Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for redundant request media types in the API #5874

Merged
merged 3 commits into from
Mar 29, 2023

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented Mar 16, 2023

Motivation and context

Currently, every API endpoint that takes a request body supports (or at least declares to support) 4 media types:

  • application/json
  • application/offset+octet-stream
  • application/x-www-form-urlencoded
  • multipart/form-data

Supporting multiple media types has a cost. We need to test that the various media types actually work, and we need to document their use (e.g., providing examples for each supported type). In practice, we mostly don't... but we still need to. In addition, the user, seeing the list of supported types, has to decide which one to use.

Now, the cost could be worthwhile if the multiple type support provided value. However, for the most part, it doesn't:

  • application/offset+octet-stream only makes sense for the TUS endpoints. Moreover, for those endpoints it's the only type that makes sense.

  • application/x-www-form-urlencoded is strictly inferior to JSON. It doesn't support compound values, and it doesn't carry type information, so you can't, for example, distinguish a string from a null. It's potentially susceptible to CSRF attacks (we have protections against those, but we could accidentally disable them and not notice). Its main use is for form submissions, but we don't use HTML-based submissions.

  • multipart/form-data shares the downsides of application/x-www-form-urlencoded, however it does have a redeeming quality: it allows to efficiently upload binary files. Therefore, it has legitimate uses in endpoints that accept such files.

Therefore, I believe it is justified to reduce the API surface area as follows:

  • Restrict application/offset+octet-stream to TUS endpoints and remove support for other types from those endpoints.

  • Remove application/x-www-form-urlencoded support entirely.

  • Restrict multipart/form-data support to endpoints dealing with file uploads.

Note that I had to keep multipart/form-data support for POST /api/cloudstorages and PATCH /api/cloudstorages/<id>. That's because they accept a file-type parameter (key_file). I don't especially like this. Key files are not big, so the efficiency benefits of multipart/form-data don't matter. Therefore, I don't think we really need to support this type here; it would be more elegant to just use JSON and Base64-encode the key file contents. However, I don't have time to make that change right now, so I'm
leaving it for another time.

How has this been tested?

CI.

Checklist

  • I submit my changes into the develop branch
  • I have added a description of my changes into the CHANGELOG file
  • [ ] I have updated the documentation accordingly
  • [ ] I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)
  • [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@SpecLad
Copy link
Contributor Author

SpecLad commented Mar 16, 2023

/check

@github-actions
Copy link
Contributor

github-actions bot commented Mar 16, 2023

❌ Some checks failed
📄 See logs here

@SpecLad
Copy link
Contributor Author

SpecLad commented Mar 17, 2023

/check

@github-actions
Copy link
Contributor

github-actions bot commented Mar 17, 2023

❌ Some checks failed
📄 See logs here

@SpecLad SpecLad force-pushed the remove-redundant-parsers branch 2 times, most recently from e51bbff to 33f4ae6 Compare March 19, 2023 13:48
@SpecLad
Copy link
Contributor Author

SpecLad commented Mar 19, 2023

/check

@github-actions
Copy link
Contributor

github-actions bot commented Mar 19, 2023

❌ Some checks failed
📄 See logs here

@SpecLad
Copy link
Contributor Author

SpecLad commented Mar 28, 2023

/check

@github-actions
Copy link
Contributor

github-actions bot commented Mar 28, 2023

✔️ All checks completed successfully
📄 See logs here

Currently, every API endpoint that takes a request body supports (or at
least declares to support) 4 media types:

* `application/json`
* `application/offset+octet-stream`
* `application/x-www-form-urlencoded`
* `multipart/form-data`

Supporting multiple media types has a cost. We need to test that the various
media types actually work, and we need to document their use (e.g.,
providing examples for each supported type). In practice, we mostly don't...
but we still need to. In addition, the user, seeing the list of supported
types, has to decide which one to use.

Now, the cost could be worthwhile if the multiple type support provided value.
However, for the most part, it doesn't:

* `application/offset+octet-stream` only makes sense for the TUS endpoints.
  Moreover, for those endpoints it's the only type that makes sense.

* `application/x-www-form-urlencoded` is strictly inferior to JSON. It doesn't
  support compound values, and it doesn't carry type information, so you
  can't, for example, distinguish a string from a null. It's potentially
  susceptible to CSRF attacks (we have protections against those, but we
  could accidentally disable them and not notice). Its main use is for form
  submissions, but we don't use HTML-based submissions.

* `multipart/form-data` shares the downsides of `application/x-www-form-urlencoded`,
  however it does have a redeeming quality: it allows to efficiently upload
  binary files. Therefore, it has legitimate uses in endpoints that accept
  such files.

Therefore, I believe it is justified to reduce the API surface area as follows:

* Restrict `application/offset+octet-stream` to TUS endpoints and remove
  support for other types from those endpoints.

* Remove `application/x-www-form-urlencoded` support entirely.

* Restrict `multipart/form-data` support to endpoints dealing with file
  uploads.

Note that I had to keep `multipart/form-data` support for `POST
/api/cloudstorages` and `PATCH /api/cloudstorages/<id>`. That's because they
accept a file-type parameter (`key_file`). I don't especially like this. Key
files are not big, so the efficiency benefits of `multipart/form-data` don't
matter. Therefore, I don't think we really need to support this type here;
it would be more elegant to just use JSON and Base64-encode the key file
contents. However, I don't have time to make that change right now, so I'm
leaving it for another time.
@SpecLad SpecLad changed the title Remove redundant parsers Remove support for redundant request media types in the API Mar 28, 2023
@SpecLad SpecLad marked this pull request as ready for review March 28, 2023 15:28
@SpecLad SpecLad requested a review from nmanovic as a code owner March 28, 2023 15:28
@SpecLad SpecLad requested review from zhiltsov-max and removed request for nmanovic March 28, 2023 15:28
@SpecLad SpecLad merged commit 409fac5 into cvat-ai:develop Mar 29, 2023
@SpecLad SpecLad deleted the remove-redundant-parsers branch March 29, 2023 09:58
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Jul 1, 2023
…5874)

Currently, every API endpoint that takes a request body supports (or at
least declares to support) 4 media types:

* `application/json`
* `application/offset+octet-stream`
* `application/x-www-form-urlencoded`
* `multipart/form-data`

Supporting multiple media types has a cost. We need to test that the
various media types actually work, and we need to document their use
(e.g., providing examples for each supported type). In practice, we
mostly don't... but we still need to. In addition, the user, seeing the
list of supported types, has to decide which one to use.

Now, the cost could be worthwhile if the multiple type support provided
value. However, for the most part, it doesn't:

* `application/offset+octet-stream` only makes sense for the TUS
endpoints. Moreover, for those endpoints it's the only type that makes
sense.

* `application/x-www-form-urlencoded` is strictly inferior to JSON. It
doesn't support compound values, and it doesn't carry type information,
so you can't, for example, distinguish a string from a null. It's
potentially susceptible to CSRF attacks (we have protections against
those, but we could accidentally disable them and not notice). Its main
use is for form submissions, but we don't use HTML-based submissions.

* `multipart/form-data` shares the downsides of
`application/x-www-form-urlencoded`, however it does have a redeeming
quality: it allows to efficiently upload binary files. Therefore, it has
legitimate uses in endpoints that accept such files.

Therefore, I believe it is justified to reduce the API surface area as
follows:

* Restrict `application/offset+octet-stream` to TUS endpoints and remove
support for other types from those endpoints.

* Remove `application/x-www-form-urlencoded` support entirely.

* Restrict `multipart/form-data` support to endpoints dealing with file
uploads.

Note that I had to keep `multipart/form-data` support for `POST
/api/cloudstorages` and `PATCH /api/cloudstorages/<id>`. That's because
they accept a file-type parameter (`key_file`). I don't especially like
this. Key files are not big, so the efficiency benefits of
`multipart/form-data` don't matter. Therefore, I don't think we really
need to support this type here; it would be more elegant to just use
JSON and Base64-encode the key file contents. However, I don't have time
to make that change right now, so I'm
leaving it for another time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants