Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark endpoints for streaming uploads #1371

Conversation

tst-lsavoie
Copy link

Fixes #592 based on the suggestion by @MatteoRagni: #592 (comment).
Also Fixes #1332, I think.

Changes proposed in this pull request:

  • Adds an x-stream-upload option for requests. When this is specified, connexion will pass flask.request.stream as the request.body instead of calling get_data(). This allows streaming uploads to disk instead of reading the entire file into memory.

Adds an "x-stream-upload" option that users can add to a requestBody to
specify that it should be read as a stream rather than loading the
entire request into memory. This allows it to be streamed to disk,
which is necessary to support large requests like file uploads.
@RobbeSneyders
Copy link
Member

Hi @tst-lsavoie, thanks for the PR.

  • I'm a bit hesitant to define a custom extension. Is there a reason we cannot inspect Content-Type instead?
  • We moved to the main branch. Can you rebase or merge and change the target branch of the PR?

@tst-lsavoie tst-lsavoie changed the base branch from master to main July 1, 2021 20:49
@tst-lsavoie
Copy link
Author

@RobbeSneyders I didn't use the Content-Type because it's less reliable since a client could leave the content type out or set it incorrectly. There's also the question of which content types should be streamed, which might be different for each application, or even different for different endpoints in the same application. I don't love the idea of adding a new extension either but it seemed like the least bad option.

I changed the base branch to main but it looks like there are non-trivial merge conflicts I need to figure out. I'm going to hold off on that until we have a clearer picture of how the PR will need to change.

@RobbeSneyders
Copy link
Member

I've been looking into this a bit more, and I would suggest the following solution.

Define a body property with body_getter on ConnexionRequest, similar to the json property with json_getter.

https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/lifecycle.py#L25-L27

We can then assign request.get_data as the boddy_getter for Flask, which would delay stream consumption until the body property is evaluated. There's only two places in Connexion where request.body is evaluated:

  1. During request body validation. Here the body is only consumed if Connexion supports validation for the provided Content-Type, which is currently only for json or form:
    https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/decorators/validation.py#L137-L190

  2. During parameter parsing. Here the body is currently always consumed:
    https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/decorators/parameter.py#L79-L84
    But it's actually not used if the body parameter is not defined in the view function parameters:
    https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/operations/swagger2.py#L277-L282
    If we refactor the parameter parsing a bit, we can prevent body consumption here as well if the body parameter is not present in the view function parameters.

You can then prevent stream consumption in connexion by

  • Defining a Content-Type for which Connexion does not support validation, this should be the case for all streaming types
  • Not defining a body parameter in the view function

WDYT?

@RobbeSneyders
Copy link
Member

RobbeSneyders commented Jul 3, 2021

You can find a quick and dirty PoC of the solution here:
thermopylae/connexion@mark-endpoints-for-streaming-uploads...spec-first:streaming-uploads-poc

@tst-lsavoie
Copy link
Author

I think I understand the proposed solution. So, if the content type was not form or JSON, and if the user didn't add a parameter for the body to their function, then the stream would be available.

I could see that working. I worry about it for the future, though. One small, innocent looking update that accesses the request body could break it. Hopefully testing would catch that, but it's not guaranteed.

We'd have to do some refactoring for the OpenAPI 3 version of the code as well:
https://github.com/zalando/connexion/blob/c014299435990ff1d4c20596396ca563f7b266b4/connexion/operations/openapi.py#L270-L300

The biggest problem is going to be time. I'm not sure when I'll next have a good chunk of time to focus on this issue.

@clementperon
Copy link

This is great!

I have used it marking my endpoint with x-stream-upload and work like a charm, thanks :)

@RobbeSneyders
Copy link
Member

RobbeSneyders commented Feb 18, 2023

Fixed by #1618

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants