Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iqss/8726 better HTTP range request support #8727

Merged
merged 10 commits into from
Jun 23, 2022

Conversation

haarli
Copy link
Contributor

@haarli haarli commented May 23, 2022

This PR addresses #8726 with the following changes:

  • Access.java datafile()
    • If a certain range is requested, the returned status is now 206 (partial content) instead of 200
  • DownloadInstanceWriter writeTo()
    • I f a range is requested, a Content-Range header is returned containing information about the returned bytes
    • I f a range is requested, a Accept-Ranges header with value "bytes" is returned
  • APIBlockingFilter doFilter()
    • CORS: The "Range" header is added to "Access-Control-Allow-Headers"
    • CORS: The Content-Range and Accept-Ranges header are added to "Access-Control-Expose-Headers"

[from L.A.:]
A quick note on how/what to test/QA:

A call to file download api with the Range request should now return 206 instead of 200; and the extra headers as described above.
May make sense to confirm that regular downloads (without the range header) are not affected and everything is working as before.
This would be best to test on locally-stored files; or, if testing with s3 storage, make sure to disable the direct download option. (otherwise the return codes and the headers you're seeing are not ours, but Jeff Bezos's).
(sorry if the above is all stating the obvious)

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.001%) to 19.241% when pulling d3de1fa on MPDL:IQSS/8726_better_http_range_support into fdef1f6 on IQSS:develop.

if (headers.getRequestHeaders().containsKey("Range")) {
return Response.status(Response.Status.PARTIAL_CONTENT).entity(downloadInstance).build();
}
return Response.ok(downloadInstance).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify what this change does?
Specifically, changing the return value from DownloadInstance to Response - are you positive that it's still going to stream the output? (As opposed to trying to read everything from the DownloadInstance into memory at once, and then build the Response?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with the requirement:
If the request contains a "Range" header and the requested range of bytes is returned, a HTTP status of 206 (Partial Content) instead of 200 must be returned. See also
https://datatracker.ietf.org/doc/html/rfc2616#section-14.35.2

So, these lines change the status code of the HTTP response dependent on the existence of the Range header in the HTTP request. The only way to change it is by using the Response object. (Setting it directly in the HttpServletResponse would be overwritten by Jersey later).

However, there is no real difference between returning the entity (DownloadInstance) directly, or wrapped in a Response object. Except that Response allows more control over settings like the status code. The custom MessageBodyWriter for DownloadInstance is still called, and that is where the streaming happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. For the record, this is very old code. Written for a very early version of Jersey.
We have an open issue for refactoring that download subsystem. But it's going to be a project, with years worth of special case handling added there.

Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving along.
(added a quick note for QA to the description)

@landreev
Copy link
Contributor

landreev commented Jun 7, 2022

Sorry, I rushed to move this into QA.
Some API tests are of course failing now, on account of the changed status code.
(I'm assuming it's just that; and should be localized to FilesIT->testRange() - but I would double-check. but should be trivial to fix regardless)
Thanks.

@haarli
Copy link
Contributor Author

haarli commented Jun 8, 2022

Uh, sorry, didn't have the integration tests on my mind. I'll have a look at them.
Hope the status code change won't break any external tools already working with Range requests in Dataverse. Do you know of any?

@landreev
Copy link
Contributor

landreev commented Jun 8, 2022

I'm not aware of any external tools that this would break, no. This native range support is a relatively new addition; and even if anyone is already relying on it, the change would be trivial on their end.

But this makes me realize - it would be proper style to announce this change, even if it's not likely to affect many people. Our normal procedure for having something included in the next release notes is to add your PR-specific note as a file under doc/release-notes (please see the notes already there for an example). Just one sentence - the return code is changing, the extra headers are being added, per the http spec...

And, uh, better late than never: thank you for contributing to the project! 😄

@haarli
Copy link
Contributor Author

haarli commented Jun 9, 2022

Thanks. I've added notes for this PR under doc/release-notes.

FWIW: We just went productive with the ZIP Previewer, which relies on these changes, see e.g. here: https://edmond.mpdl.mpg.de/file.xhtml?fileId=182288
I'll create a PR for the previewer on GDCC soon.

@landreev
Copy link
Contributor

landreev commented Jun 9, 2022

We just went productive with the ZIP Previewer

Very nice, I like it a lot! I'll be looking forward to having it available @ GDCC.

@kcondon
Copy link
Contributor

kcondon commented Jun 23, 2022

Hi @haarli would you mind refreshing this branch from develop? We've since released a new version and the version numbers have incremented. Regards, Kevin

@kcondon kcondon merged commit 991a39d into IQSS:develop Jun 23, 2022
@pdurbin pdurbin added this to the 5.12 milestone Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants