-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Response.content iterates in needlessly small chunks #3186
Comments
It's good to know that httplib allocates a buffer of that size. I think we can probably stretch to double that buffer though: 20kb of buffer is unlikely to be the end of the world. At the very least, though, we should understand how this works so that we can write documentation to explain this. |
Originally, I iterated over a chunk size of |
While we're on the topic, we have 4 different default chunk_sizes between all of our iterator functions in Requests. Some I can find reasoning for ( I'm not saying these are wrong, just curious why |
There is a long, long issue to look at in the backlog. Anyone wanting to make progress on this needs to read and understand #844. Safe to say this is not a good choice for someone who doesn't want to find a really tough slog of a job. |
just a ping back from the pip project on this 12 years old bug. :) the iter_content() was set to 10240 bytes 12 years ago in requests. it's a needlessly small size and incur a lot of overhead. I'm quite curious if there is any reason that prevents from updating On Linux, the network read buffer was increased to 64k in kernel v4.20, year 2018, the read and write buffer were 16k historically before that. |
Response.content
iterates over the response data in chunks of 10240 bytes. The number 10240 was set in commit62d2ea8
.After tracing the source code of urllib3 and httplib, I can’t see a reason for this behavior. It all ultimately goes through httplib’s
HTTPResponse.readinto
, which automatically limits the read size according toContent-Length
or thechunked
framing.Therefore, it seems that, if you simply set
CONTENT_CHUNK_SIZE
to a much larger number (like 10240000), nothing should change, exceptResponse.content
will become more efficient on large responses.Update: it seems like httplib allocates a buffer of the requested size (to be read into), so simply setting
CONTENT_CHUNK_SIZE
to a large value will cause large chunks of memory to be allocated, which is probably a bad idea.This is not a problem for me and I have not researched it thoroughly. I’m filing this issue after investigating a Stack Overflow question where this caused an unexpected slowdown for the poster, and a subsequent IRC exchange with @Lukasa. Feel free to do (or not do) whatever you think is right here.
The text was updated successfully, but these errors were encountered: