-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frequent HTTP::TimeoutError's (Read) on 1.0.x but not on 0.9.x #298
Comments
How long does the request normally take, and how long before it times out? Also what version of Ruby? There's some changes @tarcieri made around the API calls we use for timeouts, but they should work identical to before. I could believe there's a small rounding change where if you were cutting it close on timeouts, but given you said increasing it doesn't help, it's probably not that. |
@javierjulio can you provide any sort of timeline of the events that are happening and how those don't line up with your described timeout model? |
@zanker I don't know how long requests normally take, since I'm making requests to a sandbox service I'd say about a second or so each. Note that the timeouts are for each request (per operation). I've updated the description with this info and Ruby version (2.2.3). Thanks! @tarcieri to be honest I don't know what the best timeouts are, I just went with what shown in the README as it seemed better than nothing. I figured I would just come back and fine tune it later. What I do know though is after upgrading to 1.0.x I consistently received HTTP::TimeoutError (Read) errors but I'm not seeing it on 0.9.x. I just deployed 3 separate branches to CircleCI that runs a fair amount of integration tests and all passed without timeout errors. Before it happened on each deploy. Having a hard time figuring out how I can get you more information. If I find out more if you have a suggestion let me know and I'll do my best to get it to you ASAP. Thanks! |
@javierjulio more generally, do you think you're getting a |
@tarcieri that's a great question, I'm not sure. I don't believe so. I'm not expecting it to timeout. It hasn't before and when downgrading its fine. At first I thought it was service related but the third party service has been up and no reported issues any time I check. I started to suspect http since I upgrade dependencies frequently. Since I've been on 1.0.x I've had a few requests fail due to timeout errors consistently. The requests vary. I don't have more info sadly. Sorry. 😩 If I find out more I'll provide it or if any questions I can answer. |
We experienced same issue. So we had to downgrade to |
Was unable to 100% simple reproducible variant. # reproducible on ruby 2.1.8
url = "https://play.google.com/store/apps/" \
"category/GAMES/collection/top?start=400&num=100&hl=en&gl=us"
42.times do
start = Time.now.to_f
begin
HTTP.timeout(:read => 5, :connect => 5, :write => 5).get(url).flush
rescue
puts "Failed after: #{Time.now.to_f - start}"
raise
end
end |
Fails for me with:
|
The snippet abbove reproduces error on ruby |
@ixti awesome thanks! I was thinking I might be crazy. Good to know its reproducible elsewhere. I just took your same code snippet, ran it on Ruby 2.2.3 using HTTP 1.0.2 and immediately replicated the issue. It fails right away. The output from the first failure is below. I ran the snippet four more times and each one failed consistently for me.
|
@ixti @javierjulio thanks for the repro. Will look into it |
Just to explain a bit. It fails pretty randomly: url = "https://play.google.com/store/apps/" \
"category/GAMES/collection/top?start=400&num=100&hl=en&gl=us"
42.times do |i|
start = Time.now.to_f
print format("%3d ... ", i + 1)
begin
HTTP.timeout(:read => 5, :connect => 5, :write => 5).get(url).flush
puts "pass (#{Time.now.to_f - start})"
rescue
puts "fail (#{Time.now.to_f - start})"
raise
end
end It may fail at any cycle. Absolutely randomly. So if you will remove
|
Notice that each cycle is pretty same in timing. So it has nothing to do with actual timing. # https://github.com/httprb/http/blob/master/lib/http/timeout/null.rb#L78-L83
class Null
def rescue_readable
yield
rescue IO::WaitReadable
retry if @socket.to_io.wait_readable(read_timeout)
raise TimeoutError, "Read timed out after #{read_timeout} seconds"
end
end
# https://github.com/httprb/http/blob/master/lib/http/timeout/per_operation.rb#L59-L78
class PerOperation < Null
def readpartial(size)
loop do
# JRuby may still raise exceptions on SSL sockets even though
# we explicitly specify `:exception => false`
result = rescue_readable do
@socket.read_nonblock(size, :exception => false)
end
if result.nil?
return :eof
elsif result != :wait_readable
return result
end
unless @socket.to_io.wait_readable(read_timeout)
fail TimeoutError, "Read timed out after #{read_timeout} seconds"
end
end
end
end |
@ixti @javierjulio if this was working on an earlier version and regressed, can either of you do a git bisect and try to find the commit that introduced the bug? |
@tarcieri sure.
|
@ixti joy! Since you have it reproed, mind playing with it a bit more? That commit changed all 3 types of timeouts ( |
@tarcieri sure. :D |
But I won;t be able to till weekend. Just for the record. |
No worries |
@ixti well, I'm thinking I should revert this entirely unless we can narrow it down |
@tarcieri I've pushed a patch that makes snippet from the above work on all rubies. Also it passes all tests... |
@ixti odd... that is something that sticks out from my changes to the original code. The comment is actually documenting why it's using We may not have sufficient coverage for EAGAIN cases for SSL (they're rather difficult to test) |
After a discussion over an IRC, I agree that proposed patch is terribly wrong. |
After my attempt to simplify code in order to find a better fix, it happened that it requires more thoughtful code reading. I will try to find time over a week to come with a better fix. Meanwhile, @tarcieri if you're OK to revert that particular commit, probably that's the best option. If anybody wanna jump on fixing this, please do so! :D |
@tarcieri I saw the revert in master so I installed http from master locally and in a console ran the same script @ixti wrote. I fooled around with the timeouts too but I still keep getting frequent read errors, in fact more so than what we showed here. I made a small adjustment to the script to be sure the timeout is on read and it is. Although I could very well be doing something wrong. I did verify my install was using the master branch as it printed out the last git commit sha.
The script: url = "https://play.google.com/store/apps/" \
"category/GAMES/collection/top?start=400&num=100&hl=en&gl=us"
42.times do |i|
start = Time.now.to_f
print format("%3d ... ", i + 1)
begin
HTTP.timeout(:read => 10, :connect => 5, :write => 5).get(url).flush
puts "pass (#{Time.now.to_f - start})"
rescue HTTP::Error => error
puts "fail #{error} (#{Time.now.to_f - start})"
end
end The output:
|
@javierjulio it wasn't a full revert... I was trying to preserve the new changes to support an IO.wait backend. Can you try bisecting yourself and seeing if you get the same commit as @ixti? f4df840 I can try a full revert of this commit next if so... |
@tarcieri oops sorry about that! |
@javierjulio what Ruby was that on? I can't reproduce your problem |
I did a full revert in #320 and released that as 1.0.4. That should hopefully fix your problem. |
Considering this solved for now. Please reopen if you disagree. |
Not that I disagree... But unfortunately snippet still fails for me on ruby < 2.3.0: chruby | sed 's/^[^jr]*//' | while read x; do
chruby $x && echo "== $(ruby -v)"
ruby -Ilib -rhttp snippet.rb | cut -b8-12 | sort | uniq -c
done outputs
|
Snippet was executed against current master. |
@ixti can you repro against 1.0.4 too? |
Ok, I'm able to reproduce the problem on Ruby 2.2 for |
I think this actually fixes the problem: #322 |
Revert to IO.select for PerOperation timeouts (fixes #298)
@ixti @javierjulio please double check, but I think both |
@tarcieri sorry, I wasn't able to look at this sooner. I just tried this on Ruby 2.2.3 (Mac OS X latest) with http from |
Woop! The root cause is still mysterious, but I'm glad we got this narrowed down and resolved |
I confirm that |
I've been using the 0.9.x releases for a few months now and have since upgraded to 1.0.0 and then 1.0.2. I've noticed several HTTP::TimeoutError's on 1.0.x locally but I get it almost every single time I deploy a gem that runs integration tests on CircleCI (Ubuntu). If I update the two gems I've created that depend on http and downgrade it back to
add_dependency "http", '~> 0.9'
all my integration tests that are making live HTTP requests consistently pass without a single timeout error. At first it seemed that the 1.0.2 release was helping but the timeout errors still occur. I'm trying to debug the issue further but figured in the meantime I'd post this and see if anyone else has had a similar issue when upgrading to 1.0.x. As I learn more I'll report back.For configuring timeouts I went with a default of
{ write: 2, connect: 5, read: 10 }
. Increasing the read timeout value doesn't help. Note that this is per request. While I don't have stats I know that requests don't take 10 seconds each (just the read part fails). The same timeouts are used for http 0.9.x and all integration tests pass with no frequent timeout errors.I use
http
gem in the following JSON API client gems: synapse_payments and paysafe.Using Ruby 2.2.3 in app and each gem using http gem.
The text was updated successfully, but these errors were encountered: