Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP::Client Confusion #4947

Closed
Thyra opened this issue Sep 10, 2017 · 4 comments
Closed

HTTP::Client Confusion #4947

Thyra opened this issue Sep 10, 2017 · 4 comments

Comments

@Thyra
Copy link
Contributor

Thyra commented Sep 10, 2017

Hi,
I want to scan a million URLs for validity (does the domain still exist, has the resource moved etc.) and I am using HTTP::Client.get concurrently for that. The only problem is that it has a really long timeout threshold which causes my application to get hung up for minutes at a time. By instantiating HTTP::Client I can adjust the threshold with #connect_timeout and #read_timeout. But now I can't get the actual request to work properly. I have experimented with #exec and #get and got the best results with this:

require "http/client"

url = "http://schlossereinoll.de"
c = HTTP::Client.new(URI.parse(url).host.not_nil!)
c.connect_timeout = 30 # seconds
c.read_timeout = 30 # seconds
response = c.exec "GET", url
c.close

For most URLs that works but with certain ones (for example http://sprintmetal.de ) I get a 400 response and I don't know why.
So my question is: How do you execute a GET request with customly set timeouts?

@Thyra
Copy link
Contributor Author

Thyra commented Sep 11, 2017

I did find a way and it looks like this:

require "http/client"

url = "http://schlossereinoll.de"
uri_obj = URI.parse(url)

client = HTTP::Client.new(uri_obj)
client.connect_timeout = 30 # seconds
client.read_timeout = 30 # seconds

request = !(uri_obj.path.nil? || uri_obj.path == "") ? uri_obj.path.not_nil! : "/"
request += "?" + uri_obj.query.not_nil! unless uri_obj.query.nil? || uri_obj.query == ""
response = client.get(request)

client.close

I had to rebuild the string to get manually, which I think is really ugly and not something everyone should have to do for themselves. So why not introduce a #get method similar to .get, where you can just pass a string or an URI and the method extracts what it needs?

@asterite
Copy link
Member

require "http/client"

uri = URI.parse("http://api.icndb.com/jokes/random?firstName=John&lastName=Doe")

HTTP::Client.new(uri) do |client|
  client.connect_timeout = 30 # seconds
  client.read_timeout = 30    # seconds
  response = client.get uri.full_path
  puts response.body
end

@RX14
Copy link
Contributor

RX14 commented Sep 11, 2017

This has been solved, but next time you should use stackoverflow or the mailinglist to ask for help instead of the issue tracker.

@straight-shoota
Copy link
Member

Though it would make sense to allow passing a URI directly instead of enforcing a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants