Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout on start #2

Closed
mihaitodor opened this issue Jan 1, 2024 · 15 comments · Fixed by #4
Closed

Timeout on start #2

mihaitodor opened this issue Jan 1, 2024 · 15 comments · Fixed by #4
Assignees
Labels
enhancement New feature or request

Comments

@mihaitodor
Copy link

Hey @davep, thanks for taking the time to put this project together! I really like it!

One issue is that, on startup, it tries to make 500+ HTTP calls and, since the httpx default timeout is 5 seconds, some requests end up failing and the UI just shows a rather cryptic "[Errno 8] nodename nor servname provided, or not known" error when that happens. I think this is because it tries to fetch details about all the posts that the API sends instead of just the top 30, but some pagination would be better / faster. I haven't dug enough into the code to see how it's all wired up, but, at the very least, it would help to be able to configure the AsyncClient HTTP timeout in oshit/hn/client.py.

@davep
Copy link
Owner

davep commented Jan 1, 2024

Yup, right now, mainly out of curiosity about how well it might work in various environments1 (and thanks to the wonderful HackerNews API design), it just goes all-in on loading up the stories. I'm likely going to change it up a bit to either, as you suggest, do some form of pagination, or perhaps chunk up the requests and add them to each of the lists in the background.

Footnotes

  1. I've not been able to make it fail on anything I've tested with her, so was curious if anyone else might run it and if they might see an issue, so this is really useful, thanks.

@davep davep added the enhancement New feature or request label Jan 1, 2024
@davep davep self-assigned this Jan 1, 2024
@mihaitodor
Copy link
Author

You're welcome! Since I'm based in Ireland, I suspect the requests have to travel across the ocean, so it's slower to fetch all the data :)

@davep
Copy link
Owner

davep commented Jan 1, 2024

Pfft, I'm in Scotland, the packets have to get past you before they get to me! :-P

@davep
Copy link
Owner

davep commented Jan 1, 2024

@mihaitodor Are you able to check out a branch from this repo and test it for me?

@mihaitodor
Copy link
Author

@davep Of course! Which branch should I test?

@davep
Copy link
Owner

davep commented Jan 1, 2024

Awesome, I'll ping when I have it up in a wee but (there's zero rush to test of course). I've got an easy-to-plug-in approach I'd like to try out.

davep added a commit that referenced this issue Jan 1, 2024
Due to the way the HackerNews API works, getting items is a bit of a pain.
There's no "get me the data for all these items please", it's "get me a list
of IDs and them make me load every single on individually". To start with I
just went all in on loading items up in parallel; which works fine for me
but, eh, there are some situations where 500 simultaneous HTTP requests is
a *bit* over the top (despite it being 2023! IKR?!?).

So, in an initial effort to explore solutions for #2 without leaning into
pagination, let's have a play with limiting the number of concurrent
connections and see what works.
@davep
Copy link
Owner

davep commented Jan 1, 2024

Okay, there's be-more-chill-when-getting-stuff. In it I'm trying out using a Semaphone to limit the number of concurrent requests running at any one time. So now rather than 500 all at once, it'll be 10 lots of 50 all one after the other.

I'd be interested to see if this makes any difference for you; feel free to muck with the max concurrency value which is currently hard-coded at 50 if it doesn't help, just to see if it does make any difference at some level.

If this seems like a reasonable approach I'll expose it as a configuration value, or environment variable, or something; but ideally with a default that is a little more reasonable than the current nuclear option.

Yes, this is an attempt to not have to rework the whole design to use paging; at least for tonight. ;-)

@mihaitodor
Copy link
Author

mihaitodor commented Jan 1, 2024

No dice... It choked here on these requests:

https://hacker-news.firebaseio.com/v0/item/38817271.json
https://hacker-news.firebaseio.com/v0/item/38815893.json
https://hacker-news.firebaseio.com/v0/item/38795559.json
https://hacker-news.firebaseio.com/v0/item/38829927.json
https://hacker-news.firebaseio.com/v0/item/38827026.json
https://hacker-news.firebaseio.com/v0/item/38809728.json
https://hacker-news.firebaseio.com/v0/item/38809642.json
https://hacker-news.firebaseio.com/v0/item/38803160.json
https://hacker-news.firebaseio.com/v0/item/38822051.json
https://hacker-news.firebaseio.com/v0/item/38820152.json
https://hacker-news.firebaseio.com/v0/item/38814093.json
https://hacker-news.firebaseio.com/v0/item/38805811.json
https://hacker-news.firebaseio.com/v0/item/38809770.json
https://hacker-news.firebaseio.com/v0/item/38807001.json
https://hacker-news.firebaseio.com/v0/item/38818700.json
https://hacker-news.firebaseio.com/v0/item/38827090.json
https://hacker-news.firebaseio.com/v0/item/38813496.json
https://hacker-news.firebaseio.com/v0/item/38817128.json
https://hacker-news.firebaseio.com/v0/item/38813647.json
https://hacker-news.firebaseio.com/v0/item/38814652.json
https://hacker-news.firebaseio.com/v0/item/38814215.json
https://hacker-news.firebaseio.com/v0/item/38797076.json

LE: I'm running Python 3.11.4 if that helps. Also, I used python3 -m oshit to run it from the cloned repo on my Macbook.

@davep
Copy link
Owner

davep commented Jan 1, 2024

Damn. Thanks for trying. I was hoping that, as you suggested, the main issue was the number of concurrent requests.

So it does seem to be a pure timeout issue? Would you mind doing a quick test at your end, around line 76 in client.py, and add either a longer timeout, or even set timeout=None? If that seems to do the trick then I'll throw configuration for that in too (I mean, I'll do it anyway, it makes sense, but it'll be good to know if this is the route to go for your problem).

@mihaitodor
Copy link
Author

Adding timeout=10.0 to self._client.get(...) on the be-more-chill-when-getting-stuff branch does make the issue go away and I tried several times. On the main branch, I can't get it to work, not sure why. Dunno what the default value should be, but maybe something like 30.0 wouldn't be too bad.

@davep
Copy link
Owner

davep commented Jan 1, 2024

Okay, cool, thanks for that! Assuming when you say you couldn't get it to work on the main branch, you mean that it kept running into the same problem no matter the timeout... looks like it is a combination of timeout and maximum concurrent connections; which I guess makes a lot of sense.

In that case, for the moment anyway, I'll go ahead with this change. There's also going to be two new values in the config file (which lives in ~/.config/oshit/ -- I should probably mention that in the README now I say that) which can be used to tweak things too.

Also... hey, thanks for raising the issue. It's been really useful! Always a joy when someone runs into a problem and flags it up. :-)

@mihaitodor
Copy link
Author

Okay, cool, thanks for that!

You're welcome!

Assuming when you say you couldn't get it to work on the main branch, you mean that it kept running into the same problem no matter the timeout...

Yep!

Also... hey, thanks for raising the issue. It's been really useful! Always a joy when someone runs into a problem and flags it up. :-)

Happy to help! I'm also a daily HN reader and I like this TUI app a lot! I'm thinking to start using it instead of the web-based one, at least for skimming across posts, which is what I usually do, although I also upvote stuff that catches my eye, so it would be nice to also support some basic interactions like that in the future 😅

Not sure if you're aware of other similar sites, but there's also https://lobste.rs, https://slashdot.org and https://tildes.net which I also browse sometimes, but HN is my usual goto place for updates. Would be cool to support some of those as well 😅

@davep
Copy link
Owner

davep commented Jan 1, 2024

Oh man, I used to live on /. back in the day. Can't say I know the other two though. I'll take a look.

@davep
Copy link
Owner

davep commented Jan 1, 2024

v0.1.1 is up on PyPi. 🤞

@mihaitodor
Copy link
Author

Works as expected, thank you for the quick fix! 🥇

/. is still quite OK in terms of links and summaries and their MOTD statuses never get old. The mobile site is rubbish, though. Also, no idea about the comments section, since I never looked at those 😅

If you want a Lobsters invite, message me on Fosstodon and I'll get you sorted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants