Timeout on start #2

mihaitodor · 2024-01-01T17:54:35Z

Hey @davep, thanks for taking the time to put this project together! I really like it!

One issue is that, on startup, it tries to make 500+ HTTP calls and, since the httpx default timeout is 5 seconds, some requests end up failing and the UI just shows a rather cryptic "[Errno 8] nodename nor servname provided, or not known" error when that happens. I think this is because it tries to fetch details about all the posts that the API sends instead of just the top 30, but some pagination would be better / faster. I haven't dug enough into the code to see how it's all wired up, but, at the very least, it would help to be able to configure the AsyncClient HTTP timeout in oshit/hn/client.py.

The text was updated successfully, but these errors were encountered:

davep · 2024-01-01T18:02:52Z

Yup, right now, mainly out of curiosity about how well it might work in various environments¹ (and thanks to the wonderful HackerNews API design), it just goes all-in on loading up the stories. I'm likely going to change it up a bit to either, as you suggest, do some form of pagination, or perhaps chunk up the requests and add them to each of the lists in the background.

I've not been able to make it fail on anything I've tested with her, so was curious if anyone else might run it and if they might see an issue, so this is really useful, thanks. ↩

mihaitodor · 2024-01-01T18:05:03Z

You're welcome! Since I'm based in Ireland, I suspect the requests have to travel across the ocean, so it's slower to fetch all the data :)

davep · 2024-01-01T18:06:48Z

Pfft, I'm in Scotland, the packets have to get past you before they get to me! :-P

davep · 2024-01-01T20:05:02Z

@mihaitodor Are you able to check out a branch from this repo and test it for me?

mihaitodor · 2024-01-01T20:06:20Z

@davep Of course! Which branch should I test?

davep · 2024-01-01T20:08:12Z

Awesome, I'll ping when I have it up in a wee but (there's zero rush to test of course). I've got an easy-to-plug-in approach I'd like to try out.

Due to the way the HackerNews API works, getting items is a bit of a pain. There's no "get me the data for all these items please", it's "get me a list of IDs and them make me load every single on individually". To start with I just went all in on loading items up in parallel; which works fine for me but, eh, there are some situations where 500 simultaneous HTTP requests is a *bit* over the top (despite it being 2023! IKR?!?). So, in an initial effort to explore solutions for #2 without leaning into pagination, let's have a play with limiting the number of concurrent connections and see what works.

davep · 2024-01-01T20:21:19Z

Okay, there's be-more-chill-when-getting-stuff. In it I'm trying out using a Semaphone to limit the number of concurrent requests running at any one time. So now rather than 500 all at once, it'll be 10 lots of 50 all one after the other.

I'd be interested to see if this makes any difference for you; feel free to muck with the max concurrency value which is currently hard-coded at 50 if it doesn't help, just to see if it does make any difference at some level.

If this seems like a reasonable approach I'll expose it as a configuration value, or environment variable, or something; but ideally with a default that is a little more reasonable than the current nuclear option.

Yes, this is an attempt to not have to rework the whole design to use paging; at least for tonight. ;-)

mihaitodor · 2024-01-01T20:30:27Z

No dice... It choked here on these requests:

https://hacker-news.firebaseio.com/v0/item/38817271.json
https://hacker-news.firebaseio.com/v0/item/38815893.json
https://hacker-news.firebaseio.com/v0/item/38795559.json
https://hacker-news.firebaseio.com/v0/item/38829927.json
https://hacker-news.firebaseio.com/v0/item/38827026.json
https://hacker-news.firebaseio.com/v0/item/38809728.json
https://hacker-news.firebaseio.com/v0/item/38809642.json
https://hacker-news.firebaseio.com/v0/item/38803160.json
https://hacker-news.firebaseio.com/v0/item/38822051.json
https://hacker-news.firebaseio.com/v0/item/38820152.json
https://hacker-news.firebaseio.com/v0/item/38814093.json
https://hacker-news.firebaseio.com/v0/item/38805811.json
https://hacker-news.firebaseio.com/v0/item/38809770.json
https://hacker-news.firebaseio.com/v0/item/38807001.json
https://hacker-news.firebaseio.com/v0/item/38818700.json
https://hacker-news.firebaseio.com/v0/item/38827090.json
https://hacker-news.firebaseio.com/v0/item/38813496.json
https://hacker-news.firebaseio.com/v0/item/38817128.json
https://hacker-news.firebaseio.com/v0/item/38813647.json
https://hacker-news.firebaseio.com/v0/item/38814652.json
https://hacker-news.firebaseio.com/v0/item/38814215.json
https://hacker-news.firebaseio.com/v0/item/38797076.json

LE: I'm running Python 3.11.4 if that helps. Also, I used python3 -m oshit to run it from the cloned repo on my Macbook.

davep · 2024-01-01T20:35:54Z

Damn. Thanks for trying. I was hoping that, as you suggested, the main issue was the number of concurrent requests.

So it does seem to be a pure timeout issue? Would you mind doing a quick test at your end, around line 76 in client.py, and add either a longer timeout, or even set timeout=None? If that seems to do the trick then I'll throw configuration for that in too (I mean, I'll do it anyway, it makes sense, but it'll be good to know if this is the route to go for your problem).

mihaitodor · 2024-01-01T21:22:45Z

Adding timeout=10.0 to self._client.get(...) on the be-more-chill-when-getting-stuff branch does make the issue go away and I tried several times. On the main branch, I can't get it to work, not sure why. Dunno what the default value should be, but maybe something like 30.0 wouldn't be too bad.

davep · 2024-01-01T21:27:39Z

Okay, cool, thanks for that! Assuming when you say you couldn't get it to work on the main branch, you mean that it kept running into the same problem no matter the timeout... looks like it is a combination of timeout and maximum concurrent connections; which I guess makes a lot of sense.

In that case, for the moment anyway, I'll go ahead with this change. There's also going to be two new values in the config file (which lives in ~/.config/oshit/ -- I should probably mention that in the README now I say that) which can be used to tweak things too.

Also... hey, thanks for raising the issue. It's been really useful! Always a joy when someone runs into a problem and flags it up. :-)

mihaitodor · 2024-01-01T21:34:12Z

Okay, cool, thanks for that!

You're welcome!

Assuming when you say you couldn't get it to work on the main branch, you mean that it kept running into the same problem no matter the timeout...

Yep!

Also... hey, thanks for raising the issue. It's been really useful! Always a joy when someone runs into a problem and flags it up. :-)

Happy to help! I'm also a daily HN reader and I like this TUI app a lot! I'm thinking to start using it instead of the web-based one, at least for skimming across posts, which is what I usually do, although I also upvote stuff that catches my eye, so it would be nice to also support some basic interactions like that in the future 😅

Not sure if you're aware of other similar sites, but there's also https://lobste.rs, https://slashdot.org and https://tildes.net which I also browse sometimes, but HN is my usual goto place for updates. Would be cool to support some of those as well 😅

davep · 2024-01-01T21:48:04Z

Oh man, I used to live on /. back in the day. Can't say I know the other two though. I'll take a look.

davep · 2024-01-01T21:52:36Z

v0.1.1 is up on PyPi. 🤞

mihaitodor · 2024-01-02T01:49:54Z

Works as expected, thank you for the quick fix! 🥇

/. is still quite OK in terms of links and summaries and their MOTD statuses never get old. The mobile site is rubbish, though. Also, no idea about the comments section, since I never looked at those 😅

If you want a Lobsters invite, message me on Fosstodon and I'll get you sorted.

davep added the enhancement New feature or request label Jan 1, 2024

davep self-assigned this Jan 1, 2024

davep mentioned this issue Jan 1, 2024

Slow down a little when getting items from the HackerNews API #4

Merged

davep closed this as completed in #4 Jan 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeout on start #2

Timeout on start #2

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024 •

edited

Loading

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 2, 2024

Timeout on start #2

Timeout on start #2

Comments

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

Footnotes

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024 • edited Loading

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 1, 2024

davep commented Jan 1, 2024

davep commented Jan 1, 2024

mihaitodor commented Jan 2, 2024

mihaitodor commented Jan 1, 2024 •

edited

Loading