Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate Limit Workflow #1107

Open
singlash26 opened this issue Nov 8, 2023 · 8 comments
Open

Rate Limit Workflow #1107

singlash26 opened this issue Nov 8, 2023 · 8 comments

Comments

@singlash26
Copy link

Hi @zedeus , can you explain how the rate limit work here?

  1. From your responses to a few questions in the issues, I understand, there is a supposed limit of 500 requests per 15 mins per Guest Account, so my question is - if I have 5 Guest Accounts in my JSONL file , then will it simply multiply 500 by 5 , that means 2500 requests can be made per 15 mins?
  2. How the system is going to pick the Guest Accounts from the JSONL file for each request in case of multiple Guest Accounts? Are there any chances of request throttling in case the requests are going to get the Guest Accounts unevenly ?

Thanks in advance!

@zedeus
Copy link
Owner

zedeus commented Nov 8, 2023

The main rate limit is a 15 minute window, starting from the first request made with a given account. Within this window there's a certain amount of requests you can make, depending on the endpoint:

nitter/src/auth.nim

Lines 11 to 23 in d175832

Api.search: 50,
Api.tweetDetail: 150,
Api.photoRail: 180,
Api.userTweets: 500,
Api.userTweetsAndReplies: 500,
Api.userMedia: 500,
Api.userRestId: 500,
Api.userScreenName: 500,
Api.tweetResult: 500,
Api.list: 500,
Api.listTweets: 500,
Api.listMembers: 500,
Api.listBySlug: 500

tweetResult is only used for embeds, tweetDetail is for the main /i/status/....

Note however that most of these endpoints have a second rate limit which kicks in if you make requests too rapidly. Even 1 request per second is enough to hit it. I haven't yet done extensive testing to find the exact limit, but that's why you can't just multiply e.g. the search endpoint limit of 500 with the amount of accounts you have. This does not happen with the main tweets endpoint (used for the Tweets tab on a profile, e.g. https://nitter.net/nasa), but pretty much all the other ones will "block" a guest account for the following 24 hours, and it's applied across endpoints, not just the one that got you blocked. There are no IP-based limits that I'm aware of.

Accounts are randomly selected, and pending requests are tracked so the same one isn't used for too many requests at a time.

@singlash26
Copy link
Author

Thanks for the detailed response @zedeus !

We are running our instance with one guest account for now and tried with a sleep interval 10 and 15 sec between each subsequent request and trying to hit around 40 search requests within the 15 mins window. But, we still got rate limit exceeded error in the 2nd run. Any thoughts on this ?

@zedeus
Copy link
Owner

zedeus commented Nov 14, 2023

That's surprising, but not much can be said without lots of extensive testing.

@singlash26
Copy link
Author

Hi @zedeus , we tried with 3 guest accounts in the JSONL file with the similar approach and we got 429 again. However, it stayed a little longer this time (got blocked while processing 6th batch of those 40 requests per 15min window).

Could you please explain how the public instances are utilizing those guest accounts to make it work? Do you think we are missing out on something? Please suggest.

Also, what is the significance of "tokenCount" in nitter.conf. We set it to 3 to be in line with the number of guest accounts. Is that correct approach ?

@singlash26
Copy link
Author

Hi @zedeus , did you get a chance to see above comment? Any thoughts?

@zedeus
Copy link
Owner

zedeus commented Nov 29, 2023

Public instances use 20k-50k guest accounts, not 3, that's how :) You can always visit /.health to check this information, e.g. nitter.net/.health

tokenCount is deprecated as it was only used for the old guest tokens, it's being removed in #1116

@singlash26
Copy link
Author

Right, I understand that, the public instances are using thousands of guest accounts. But my question is - how the guest accounts are being rotated or utilized to make sure we hit 40-50 "/search" API requests per guest account per 15min window?

@zedeus
Copy link
Owner

zedeus commented Nov 30, 2023

They're used randomly, but with a small amount you inevitably run into their really strange rate limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants