-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SDK-1975] Use exponential backoff rather than rate limit headers #538
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and simple, love it!
It might be nice to allow all the retry options to be passed in as an object. Not necessary but could make it easier to adjust based on what the time/rate limits are.
One thing that came to mind ... we're removing the reliance on the headers, which means we're ignoring info we have that mind make us skip retry altogether. The management API limit might be hit within, say, 10 seconds and not have anything available for another 50. So we might try 4 additional when we already know that we'll hit a 429 for 50 more seconds.
It might be that this kind of logic would complicate things and we'd, at best, only save ~4 API calls. Just wanted to bring it up.
Thanks @joshcanhelp!
Looking at that management api rate limit doc, it says:
Which tells me that requests are added back to the bucket at a rate of 60/per_minute_rate_limit per second, so when you say "hit within, say, 10 seconds and not have anything available for another 50" - I would expect only an endpoint that was limited at '1 per minute' would behave like this - and I don't see any endpoints like that (except possibly the signing key rotation one) Although, looking at that doc, I wonder if I should make the |
Good idea 👍 |
@adamjmcgrath - Ah, I see, it's a rolling check ... makes more sense than "sorry, this API won't work for another minute" 😆 thank god I'm not in charge of setting those limits! In that case, with access to all the vars, this looks great! I'll bump the version in our Rule utility library and update the docs once it's ready. |
@adamjmcgrath looks like the codecov task is failing due to a decrease in coverage. Is that something that should be addressed in this PR, or do we need to re-evaluate our coverage requirements? @davidpatrick will be good for you to review this just to ensure there's no risk here that I can't see. It looks good to me 👍 |
Thanks @jimmyjames - it's the project coverage threshold, we should probably ignore them. I deleted a bunch of code that had 100% coverage so the overall level has decreased. The only way to make that green would be to go and find some unrelated uncovered code and write some tests for it (which I'm happy to do, just not in this pr)
I'm going to get someone from #iam-core-foundations to take a look as well, I just haven't gotten around to pinging them yet |
# Conflicts: # src/RetryRestClient.js
@jimmyjames @davidpatrick #iam-core-foundation has no issues so if one of you could approve, I'll go ahead and merge |
Changes
Currently
RetryRestClient
uses theX-RateLimit-Reset
header to calculate retry duration, eg if you hit a per minute rate limit, the backoff for 10 retries will be: [60s, 60s, ..., 60s, 60s]But the
X-RateLimit-Reset
only tells when thebucket
of available requests is full - not when there might be a request available in the bucket to use. So 60s will likely be too long to wait, and cause things like Webtasks to fail.Instead we are switching to a generic exponential backoff with jitter, starting at ~1sec, eg for 10 retries: [ 1-2s, 2-4s, 4-8s, 8-16s, 16-32s, 32-64s, 64-128s, 128-256s, 256-512s, 512-1024s ]
Calculated by
(1 + Math.random()) * 1sec * Math.pow(2, attemptNumber)
- happy to tweak these numbers if we wantThe SDK already uses retry - so we can leverage the default behaviour of this library
References
https://auth0team.atlassian.net/browse/ESD-8390
https://auth0.com/docs/policies/rate-limit-policy#handle-rates-limitations-in-code
https://auth0.com/docs/policies/rate-limit-policy/management-api-endpoint-rate-limits
https://github.com/tim-kos/node-retry
Testing
- [ ] This change adds integration test coverage