update okta-sdk-golang to v2.9.1 #13439
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It is possible for Vault to get into a deadlock with the current version of the Okta SDK based on how it handles backoff for 429 responses. The SDK trusts the backoff time from the Okta API without any means of short-circuiting with a maximum backoff limit.
We have witnessed long lived Okta login requests resulting in the k8s service registry incorrectly labeling two nodes as
vault-active=true
. In this scenario, Okta login request goroutines were sleeping in a function calledbackoffPause
. The following is the call chain from Vault’s Okta backend:Login
- builtin/credential/okta/backend.go:291getOktaGroups
- builtin/credential/okta/backend.go:340ListUserGroups
- okta-sdk-golang/v2/okta/user.go:519Do
- okta-sdk-golang/v2/okta/requestExecutor.go:237doWithRetries
- okta-sdk-golang/v2/okta/requestExecutor.go:284backoffPause
- okta-sdk-golang/v2/okta/requestExecutor.go:314In order to enter
backoffPause
, Vault needed to receive a429 Too Many Requests
response from the Okta API. The implementation of thebackoffPause
function in version 2.0.0 of the SDK trusts theX-Rate-Limit-Reset
header value from the Okta API which is used to calculate the sleep time. My theory is that the SDK calculated an extremely large backoff time given that the default max retries is set to 2. The backoff logic has since changed in newer versions of the SDK by introducing a config parameterOkta.Client.RateLimit.MaxBackoff
which defaults to 30 seconds when usingNewClient
(see builtin/credential/okta/path_config.go:326). The important change is that if the SDK calculates a backoff time using the response headers that exceeds the one configured on the Okta client, the SDK will default to using the one configured on the client, see okta-sdk-golang/requestExecutor.go at v2.9.1.This PR updates
okta-sdk-golang
to v2.9.1 in order to consume refactored request handling logic which introduces this maximum backoff limit. In addition to the existing unit tests, I have also manually tested the Okta auth method with MFA to ensure that there have not been any regressions.