-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store and return RPC error in ACL cache entries #12885
Conversation
agent/consul/acl_test.go
Outdated
// Attempt to resolve the token - this should set up the cache entry with a nil | ||
// identity and permission denied error. | ||
authz, err := r.ResolveToken(secretID) | ||
require.NoError(t, err) | ||
require.NotNil(t, authz) | ||
entry := r.cache.GetIdentity(tokenSecretCacheID(secretID)) | ||
require.Nil(t, entry.Identity) | ||
require.Equal(t, acl.ErrPermissionDenied, entry.Error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original context was that after this initial ResolveToken
primes the cache, the followup call would fail on acl.ErrNotFound
and not apply the ACLDownPolicy (bug since it should fail on ACLRemoteError
and apply the ACLDownPolicy).
I think this test needs to repeat the above chunk (you'll see that it fails the second time around because ResolveToken
returns the cached error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After applying the proposed change in acl.go
the second ResolveToken
call passes for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I wasn't clear:
authz, err := r.ResolveToken(secretID)
require.NoError(t, err)
require.NotNil(t, authz)
entry := r.cache.GetIdentity(tokenSecretCacheID(secretID))
require.Nil(t, entry.Identity)
require.Equal(t, acl.ErrPermissionDenied, entry.Error)
authz, err = r.ResolveToken(secretID)
require.NoError(t, err) <- your implementation fails here but it shouldn't
require.NotNil(t, authz)
entry = r.cache.GetIdentity(tokenSecretCacheID(secretID))
require.Nil(t, entry.Identity)
require.Equal(t, acl.ErrPermissionDenied, entry.Error)
@@ -416,7 +416,7 @@ func (r *ACLResolver) resolveIdentityFromToken(token string) (structs.ACLIdentit | |||
waitForResult := cacheEntry == nil || r.config.ACLDownPolicy != "async-cache" | |||
if !waitForResult { | |||
// waitForResult being false requires the cacheEntry to not be nil | |||
return cacheEntry.Identity, nil | |||
return cacheEntry.Identity, cacheEntry.Error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the caller has special handling for ACLRemoteError
, we need to make sure we wrap the error like we do in L430, otherwise we'll miss the ACLRemoteError
codepath
return cacheEntry.Identity, cacheEntry.Error | |
if cacheEntry.Error != nil && !acl.IsErrNotFound(cacheEntry.Error) { | |
return cacheEntry.Identity, ACLRemoteError{Err: cacheEntry.Error} | |
} | |
return cacheEntry.Identity, cacheEntry.Error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I totally missed this part about needing the return the err wrapped in an ACLRemoteError
- I thought the bug was only that we weren't caching the error at all and returning an ErrNotFound when we shouldn't be
agent/consul/acl_test.go
Outdated
localTokens: false, | ||
localPolicies: false, | ||
tokenReadFn: func(_ *structs.ACLTokenGetRequest, reply *structs.ACLTokenResponse) error { | ||
return acl.ErrPermissionDenied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slight nit but can we return something like testErr := fmt.Errorf("network error")
?
The type of error returned here isn't too important but I think using an error from the acl
package (and the fact that we have special handling for a bunch of ACL error types) makes it a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add special handling to the newly returned errors and should update the test a bit to replicate the bug this PR aims to address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for working on this. Hopefully this cuts down on all the flakes that the bug subtly caused
🍒 If backport labels were added before merging, cherry-picking will start automatically. To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/659162. |
This PR adds an
Error
field to theIdentityCacheEntry
to preserve the error returned when attempting to resolve a token. Currently we returnacl.ErrNotFound
in this case which bubbles up to the client agent that made the request and the down policy ends up not being respected if it's set toallow
.