-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: revoke the proper token on partial failures from token-related requests #7835
Conversation
Thank you for all the hard work on this code! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for giving this attention.
vault/request_handling.go
Outdated
@@ -1209,7 +1217,9 @@ func (c *Core) RegisterAuth(ctx context.Context, tokenTTL time.Duration, path st | |||
case logical.TokenTypeService: | |||
// Register with the expiration manager | |||
if err := c.expiration.RegisterAuth(ctx, &te, auth); err != nil { | |||
c.tokenStore.revokeOrphan(ctx, te.ID) | |||
if err := c.tokenStore.revokeOrphan(ctx, te.ID); err != nil { | |||
c.logger.Warn("failed to clean up token lease", "request_path", path, "error", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps failing here should produce a slightly different log message to make orphan revocation by ID distinct from revocation by client token lease. Something like failed to clean up token lease - id not found
vs failed to clean up token lease - not found
. If err
will be sufficiently unique and the logger in fact captures that output, then this isn't necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to the difference between using te.ID
here and resp.Auth.ClientToken
above? The revoke orphan call here ultimately points to to the same thing (i.e. the token ID that we're trying to revoke) since the auth
that we pass into this func is from resp.Auth
and we set auth.ClientToken = te.ID
up in L1208. The revocation is not by client token lease ID.
The reason why we're able to use te.ID
is because we instantiate a new instance of te
within the func, and in all three revoke orphan cases we're now referencing the token ID that's being created instead of the request's client token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was indeed referring to te.ID
vs resp.Auth.ClientToken
. If it's all the same value and call then the messaging is sensible. Thanks.
Is there a reason you put a test in the http package? If you put it in the vault package you wouldn't need to expose a public function just to read the value in the test. |
@jefferai Moved the test over to the vault package. I also moved the test trigger into the expiration manager since we don't really need to expose it to core now the trigger can be accessed internally by the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a good find! I agree with the outstanding comment about the log message, but otherwise looks great!
…requests (#7835) * core: revoke the proper token on partial failures from token-related requests * move test to vault package, move test trigger to expiration manager * update logging messages for clarity * docstring fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
@@ -860,7 +860,11 @@ func (c *Core) handleRequest(ctx context.Context, req *logical.Request) (retResp | |||
|
|||
_, identityPolicies, err := c.fetchEntityAndDerivedPolicies(ctx, tokenNS, resp.Auth.EntityID) | |||
if err != nil { | |||
c.tokenStore.revokeOrphan(ctx, te.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably should log the error here as well?
During the request flow, if a request from
auth/token/
resulted in a non-emptyresp.Auth
, we would register the token present in there to the expiration manager. If we encounter an error during this process, we attempt to perform cleanup by revoking said token.However, the call to revoking the token via
c.expiration.revokeOrphan
was using the client token from the request as opposed to the token inresp.Auth
. This led to a case where the client token would be revoked if failures triggered this cleanup, such as aRegisterAuth
error that would result from a storage write failure.The chances of triggering this failure is fairly slim, as it would have to succeed on writes in the token store, but fail right when it tries to register the lease. Nonetheless, a cluster that has unstable connectivity to its storage and that issues a lot of token creation request could trigger this from time to time.