Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: revoke the proper token on partial failures from token-related requests #7835

Merged
merged 4 commits into from
Nov 8, 2019

Conversation

calvn
Copy link
Contributor

@calvn calvn commented Nov 7, 2019

During the request flow, if a request from auth/token/ resulted in a non-empty resp.Auth, we would register the token present in there to the expiration manager. If we encounter an error during this process, we attempt to perform cleanup by revoking said token.

However, the call to revoking the token via c.expiration.revokeOrphan was using the client token from the request as opposed to the token in resp.Auth. This led to a case where the client token would be revoked if failures triggered this cleanup, such as a RegisterAuth error that would result from a storage write failure.

The chances of triggering this failure is fairly slim, as it would have to succeed on writes in the token store, but fail right when it tries to register the lease. Nonetheless, a cluster that has unstable connectivity to its storage and that issues a lot of token creation request could trigger this from time to time.

@calvn calvn added this to the 1.3 milestone Nov 7, 2019
@calvn calvn requested a review from briankassouf November 7, 2019 21:36
@robloxrob
Copy link

Thank you for all the hard work on this code!

Copy link

@roberteckert roberteckert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving this attention.

vault/core.go Outdated Show resolved Hide resolved
@@ -1209,7 +1217,9 @@ func (c *Core) RegisterAuth(ctx context.Context, tokenTTL time.Duration, path st
case logical.TokenTypeService:
// Register with the expiration manager
if err := c.expiration.RegisterAuth(ctx, &te, auth); err != nil {
c.tokenStore.revokeOrphan(ctx, te.ID)
if err := c.tokenStore.revokeOrphan(ctx, te.ID); err != nil {
c.logger.Warn("failed to clean up token lease", "request_path", path, "error", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps failing here should produce a slightly different log message to make orphan revocation by ID distinct from revocation by client token lease. Something like failed to clean up token lease - id not found vs failed to clean up token lease - not found. If err will be sufficiently unique and the logger in fact captures that output, then this isn't necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the difference between using te.ID here and resp.Auth.ClientToken above? The revoke orphan call here ultimately points to to the same thing (i.e. the token ID that we're trying to revoke) since the auth that we pass into this func is from resp.Auth and we set auth.ClientToken = te.ID up in L1208. The revocation is not by client token lease ID.

The reason why we're able to use te.ID is because we instantiate a new instance of te within the func, and in all three revoke orphan cases we're now referencing the token ID that's being created instead of the request's client token.

Copy link

@roberteckert roberteckert Nov 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was indeed referring to te.ID vs resp.Auth.ClientToken. If it's all the same value and call then the messaging is sensible. Thanks.

@jefferai
Copy link
Member

jefferai commented Nov 7, 2019

Is there a reason you put a test in the http package? If you put it in the vault package you wouldn't need to expose a public function just to read the value in the test.

@calvn
Copy link
Contributor Author

calvn commented Nov 8, 2019

@jefferai Moved the test over to the vault package. I also moved the test trigger into the expiration manager since we don't really need to expose it to core now the trigger can be accessed internally by the test.

briankassouf
briankassouf previously approved these changes Nov 8, 2019
Copy link
Contributor

@briankassouf briankassouf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a good find! I agree with the outstanding comment about the log message, but otherwise looks great!

briankassouf
briankassouf previously approved these changes Nov 8, 2019
@calvn calvn merged commit eb74829 into master Nov 8, 2019
@calvn calvn deleted the token-revocation-fix branch November 8, 2019 21:14
calvn added a commit that referenced this pull request Nov 8, 2019
…requests (#7835)

* core: revoke the proper token on partial failures from token-related requests

* move test to vault package, move test trigger to expiration manager

* update logging messages for clarity

* docstring fix
briankassouf pushed a commit that referenced this pull request Nov 8, 2019
…requests (#7835) (#7846)

* core: revoke the proper token on partial failures from token-related requests

* move test to vault package, move test trigger to expiration manager

* update logging messages for clarity

* docstring fix
Copy link

@notnoop notnoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@@ -860,7 +860,11 @@ func (c *Core) handleRequest(ctx context.Context, req *logical.Request) (retResp

_, identityPolicies, err := c.fetchEntityAndDerivedPolicies(ctx, tokenNS, resp.Auth.EntityID)
if err != nil {
c.tokenStore.revokeOrphan(ctx, te.ID)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably should log the error here as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants