Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the validation of the access token returned with the code grant optional #384

Closed

Conversation

sberyozkin
Copy link

@sberyozkin sberyozkin commented Apr 29, 2020

Motivation:

As shown in quarkusio/quarkus#8396, OIDC providers may return the opaque access tokens as part of the code flow - such tokens do not have to be decoded as JWT and also introspected as opaque tokens - they are only for the client web applications to propagate such a token to the remote endpoint.

It will resolve #381 once 3.9.x is updated.

@pmlopes Hi Paulo, The master code is different from the 3.9.x branch so I'll need to follow up with another PR for 3.9.x, correct once this PR is ready ? Thanks

CC @cescoffier

@vietj vietj requested a review from pmlopes May 4, 2020 08:26
@vietj vietj added this to the 3.9.1 milestone May 4, 2020
@vietj
Copy link
Member

vietj commented May 4, 2020

can you review this @pmlopes ?

@vietj vietj modified the milestones: 3.9.1, 4.0.0 May 4, 2020
Copy link
Contributor

@pmlopes pmlopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merge this, I am a bit concerned because I don't see what the PR is trying to fix. Just looking at the changes it looks like the fix will just avoid logs to be filled with debug information.

Maybe in order to us to understand what the underline issue, we need a test with a dummy response as sent by Auth0 with this strange token response.

@sberyozkin can you provide a response as it comes back from the idp?, with that we could add a test to see what is happening.

@sberyozkin
Copy link
Author

sberyozkin commented May 6, 2020

Hi @pmlopes Thanks for the review so far, sorry I missed your comments. I'm actually looking at how to test it right now and came here to ask a question :-). Let me get some payload, it is Auth0, but I can get a test payload.

Just to summarize what I am trying to fix for Quarkus. In Quarkus, when we have a code flow, IdToken is verified and used for RBAC. AccessToken is made available for the injection into a service code only (either as the opaque token or JWT), we don't verify it at all. This is because in the code flow, the access token is only meant for the service endpoint to use it on behalf of the user and invoke some remote endpoint.

Vertx OAuth2 verifies both IdToken and AccessToken when it gets them as part of the code flow grant.
It works in most cases because with Keycloak we get the access token in a JWT format.
But the user has reported a case where with Auth0 the opaque token is binary.

@sberyozkin
Copy link
Author

sberyozkin commented May 6, 2020

HTTP/1.1 200 OK
Content-Type: application/json
Cache-Control: no-store
Pragma: no-cache
{
    "access_token": "SlAV32hkKG",
    "token_type": "Bearer",
    "refresh_token": "8xLOxBtZp8",
    "expires_in": 3600,
    "id_token": "eyJ..."
}

This is from https://auth0.com/docs/api-auth/tutorials/adoption/authorization-code.
Note id_token is JWT, access_token is binary/opaque (https://auth0.com/docs/tokens/concepts/access-tokens).

Paulo, @pmlopes, so I can try to test it but I'd appreciate some hints :-)
There is OAuth2AuthCodeTest and it returns an opaque access token. But it is not verified in the test because JWT.isUnsecured returns true.
So may be we can get JWT.isUnsecured return false by adding some JWK to JWT and set this new property to disable the verification, otherwise the test will fail...
What do you think ?

@sberyozkin
Copy link
Author

@pmlopes I've managed to get the test server returning JWK set, and indeed the exception is caught here and the test still passes. Let me try to reproduce it. But I'm getting a bit more comfortable with testing it :-)

@sberyozkin
Copy link
Author

@pmlopes I've asked @majonga88 to clarify.
It may be this PR does not really help much, except that we won't see the JWT decoding exception in the logs which is a bit noisy when AT is expected to be opaque.

@sberyozkin sberyozkin force-pushed the code_flow_at_validation branch from 210d0cd to 364b150 Compare May 6, 2020 16:28
@sberyozkin
Copy link
Author

@pmlopes Hi, so I've improved the test a bit and removed the code adding an access token as an attribute, as long as it is available as an opaque access token that should be fine.
How does it look now ?
May be only rename validateCodeFlowAccessToken to simply validateAccessToken ? If the Quarkus/etc user knows the opaque token is coming then they would set validateAccessToken to false and there would be no attempt to decode it as JWT, which is a bit better than always trying to decode it ?

Thanks

@pmlopes
Copy link
Contributor

pmlopes commented May 8, 2020

@sberyozkin I've just copied your test to the current master branch. I've run the tests and they all pass. So my question, like before (but I probably am not expressing it right) is, what is the flag doing? imho, it basically only avoids the logs to be fulled with stacktraces but behavior wise, there's no change.

I think The goal here, instead of adding more flags is to create better logs. Is this it?

@pmlopes
Copy link
Contributor

pmlopes commented May 8, 2020

I've merged only the test on master for illustration purposes. As you can see the flag doesn't seem to affect in any way the result: https://travis-ci.org/github/vert-x3/vertx-auth/builds/684629720

@sberyozkin
Copy link
Author

sberyozkin commented May 8, 2020

@pmlopes I thought the goal would be to let the Quarkus users avoid the redundant verification of access tokens in the code flow since as I said they are not used by default and only made available for further propagation by the service code if it needs it.
Right now:

  • if the access token is JWT it is verified, it passes, but the verification is not needed
  • if the access token is opaque then it is verified, and the error is in the logs, but it should really be not there at all in this case

So this flag lets users optimize (by skipping the redundant verification), and also keep a clean log

If we have a bearer access token coming in the Authorization header then we need to verify it. If we have obtained IdToken + Access token as part of the code flow then we only need an id token verification.

See what I mean ?

@sberyozkin
Copy link
Author

@pmlopes Without the flag we won't be able to improve the logs, since the default behaviour is to verify all the tokens (AT + ID) and we can't change it I guess. In case someone uses them interchangeably (ID token is effectively = access token). So the flag lets cleanly do it.
May be if the flag is enabled then we can improve the log by saying: "Acess token verification is skipped" ?
What do you think ?

@sberyozkin
Copy link
Author

sberyozkin commented May 8, 2020

The only thing I'm not really sure about is how to name this flag correctly. skipAcquiredAccessTokenVerification is long but the idea is to say: if you get a token from the token endpoint then if this property is enabled then do no verify it...validateCodeFlowAccessToken is not bad but it is specific to a code grant, in general if one gets a token with whatever grant then this token is meant to be propagated somewhere...hence it will up to the target to verify it

@sberyozkin
Copy link
Author

sberyozkin commented May 8, 2020

We could've just checked if 3 parts are there, if not, not JWT and as such - skip throwing the verification exception as it is done now and also log about it, but it is risky because may be someone does expect AT as JWT and also even if it is JWT, then, as noted above, if the service only wants to use to invoke some other service on the user behalf, then there is no point in verifying it... So the flag would do well IMHO.
No problems if you are not convinced :-) You are right it still works.
But I should say I (and the user) thought it was a bug when we saw this error validation error in the log with Auth0 returning an opaque access token :-)

@pmlopes
Copy link
Contributor

pmlopes commented May 8, 2020

Ok, so we agree that there is no behavior issue but it's a cosmetic feature to avoid less logging. In that case the the solution should be changing the runtime logging config to something like:

handlers=java.util.logging.ConsoleHandler
# allow the logger to still log at it's finest
java.util.logging.ConsoleHandler.level=FINEST

# generally we only care about WARN and up
.level=WARN
# as the level has been lowered to trace on master this will avoid it to be printed
io.vertx.ext.auth.oauth2.level=INFO

When things go wrong and users don't know why, then without rebuilding the application, they can change the log level to FINEST and see the whole debug and trace information, which will help debug why the tokens are not parsed.

handlers=java.util.logging.ConsoleHandler
# allow the logger to still log at it's finest
java.util.logging.ConsoleHandler.level=FINEST

# generally we only care about WARN and up
.level=WARN
# as the level has been lowered to trace on master this will avoid it to be printed
io.vertx.ext.auth.oauth2.level=FINEST

This imho is the correct solution instead of adding more configuration flags, which will make the process more complex for the end user and add yet another thing that needs to be set by the user in order to get the module working as expected.

I think @vietj can also give his opinion here.

@sberyozkin
Copy link
Author

sberyozkin commented May 8, 2020

@pmlopes It is more than just a cosmetic issue.
As I said above it is:

This imho is the correct solution instead of adding more configuration flags, which will make the process more complex for the end user and add yet another thing that needs to be set by the user in order to get the module working as expected.

How does it make more complex compared to the users having to start dealing with Vertx Oauth2 implementation specific log levels as you suggest ? The new property is optional and the default behaviour is retained OOB.

If the users know that they only need AT to propagate then they set a Quarkus specific property to disable the verification (internally we set this new property). The end result - no redundant verification and no noise in the logs but a proper log message like AT verification is skipped on request etc...

@sberyozkin
Copy link
Author

@pmlopes Paulo, I myself always add a lot of properties hence I suppose I'm keen on them :-).

IMHO if I say via the property, as a Quarkus user, "Do not verify the access token for code flow" because I only need ID token or I only need AT to propagate then it will be more user centric as opposed to them noticing, oh, what is that in the logs, is it a real problem, who knows, etc...It is just a property :-)

Anyway. Sorry for the noise, thanks for the review, and I'll leave it up to you how to deal with this PR.
Cheers

@pmlopes
Copy link
Contributor

pmlopes commented May 8, 2020

@pmlopes It is more than just a cosmetic issue.
As I said above it is:

  • confusing to the users, please see quarkusio/quarkus#8396, we've both been under impression something is broken
  • it is suboptimal to try to verify the token even if it is in the JWT format which is not meant to be used locally

This is not entirely correct. The message is currently logged with debug level. By default vert.x does not log at that level, but above, so quarkus or the end user is already changing the log levels of the application.

Also note that the token is not being validated in a cryptographic way. The exception as you can trace back is a simple if statement check, it is triggered when the jwt object is initialized with keys and the token does not contain the right number of segments (which is a simple count of . characters in the string).

The API design choice was throwing an exception instead of returning null. Returning null is possible but it means it will become a breaking change (which we can do on master atm). I don't mind to do that, yet we will still log the event as, in the past we weren't logging and the community asked for it. Going back now doesn't sound like a valid point.

How does it make more complex compared to the users having to start dealing with Vertx Oauth2 implementation specific log levels as you suggest ? The new property is optional and the default behaviour is retained OOB.

Let me give you an example, let's say that we introduce the flag, now imagine you have a keycloak user, before the user would just point to the discovery url with the client id and client secret knowledge and was done, now she also must think about this new flag. Grepping the documentation will not show anything about this property.

The exact same will happen with Google, if a user specifies the scope openid then she also needs to think of the flag.

Going back to auth0 the original report for this issue their documentation says: https://auth0.com/docs/tokens/concepts/access-tokens tokens can be opaque or jwts and currently we can handle both, but after we will only be able to handle one at the time.

If the users know that they only need AT to propagate then they set a Quarkus specific property to disable the verification (internally we set this new property). The end result - no redundant verification and no noise in the logs but a proper log message like AT verification is skipped on request etc...

My conclusion, quarkus is already modifying the log levels (and as a small note, vertx logging, is generic, we don't include a logger, it will pick what is provided by the user (quarkus)), so instead of picking the debug level, switch to info. There's no crypto checks happening on that message, we could change the API to return null but we will break the contract with existing users.

I've no knowledge on which logger quarkus uses, so based on the j.u.l config provided above, you should adapt quarkus to do the same. In the meantime, I've already dropped the level from DEBUG to TRACE which is even lower, so you should not see it at all in a production environment.

@pmlopes pmlopes closed this May 8, 2020
@sberyozkin
Copy link
Author

@pmlopes

but after we will only be able to handle one at the time.

This is an obvious misundestanding, that property was only about skipping the code which tries to treat an opaque token as a JWT token. I've said it 3 times at least but it did not work

@sberyozkin
Copy link
Author

@pmlopes

Grepping the documentation will not show anything about this property.

This just makes me frustrated - what kind of argument it is ? Obviously there would've been a documentation around this property.

@sberyozkin
Copy link
Author

sberyozkin commented May 9, 2020

@pmlopes

The API design choice was throwing an exception instead of returning null.

The property was simply about optionally skipping this code: https://github.com/vert-x3/vertx-auth/blob/master/vertx-auth-oauth2/src/main/java/io/vertx/ext/auth/oauth2/impl/OAuth2AuthProviderImpl.java#L307

This swallows the exception anyway. And it still cryptographically verifies the access token if it is JWT. Which is of no use to Quarkus users in the code flow. Which I've tried to explain several times above.

So the solution just to hide the exception in the lowest category log if it is an opaque token while continuing verifying the access tokens which is not needed is not a good solution IMHO.

I apologize for trying to say the last word.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code grant request fails when OIDC server returns an opaque access token
3 participants