Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OAuth support for Kafka client in native mode #12732

Merged
merged 1 commit into from
Jan 4, 2021

Conversation

dejanb
Copy link
Contributor

@dejanb dejanb commented Oct 15, 2020

No description provided.

@dejanb dejanb marked this pull request as draft October 15, 2020 10:58
@dejanb
Copy link
Contributor Author

dejanb commented Oct 15, 2020

This is a first step toward full OAUTH support for the Kafka extension in native mode. These changes get the communication with Keycloak going and obtain a proper token. However, there's an issue with the Kafka client trying to connect to the broker with that token.

I tracked it down to this issue

oracle/graal#2745

Basically,

Subject subject = Subject.getSubject(AccessController.getContext());

call return null in the native mode.

I tried to add some classes to the jni configuration, but as with the above reporter that didn't help.

Anyone experienced something like this before? Any pointers on where to look next would be helpful.

Note that the PR in the current state is usable/mergeable, but it doesn't get us to the end.

@cescoffier
Copy link
Member

About Subject, we had a similar issue with Jass and Kafka. We need to check why the lookup fails. IT generally because of a missing class or module.

@sberyozkin
Copy link
Member

@dejanb Can quarkus-oidc help here ? Why to wire the Keycloak specific code into the Kafka extension ?

@dejanb
Copy link
Contributor Author

dejanb commented Oct 15, 2020

About Subject, we had a similar issue with Jass and Kafka. We need to check why the lookup fails. IT generally because of a missing class or module.

@cescoffier If you have any pointers on what to try and how to more info please let me know. I can try to provide substitutions for some of these classes and see if that would help.

@dejanb
Copy link
Contributor Author

dejanb commented Oct 15, 2020

@dejanb Can quarkus-oidc help here ? Why to wire the Keycloak specific code into the Kafka extension ?

@sberyozkin these are used by strimzi oauth client, so this PR is intended to provide a better support for that case. I'd need to give oidc a look, but I assume that the client would need first changed to use it.

@sberyozkin
Copy link
Member

sberyozkin commented Oct 15, 2020

@dejanb I see, so in your PR it is only about getting a token by directly requesting it from Keycloak. So quarkus-oidc won't help at the moment, I've started working on OidcClient (quarkus-oidc-client) which will work not only with Keycloak, but it is not ready yet.
Thanks

@dejanb
Copy link
Contributor Author

dejanb commented Oct 15, 2020

@sberyozkin Great. We can take a look of moving strimzi oauth to it later on.

@cescoffier
Copy link
Member

cescoffier commented Oct 26, 2020

@dejanb Sorry, missed the notifications.

I would go with simple substitutions adding traces to check where it fails. Unfortunately, this is going to be a lot of fails and retries cycles.

@dejanb
Copy link
Contributor Author

dejanb commented Oct 26, 2020

@cescoffier no worries, I was on other stuff as well. I pretty much pin-pointed the problematic places already, I'll try to play with substitutions to come up with a fix. Hopefully, I'll have something to show this week. Thanks for all your help so far.

@cescoffier
Copy link
Member

Can you link the place you think it fails?

@dejanb
Copy link
Contributor Author

dejanb commented Oct 27, 2020

Can you link the place you think it fails?
@cescoffier

So what I think is going on is that SaslClientAuthenticator do Subject.doAs() call with the proper subject it wants to use

https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/authenticator/SaslClientAuthenticator.java#L524

This subject later on should be available with the call to Subject.getSubject(AccessController.getContext()), but this call returns null in the callback handler.

https://github.com/apache/kafka/blob/aa287acb2eed07cf6d75c10e71f051f538a57872/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/internals/OAuthBearerSaslClientCallbackHandler.java#L99

That's where it fails.

Let me know if this helpful or do you need any more info from my side.

@cescoffier
Copy link
Member

cescoffier commented Oct 27, 2020

It seems close to the SASL issue we had with kafka a few months ago.

@dejanb dejanb marked this pull request as ready for review November 3, 2020 11:09
@dejanb
Copy link
Contributor Author

dejanb commented Nov 3, 2020

@cescoffier unfortunately SASL fix wasn't applicable here. I ended up substituting Subject class which should help with all these doAs() -> getSubject() usage patterns until graalvm issue is fixed. It might not be ideal but all other solutions would need much more substitutions to avoid Subject usage at all (which is even less ideal). This now works nicely. Let me know what you think.

@dejanb dejanb requested a review from cescoffier November 3, 2020 11:13
@cescoffier
Copy link
Member

Thanks @dejanb.
@galderz WDYT about the substitutions?

Copy link
Member

@cescoffier cescoffier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice trick.
I've done a few suggestions around SubstitutionsUtils.

There is a potential memory leak if there is lots of call to doAs breaking before the call to getSubject- the map size could grow unexpectedly.
I guess it's fine here, as we won't have such a scenario.


import javax.security.auth.Subject;

public class SubstitutionUtils {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the name.
Maybe SubjectHolder ?


public class SubstitutionUtils {

public static HashMap<AccessControlContext, Subject> subjects = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be final?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this field be moved to the Subject class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cescoffier The field has to be kept outside otherwise IIRC you get an error. The only fields allowed in substituted classes are aliases of existing fields.

I'm not an expert on this java security APIs, but a couple of things to bear in mind would be:

  1. Should be it be concurrent?
  2. Is there a chance entries could be stored and not removed? Maybe you need some kind of expiration of entries to avoid a leak?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, thanks for reminding me of this @galderz! I always forgot.

Handling expiration might be good as it would avoid the potential growth I mentioned in my review.

@cescoffier cescoffier requested a review from galderz November 3, 2020 13:53
@dejanb
Copy link
Contributor Author

dejanb commented Nov 3, 2020

Nice trick.
I've done a few suggestions around SubstitutionsUtils.

There is a potential memory leak if there is lots of call to doAs breaking before the call to getSubject- the map size could grow unexpectedly.
I guess it's fine here, as we won't have such a scenario.

@cescoffier thanks, great suggestions. I'll incorporate them in next.

@dejanb
Copy link
Contributor Author

dejanb commented Nov 4, 2020

@cescoffier @galderz Thanks for the feedback. I pushed some more changes. As Galder mentioned, the static field cannot be added to the target class. But I moved the SubjectHolder class to the target class and that works and looks a bit better.

In terms of actual map, I made it synchronized and converted to WeakHashMap() which removes entries when the key is no longer in use. I think this should help with potential leaks.

I can use full blown cache (with expiring entries) here but I'll need to add a dependency on Infinispan/Guava for that. Let me know if you want me to explore that route.

@galderz
Copy link
Member

galderz commented Nov 5, 2020

@dejanb Rather than WeakHashMap, I'd probably use Caffeine for this which is a library already in use within Quarkus in different places (e.g. to power caching for quarkus end user apps). I'm not expert on WeakHashMap but you can find some details here on the gotchas of using them as a cache. Also, I don't think WeakHashMap is designed for concurrent access.

@dejanb
Copy link
Contributor Author

dejanb commented Nov 9, 2020

@galderz Makes sense. I pushed an implementation using Caffeine cache. I used some defaults for sizes and expiry times we use on other projects and it seems to me they should work well here as well. Let me know if you have any other suggestions.

@machi1990 machi1990 requested a review from cescoffier November 14, 2020 16:52
@cescoffier
Copy link
Member

Re-running the CI before merging.

@cescoffier
Copy link
Member

@dejanb Sorry for the delay. It should be fine to merge it, but I re-run the CI to be sure.

@machi1990
Copy link
Member

This errored again with

Error: Unable to process command '::set-env name=JAVA_HOME_11_x64::/opt/hostedtoolcache/AdoptOpenJDK/1.0.0-ga-11-jdk-hotspot-linux-x64-normal-latest/x64' successfully.
Error: The `set-env` command is disabled. Please upgrade to using Environment Files or opt into unsecure command execution by setting the `ACTIONS_ALLOW_UNSECURE_COMMANDS` environment variable to `true`. For more information see: 

I've this these error before (a month or so ago) and they were fixed by @loicmathieu or @gsmet. I think a rebase will do and it will re-trigger a clean build. Do you want to take a look? @dejanb

@ghost ghost added the area/kafka label Dec 9, 2020
@dejanb
Copy link
Contributor Author

dejanb commented Dec 9, 2020

@cescoffier @machi1990 Done. Hopefully it'll do it.

@cescoffier cescoffier merged commit d71f941 into quarkusio:master Jan 4, 2021
@ghost ghost added this to the 1.11 - master milestone Jan 4, 2021
@cescoffier
Copy link
Member

@dejanb It's IN! What a great way to start the year!

@dejanb
Copy link
Contributor Author

dejanb commented Jan 4, 2021

@cescoffier awesome ... To many more in this year :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants