-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Kerberos authentication provider. #36112
Conversation
Pinging @elastic/kibana-security |
* host/[email protected] | ||
* HTTP/[email protected] | ||
|
||
The SPNEGO token used in tests is generated for for `[email protected]`. We can re-use it multiple times because we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @bizybot ,
Can you please help us to understand whether the approach we're taking here for integration tests is going to work with Elasticsearch Kerberos support in the long run and we don't make any wrong assumptions?
The key here is that we disable replay cache (-Dsun.security.krb5.rcache=none
) so that we can use the same SPNEGO token without any interval between calls and use very large clockskew
(2147483647 seconds or ~68 years) to use the same token for a very long time. It looks like in this case we don't need to have a dedicated KDC for tests, unless we're missing something.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the second question, it seems we can't exchange SPNEGO token to an access token for a "normal" user using this:
POST /_security/oauth2/token
{
"grant_type" : "client_credentials"
}
I'm getting action [cluster:admin/xpack/security/token/create] is unauthorized for user ...
. I thought we were supposed to call this API endpoint using Negotiate xxx-spnego-token
and user won't need any admin privileges to get the token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we discuss this in elastic/elasticsearch#41943, it made more sense to point this out here:
Using the client_credentials
grant will get you an access token only, and no refresh token. This access token would only be usable for (usually 20min) max 1 hour depending on the configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, thanks for the note! We're going to return WWW-Authenticate: Negotiate
when access token expires to request a new SPNEGO token that we can exchange for a new access token (poor man replacement of refresh token).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key here is that we disable replay cache (
-Dsun.security.krb5.rcache=none
) so that we can use the same SPNEGO token without any interval between calls and use very largeclockskew
(2147483647 seconds or ~68 years) to use the same token for a very long time. It looks like in this case we don't need to have a dedicated KDC for tests, unless we're missing something.
@azasypkin I think this may not be entirely true. -Dsun.security.krb5.rcache=none
would disable the cache but there are other protections which might come into play when replay cache is disabled or not working, sun.security.krb5.acceptor.subkey
is one more property which we may have to play with (this one seems to be related to some session key). I have not been able to test my finding, I will test tomorrow if there is anything else needed to do here and update. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this @bizybot ! The SPNEGO token I've generated ~5 days ago is still accepted by the ES in integration tests, so I assume we should be fine, but would be great if you can confirm or refute that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @azasypkin, Sorry it took time to get back to you.
I took this opportunity to investigate further and sun.security.krb5.acceptor.subkey
does not help in the authenticator requests but only to prevent further msg replay.
I think you are fine to disable replay cache during your testing. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks!
372cd73
to
40ca9bd
Compare
@@ -238,6 +238,10 @@ exports.Cluster = class Cluster { | |||
|
|||
this._process = execa(ES_BIN, args, { | |||
cwd: installPath, | |||
env: { | |||
...process.env, | |||
...(options.esEnvVars || {}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: @elastic/kibana-operations how do you feel about this change? We need to specify some JVM options for the test ES instance (namely -Djava.security.krb5.conf
and -Dsun.security.krb5.rcache
) and I'm aware of only two ways of doing that, either via config/jvm.options
or ES_JAVA_OPTS
env variable.
Dealing with ES_JAVA_OPTS
seemed easier to me, so I picked that option, but happy to hear about any better ways of doing that. With this change test config would look like this:
{
esTestCluster: {
...xPackAPITestsConfig.get('esTestCluster'),
serverArgs: [
...xPackAPITestsConfig.get('esTestCluster.serverArgs'),
`xpack.security.authc.realms.kerberos.kerb1.keytab.path=${kerberosKeytabPath}`,
],
serverEnvVars: {
ES_JAVA_OPTS: `-Djava.security.krb5.conf=${kerberosConfigPath} -Dsun.security.krb5.rcache=none`,
},
},
...
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me, though I think you might run into problems with paths containing spaces based on your snippet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
I think you might run into problems with paths containing spaces based on your snippet.
Thanks for pointing out! We shouldn't have spaces there, but yeah, I pretend this problem doesn't exist yet 🙈
40ca9bd
to
4726186
Compare
return wrapError(authenticationResult.error); | ||
} else { | ||
return Boom.unauthorized(); | ||
const error = wrapError(authenticationResult.error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: we could potentially move this if
and optional challenges
to wrapError
, but I'm leaning towards targeted usage at this stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the explicitness of how it's done currently.
*/ | ||
public static failed(error: Error) { | ||
public static failed(error: Error, challenges?: string[]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: If I understand that correctly challenges
makes sense only with 401
error so I made it an optional argument for failed
, but based on the way you handled it in your WIP I'd be glad to hear your thoughts or concerns!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the current implementation correctly, when the Kerberos provider returns with AuthenticationResult.failed(Boom.unauthorized(), ['Negotiate'])
, this will immediately cause the response to be sent to the user. Assuming the user's browser doesn't support Kerberos, will they still eventually get to the login screen?
The previous WIP approach which I implemented was intending to allow the kerberos
provider to behave more seamlessly with the basic
provider, which is why the challenges weren't necessarily terminating the way we iterate through the auth providers within the authenticator.
Other internals might have changed where the previous approach is no longer relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the current implementation correctly, when the Kerberos provider returns with AuthenticationResult.failed(Boom.unauthorized(), ['Negotiate']), this will immediately cause the response to be sent to the user. Assuming the user's browser doesn't support Kerberos, will they still eventually get to the login screen?
Right, the only way user will be able to login using basic
auth provider in [kerberos, basic]
setup would be to go directly to /login
page. Like we do for saml
.
The previous WIP approach which I implemented was intending to allow the kerberos provider to behave more seamlessly with the basic provider, which is why the challenges weren't necessarily terminating the way we iterate through the auth providers within the authenticator.
Thanks for pointing out! I've played with this approach last week, it looks interesting, but I'd rather tackle this separately if find out that current approach is annoying for our users. Would that work for you? There are three things that make me think about handling this separately:
- Same experience (okay, ^experience^ 🙂 ) even if
basic
is not enabled basic
in[kerberos, basic]
scenario feels more like admin-only and "normal" users that can't exchange Kerberos tickets to access tokens wouldn't benefit fromlogin
page too much, and admins know where to goAuthenticator
will have to handlechallenges
in a special manner, that I'd rather avoid unless it's really needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you have is perfectly fine :) I think at some point, we'll likely need to add the "choose your login method" when we have multiple auth providers configured.
* Parses request's `Authorization` HTTP header if present and extracts authentication scheme. | ||
* @param request Request instance to extract authentication scheme for. | ||
*/ | ||
function getRequestAuthenticationScheme(request: Legacy.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: experimenting with another way of handling headerNotRecognized
we have in token and saml providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way you have it right now is definitely easier for me to comprehend, I like it!
return DeauthenticationResult.notHandled(); | ||
} | ||
|
||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: it's probably time to move token handling stuff to a separate shared "module", but not in this PR likely.
return DeauthenticationResult.failed(err); | ||
} | ||
|
||
return DeauthenticationResult.redirectTo('/logged_out'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: It seems /logged_out
page would be appropriate here since with SPNEGO user will be automatically authenticated from any other page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, we didn't have this when I was initially working on Kerberos, but I think it makes complete sense here.
|
||
// We validate that access token exists in the response so other private methods in this class | ||
// can rely on them both existing. | ||
if (!accessToken) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: I'm copying this code from token provider, but probably it doesn't make sense... What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't make sense because we really should always get an access token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I believe so. Will remove this check
private async authenticateViaSPNEGO(request: Legacy.Request, state?: ProviderState | null) { | ||
this.debug('Trying to authenticate request via SPNEGO.'); | ||
|
||
// If client can't handle redirect response, basically if it's an AJAX request, we shouldn't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: I'm hesitating about this one, I'd rather "fix" logged_out page and allow AJAX request to start session as long as SPNEGO token is valid... But didn't think it through, maybe I'm missing some cases when it'd be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree with fixing the logged_out page. My concern with the current approach is that we don't have "refresh tokens" so if after let's say 20 minutes the access token is expired, we can't transparently re-auth the user when an "ajax request" is unauthenticated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was my concern as well, feels like something that will be happening frequently.
Hey @kobelb , Would you mind giving a preliminary feedback on the approach taken before I start covering the functionality with unit tests? I added a bunch of integration tests to test main functionality as well. Thanks! |
932cf92
to
0dbcf09
Compare
0dbcf09
to
b7f3175
Compare
💚 Build Succeeded |
Done!
Yeah, I'd merge it and have tests running so that we can spot any breaking changes early and be up to date with any refactorings we may do inside of providers in the meantime (e.g. migrating to NP). PR should be ready for another pass :) |
@spalger tagged your for review of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the kbn-es stuff.
💚 Build Succeeded |
Is it possible to customize krb5.conf and krb5.keytab locations in kibana.yml ? |
These are configured only on ES side, not Kibana: |
Thanks for this information and the quick answer! So with this feature, Kibana manages Kerberos sso authentication with browser and then forwards browser Kerberos ticket to elasticsearch through http header. Am I right? |
Correct, Kibana exchanges SPNEGO token that it receives from the UA (user agent/browser) to a proper Elasticsearch access token (like demonstrated here) and then relies on this token instead. If token expires/is invalidated/etc Kibana will ask UA for a fresh SPNEGO token. The internal implementation may change slightly once elastic/elasticsearch#41943 is resolved, but the main idea will stay the same. |
7.x/7.3.0: c62833b |
Great! Thanks for the detailed explanation! |
Introduce Kerberos authentication provider.
How to test
In case it's useful, here is how I test it locally:
Create TGT with
kinit tester
Run Elasticsearch with (keytab file will be automatically copied into
config
directory):$ ES_JAVA_OPTS="-Djava.security.krb5.conf=/some-accessible-path/krb5.conf" yarn es snapshot \ --license trial \ -E xpack.security.authc.token.enabled=true \ -E xpack.security.authc.realms.kerberos.kerb1.keytab.path=/some-accessible-path/krb5.keytab
xpack.security.authProviders: [kerberos, basic]
Replaces: #27008
Fixes: #18188