Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kerberos support #214

Merged
merged 3 commits into from
Dec 1, 2022
Merged

Add kerberos support #214

merged 3 commits into from
Dec 1, 2022

Conversation

psaiz
Copy link
Contributor

@psaiz psaiz commented Oct 6, 2022

Description

Provide kerberos authentication support to connect to clusters

Issues Resolved

This pull request creates a new connection class that extends the HTTP connection to support kerberos. Thanks to it, using the code below, the user can connect using kerberos for authentication:

from opensearchpy import OpenSearch
from opensearchpy.connection.krb_requests  import RequestsKrbConnection

client = OpenSearch(["https://<host>/"], connection_class=RequestsKrbConnection)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@psaiz psaiz requested review from a team, axeoman and deztructor as code owners October 6, 2022 12:40
@psaiz psaiz mentioned this pull request Oct 6, 2022
@harshavamsi
Copy link
Collaborator

Hi @psaiz, thanks for the contribution. A couple of thoughts.

We need to get DCO check to pass, so could you sign off your commit with the -s flag.

I noticed you're importing kerberos, I believe we would need to include it as a dependency to be installed in setup.py

We would need unit tests to make sure that this works for both sync and async connections. This ensures test compatibility.

And it would be great if you could add this class to the docs folder so that we can auto generate docs.

@psaiz
Copy link
Contributor Author

psaiz commented Oct 11, 2022

Hi @harshavamsi,
Thanks for your comments. I'll do the docs and I'll look into the tests and the -s flag, thanks!

About adding kerberos as a dependency, what would you think about making it an optional dependency? I mean, the opensearch-py can still be used without kerberos.
Thanks again ,

Pablo

@psaiz
Copy link
Contributor Author

psaiz commented Oct 11, 2022

Hi again @harshavamsi,
Ok, so the DCO is happy, and I've added the optional dependency and the doc (I think...).

Regarding the testing, I'm not sure how to proceed. Note that to test it properly it requires a kerberos setup, and I'm afraid that won't be available during the pipelines. I've checked how it is done in the pykerberos, and they give a test.py file that should be run against their system. Is that what you have in mind?

Cheers,
pablo

@dblock
Copy link
Member

dblock commented Oct 11, 2022

Regarding the testing, I'm not sure how to proceed. Note that to test it properly it requires a kerberos setup, and I'm afraid that won't be available during the pipelines. I've checked how it is done in the pykerberos, and they give a test.py file that should be run against their system. Is that what you have in mind?

At least exercise the code to some extent. Mock the kerberos setup.

@psaiz
Copy link
Contributor Author

psaiz commented Oct 12, 2022

Hi @dblock,
Thanks for the message. Here you have a very basic unittest mocking the kerberos calls. Let me know if this is what you had in mind.

Best,
pablo

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close!

I made some other comments a bit more in-depth.

opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
test_opensearchpy/test_krbconnection.py Outdated Show resolved Hide resolved
@psaiz
Copy link
Contributor Author

psaiz commented Oct 18, 2022

Thanks @dblock! I've addressed most of the comments. I left two open:

  • exception handling, where I've changed the logic to let the exceptions just go through
  • the testing, in case something else should be tested
    Let me know if you want me to change anything else

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some more. I know it's just a small piece of code, but let's get it right! Thanks for hanging in here with me.

opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
opensearchpy/connection/krb_requests.py Outdated Show resolved Hide resolved
test_opensearchpy/test_krbconnection.py Outdated Show resolved Hide resolved
@psaiz
Copy link
Contributor Author

psaiz commented Oct 20, 2022

Thanks @dblock ! I'm happy to do as many changes as are needed, so please, keep the comments coming.

I left a reply in the three open discussions. Please, let me know if those are enough to resolve them.

@dblock
Copy link
Member

dblock commented Oct 20, 2022

Thanks @dblock ! I'm happy to do as many changes as are needed, so please, keep the comments coming.

I left a reply in the three open discussions. Please, let me know if those are enough to resolve them.

Thanks. See my two comments above, the unused variable is definitely a bug.

@aiudirog
Copy link
Contributor

@psaiz Have you considered just leveraging a standard Kerberos authentication library for requests like requests-gssapi or requests-kerberos? If so, is there a particular reason you don't want to use them?

If you haven't, you should be able to use them this way:

from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_gssapi import HTTPSPNEGOAuth

opensearch = OpenSearch(
    ...,
    connection_class=RequestsHttpConnection,
    http_auth=HTTPSPNEGOAuth(),
)

This is similar to the standard mechanism for authenticating with AWS IAM roles.

Making dedicated Connection implementations for every combination of HTTP client and authentication mechanism is an N^2 problem that could get out of hand pretty quickly, especially when most clients already have a standard authentication API.

Some of my personal projects are a port of requests-gssapi to HTTPX and adding Windows support to the Python gssapi client, so I know my way around this scene a bit.

@psaiz
Copy link
Contributor Author

psaiz commented Oct 21, 2022

Hi @aiudirog, thanks for the input. If that approach works, it is indeed much better than creating a new class. Let me play with it and see if it works in my case.

@dblock
Copy link
Member

dblock commented Oct 21, 2022

Either way, I'd appreciate contributions to https://github.com/opensearch-project/opensearch-py/blob/main/GETTING_STARTED.md for this, please! We might also want to rename that to USER_GUIDE to match other clients, cc: @wbeckler - is there a campaign to standardize those things?

@psaiz
Copy link
Contributor Author

psaiz commented Oct 31, 2022

Hi @aiudirog,
Sorry for the delay in getting back to you. I don't manage to get it working with the

opensearch = OpenSearch(
    ...,
    connection_class=RequestsHttpConnection,
    http_auth=HTTPSPNEGOAuth(),
)

I get the error:

Traceback (most recent call last):
  File "connect.py", line 53, in <module>
    state = client.cluster.health()
  File "/usr/lib/python3.6/site-packages/opensearchpy/client/utils.py", line 177, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/usr/lib/python3.6/site-packages/opensearchpy/client/cluster.py", line 78, in health
    headers=headers,
  File "/usr/lib/python3.6/site-packages/opensearchpy/transport.py", line 412, in perform_request
    raise e
  File "/usr/lib/python3.6/site-packages/opensearchpy/transport.py", line 380, in perform_request
    timeout=timeout,
  File "/usr/lib/python3.6/site-packages/opensearchpy/connection/http_requests.py", line 210, in perform_request
    self._raise_error(response.status_code, raw_data)
  File "/usr/lib/python3.6/site-packages/opensearchpy/connection/base.py", line 331, in _raise_error
    status_code, error_message, additional_info
opensearchpy.exceptions.AuthenticationException: AuthenticationException(401, {'header': {'WWW-Authenticate': 'Negotiate'}})

I've checked with different options, like opportunistic_auth=True,mutual_authentication=DISABLED, delegate=True, and still no luck. It does work if I pass the headers, with the thing built from the kerberos.authGSSClientResponse, but that kind of defeats the purpose of using requests_gssapi

Do you know what I'm doing wrong?

@aiudirog
Copy link
Contributor

aiudirog commented Oct 31, 2022

Can you give the requests_kerberos library a try? That one uses pykerberos directly and I'm curious if the issue is GSSAPI or requests' handling of the auth.

Edit: You can also give mech=None a try with HTTPSPNEGOAuth to try and force Kerberos 5 instead of SPNEGO: pythongssapi/requests-gssapi#41

Edit 2: requests_gssapi also has extensive logging if you want to try and see what's going on under the hood:

import logging
logging.basicConfig()
logging.getLogger('requests_gssapi').setLevel(logging.DEBUG)

@psaiz
Copy link
Contributor Author

psaiz commented Nov 1, 2022

Hi @aiudirog ,

The requests_kerberos worked, thanks! In my case, something like:

from opensearchpy import OpenSearch, RequestsHttpConnection
import requests_kerberos
url = 'htps://...'
certpath = '/etc/...'
client = OpenSearch([url],
                   use_ssl=True,
                   verify_certs=True,
                   ca_certs=certpath,
                   connection_class=RequestsHttpConnection,
http_auth=requests_kerberos.HTTPKerberosAuth(mutual_authentication=requests_kerberos.OPTIONAL)
)

did the trick. And, once I knew the parameters for the mutual_authentication, I went back to requests_gssapi, and

...
from requests_gssapi import HTTPSPNEGOAuth, OPTIONAL
...
client = OpenSearch([url],
 ...
http_auth=HTTPSPNEGOAuth(mutual_authentication=OPTIONAL),
)

also worked.

So, indeed, the new class makes no sense. Following the comment from @dblock, I'll change this MR to update the documentation. I was thinking about leaving in a test (mocking the calls), and the optional dependency for the python_requests. Please, let me know if you want me to remove those as well.

@psaiz psaiz changed the title Add kerberos support DRAFT: Add kerberos support Nov 2, 2022
@psaiz psaiz closed this Nov 2, 2022
@psaiz psaiz reopened this Nov 2, 2022
@psaiz psaiz changed the title DRAFT: Add kerberos support Add kerberos support Nov 2, 2022
@psaiz
Copy link
Contributor Author

psaiz commented Nov 2, 2022

Ready for a new round of review. Note that now, the changes are mainly on the documentation. I've just made sure that my source branch was up to date with the changes that have happened in the meantime.

Please, do check the test. I'm not really happy with it. It works as an example of the code, However, it doesn't test much, since it mocks the call that does the connection. I checked if it would be better to mock something at a lower level (like the socket connect), but I didn't like that idea either. If you have any better ideas, I'm happy to look into that (or to drop the test completely if you prefer)

@psaiz psaiz requested a review from dblock November 2, 2022 14:36
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs are great, and I would be down with just merging that.

You correctly point out that the test doesn't quite test that the kerberos auth instance is actually called. What else does send call? Is there a way to mock those other things? If it's too difficult, then I think we should just remove it. We don't need to test other people's code really, and I believe we already have tests for http_auth being used (or we should have some).

USER_GUIDE.md Outdated Show resolved Hide resolved
USER_GUIDE.md Outdated Show resolved Hide resolved
USER_GUIDE.md Outdated Show resolved Hide resolved
@psaiz
Copy link
Contributor Author

psaiz commented Nov 10, 2022

Thanks @dblock, changes applied, and test removed

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only must have is schemas -> schemes (or methods). Schemas means something else.

Thanks for hanging in here with me! Really appreciate your help.

USER_GUIDE.md Outdated Show resolved Hide resolved
USER_GUIDE.md Outdated Show resolved Hide resolved
USER_GUIDE.md Outdated Show resolved Hide resolved
@psaiz
Copy link
Contributor Author

psaiz commented Nov 22, 2022

Thanks @dblock, ready for the next round

USER_GUIDE.md Outdated Show resolved Hide resolved
dblock
dblock previously approved these changes Nov 23, 2022
Signed-off-by: Pablo Saiz <[email protected]>
@psaiz psaiz reopened this Nov 24, 2022
@psaiz
Copy link
Contributor Author

psaiz commented Nov 24, 2022

Ok, sorry that I closed it. I've merged it with the changes that had happened in the mean time, and added a line to the Changelog, since one of the workflows was failing

dblock
dblock previously approved these changes Nov 28, 2022
@dblock
Copy link
Member

dblock commented Nov 28, 2022

@harshavamsi @VachaShah 2nd pls? Let's merge this?

@harshavamsi harshavamsi merged commit d6e994a into opensearch-project:main Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants