-
Notifications
You must be signed in to change notification settings - Fork 3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs(adding users): Refreshing the docs for adding new DataHub Users (#…
- Loading branch information
1 parent
d9b71ce
commit 1503ef3
Showing
5 changed files
with
173 additions
and
118 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,41 +1,55 @@ | ||
# Overview | ||
|
||
Authentication is the process of verifying the identity of a user or service. In DataHub this can be split into 2 main components: | ||
- How to login into DataHub. | ||
- How to make some action withing DataHub on **behalf** of a user/service. | ||
Authentication is the process of verifying the identity of a user or service. There are two | ||
places where Authentication occurs inside DataHub: | ||
|
||
:::note | ||
1. DataHub frontend service when a user attempts to log in to the DataHub application. | ||
2. DataHub backend service when making API requests to DataHub. | ||
|
||
Authentication in DataHub does not necessarily mean that the user/service being authenticated will be part of the metadata graph within DataHub itself other concepts like Datasets or Dashboards. | ||
In other words, a user called `john.smith` logging into DataHub does not mean that john.smith appears as a CorpUser Entity within DataHub. | ||
In this document, we'll tak a closer look at both. | ||
|
||
For a quick video on that subject, have a look at our video on [DataHub Basics — Users, Groups, & Authentication 101 | ||
](https://youtu.be/8Osw6p9vDYY) | ||
### Authentication in the Frontend | ||
|
||
::: | ||
Authentication of normal users of DataHub takes place in two phases. | ||
|
||
### Authentication in the Frontend | ||
At login time, authentication is performed by either DataHub itself (via username / password entry) or a third-party Identity Provider. Once the identity | ||
of the user has been established, and credentials validated, a persistent session token is generated for the user and stored | ||
in a browser-side session cookie. | ||
|
||
DataHub provides 3 mechanisms for authentication at login time: | ||
|
||
- **Native Authentication** which uses username and password combinations natively stored and managed by DataHub, with users invited via an invite link. | ||
- [Single Sign-On with OpenID Connect](guides/sso/configure-oidc-react.md) to delegate authentication responsibility to third party systems like Okta or Google/Azure Authentication. This is the recommended approach for production systems. | ||
- [JaaS Authentication](guides/jaas.md) for simple deployments where authenticated users are part of some known list or invited as a [Native DataHub User](guides/add-users.md). | ||
|
||
In subsequent requests, the session token is used to represent the authenticated identity of the user, and is validated by DataHub's backend service (discussed below). | ||
Eventually, the session token is expired (24 hours by default), at which point the end user is required to log in again. | ||
|
||
### Authentication in the Backend (Metadata Service) | ||
|
||
Authentication in DataHub happens at 2 possible moments, if enabled. | ||
When a user makes a request for Data within DataHub, the request is authenticated by DataHub's Backend (Metadata Service) via a JSON Web Token. This applies to both requests originating from the DataHub application, | ||
and programmatic calls to DataHub APIs. There are two types of tokens that are important: | ||
|
||
The first happens in the **DataHub Frontend** component when you access the UI. | ||
You will be prompted with a login screen, upon which you must supply a username/password combo or OIDC login to access DataHub's UI. | ||
This is typical scenario for a human interacting with DataHub. | ||
1. **Session Tokens**: Generated for users of the DataHub web application. By default, having a duration of 24 hours. | ||
These tokens are encoded and stored inside browser-side session cookies. | ||
2. **Personal Access Tokens**: These are tokens generated via the DataHub settings panel useful for interacting | ||
with DataHub APIs. They can be used to automate processes like enriching documentation, ownership, tags, and more on DataHub. Learn | ||
more about Personal Access Tokens [here](personal-access-tokens.md). | ||
|
||
DataHub provides 2 methods of authentication: | ||
- [JaaS Authentication](guides/jaas.md) for simple deployments where authenticated users are part of some known list or invited as a [Native DataHub User](guides/add-users.md). | ||
- [OIDC Authentication](guides/sso/configure-oidc-react.md) to delegate authentication responsibility to third party systems like Okta or Google/Azure Authentication. This is the recommended approach for production systems. | ||
To learn more about DataHub's backend authentication, check out [Introducing Metadata Service Authentication](introducing-metadata-service-authentication.md). | ||
|
||
Upon validation of a user's credentials through one of these authentication systems, DataHub will generate a session token with which all subsequent requests will be made. | ||
Credentials must be provided as Bearer Tokens inside of the **Authorization** header in any request made to DataHub's API layer. To learn | ||
|
||
### Authentication in the Backend | ||
```shell | ||
Authorization: Bearer <your-token> | ||
``` | ||
|
||
The second way in which authentication occurs, is within DataHub's Backend (Metadata Service) when a user makes a request either through the UI or through APIs. | ||
In this case DataHub makes use of Personal Access Tokens or session HTTP headers to apply actions on behalf of some user. | ||
To learn more about DataHub's backend authentication have a look at our docs on [Introducing Metadata Service Authentication](introducing-metadata-service-authentication.md). | ||
Note that in DataHub local quickstarts, Authentication at the backend layer is disabled for convenience. This leaves the backend | ||
vulnerable to unauthenticated requests and should not be used in production. To enable | ||
backend (token-based) authentication, simply set the `METADATA_SERVICE_AUTH_ENABLED=true` environment variable | ||
for the datahub-gms container or pod. | ||
|
||
Note, while authentication can happen on both the frontend or backend components of DataHub, they are separate, related processes. | ||
The first is to authenticate users/services by a third party system (Open-ID connect or Java based authentication) and the latter to only permit identified requests to be accepted by DataHub via access tokens or bearer cookies. | ||
### References | ||
|
||
If you only want some users to interact with DataHub's UI, enable authentication in the Frontend and manage who is allowed either through JaaS or OIDC login methods. | ||
If you want users to be able to access DataHub's backend directly without going through the UI in an authenticated manner, then enable authentication in the backend and generate access tokens for them. | ||
For a quick video on the topic of users and groups within DataHub, have a look at [DataHub Basics — Users, Groups, & Authentication 101 | ||
](https://youtu.be/8Osw6p9vDYY) |
Oops, something went wrong.