forked from datahub-project/datahub
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs(ingest): improve doc gen, docs for snowflake, looker (datahub-pr…
- Loading branch information
1 parent
62699a1
commit fe73ab9
Showing
13 changed files
with
276 additions
and
65 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
### Pre-Requisites | ||
|
||
#### Set up the right permissions | ||
You need to provide the following permissions for ingestion to work correctly. | ||
``` | ||
access_data | ||
explore | ||
manage_models | ||
see_datagroups | ||
see_lookml | ||
see_lookml_dashboards | ||
see_looks | ||
see_pdts | ||
see_queries | ||
see_schedules | ||
see_sql | ||
see_system_activity | ||
see_user_dashboards | ||
see_users | ||
``` | ||
Here is an example permission set after configuration. | ||
![Looker DataHub Permission Set](./looker_datahub_permission_set.png) | ||
|
||
#### Get an API key | ||
|
||
You need to get an API key for the account with the above privileges to perform ingestion. See the [Looker authentication docs](https://docs.looker.com/reference/api-and-integration/api-auth#authentication_with_an_sdk) for the steps to create a client ID and secret. | ||
|
||
|
||
### Ingestion through UI | ||
|
||
The following video shows you how to get started with ingesting Looker metadata through the UI. | ||
|
||
:::note | ||
|
||
You will need to run `lookml` ingestion through the CLI after you have ingested Looker metadata through the UI. Otherwise you will not be able to see Looker Views and their lineage to your warehouse tables. | ||
|
||
::: | ||
|
||
<div | ||
style={{ | ||
position: "relative", | ||
paddingBottom: "57.692307692307686%", | ||
height: 0 | ||
}} | ||
> | ||
<iframe | ||
src="https://www.loom.com/embed/b8b9654e02714d20a44122cc1bffc1bb" | ||
frameBorder={0} | ||
webkitallowfullscreen="" | ||
mozallowfullscreen="" | ||
allowFullScreen="" | ||
style={{ | ||
position: "absolute", | ||
top: 0, | ||
left: 0, | ||
width: "100%", | ||
height: "100%" | ||
}} | ||
/> | ||
</div> | ||
|
||
|
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#### Configuration Notes | ||
|
||
:::note | ||
|
||
The integration can use an SQL parser to try to parse the tables the views depends on. | ||
|
||
::: | ||
|
||
This parsing is disabled by default, but can be enabled by setting `parse_table_names_from_sql: True`. The default parser is based on the [`sqllineage`](https://pypi.org/project/sqllineage/) package. | ||
As this package doesn't officially support all the SQL dialects that Looker supports, the result might not be correct. You can, however, implement a custom parser and take it into use by setting the `sql_parser` configuration value. A custom SQL parser must inherit from `datahub.utilities.sql_parser.SQLParser` | ||
and must be made available to Datahub by ,for example, installing it. The configuration then needs to be set to `module_name.ClassName` of the parser. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
### Pre-requisites | ||
|
||
#### [Optional] Create an API key | ||
|
||
See the [Looker authentication docs](https://docs.looker.com/reference/api-and-integration/api-auth#authentication_with_an_sdk) for the steps to create a client ID and secret. | ||
You need to ensure that the API key is attached to a user that has Admin privileges. | ||
|
||
If that is not possible, read the configuration section and provide an offline specification of the `connection_to_platform_map` and the `project_name`. | ||
|
||
### Ingestion through UI | ||
|
||
Ingestion using lookml connector is not supported through the UI. | ||
However, you can set up ingestion using a GitHub Action to push metadata whenever your main lookml repo changes. | ||
|
||
#### Sample GitHub Action | ||
|
||
Drop this file into your `.github/workflows` directory inside your Looker github repo. | ||
|
||
``` | ||
name: lookml metadata upload | ||
on: | ||
push: | ||
branches: | ||
- main | ||
paths-ignore: | ||
- "docs/**" | ||
- "**.md" | ||
pull_request: | ||
branches: | ||
- main | ||
paths-ignore: | ||
- "docs/**" | ||
- "**.md" | ||
release: | ||
types: [published, edited] | ||
workflow_dispatch: | ||
jobs: | ||
lookml-metadata-upload: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: '3.9' | ||
- name: Run LookML ingestion | ||
run: | | ||
pip install 'acryl-datahub[lookml,datahub-rest]' | ||
cat << EOF > lookml_ingestion.yml | ||
# LookML ingestion configuration | ||
source: | ||
type: "lookml" | ||
config: | ||
base_folder: ${{ github.workspace }} | ||
parse_table_names_from_sql: true | ||
github_info: | ||
repo: ${{ github.repository }} | ||
branch: ${{ github.ref }} | ||
# Options | ||
#connection_to_platform_map: | ||
# acryl-snow: snowflake | ||
#platform: snowflake | ||
#default_db: DEMO_PIPELINE | ||
api: | ||
client_id: ${LOOKER_CLIENT_ID} | ||
client_secret: ${LOOKER_CLIENT_SECRET} | ||
base_url: ${LOOKER_BASE_URL} | ||
sink: | ||
type: datahub-rest | ||
config: | ||
server: ${DATAHUB_GMS_HOST} | ||
token: ${DATAHUB_TOKEN} | ||
EOF | ||
datahub ingest -c lookml_ingestion.yml | ||
env: | ||
DATAHUB_GMS_HOST: ${{ secrets.DATAHUB_GMS_HOST }} | ||
DATAHUB_TOKEN: ${{ secrets.DATAHUB_TOKEN }} | ||
LOOKER_BASE_URL: https://acryl.cloud.looker.com # <--- replace with your Looker base URL | ||
LOOKER_CLIENT_ID: ${{ secrets.LOOKER_CLIENT_ID }} | ||
LOOKER_CLIENT_SECRET: ${{ secrets.LOOKER_CLIENT_SECRET }} | ||
``` | ||
|
||
If you want to ingest lookml using the **datahub** cli directly, read on for instructions and configuration details. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,29 @@ | ||
To get all metadata from Snowflake you need to use two plugins `snowflake` and `snowflake-usage`. Both of them are described in this page. These will require 2 separate recipes. | ||
Ingesting metadata from Snowflake requires either using the **snowflake-beta** module with just one recipe (recommended) or the two separate modules **snowflake** and **snowflake-usage** (soon to be deprecated) with two separate recipes. | ||
|
||
All three modules are described on this page. | ||
|
||
We encourage you to try out new `snowflake-beta` plugin as alternative to running both `snowflake` and `snowflake-usage` plugins and share feedback. `snowflake-beta` is much faster than `snowflake` for extracting metadata . | ||
We encourage you to try out the new **snowflake-beta** plugin as alternative to running both **snowflake** and **snowflake-usage** plugins and share feedback. `snowflake-beta` is much faster than `snowflake` for extracting metadata. | ||
|
||
## Snowflake Ingestion through the UI | ||
|
||
The following video shows you how to ingest Snowflake metadata through the UI. | ||
|
||
<div style={{ position: "relative", paddingBottom: "56.25%", height: 0 }}> | ||
<iframe | ||
src="https://www.loom.com/embed/15d0401caa1c4aa483afef1d351760db" | ||
frameBorder={0} | ||
webkitallowfullscreen="" | ||
mozallowfullscreen="" | ||
allowFullScreen="" | ||
style={{ | ||
position: "absolute", | ||
top: 0, | ||
left: 0, | ||
width: "100%", | ||
height: "100%" | ||
}} | ||
/> | ||
</div> | ||
|
||
|
||
Read on if you are interested in ingesting Snowflake metadata using the **datahub** cli, or want to learn about all the configuration parameters that are supported by the connectors. |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
Oops, something went wrong.