Skip to content

Commit

Permalink
feat(ingest): snowflake - update snowflake docs, add simple validatio…
Browse files Browse the repository at this point in the history
  • Loading branch information
mayurinehate authored and cccs-Dustin committed Feb 1, 2023
1 parent 19fee26 commit 02510af
Show file tree
Hide file tree
Showing 13 changed files with 43 additions and 2,049 deletions.
2 changes: 1 addition & 1 deletion docs/how/updating-datahub.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ This file documents any backwards-incompatible changes in DataHub and assists pe
- #6243 apache-ranger authorizer as plugin is not supported in DataHub Kubernetes deployment.
- #6243 Authentication and Authorization plugins configuration are removed from [application.yml](../../metadata-service/factories/src/main/resources/application.yml). Refer documentation [Migration Of Plugins From application.yml](../plugins.md#migration-of-plugins-from-applicationyml) for migrating any existing custom plugins.
- `datahub check graph-consistency` command has been removed. It was a beta API that we had considered but decided there are better solutions for this. So removing this.

- `graphql_url` option of `powerbi-report-server` source deprecated as the options is not used.

### Potential Downtime
Expand All @@ -19,6 +18,7 @@ This file documents any backwards-incompatible changes in DataHub and assists pe
### Other notable Changes

- #6611 - Snowflake `schema_pattern` now accepts pattern for fully qualified schema name in format `<catalog_name>.<schema_name>` by setting config `match_fully_qualified_names : True`. Current default `match_fully_qualified_names: False` is only to maintain backward compatibility. The config option `match_fully_qualified_names` will be deprecated in future and the default behavior will assume `match_fully_qualified_names: True`."
- #6636 - Sources `snowflake-legacy` and `snowflake-usage-legacy` have been removed.

## 0.9.3

Expand Down
7 changes: 1 addition & 6 deletions metadata-ingestion/docs/sources/snowflake/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
Ingesting metadata from Snowflake requires either using the **snowflake** module with just one recipe (recommended) or the two separate modules **snowflake-legacy** and **snowflake-usage-legacy** (soon to be deprecated) with two separate recipes.

All three modules are described on this page.

## Snowflake Ingestion through the UI

The following video shows you how to ingest Snowflake metadata through the UI.
Expand All @@ -23,5 +19,4 @@ The following video shows you how to ingest Snowflake metadata through the UI.
/>
</div>


Read on if you are interested in ingesting Snowflake metadata using the **datahub** cli, or want to learn about all the configuration parameters that are supported by the connectors.
Read on if you are interested in ingesting Snowflake metadata using the **datahub** cli, or want to learn about all the configuration parameters that are supported by the connectors.
56 changes: 0 additions & 56 deletions metadata-ingestion/docs/sources/snowflake/snowflake-legacy_pre.md

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

33 changes: 14 additions & 19 deletions metadata-ingestion/docs/sources/snowflake/snowflake_recipe.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
source:
type: snowflake
config:
# This option is recommended to be used for the first time to ingest all lineage
# This option is recommended to be used to ingest all lineage
ignore_start_time_lineage: true
# This is an alternative option to specify the start_time for lineage
# if you don't want to look back since beginning
start_time: "2022-03-01T00:00:00Z"

# Coordinates
account_id: "abc48144"
Expand All @@ -16,25 +13,23 @@ source:
password: "${SNOWFLAKE_PASS}"
role: "datahub_role"

# Change these as per your database names. Remove to get all databases
database_pattern:
allow:
- "^ACCOUNTING_DB$"
- "^MARKETING_DB$"

table_pattern:
allow:
# If you want to ingest only few tables with name revenue and sales
- ".*revenue"
- ".*sales"
# (Optional) Uncomment and update this section to filter ingested datasets
# database_pattern:
# allow:
# - "^ACCOUNTING_DB$"
# - "^MARKETING_DB$"

profiling:
# Change to false to disable profiling
enabled: true
# This option is recommended to reduce profiling time and costs.
turn_off_expensive_profiling_metrics: true
profile_pattern:
allow:
- "ACCOUNTING_DB.*.*"
- "MARKETING_DB.*.*"

# (Optional) Uncomment and update this section to filter profiled tables
# profile_pattern:
# allow:
# - "ACCOUNTING_DB.*.*"
# - "MARKETING_DB.*.*"

# Default sink is datahub-rest and doesn't need to be configured
# See https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub for customization options
8 changes: 0 additions & 8 deletions metadata-ingestion/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -329,12 +329,6 @@ def get_long_description():
"s3": {*s3_base, *data_lake_profiling},
"sagemaker": aws_common,
"salesforce": {"simple-salesforce"},
"snowflake-legacy": snowflake_common,
"snowflake-usage-legacy": snowflake_common
| usage_common
| {
"more-itertools>=8.12.0",
},
"snowflake": snowflake_common | usage_common,
"snowflake-beta": (
snowflake_common | usage_common
Expand Down Expand Up @@ -532,8 +526,6 @@ def get_long_description():
"redash = datahub.ingestion.source.redash:RedashSource",
"redshift = datahub.ingestion.source.sql.redshift:RedshiftSource",
"redshift-usage = datahub.ingestion.source.usage.redshift_usage:RedshiftUsageSource",
"snowflake-legacy = datahub.ingestion.source.sql.snowflake:SnowflakeSource",
"snowflake-usage-legacy = datahub.ingestion.source.usage.snowflake_usage:SnowflakeUsageSource",
"snowflake = datahub.ingestion.source.snowflake.snowflake_v2:SnowflakeV2Source",
"superset = datahub.ingestion.source.superset:SupersetSource",
"tableau = datahub.ingestion.source.tableau:TableauSource",
Expand Down
Loading

0 comments on commit 02510af

Please sign in to comment.