Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest): add column-level lineage support for snowflake #6034

Merged
merged 3 commits into from
Sep 28, 2022

Conversation

mayurinehate
Copy link
Collaborator

@mayurinehate mayurinehate commented Sep 23, 2022

Other changes

  • fix missing view upstream lineage
  • remove customProperties with column-lineage info

Note:

  • Direct Column level lineage is available only for tables(as downstream) and not for views(as downstream).
  • The bag of columns lineage as available in customProperties earlier is not used as fine grained lineage, but could be used if required with small code change, if required.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Sep 23, 2022
@github-actions
Copy link

github-actions bot commented Sep 23, 2022

Unit Test Results (metadata ingestion)

       8 files  ±0         8 suites  ±0   1h 0m 58s ⏱️ -33s
   719 tests +2     716 ✔️ +2  3 💤 ±0  0 ±0 
1 440 runs  +4  1 434 ✔️ +4  6 💤 ±0  0 ±0 

Results for commit 75e71c7. ± Comparison against base commit 0728656.

This pull request removes 8 and adds 10 tests. Note that renamed tests count towards both.
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/bad_complex_variable_expansion.yml-None-env6-MissingClosingBrace]
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/bad_extension.whatevenisthis-None-env5-ConfigurationError]
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/bad_variable_expansion.yml-None-env3-ParameterNullOrNotSet]
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/basic.toml-golden_config1-env1-None]
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/basic.yml-golden_config0-env0-None]
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/complex_variable_expansion.yml-golden_config7-env7-None]
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/simple_variable_expansion.yml-golden_config2-env2-None]
tests.unit.config.test_config_loader ‑ test_load[tests/unit/config/this_file_does_not_exist.yml-None-env4-ConfigurationError]
tests.integration.tableau.test_tableau_ingest ‑ test_tableau_usage_stat
tests.unit.config.test_config_loader ‑ test_load_error[tests/unit/config/bad_complex_variable_expansion.yml-env3-MissingClosingBrace]
tests.unit.config.test_config_loader ‑ test_load_error[tests/unit/config/bad_extension.whatevenisthis-env2-ConfigurationError]
tests.unit.config.test_config_loader ‑ test_load_error[tests/unit/config/bad_variable_expansion.yml-env0-ParameterNullOrNotSet]
tests.unit.config.test_config_loader ‑ test_load_error[tests/unit/config/this_file_does_not_exist.yml-env1-ConfigurationError]
tests.unit.config.test_config_loader ‑ test_load_success[tests/unit/config/basic.toml-golden_config1-env1-None]
tests.unit.config.test_config_loader ‑ test_load_success[tests/unit/config/basic.yml-golden_config0-env0-referenced_env_vars0]
tests.unit.config.test_config_loader ‑ test_load_success[tests/unit/config/complex_variable_expansion.yml-golden_config3-env3-referenced_env_vars3]
tests.unit.config.test_config_loader ‑ test_load_success[tests/unit/config/simple_variable_expansion.yml-golden_config2-env2-referenced_env_vars2]
tests.unit.stateful_ingestion.state.test_checkpoint ‑ test_supported_encodings

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Sep 23, 2022

Unit Test Results (build & test)

584 tests  +16   580 ✔️ +16   12m 54s ⏱️ -2s
143 suites +  3       4 💤 ±  0 
143 files   +  3       0 ±  0 

Results for commit 75e71c7. ± Comparison against base commit 0728656.

This pull request removes 9 and adds 25 tests. Note that renamed tests count towards both.
com.datahub.authentication.user.NativeUserServiceTest ‑ testConstructor
com.datahub.authentication.user.NativeUserServiceTest ‑ testCreateNativeUserInviteTokenDoesNotExist
com.datahub.authentication.user.NativeUserServiceTest ‑ testGenerateNativeUserInviteTokenPasses
com.datahub.authentication.user.NativeUserServiceTest ‑ testGetNativeUserInviteTokenInviteTokenDoesNotExistPasses
com.datahub.authentication.user.NativeUserServiceTest ‑ testGetNativeUserInviteTokenPasses
com.linkedin.datahub.graphql.resolvers.user.CreateNativeUserInviteTokenResolverTest ‑ testFailsCannotManageUserCredentials
com.linkedin.datahub.graphql.resolvers.user.CreateNativeUserInviteTokenResolverTest ‑ testPasses
com.linkedin.datahub.graphql.resolvers.user.GetNativeUserInviteTokenResolverTest ‑ testFailsCannotManageUserCredentials
com.linkedin.datahub.graphql.resolvers.user.GetNativeUserInviteTokenResolverTest ‑ testPasses
com.datahub.authentication.invite.InviteTokenServiceTest ‑ getInviteToken
com.datahub.authentication.invite.InviteTokenServiceTest ‑ getInviteTokenEmptySearchResult
com.datahub.authentication.invite.InviteTokenServiceTest ‑ getInviteTokenNoInviteTokenAspect
com.datahub.authentication.invite.InviteTokenServiceTest ‑ getInviteTokenNullEntity
com.datahub.authentication.invite.InviteTokenServiceTest ‑ getInviteTokenRegenerate
com.datahub.authentication.invite.InviteTokenServiceTest ‑ getInviteTokenRoleUrnDoesNotExist
com.datahub.authentication.invite.InviteTokenServiceTest ‑ testGetInviteTokenRole
com.datahub.authentication.invite.InviteTokenServiceTest ‑ testGetInviteTokenRoleEmptyAspectMap
com.datahub.authentication.invite.InviteTokenServiceTest ‑ testGetInviteTokenRoleNoRole
com.datahub.authentication.invite.InviteTokenServiceTest ‑ testGetInviteTokenRoleNullEntity
…

♻️ This comment has been updated with latest results.

view_name,
db_row["VIEW_COLUMNS"],
db_row["DOWNSTREAM_TABLE_COLUMNS"],
json.loads(db_row["VIEW_COLUMNS"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if this bombs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I don't expect bombing since the VIEW_COLUMNS column is equivalent to columns json array as per snowflake access history view documentation and the dict to class conversion is protected already.

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have some questions around __eq__ and __hash__

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hsheth2 hsheth2 changed the title feat(ingest): add column level lineage support in snowflake source feat(ingest): add column-level lineage support for snowflake Sep 28, 2022
@hsheth2 hsheth2 merged commit 21a8718 into datahub-project:master Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants