Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest): add ability to preserve dbt table identifier casing #7854

Closed
wants to merge 4 commits into from
Closed

feat(ingest): add ability to preserve dbt table identifier casing #7854

wants to merge 4 commits into from

Conversation

viplazylmht
Copy link

Summary

Resolve the issue #7853.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Apr 19, 2023
@hsheth2
Copy link
Collaborator

hsheth2 commented May 24, 2023

@viplazylmht the code overall looks good here

However, I've generally found that the approach of setting convert_dataset_urns to True everywhere more reliably produced correct lineage. As such, I'm curious to understand the motivation behind this PR

@laulpogan
Copy link
Contributor

Hi @viplazylmht - we haven't seen activity on this PR for a little bit, are you still interested in contributing? If not we'll go ahead and close it if we haven't heard back from you in a week!

@viplazylmht
Copy link
Author

@hsheth2 @laulpogan I'm here. Well, convert_dataset_urns_to_lowercase currently has the default value as True, so it will not break any lineages.

In my case, I use datahub with dbt and Bigquery, and the Bigquery adapter said that they have a convert_urns_to_lowercase configuration, but default to False. So the urns they produced are completely different (because our bigquery tables are in UPPERCASE).
image

I am planning to integrate dbt to the existing datahub x bigquery production environment, so dbt should have the above config, instead of dropping all current metadata and ingesting all again.

@anshbansal anshbansal added the community-contribution PR or Issue raised by member(s) of DataHub Community label Jun 23, 2023
@shirshanka shirshanka added the on-deck PR or Issue that will be reviewed and/or addressed by the DataHub Maintainers in future cycles label Jun 28, 2024
@viplazylmht viplazylmht closed this by deleting the head repository Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community ingestion PR or Issue related to the ingestion of metadata on-deck PR or Issue that will be reviewed and/or addressed by the DataHub Maintainers in future cycles
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants