-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(tableau): skip ingesting filtered project assets #12591
fix(tableau): skip ingesting filtered project assets #12591
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Continue to review full report in Codecov by Sentry.
|
metadata-ingestion/tests/integration/tableau/test_tableau_ingest.py
Outdated
Show resolved
Hide resolved
metadata-ingestion/src/datahub/ingestion/source/tableau/tableau.py
Outdated
Show resolved
Hide resolved
Thanks for raising the issue and the proposal. Overall, it looks good. As a general comment, wondering if it would be possible to do the filtering upper in the call stack? Without a heavy refactor, of course. Ideally, and not sure if we are very consistent with that, emit methods should be exclusively responsible for generating the aspects and build the |
metadata-ingestion/tests/integration/tableau/test_tableau_ingest.py
Outdated
Show resolved
Hide resolved
metadata-ingestion/src/datahub/ingestion/source/tableau/tableau.py
Outdated
Show resolved
Hide resolved
I agree, but due to the way everything works, by traversing everything from downstream to upstream and the fact that even though published datasource is an upstream for embedded datasources, they still share all the upstreams, I can't think of a way to do it without a heavy refactoring. |
metadata-ingestion/src/datahub/ingestion/source/tableau/tableau.py
Outdated
Show resolved
Hide resolved
Those files may have been accidentally pushed. Maybe some issue during the merge? Once they are removed from the PR, this is ready to be merged. |
…-filtered-project-assets
Co-authored-by: Sergio Gómez Villamor <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the contrib!
dcf1576
into
datahub-project:master
…#12591) Co-authored-by: Mihai Ciocirdel <[email protected]> Co-authored-by: Jay <[email protected]> Co-authored-by: Aseem Bansal <[email protected]> Co-authored-by: Sergio Gómez Villamor <[email protected]>
…#12591) Co-authored-by: Mihai Ciocirdel <[email protected]> Co-authored-by: Jay <[email protected]> Co-authored-by: Aseem Bansal <[email protected]> Co-authored-by: Sergio Gómez Villamor <[email protected]>
Issue Description:
Problem: The Tableau asset ingestion process is designed to filter and ingest allowed projects first. Subsequent layer-by-layer asset ingestion incorrectly includes published data sources from projects that should be filtered out if these data sources are upstream of embedded data sources in allowed projects. This results in unauthorized assets being ingested without correct 'browsePaths', as their parent projects are not ingested.
Expected: The ingestion should strictly honor project filters and not ingest any upstream assets from disallowed projects. Assets only in allowed projects should have their 'browsePaths' created and ingested.
Checklist