Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingest): bigquery-beta - Lowering a bit memory footprint of bigquery usage #6095

Merged

Conversation

treff7es
Copy link
Contributor

  • Lowering a bit of memory footprint of bigquery usage
  • Filtering out not seen tables from usage generation

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Filtering out not seen tables from usage generation
@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Sep 30, 2022
@github-actions
Copy link

github-actions bot commented Sep 30, 2022

Unit Test Results (metadata ingestion)

       8 files         8 suites   56m 7s ⏱️
   719 tests    716 ✔️ 3 💤 0
1 440 runs  1 434 ✔️ 6 💤 0

Results for commit 150f0e9.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

Unit Test Results (build & test)

584 tests  ±0   580 ✔️ ±0   12m 52s ⏱️ -37s
143 suites ±0       4 💤 ±0 
143 files   ±0       0 ±0 

Results for commit 150f0e9. ± Comparison against base commit 79575b2.

:::
"""

aggregated_info: Dict[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiousity, what was the reason for moving this into generate_usage_for_project?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I wanted to make sure don't keep in memory this dict longer than we should and making sure if we run the extract usage multiple times then it won't add others project usage to the same dict.

Copy link
Collaborator

@hsheth2 hsheth2 Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - feel free to merge whenever

@treff7es treff7es merged commit 05f5c12 into datahub-project:master Oct 1, 2022
@treff7es treff7es deleted the bigquery-usage-mem-improvements branch February 8, 2023 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants