Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite dbt cloud crawler using discovery API #1052

Merged
merged 11 commits into from
Dec 26, 2024

Conversation

alyiwang
Copy link
Contributor

@alyiwang alyiwang commented Dec 24, 2024

🤔 Why?

Should use the dbt discovery API environment endpoint to fetch most of the metadata instead of jobs. This simplifies the steps and has more complete lineage info.

🤓 What?

  • rewrite the whole dbt cloud crawler using discovery API environment endpoint, as well as the admin API to get all projects and environments
  • Update required configs, job_ids no longer used

🧪 Tested?

Tested against metaphor dbt instance. File diff with MCE generated using previous crawlers. The results are mostly the same.

Known differences

  • now able to retrieve dbt model columns
  • dbtModel.sourceModels not longer filled as it's an deprecated field in favor of entityUpstream
  • docsUrl no longer filled, as the previous format https://cloud.getdbt.com/accounts/123/jobs/146/docs/#!/xxx no longer supported by dbt
  • test.sql not available currently, can get it from top-level environment.tests endpoint later on
  • metrics label, dimensions, filters, timeGrains are not available right now, but formula is now available.

☑️ Checks

  • My PR contains actual code changes, and I have updated the version number in pyproject.toml.

@alyiwang alyiwang requested a review from mars-lan December 24, 2024 19:48
Copy link

github-actions bot commented Dec 24, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
13525 12117 90% 85% 🟢

New Files

File Coverage Status
metaphor/dbt/cloud/parser/env_parser.py 100% 🟢
metaphor/dbt/cloud/parser/lineage_parser.py 93% 🟢
metaphor/dbt/cloud/parser/macro_parser.py 96% 🟢
metaphor/dbt/cloud/parser/metric_parser.py 97% 🟢
metaphor/dbt/cloud/parser/source_parser.py 92% 🟢
TOTAL 96% 🟢

Modified Files

File Coverage Status
metaphor/common/entity_id.py 96% 🟢
metaphor/dbt/cloud/client.py 100% 🟢
metaphor/dbt/cloud/config.py 100% 🟢
metaphor/dbt/cloud/extractor.py 96% 🟢
metaphor/dbt/cloud/parser/common.py 71% 🟢
metaphor/dbt/util.py 94% 🟢
TOTAL 93% 🟢

updated for commit: 050dcb0 by action🐍

Copy link

codecov bot commented Dec 24, 2024

Codecov Report

Attention: Patch coverage is 93.26425% with 26 lines in your changes missing coverage. Please review.

Project coverage is 89.58%. Comparing base (b4913d5) to head (050dcb0).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
metaphor/dbt/cloud/parser/model_parser.py 86.25% 11 Missing ⚠️
metaphor/dbt/cloud/parser/lineage_parser.py 92.53% 5 Missing ⚠️
metaphor/dbt/cloud/parser/source_parser.py 91.83% 4 Missing ⚠️
metaphor/dbt/util.py 90.47% 2 Missing ⚠️
metaphor/common/entity_id.py 80.00% 1 Missing ⚠️
metaphor/dbt/cloud/extractor.py 94.73% 1 Missing ⚠️
metaphor/dbt/cloud/parser/macro_parser.py 96.42% 1 Missing ⚠️
metaphor/dbt/cloud/parser/metric_parser.py 96.77% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1052      +/-   ##
==========================================
+ Coverage   89.54%   89.58%   +0.04%     
==========================================
  Files         211      210       -1     
  Lines       13525    13525              
==========================================
+ Hits        12111    12117       +6     
+ Misses       1414     1408       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alyiwang alyiwang enabled auto-merge (squash) December 25, 2024 00:00
Copy link
Contributor

@mars-lan mars-lan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for the refactoring.

@alyiwang alyiwang merged commit 4433922 into main Dec 26, 2024
6 checks passed
@alyiwang alyiwang deleted the yi.wang/sc-29961/missing-complied-code-from-dbt branch December 26, 2024 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants