-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new dbt example DAG #9
Conversation
… to permanently store the store_failures tables generated by dbt test.
Issues with BigQuery ID and dbt schema led to dropping the custom test that was designed to fail, and designing the month test to fail instead. The custom test for whatever reason was trying to use the schema.project_id as the project ID and BigQuery did NOT like that. Not sure what the fix is. Additionally, the DAG is completed by adding a BigQuery copy table operator. This copies a single table written by dbt test on failures to a permament path. This is a not-so-great, but workable, solution until AIP-42 gets merged and we can do some real dynamic task generation based on XCOM return values. In preparation for that time, BigQueryGetDatasetTablesOperator is left in commented-out.
Renamed DAGs and cleaned up bigquery version to not use certain imports and operators until AIP-42 is released. Added a snowflake version of the DAG which can also use the FFMC test. Runs dbt run then dbt test, with 2 tests designed to fail. The outputs are then loaded to a permanent table in the original schema. Modifications to the directory structure to support these two separate DAGs as their own dbt projects, as they work slightly differently.
New DAG and dbt project were created and modified to allow use of Redshift as a backend to permanently store the store_failures tables generated by dbt on test failures.
) | ||
|
||
""" | ||
Run dbt test suite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only feedback here is that from the perspective of a new user (me), it's not clear reading this DAG what exactly the dbt test
command is doing and what it's testing for. I don't know if that'd be obvious with someone with more dbt experience, but just wanted to call that out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. So the dbt test
command runs the test suite which is specified under include/dbt/[project]/models/[model]/schema.yml
. I can add this to the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Killer stuff! Really great to see another set of dbt examples.
limitations in dynamic task mapping, where needed values like 'source_table' | ||
cannot be retrieved from Variables or other backend sources. | ||
|
||
One is given as an example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment needed here and in copy_store_failures_snowflake.py
?
Add new dbt example DAG and necessary updates for dbt to showcase how to permanently store the store_failures tables generated by dbt test. The DAG comes from a question from the data quality webinar:
From dbt PR #2593: