-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DENG1381 - Add bqetl support for deprecation metadata #4213
Conversation
7f3093e
to
9cb8958
Compare
1e36fce
to
dd80ffb
Compare
Integration report for "Initial draft for adding deprecation support to bqetl"
|
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
3900d3c
to
5f82349
Compare
47b944b
to
14c44c1
Compare
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
7a87a6e
to
faf30c8
Compare
3d918a3
to
45a54a7
Compare
e0294bb
to
a0bcedb
Compare
32df3a0
to
798ebf1
Compare
Integration report for "Ignore dirs that do not have dataset_metadata yaml"
|
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
@whd and @scholtzan I have updated the code based on your comments. bigquery-etl/bigquery_etl/cli/metadata.py Line 51 in bd3a9ae
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r+wc
@@ -0,0 +1,14 @@ | |||
friendly_name: cjms non prod derived tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we discussed this in slack, but I'm of the opinion that we should (for now) only create dataset_metadata.yaml
files in moz-fx-data-shared-prod
, since that's where they are authoritative. So you should delete all of these dataset_metadata.yaml
files.
There's actually a single dataset_metadata.yaml
file https://github.com/mozilla/bigquery-etl/blob/bd3a9ae0b2e9c289ac93e67e426189d78749c2fd/sql/moz-fx-data-marketing-prod/adjust_derived/dataset_metadata.yaml that exists in a non-shared-prod dir and is technically incorrect. I think removing that can be handled in a separate PR.
workgroup_access: list = attr.ib(DEFAULT_WORKGROUP_ACCESS) | ||
|
||
def __attrs_post_init__(self): | ||
"""Set default table workgroup access to workgroup access.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it appears that this block makes it so that a default value is always available, which means the logic in most recent suggestion won't do anything. I suppose that explains some earlier logic that was writing this file all the time redundantly. It seems fine to do that but there are probably ways to avoid it as well.
bd3a9ae
to
f05e634
Compare
Integration report for "Remove unwanted dataset metadata yamls"
|
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
Integration report for "Update bigquery_etl/cli/metadata.py"
|
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
Integration report for "Merge branch 'main' into DENG1381-support_deprecation"
|
https://github.com/mozilla/private-bigquery-etl/commit/dbc0d01b41c947833718e926ba2f1ce8ab58af31 has the generated diff from this change. From my cursory scan it looks correct with some miscellaneous formatting changes but does have some interesting effects:
@scholtzan do you think it's worth trying to preserve comments in generated metadata? |
There shouldn't be downstream effects from this. Afaik we can't preserve comments. Once we parse the metadata files and enrich them with additional information the comments are lost. I think we'd need to use a different library for working with yaml files, |
Implements - https://mozilla-hub.atlassian.net/browse/DENG-1381
Checklist for reviewer:
<username>:<branch>
of the fork as parameter. The parameter will also show upin the logs of the
manual-trigger-required-for-fork
CI task together with more detailed instructions.For modifications to schemas in restricted namespaces (see
CODEOWNERS
):┆Issue is synchronized with this Jira Task