-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest): bigquery - Fixing querying non-date partition columns in profiling #6554
fix(ingest): bigquery - Fixing querying non-date partition columns in profiling #6554
Conversation
Adding to the source report the number of non-eligible tables for profiling
Unit Test Results (metadata ingestion) 8 files ±0 8 suites ±0 57m 52s ⏱️ - 5m 51s For more details on these failures, see this check. Results for commit fa1b51a. ± Comparison against base commit d424edd. ♻️ This comment has been updated with latest results. |
if table.time_partitioning.type_ in ("DAY", "MONTH", "YEAR"): | ||
partition_where_clause = "{column_name} BETWEEN DATE('{partition_id}') AND DATE('{upper_bound_partition_id}')".format( | ||
partition_where_clause = "{column_name} BETWEEN {partition_column_type}('{partition_id}') AND {partition_column_type}('{upper_bound_partition_id}')".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: you could use f-strings here and avoid .format
else: | ||
skip_profiling = True | ||
profile_table_level_only = True | ||
self.report.num_tables_not_eligible_profiling[dataset] = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: make num_tables_not_eligible_profiling
a Counter and no need to do this tricks.
just: self.report.num_tables_not_eligible_profiling[dataset]++
https://docs.python.org/3/library/collections.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we try to use in report structs that can't grow indefinitely and there this way I think won't work :(
@@ -66,7 +66,7 @@ class OperationalDataMeta: | |||
def bigquery_audit_metadata_query_template( | |||
dataset: str, | |||
use_date_sharded_tables: bool, | |||
table_allow_filter: str = None, | |||
table_allow_filter: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this change necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field should be Optional and mypy (linter) was complaining about it.
Adding to the source report the number of non-eligible tables for profiling
Checklist