Skip to content

Commit

Permalink
fix(ingest/bigquery): ignore complex types from profiling (datahub-pr…
Browse files Browse the repository at this point in the history
  • Loading branch information
treff7es authored and cccs-Dustin committed Feb 1, 2023
1 parent d1fe32d commit 0f12509
Showing 1 changed file with 10 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,16 @@ def get_workunits(
continue

for table in tables[project][dataset]:
for column in table.columns:
# Profiler has issues with complex types (array, struct, geography, json), so we deny those types from profiling
# We also filter columns without data type as it means that column is part of a complex type.
if not column.data_type or any(
word in column.data_type.lower()
for word in ["array", "struct", "geography", "json"]
):
self.config.profile_pattern.deny.append(
f"^{project}.{dataset}.{table.name}.{column.field_path}$"
)
# Emit the profile work unit
profile_request = self.get_bigquery_profile_request(
project=project, dataset=dataset, table=table
Expand Down

0 comments on commit 0f12509

Please sign in to comment.