Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ui) Handle encoded schemaField urns on the frontend #6321

Conversation

chriscollins3456
Copy link
Collaborator

Our ingestion crew had to encode schemaField urns so that the fieldPath of the schema is encoding a few special characters ((, ), and ,). We need to decode those on the frontend wherever we display fieldPaths that we get from upstreamLineage aspects. We also need to encode fieldPaths for CLL impact analysis since the urn in our graph index will have the encoded urns.

I also added a comment where we ingestion calls this encoding to make sure they are always in sync with the frontend code as well.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX labels Oct 31, 2022
@github-actions
Copy link

github-actions bot commented Oct 31, 2022

Unit Test Results (metadata ingestion)

       8 files         8 suites   55m 58s ⏱️
   756 tests    753 ✔️ 3 💤 0
1 514 runs  1 507 ✔️ 7 💤 0

Results for commit dbe889f.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Oct 31, 2022

Unit Test Results (build & test)

602 tests  +3   598 ✔️ +3   11m 54s ⏱️ +12s
149 suites ±0       4 💤 ±0 
149 files   ±0       0 ±0 

Results for commit dbe889f. ± Comparison against base commit 9e6489f.

♻️ This comment has been updated with latest results.

@@ -116,9 +116,17 @@ export function filterColumns(
}
}

export function decodeSchemaField(fieldPath: string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -25,7 +26,7 @@ export default function DisplayedColumns({ displayedColumns }: Props) {
return (
<ColumnNameWrapper>
{entity.type === EntityType.SchemaField
? downgradeV2FieldPath((entity as SchemaFieldEntity).fieldPath)
? decodeSchemaField(downgradeV2FieldPath((entity as SchemaFieldEntity).fieldPath) || '')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we decode after downgrading or before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would really matter as decoding it just converts a few encoded characters back to normal ((, ), and ,) and none of those should be affected by the V2 field path. I believe decoding after downgrading would be safest so then we're decoding what we want to show in the UI.

@@ -1,6 +1,8 @@
import urllib.parse
from typing import List

# NOTE: Frontend relies on encoding these three characters. Specifically, we decode and encode schema fields for column level lineage.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - In such cases, how will these column fields appear in

a) Schema Tab
b) Stats Tab

Do we also need that encoding there? If yes, maybe consider adding in this PR, or marking TODOs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great question - I believe these tabs do not need to be decoded, since the tables in them are reading from fields in mysql that have the fieldPath stored in them, not from schemaField urns that we generate and encode.

So the only time we have encoded fieldPaths is when we're actually grabbing the fieldPath from the schema field urn - and we only have those stored in upstreamLineage, inputFields, and now that you're calling this out, I discovered we're also storing them in DatasetAssertionInfo and under foreignKeys in SchemaMetadata - going to add some coverage for that just in case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thanks!

@jjoyce0510 jjoyce0510 merged commit f24f745 into datahub-project:master Nov 3, 2022
anshbansal pushed a commit to acryldata/datahub that referenced this pull request Nov 4, 2022
cccs-tom pushed a commit to CybercentreCanada/datahub that referenced this pull request Nov 18, 2022
cccs-tom pushed a commit to CybercentreCanada/datahub that referenced this pull request Nov 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants