Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ui) Handle encoded schemaField urns on the frontend #6321

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import { Link } from 'react-router-dom';
import styled from 'styled-components';
import { GetDatasetQuery } from '../../../../../../../graphql/dataset.generated';
import { EntityType } from '../../../../../../../types.generated';
import { decodeSchemaField } from '../../../../../../lineage/utils/columnLineageUtils';
import CompactContext from '../../../../../../shared/CompactContext';
import { useEntityRegistry } from '../../../../../../useEntityRegistry';
import { ANTD_GRAY } from '../../../../constants';
Expand Down Expand Up @@ -130,7 +131,7 @@ export const SchemaRow = ({
<TableTitle>{baseEntity.dataset?.name}</TableTitle>
{selectedFk?.constraint?.sourceFields?.map((field) => (
<div key={field?.fieldPath}>
<FieldBadge count={field?.fieldPath} />
<FieldBadge count={decodeSchemaField(field?.fieldPath || '')} />
</div>
))}
</div>
Expand All @@ -139,7 +140,7 @@ export const SchemaRow = ({
<TableTitle>{selectedFk?.constraint?.foreignDataset?.name}</TableTitle>
{selectedFk?.constraint?.foreignFields?.map((field) => (
<div key={field?.fieldPath}>
<FieldBadge count={field?.fieldPath} />
<FieldBadge count={decodeSchemaField(field?.fieldPath || '')} />
</div>
))}
</div>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import {
DatasetAssertionScope,
SchemaFieldRef,
} from '../../../../../../types.generated';
import { decodeSchemaField } from '../../../../../lineage/utils/columnLineageUtils';
import { getFormattedParameterValue } from './assertionUtils';
import { DatasetAssertionLogicModal } from './DatasetAssertionLogicModal';

Expand Down Expand Up @@ -37,7 +38,7 @@ const getSchemaAggregationText = (
case AssertionStdAggregation.Columns:
return <Typography.Text>Dataset columns are</Typography.Text>;
case AssertionStdAggregation.Native: {
const fieldNames = fields?.map((field) => field.path) || [];
const fieldNames = fields?.map((field) => decodeSchemaField(field.path)) || [];
return (
<Typography.Text>
Dataset columns <Typography.Text strong>{JSON.stringify(fieldNames)}</Typography.Text> are
Expand Down Expand Up @@ -76,7 +77,7 @@ const getColumnAggregationText = (
aggregation: AssertionStdAggregation | undefined | null,
field: SchemaFieldRef | undefined,
) => {
let columnText = field?.path;
let columnText = decodeSchemaField(field?.path || '');
if (field === undefined) {
columnText = 'undefined';
console.error(`Invalid field provided for Dataset Assertion with scope Column ${JSON.stringify(field)}`);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import { encodeSchemaField } from '../../../../lineage/utils/columnLineageUtils';

export function generateSchemaFieldUrn(fieldPath: string | undefined, resourceUrn: string) {
if (!fieldPath) return null;
return `urn:li:schemaField:(${resourceUrn},${fieldPath})`;
return `urn:li:schemaField:(${resourceUrn},${encodeSchemaField(fieldPath)})`;
}
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
import { getFieldPathFromSchemaFieldUrn, getSourceUrnFromSchemaFieldUrn } from '../columnLineageUtils';
import {
decodeSchemaField,
encodeSchemaField,
getFieldPathFromSchemaFieldUrn,
getSourceUrnFromSchemaFieldUrn,
} from '../columnLineageUtils';

describe('getSourceUrnFromSchemaFieldUrn', () => {
it('should get the source urn for a chart schemaField', () => {
Expand Down Expand Up @@ -43,3 +48,37 @@ describe('getFieldPathFromSchemaFieldUrn', () => {
expect(sourceUrn).toBe('user.name.test');
});
});

describe('decodeSchemaField', () => {
it('should decode a schemaField path when it has encoded reserved characters', () => {
const decodedSchemaField = decodeSchemaField('Test%2C test%2C test %28and test%29');
expect(decodedSchemaField).toBe('Test, test, test (and test)');
});

it('should return a regular schemaField path when it was not encoded', () => {
const schemaField = 'user.name';
const decodedSchemaField = decodeSchemaField(schemaField);
expect(decodedSchemaField).toBe(schemaField);
});
});

describe('encodeSchemaField', () => {
it('should encode a schemaField path when it has encoded reserved characters', () => {
const encodedSchemaField = encodeSchemaField('Test, test, test (and test)');
expect(encodedSchemaField).toBe('Test%2C test%2C test %28and test%29');
});

it('should return a regular schemaField path when it does not have reserved characters', () => {
const schemaField = 'user.name';
const encodedSchemaField = encodeSchemaField(schemaField);
expect(encodedSchemaField).toBe(schemaField);
});

it('should encode a decoded schemaField that we generate', () => {
const schemaField = 'Adults, men (% of pop)';
const encodedSchemaField = encodeSchemaField(schemaField);
const decodedSchemaField = decodeSchemaField(encodedSchemaField);
expect(encodedSchemaField).toBe('Adults%2C men %28% of pop%29');
expect(decodedSchemaField).toBe(schemaField);
});
});
10 changes: 9 additions & 1 deletion datahub-web-react/src/app/lineage/utils/columnLineageUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -116,9 +116,17 @@ export function filterColumns(
}
}

export function decodeSchemaField(fieldPath: string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

return fieldPath.replaceAll('%28', '(').replaceAll('%29', ')').replaceAll('%2C', ',');
}

export function encodeSchemaField(fieldPath: string) {
return fieldPath.replaceAll('(', '%28').replaceAll(')', '%29').replaceAll(',', '%2C');
}

export function getSourceUrnFromSchemaFieldUrn(schemaFieldUrn: string) {
return schemaFieldUrn.replace('urn:li:schemaField:(', '').split(')')[0].concat(')');
}
export function getFieldPathFromSchemaFieldUrn(schemaFieldUrn: string) {
return schemaFieldUrn.replace('urn:li:schemaField:(', '').split(')')[1].replace(',', '');
return decodeSchemaField(schemaFieldUrn.replace('urn:li:schemaField:(', '').split(')')[1].replace(',', ''));
}
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
import { SchemaFieldRef } from '../../../types.generated';
import EntityRegistry from '../../entity/EntityRegistry';
import { EntityAndType, FetchedEntities, FetchedEntity } from '../types';
import { getFieldPathFromSchemaFieldUrn, getSourceUrnFromSchemaFieldUrn } from './columnLineageUtils';
import {
decodeSchemaField,
getFieldPathFromSchemaFieldUrn,
getSourceUrnFromSchemaFieldUrn,
} from './columnLineageUtils';

const breakFieldUrn = (ref: SchemaFieldRef) => {
const before = ref.urn;
const after = ref.path;
const after = decodeSchemaField(ref.path);

return [before, after];
};
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import styled from 'styled-components/macro';
import { Entity, LineageDirection } from '../../../types.generated';
import { downgradeV2FieldPath } from '../../entity/dataset/profile/schema/utils/utils';
import { LineageTabContext } from '../../entity/shared/tabs/Lineage/LineageTabContext';
import { decodeSchemaField } from '../../lineage/utils/columnLineageUtils';
import DisplayedColumns from './DisplayedColumns';

const ColumnNameWrapper = styled.span<{ isBlack?: boolean }>`
Expand All @@ -19,7 +20,7 @@ interface Props {
export default function ColumnsRelationshipText({ displayedColumns }: Props) {
const { selectedColumn, lineageDirection } = useContext(LineageTabContext);

const displayedFieldPath = downgradeV2FieldPath(selectedColumn);
const displayedFieldPath = decodeSchemaField(downgradeV2FieldPath(selectedColumn) || '');

return (
<>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import React from 'react';
import styled from 'styled-components/macro';
import { Entity, EntityType, SchemaFieldEntity } from '../../../types.generated';
import { downgradeV2FieldPath } from '../../entity/dataset/profile/schema/utils/utils';
import { decodeSchemaField } from '../../lineage/utils/columnLineageUtils';
import { useEntityRegistry } from '../../useEntityRegistry';

const ColumnNameWrapper = styled.span<{ isBlack?: boolean }>`
Expand All @@ -25,7 +26,7 @@ export default function DisplayedColumns({ displayedColumns }: Props) {
return (
<ColumnNameWrapper>
{entity.type === EntityType.SchemaField
? downgradeV2FieldPath((entity as SchemaFieldEntity).fieldPath)
? decodeSchemaField(downgradeV2FieldPath((entity as SchemaFieldEntity).fieldPath) || '')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we decode after downgrading or before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would really matter as decoding it just converts a few encoded characters back to normal ((, ), and ,) and none of those should be affected by the V2 field path. I believe decoding after downgrading would be safest so then we're decoding what we want to show in the UI.

: entityRegistry.getDisplayName(entity.type, entity)}
{index !== displayedColumns.length - 1 && ', '}
</ColumnNameWrapper>
Expand Down
2 changes: 2 additions & 0 deletions metadata-ingestion/src/datahub/utilities/urn_encoder.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import urllib.parse
from typing import List

# NOTE: Frontend relies on encoding these three characters. Specifically, we decode and encode schema fields for column level lineage.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - In such cases, how will these column fields appear in

a) Schema Tab
b) Stats Tab

Do we also need that encoding there? If yes, maybe consider adding in this PR, or marking TODOs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great question - I believe these tabs do not need to be decoded, since the tables in them are reading from fields in mysql that have the fieldPath stored in them, not from schemaField urns that we generate and encode.

So the only time we have encoded fieldPaths is when we're actually grabbing the fieldPath from the schema field urn - and we only have those stored in upstreamLineage, inputFields, and now that you're calling this out, I discovered we're also storing them in DatasetAssertionInfo and under foreignKeys in SchemaMetadata - going to add some coverage for that just in case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

# If this changes, make appropriate changes to datahub-web-react/src/app/lineage/utils/columnLineageUtils.ts
RESERVED_CHARS = [",", "(", ")"]


Expand Down