Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(graph) Add createdOn, createdActor, updatedOn, updatedActor to graph edges #6615

Conversation

chriscollins3456
Copy link
Collaborator

This PR adds the mechanism to define what the createdOn and updatedOn timestamps are on a graph index edge as well as the createdActor and updatedActor. In order to specify, all you need to do is set the path to the field on the aspect inside of the Relationship annotation. As an example, this is what you can do to set these fields on Upstream in UpstreamLineage

  /**
   * The upstream dataset the lineage points to
   */
  @Relationship = {
    "name": "DownstreamOf",
    "entityTypes": [ "dataset" ],
    "isLineage": true,
    "createdOn": "upstreams/*/created/time"
    "createdActor": "upstreams/*/created/actor"
    "updatedOn": "upstreams/*/auditStamp/time"
    "updatedActor": "upstreams/*/auditStamp/actor"
  }
  @Searchable = {
    "fieldName": "upstreams",
    "fieldType": "URN",
    "queryByDefault": false
  }
  dataset: DatasetUrn

This also sets a default createdOn and createdActor on all new graph edges being created using the metadataChangeLog.systemMetadata.lastObserved timestamp and the metadataChangeLog.created.actor for those fields.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the product PR or Issue related to the DataHub UI/UX label Dec 2, 2022
@github-actions
Copy link

github-actions bot commented Dec 2, 2022

Unit Test Results (build & test)

622 tests  +1   618 ✔️ +1   16m 13s ⏱️ -9s
158 suites +1       4 💤 ±0 
158 files   +1       0 ±0 

Results for commit 1e8c13a. ± Comparison against base commit 4f7b5ac.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@aditya-radhakrishnan aditya-radhakrishnan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just some small comments

@Nullable
private static List<Urn> getActorList(@Nullable final String path, @Nonnull final RecordTemplate aspect) {
List<Urn> actorList = null;
if (path != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this would read cleaner without excessive nesting. So something like

if (path == null) {return null};

if (!value.isPresent() {return null;)

Because then it becomes a bit more obvious in the cases this returns null. Same for the below functions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sure makes sense!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!

@@ -231,7 +231,8 @@ private void updateFineGrainedEdgesAndRelationships(
// for every downstream, create an edge with each of the upstreams
for (Urn downstream : fineGrainedLineage.getDownstreams()) {
for (Urn upstream : fineGrainedLineage.getUpstreams()) {
edgesToAdd.add(new Edge(downstream, upstream, DOWNSTREAM_OF));
// TODO: add edges uniformly across aspects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this todo mean exactly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's just that we're adding edges differently for InputFields and fineGrainedLineages right now (like this method and the one below it) and once we add some changes to specifying what the source is, we should be able to remove these methods and just create edges properly along with all the other edges

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay cool

FieldExtractor.extractFields(upstreamLineage, aspectSpec.getRelationshipFieldSpecs());

for (Map.Entry<RelationshipFieldSpec, List<Object>> entry : extractedFields.entrySet()) {
if (entry.getKey().getPath().toString().equals("/upstreams/*/dataset")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this path coming from? is this a constant somewhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's just the path I deduced from this specific UpstreamLineage aspect - not a constant anywhere (however I can make it a constant in this test file for readability)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sg!

}

@Test
public void testExtractGraphEdgesDefault() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think there are any other test cases that should be added besides for the default?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly will add more once I start adding changes to relationship annotations on aspects (and add tests for those aspects). I could write one where there's an invalid urn to make sure we don't add that edge.. otherwise other than testing different aspects and checking the types of edges we get are correct I'm not sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay no worries!

}

@Nullable
private static List<Long> getTimestampList(@Nullable final String path, @Nonnull final RecordTemplate aspect) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice functions. I like how tight you were able to break them up!


@BeforeMethod
public void setupTest() {
_createdActorUrn = UrnUtils.getUrn(CREATED_ACTOR_URN);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet test! No need to refactor, but I think the firstt 5 of these vars could also be static final variables up top

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah I actually tried doing that first but I think creating an urn from a string up there was causing issues and the build was failing (with a pretty unhelpful error message) - i'm sure there's a different way I could be going about it though 🤷

Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comments! Nice.

@jjoyce0510
Copy link
Collaborator

Aside from build failing :)

@chriscollins3456 chriscollins3456 merged commit b6887d2 into datahub-project:master Dec 5, 2022
cccs-Dustin pushed a commit to CybercentreCanada/datahub that referenced this pull request Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants