Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest): Add custom properties to the ldap ingestion #7125

Merged
merged 17 commits into from
Feb 2, 2023

Conversation

bda618
Copy link
Contributor

@bda618 bda618 commented Jan 24, 2023

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX labels Jan 24, 2023
Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment on the python side of things

for prop in self.config.custom_props_list:
if self.config.user_attrs_map.get(prop) in attrs:
custom_props_map[prop] = (
attrs[self.config.user_attrs_map[prop]][0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require each item in custom_props_list to also be included in user_attrs_map. Additionally, user_attrs_map is generally a mapping of datahub name -> custom attr name, not the reverse. I think we should just use the items in custom_props_list directly

Copy link
Contributor Author

@bda618 bda618 Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require each item in custom_props_list to also be included in user_attrs_map. Additionally, >user_attrs_map is generally a mapping of datahub name -> custom attr name, not the reverse. I think we should just >use the items in custom_props_list directly.

I use "user_attrs_map" to map datahub property to an actual ldap property. For example, I have this mapping in my "user_attrs_map": {"countryCode":"c"}. I do this to replace a default datahub mapping for "countryCode", which is "countryCode":"countryCode", by an actual ldap attribute name for my company, which is "c".

custom_props_list is just a list of my custom datahub properties for corpUser, not an actual ldap attribute's names. For example, in my case I want to use "accountType" as a custom name property. This is why I use "user_attrs_map", I need to map "accountType" to an actual ldap attribute.

I think instead you suggested to use a ldap attribute name directly at custom_props_list and not to use an alias. This could be done, but I believe we are going to have some logic in UI build on values of some custom properties, not just display them. In this case, if ldap attribute name is changed, we will need to rename all place in UI code with a hardcoded previous ldap attribute name and we want to avoid this by using aliases.

However, I am going to check with my team, I think we can simplify this and use ldap attribute name directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hsheth2, I am going to make an update you have suggested.

@jjoyce0510
Copy link
Collaborator

Thanks for updating @bda618!

We are taking another pass tomorrow.

Cheers
John

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not essential, but it'd be great if you could update the ldap test too (test_ldap_ingest)

@@ -117,6 +117,10 @@ class LDAPSourceConfig(StatefulIngestionConfigBase):
default=None, description="Retrieved attributes list"
)

custom_props_list: Optional[List[str]] = Field(
default=None, description="Customer attributes list"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default=None, description="Customer attributes list"
default=None, description="A list of custom attributes to extract from the LDAP provider."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hsheth2 Yes, your description is a way more accurate, will update today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hsheth2, will update ldap test too.

@jjoyce0510 jjoyce0510 merged commit 1a1f290 into datahub-project:master Feb 2, 2023
ericyomi pushed a commit to ericyomi/datahub that referenced this pull request Feb 8, 2023
oleg-ruban pushed a commit to RChygir/datahub that referenced this pull request Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants