Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update data_platforms.json #7244

Merged
merged 1 commit into from
Feb 3, 2023
Merged

Conversation

RainerGa
Copy link
Contributor

@RainerGa RainerGa commented Feb 3, 2023

SAP Hana (SAP is the Company behind SAP Hana).

Reason 1)
Everywhere (directory, python scripts and so on) is only used "hana" Cause of consistency it should be "Hana" in the Navigation.

Reason 2)
"SAP Hana" is actually in the Alphabetic order after "G" (S after G, not really :-) cause internally it is sorted for the word "Hana"

BUT
The best would be to use the Name "SAP Hana". Cause this is the used Name in the IT industry. But I dont know for what the "name:" Attribute is used for. Only for sorting or are there other dependencies?

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

SAP Hana (SAP is the Company behind SAP Hana).

Reason 1)
Everywhere (directory, python scripts and so on) is only used "hana"
Cause of consistency it should be "Hana" in the Navigation.

Reason 2)
"SAP Hana" is actually in the Alphabetic order after "G" (S after G, not really :-) cause internally it is sorted for the word "Hana"

BUT
The best would be to use the Name "SAP Hana". Cause this is the used Name in the IT industry. But I dont know for what the "name:" Attribute is used for. Only for sorting or are there other dependencies?
@github-actions github-actions bot added the devops PR or Issue related to DataHub backend & deployment label Feb 3, 2023
Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the detailed explanation.

@jjoyce0510 jjoyce0510 merged commit 981810a into datahub-project:master Feb 3, 2023
@Masterchen09
Copy link
Contributor

@jjoyce0510 @RainerGa

Even if this PR is already merged, I strongly disagree with the change - I was responsible for the change of the display name to SAP HANA (instead of Hana, when the source was introduced) in PR #5043.

SAP HANA is not called HANA (not all products, but nearly all products of SAP are starting with SAP) or SAP Hana (I have never seen SAP HANA written like this before in any official publication of SAP).

See also Wikipedia or the website of SAP as a reference:

I agree that it is not really nice that the name is used for sorting (e.g., in the documentation...I am not aware of other places where the data platforms are sorted by the name?), but then the sorting should be fixed e.g. by using the display name when a display name is available instead of changing the display name of a data platform.

In my opinion the inconsistency regarding the name of the data platform is also fine: The dialect for SQLAlchemy which is used for the SAP HANA source and is provided by SAP is called sqlalchemy-hana (https://github.com/SAP/sqlalchemy-hana) - internally/technically it is called hana, but the display name could be SAP HANA anyway (let's say the name is more or less an internal/technical identifier of the data platform and not the name itself).

If we change SAP HANA to HANA now, what should we do with other SAP products like the SAP Analytics Cloud (this is something I am working on, resp. is already finished and is tested internally, but I had not the time to write the documentation and publish it until now)...are we calling it Analytics Cloud then?

P.S.: If you really want to change the display name to HANA, then this line must also be changed:

@platform_name("SAP HANA", id="hana")

@RainerGa
Copy link
Contributor Author

RainerGa commented Feb 7, 2023

@Masterchen09 @jjoyce0510

I agree completly that Hana is called "SAP Hana" the users are searching for this (as you wrote and as I wrote to!).

But the name change to "SAP Hana" should be complete implemented and not only change the display name (reasons see my first comment).

@Masterchen09 : Change the SAP Product names to
sap_[name].py, SAP [Name] (for display name AND internal sorting "name")
Example: sam_hana.py, @platform_name("SAP HANA", id="sap_hana")
Are there other locations in the implementation that should be changed? I don`t know.
What do you think about this proposal.

@Masterchen09
Copy link
Contributor

@RainerGa

Changing the platform id/name would make all already ingested entities "incompatible" to the renamed data platform, because the data platform is part of the urn of the entities. In case stateful ingestion is enabled the existing entities would be soft-deleted and the then ingested entities would be treated as new entities, which means all user-provided information (documentation, owner, tags, etc...) would be lost.

There might be a possiblity - I am not really sure about it - to migrate the already ingested entities to the new urn using the new platform id/name (datahub-upgrade?), but I don't really think it's worth the effort.

If the order in the menu of the documentation is the only visible place for the end-user, I think it should be enough to fix this (yesterday I have already tried to understand the logic of the generation of the documentation, but it is not that easy: here and here)...I think it does not really matter if the module or the platform id/name is called hana or sap_hana if you can find the name of the module/plugin in the documentation (even SAP is calling their SQLAlchemy dialect sqlalchemy-hana and not sqlalchemy-sap_hana, therefore I think it is fine to stick with it).

ericyomi pushed a commit to ericyomi/datahub that referenced this pull request Feb 8, 2023
oleg-ruban pushed a commit to RChygir/datahub that referenced this pull request Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants