Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Ingestion): Elastic http/https host support #4191

Merged
merged 3 commits into from
Feb 19, 2022

Conversation

abiwill
Copy link
Contributor

@abiwill abiwill commented Feb 18, 2022

allow users to specify url scheme. enable ingestion from both TLS enabled clusters.
currently specifying the url scheme resulted in assertion error.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

abiwill and others added 2 commits February 19, 2022 01:24
allow users to specify url scheme. enable ingestion from both TLS enabled clusters.
currently specifying the url scheme resulted in assertion error.
tested with 
```
host = 192.0.0.1
host = localhost
host = http://localhost:9002
host = https://localhost:9002
@shirshanka
Copy link
Contributor

Thanks for this @abiwill
I fixed up the regex to handle the old style as well..
could you add a unit test here?
https://github.com/linkedin/datahub/blob/master/metadata-ingestion/tests/unit/test_elasticsearch_source.py

That will ensure we handle different styles of host_port specification correctly.

@github-actions
Copy link

github-actions bot commented Feb 18, 2022

Unit Test Results (metadata ingestion)

       5 files         5 suites   41m 7s ⏱️
   333 tests    333 ✔️   0 💤 0
1 536 runs  1 498 ✔️ 38 💤 0

Results for commit 1cd4db9.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Feb 18, 2022

Unit Test Results (build & test)

  70 files  +  1    70 suites  +1   19m 31s ⏱️ + 11m 42s
611 tests +71  552 ✔️ +64  59 💤 +7  0 ±0 

Results for commit 1cd4db9. ± Comparison against base commit b08784e.

♻️ This comment has been updated with latest results.

@abiwill
Copy link
Contributor Author

abiwill commented Feb 18, 2022

Unittest PR: #4194
Thanks for the fix @shirshanka
There is one more problem with the regex: \w allows use of _ which is not a valid hostname.
Instead of \w we can use [0-9a-zA-Z]

@mmmeeedddsss
Copy link
Contributor

It is impressive that only changing the validation regex solves the issue. It is fascinating to see a project growing in such a way.

I want to thank you for spending your time on investigating, implementing and testing this change. 🚀

@abiwill
Copy link
Contributor Author

abiwill commented Feb 18, 2022

I have just started my journey into programming. Trying to learn and solve :)

@shirshanka
Copy link
Contributor

shirshanka commented Feb 19, 2022

Since this check is really a first line of defense, I think it is okay to be a bit more loose, since the underlying elastic client will anyway complain if it cannot connect to the configured endpoint. We just want to make sure that clearly valid values are not blocked by it.

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Made changes to unit test in this PR itself.

@shirshanka shirshanka merged commit 8bbc66b into datahub-project:master Feb 19, 2022
maggiehays pushed a commit to maggiehays/datahub that referenced this pull request Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants