Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(standalone-consumers): Exclude Solr from spring boot application config & make them run on M1 #5827

Merged

Conversation

pedro93
Copy link
Collaborator

@pedro93 pedro93 commented Sep 5, 2022

This PR modifies a few things related to MAE/MCE consumers, namely:

  • Excludes Solr config from spring boot application, this was the reason they were failing. Should now be working.
  • Changes consumer docker files to use same base image as GMS to work for multiple platforms, including ARM.
  • Updated GitHub workflow to build ARM versions of MAE/MCE consumers.
  • Adds option in datahub docker quickstart to deploy standalone consumers in preparation for smoke tests to run with standalone consumers.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@pedro93 pedro93 requested a review from RyanHolstien September 5, 2022 15:44
@github-actions
Copy link

github-actions bot commented Sep 5, 2022

Unit Test Results (build & test)

517 tests  ±0   517 ✔️ ±0   8m 22s ⏱️ - 1m 8s
121 suites ±0       0 💤 ±0 
121 files   ±0       0 ±0 

Results for commit 0309a2a. ± Comparison against base commit 13f20ad.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link

github-actions bot commented Sep 5, 2022

Unit Test Results (metadata ingestion)

       8 files         8 suites   55m 55s ⏱️
   672 tests    669 ✔️ 3 💤 0
1 344 runs  1 338 ✔️ 6 💤 0

Results for commit 0309a2a.

♻️ This comment has been updated with latest results.

@pedro93 pedro93 changed the title fix(standalone-consumers) Removes Solr from spring boot application config (WIP) fix(standalone-consumers) Removes Solr from spring boot application config Sep 6, 2022
@pedro93 pedro93 changed the title (WIP) fix(standalone-consumers) Removes Solr from spring boot application config (WIP) fix(standalone-consumers): Exclude Solr from spring boot application config & make them run o M1 Sep 6, 2022
@pedro93 pedro93 changed the title (WIP) fix(standalone-consumers): Exclude Solr from spring boot application config & make them run o M1 (WIP) fix(standalone-consumers): Exclude Solr from spring boot application config & make them run on M1 Sep 6, 2022
@pedro93 pedro93 changed the title (WIP) fix(standalone-consumers): Exclude Solr from spring boot application config & make them run on M1 fix(standalone-consumers): Exclude Solr from spring boot application config & make them run on M1 Sep 6, 2022
Copy link
Collaborator

@anshbansal anshbansal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM minor comments

@@ -21,6 +21,7 @@ pip install -r requirements.txt

echo "DATAHUB_VERSION = $DATAHUB_VERSION"
DATAHUB_TELEMETRY_ENABLED=false datahub docker quickstart --quickstart-compose-file ../docker/quickstart/docker-compose-without-neo4j.quickstart.yml --dump-logs-on-failure
#DATAHUB_TELEMETRY_ENABLED=false datahub docker quickstart --standalone_consumers --build-locally --dump-logs-on-failure
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup?

Copy link
Collaborator

@RyanHolstien RyanHolstien Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally okay with leaving this in as a way to easily run it locally. Might even want to create a separate file for this and have it run as a part of releases at some point or at least run some of the short tests against an instance with standalone consumers. We're starting to pretty commonly that people are wanting this option to scale.

Copy link
Collaborator Author

@pedro93 pedro93 Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this to change after this PR is merged so that we start using standalone consumers during our smoke tests. As Ryan said, we should have this deploy config as part of our CI.

metadata-ingestion/src/datahub/cli/docker.py Outdated Show resolved Hide resolved
- DATAHUB_GMS_HOST=datahub-gms
- DATAHUB_GMS_PORT=8080
- MAE_CONSUMER_ENABLED=true
- PE_CONSUMER_ENABLED=true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this variable not have a default anywhere? Tried looking but couldn't find so I am assuming it is null? Do you know?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, this comes from mae-consumer docker.env file, possibly legacy? @jjoyce0510 do you know?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It defaults to false because of its usage. Instead of using Spring style properties it uses System.getenv with a "true".equals(prop). This should be refactored at some point since this won't even handle case sensitivity.

@pedro93
Copy link
Collaborator Author

pedro93 commented Sep 6, 2022

Do we have a flaky test?
test_create_list_get_ingestion_execution_request

@RyanHolstien
Copy link
Collaborator

Do we have a flaky test? test_create_list_get_ingestion_execution_request

Hmm I thought Harshal was fixing this one, but yeah it was flaking

@RyanHolstien
Copy link
Collaborator

Okay yeah those changes got merged in, @hsheth2 looks like that test is still flaking :(

@RyanHolstien RyanHolstien merged commit 20138a3 into datahub-project:master Sep 6, 2022
@RyanHolstien RyanHolstien deleted the pedro-fix-standalone-consumers branch September 6, 2022 18:55
@hsheth2
Copy link
Collaborator

hsheth2 commented Sep 6, 2022

@RyanHolstien yep the same test is flaking, but seems like it's flaking in a different spot now. Added an additional log here #5842 to debug more

shirshanka pushed a commit to shirshanka/datahub that referenced this pull request Sep 8, 2022
…config & make them run on M1 (datahub-project#5827)

* fix(standalone-consumers) Removes Solr from spring boot application config

* Adds standalone consumer option to datahub quickstart cli

* Rename files

* Make dockerize platform agnostic & change docker compose utility to work with M1

* Fix MAE/MCE dockerfiles for arm & make smoke tests use standalone consumers

* Fix identation

* Use master code

* Adds ARM image publishing for consumers

* Fix linter

* Fix lint
shirshanka pushed a commit to shirshanka/datahub that referenced this pull request Sep 8, 2022
…config & make them run on M1 (datahub-project#5827)

* fix(standalone-consumers) Removes Solr from spring boot application config

* Adds standalone consumer option to datahub quickstart cli

* Rename files

* Make dockerize platform agnostic & change docker compose utility to work with M1

* Fix MAE/MCE dockerfiles for arm & make smoke tests use standalone consumers

* Fix identation

* Use master code

* Adds ARM image publishing for consumers

* Fix linter

* Fix lint
pedro93 added a commit that referenced this pull request Sep 29, 2022
Pull request #5827 introduced a regression by removing coreutils
from the mae-consumer Dockerfile (coreutils was added in #3723).
This broke the base64 call in the startup script s.th. the Elasticsearch
auth header will not be correctly set when username and password are
provided.

To make sure that the startup script fails on these errors in the
future, set "-euo pipefail" which lets the bash script fail on errors
and unset variables.
Also refactor the startup script to make it more stable and readable.

Co-authored-by: Pedro Silva <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants