Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(quickstart): Remove kafka-setup as a hard deployment requirement #7073

Merged
merged 30 commits into from
Jan 24, 2023

Conversation

pedro93
Copy link
Collaborator

@pedro93 pedro93 commented Jan 18, 2023

This PR does a few improvements to our docker cli behaviour, namely:

  • Makes kafka-setup job optional behind the flag: --kafka-setup which works similar to standalone consumer containers.
  • Makes docker-compose Quickstart generation logic able to handle duplicate env vars with last-wins heuristic. This was done because we have conflicting configs coming from docker env files & explicit docker-compose environment variables.
  • Generates Neo4j M1-compatible docker-compose files which removes a long-standing limitation in our cli.
  • M1 Quickstart was not being generated by our internal scripts. This makes it very easy for changes in our partial compose files to not get mirrored in the final Quickstart file. This has now been fixed.
  • Makes standalone consumer configs for neo4j deployments consistent.
  • Improves Quickstart comparison logic more flexible with the use of yq
  • Bumps Confluent dependencies to 7.2.2

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@pedro93
Copy link
Collaborator Author

pedro93 commented Jan 18, 2023

Note, this PR will fail if run locally to test the new flag because it assumes that the files already exist in Github.

You will need the following git diff to test this locally:

diff --git a/metadata-ingestion/src/datahub/utilities/sample_data.py b/metadata-ingestion/src/datahub/utilities/sample_data.py
index b908f7f89f..990e86fc97 100644
--- a/metadata-ingestion/src/datahub/utilities/sample_data.py
+++ b/metadata-ingestion/src/datahub/utilities/sample_data.py
@@ -6,7 +6,8 @@ import requests

 DOCKER_COMPOSE_BASE = os.getenv(
     "DOCKER_COMPOSE_BASE",
-    "https://raw.githubusercontent.com/datahub-project/datahub/master",
+    "https://raw.githubusercontent.com/acryldata/datahub/remove-kafka-setup-from-cli",
+    #"https://raw.githubusercontent.com/datahub-project/datahub/master",
 )
 BOOTSTRAP_MCES_FILE = "metadata-ingestion/examples/mce_files/bootstrap_mce.json"
 BOOTSTRAP_MCES_URL = f"{DOCKER_COMPOSE_BASE}/{BOOTSTRAP_MCES_FILE}"

@github-actions github-actions bot added devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata labels Jan 18, 2023
@david-leifker
Copy link
Collaborator

Like the approach, probably you've found an issue with the mce consumer quickstart generation.

@pedro93 pedro93 requested a review from david-leifker January 19, 2023 16:24
@pedro93 pedro93 force-pushed the remove-kafka-setup-from-cli branch from 938b449 to 0255c94 Compare January 20, 2023 11:49
@pedro93 pedro93 force-pushed the remove-kafka-setup-from-cli branch from 0255c94 to 3b15857 Compare January 20, 2023 11:50
@david-leifker
Copy link
Collaborator

I am wondering if we even need a separate m1 flavor anymore? Seems like folks have moved to generate arm images consistently, however perhaps I am overlooking something here.

@pedro93
Copy link
Collaborator Author

pedro93 commented Jan 23, 2023

I am wondering if we even need a separate m1 flavor anymore? Seems like folks have moved to generate arm images consistently, however perhaps I am overlooking something here.

There are a few images that we need M1 alternatives for still such as Postgres & Neo4j.

@pedro93 pedro93 merged commit bef59b0 into datahub-project:master Jan 24, 2023
@pedro93 pedro93 deleted the remove-kafka-setup-from-cli branch January 24, 2023 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants