Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(quickstart): Remove kafka-setup as a hard deployment requirement #7073

Merged
merged 30 commits into from
Jan 24, 2023
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
86879ee
feat(quickstart): Remove kafka-setup as a hard deployment requirement
pedro93 Jan 18, 2023
7927b81
Update quickstart
pedro93 Jan 19, 2023
4e66f42
Fix mce env vars
pedro93 Jan 19, 2023
e8fd760
more fixes
pedro93 Jan 19, 2023
b8d3c77
Fix list merging
pedro93 Jan 19, 2023
959de38
Dedup env vars & make neo4j consumers consistent
pedro93 Jan 19, 2023
733d586
change mae consumer env var file
pedro93 Jan 19, 2023
2d71155
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 19, 2023
91ae779
Fix compose check logic, adds neo4j supported m1 docker-compose
pedro93 Jan 19, 2023
a6469f9
Add support for Neo4J M1 quickstart
pedro93 Jan 19, 2023
bcc1c54
Remove temp files
pedro93 Jan 19, 2023
b3eba22
Fix docker cli linting issues
pedro93 Jan 19, 2023
b18672a
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 19, 2023
19d3b26
Correct url
pedro93 Jan 19, 2023
a905b21
remove old code
pedro93 Jan 19, 2023
c8d35db
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 19, 2023
a8118ff
Fix cyclomatic complexity of docker method
pedro93 Jan 19, 2023
52ed268
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 20, 2023
4dc6332
Use YQ installation Github Action
pedro93 Jan 20, 2023
4af8370
Specify yq version to install
pedro93 Jan 20, 2023
7f85331
Fix action version
pedro93 Jan 20, 2023
3b15857
bump to latest action version
pedro93 Jan 20, 2023
7c50481
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 20, 2023
566760d
Fix bug in docker cli python file
pedro93 Jan 20, 2023
ac1ad8e
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 21, 2023
7572727
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 23, 2023
3003b52
Update docker images to use kafka 7.2.2
pedro93 Jan 23, 2023
4feb4a0
Merge branch 'master' into remove-kafka-setup-from-cli
pedro93 Jan 23, 2023
c8a4602
Remove kymeric reference
pedro93 Jan 23, 2023
23b6f91
Bump confluent to 7.2.2
pedro93 Jan 24, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ jobs:
- uses: actions/setup-python@v4
with:
python-version: "3.7"
- name: Download YQ
uses: chrisdickinson/[email protected]
with:
yq-version: v4.28.2
- name: Quickstart Compose Validation
run: ./docker/quickstart/generate_and_compare.sh

Expand Down
6 changes: 5 additions & 1 deletion docker/datahub-mce-consumer/env/docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ MCE_CONSUMER_ENABLED=true
EBEAN_DATASOURCE_USERNAME=datahub
EBEAN_DATASOURCE_PASSWORD=datahub
EBEAN_DATASOURCE_HOST=mysql:3306
EBEAN_DATASOURCE_URL=jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8
EBEAN_DATASOURCE_URL=jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2
EBEAN_DATASOURCE_DRIVER=com.mysql.jdbc.Driver
KAFKA_BOOTSTRAP_SERVER=broker:29092
KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
Expand All @@ -15,6 +15,10 @@ JAVA_OPTS=-Xms1g -Xmx1g
ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-mce-consumer/resources/entity-registry.yml
DATAHUB_SYSTEM_CLIENT_ID=__datahub_system
DATAHUB_SYSTEM_CLIENT_SECRET=JohnSnowKnowsNothing
ENTITY_SERVICE_ENABLE_RETENTION=true
MAE_CONSUMER_ENABLED=false
PE_CONSUMER_ENABLED=false
UI_INGESTION_ENABLED=false

# Uncomment to configure kafka topic names
# Make sure these names are consistent across the whole deployment
Expand Down
12 changes: 0 additions & 12 deletions docker/docker-compose-with-cassandra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,18 +29,6 @@ services:
volumes:
- broker:/var/lib/kafka/data/

# This "container" is a workaround to pre-create topics
kafka-setup:
build:
context: kafka-setup
image: ${DATAHUB_KAFKA_SETUP_IMAGE:-linkedin/datahub-kafka-setup}:${DATAHUB_VERSION:-head}
env_file: kafka-setup/env/docker.env
hostname: kafka-setup
container_name: kafka-setup
depends_on:
- broker
- schema-registry

schema-registry:
image: confluentinc/cp-schema-registry:5.4.0
env_file: schema-registry/env/docker.env
Expand Down
2 changes: 1 addition & 1 deletion docker/docker-compose-without-neo4j.override.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ services:
hostname: mysql
image: mysql:5.7
env_file: mysql/env/docker.env
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_bin
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_bin --default-authentication-plugin=mysql_native_password
ports:
- ${DATAHUB_MAPPED_MYSQL_PORT:-3306}:3306
volumes:
Expand Down
12 changes: 0 additions & 12 deletions docker/docker-compose-without-neo4j.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,6 @@ services:
ports:
- ${DATAHUB_MAPPED_KAFKA_BROKER_PORT:-9092}:9092

# This "container" is a workaround to pre-create topics
kafka-setup:
build:
context: kafka-setup
image: ${DATAHUB_KAFKA_SETUP_IMAGE:-linkedin/datahub-kafka-setup}:${DATAHUB_VERSION:-head}
env_file: kafka-setup/env/docker.env
hostname: kafka-setup
container_name: kafka-setup
depends_on:
- broker
- schema-registry

schema-registry:
image: confluentinc/cp-schema-registry:5.4.0
env_file: schema-registry/env/docker.env
Expand Down
3 changes: 3 additions & 0 deletions docker/docker-compose.consumers-without-neo4j.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,8 @@ services:
env_file: datahub-mce-consumer/env/docker.env
hostname: datahub-mce-consumer
container_name: datahub-mce-consumer
environment:
- DATAHUB_SERVER_TYPE=${DATAHUB_SERVER_TYPE:-quickstart}
- DATAHUB_TELEMETRY_ENABLED=${DATAHUB_TELEMETRY_ENABLED:-true}
ports:
- "9090:9090"
10 changes: 10 additions & 0 deletions docker/docker-compose.consumers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,15 @@ services:
env_file: datahub-mce-consumer/env/docker.env
hostname: datahub-mce-consumer
container_name: datahub-mce-consumer
environment:
- DATAHUB_SERVER_TYPE=${DATAHUB_SERVER_TYPE:-quickstart}
- DATAHUB_TELEMETRY_ENABLED=${DATAHUB_TELEMETRY_ENABLED:-true}
- NEO4J_HOST=http://neo4j:7474
- NEO4J_URI=bolt://neo4j
- NEO4J_USERNAME=neo4j
- NEO4J_PASSWORD=datahub
- GRAPH_SERVICE_IMPL=neo4j
ports:
- "9090:9090"
depends_on:
- neo4j
11 changes: 0 additions & 11 deletions docker/docker-compose.dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,17 +24,6 @@ services:
- ./elasticsearch-setup/create-indices.sh:/create-indices.sh
- ../metadata-service/restli-servlet-impl/src/main/resources/index/:/index

kafka-setup:
image: linkedin/datahub-kafka-setup:debug
build:
context: ../
dockerfile: ./docker/kafka-setup/Dockerfile
args:
APP_ENV: dev
depends_on:
- broker
- schema-registry

datahub-gms:
image: linkedin/datahub-gms:debug
build:
Expand Down
15 changes: 15 additions & 0 deletions docker/docker-compose.kafka-setup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Service definitions for Kafka Setup container.
version: '3.8'
services:

# This "container" is a workaround to pre-create topics
kafka-setup:
build:
context: kafka-setup
image: ${DATAHUB_KAFKA_SETUP_IMAGE:-linkedin/datahub-kafka-setup}:${DATAHUB_VERSION:-head}
env_file: kafka-setup/env/docker.env
hostname: kafka-setup
container_name: kafka-setup
depends_on:
- broker
- schema-registry
2 changes: 1 addition & 1 deletion docker/docker-compose.override.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ services:
hostname: mysql
image: mysql:5.7
env_file: mysql/env/docker.env
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_bin
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_bin --default-authentication-plugin=mysql_native_password
ports:
- ${DATAHUB_MAPPED_MYSQL_PORT:-3306}:3306
volumes:
Expand Down
13 changes: 0 additions & 13 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,19 +28,6 @@ services:
volumes:
- broker:/var/lib/kafka/data/

# This "container" is a workaround to pre-create topics
kafka-setup:
build:
dockerfile: ./docker/kafka-setup/Dockerfile
context: ../
image: ${DATAHUB_KAFKA_SETUP_IMAGE:-linkedin/datahub-kafka-setup}:${DATAHUB_VERSION:-head}
env_file: kafka-setup/env/docker.env
hostname: kafka-setup
container_name: kafka-setup
depends_on:
- broker
- schema-registry

schema-registry:
image: confluentinc/cp-schema-registry:5.4.0
env_file: schema-registry/env/docker.env
Expand Down
204 changes: 204 additions & 0 deletions docker/quickstart/docker-compose-m1.quickstart.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
networks:
default:
name: datahub_network
services:
broker:
container_name: broker
depends_on:
- zookeeper
environment:
- KAFKA_BROKER_ID=1
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
- KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
- KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0
- KAFKA_HEAP_OPTS=-Xms256m -Xmx256m
- KAFKA_CONFLUENT_SUPPORT_METRICS_ENABLE=false
hostname: broker
image: kymeric/cp-kafka:latest
david-leifker marked this conversation as resolved.
Show resolved Hide resolved
ports:
- ${DATAHUB_MAPPED_KAFKA_BROKER_PORT:-9092}:9092
volumes:
- broker:/var/lib/kafka/data/
datahub-actions:
depends_on:
- datahub-gms
environment:
- DATAHUB_GMS_PROTOCOL=http
- DATAHUB_GMS_HOST=datahub-gms
- DATAHUB_GMS_PORT=8080
- KAFKA_BOOTSTRAP_SERVER=broker:29092
- SCHEMA_REGISTRY_URL=http://schema-registry:8081
- METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4
- METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME=MetadataChangeLog_Versioned_v1
- DATAHUB_SYSTEM_CLIENT_ID=__datahub_system
- DATAHUB_SYSTEM_CLIENT_SECRET=JohnSnowKnowsNothing
- KAFKA_PROPERTIES_SECURITY_PROTOCOL=PLAINTEXT
hostname: actions
image: acryldata/datahub-actions:${ACTIONS_VERSION:-head}
restart: on-failure:5
datahub-frontend-react:
container_name: datahub-frontend-react
depends_on:
- datahub-gms
environment:
- DATAHUB_GMS_HOST=datahub-gms
- DATAHUB_GMS_PORT=8080
- DATAHUB_SECRET=YouKnowNothing
- DATAHUB_APP_VERSION=1.0
- DATAHUB_PLAY_MEM_BUFFER_SIZE=10MB
- JAVA_OPTS=-Xms512m -Xmx512m -Dhttp.port=9002 -Dconfig.file=datahub-frontend/conf/application.conf
-Djava.security.auth.login.config=datahub-frontend/conf/jaas.conf -Dlogback.configurationFile=datahub-frontend/conf/logback.xml
-Dlogback.debug=false -Dpidfile.path=/dev/null
- KAFKA_BOOTSTRAP_SERVER=broker:29092
- DATAHUB_TRACKING_TOPIC=DataHubUsageEvent_v1
- ELASTIC_CLIENT_HOST=elasticsearch
- ELASTIC_CLIENT_PORT=9200
hostname: datahub-frontend-react
image: ${DATAHUB_FRONTEND_IMAGE:-linkedin/datahub-frontend-react}:${DATAHUB_VERSION:-head}
ports:
- ${DATAHUB_MAPPED_FRONTEND_PORT:-9002}:9002
volumes:
- ${HOME}/.datahub/plugins:/etc/datahub/plugins
datahub-gms:
container_name: datahub-gms
depends_on:
- neo4j
- mysql
environment:
- DATAHUB_SERVER_TYPE=${DATAHUB_SERVER_TYPE:-quickstart}
- DATAHUB_TELEMETRY_ENABLED=${DATAHUB_TELEMETRY_ENABLED:-true}
- EBEAN_DATASOURCE_USERNAME=datahub
- EBEAN_DATASOURCE_PASSWORD=datahub
- EBEAN_DATASOURCE_HOST=mysql:3306
- EBEAN_DATASOURCE_URL=jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2
- EBEAN_DATASOURCE_DRIVER=com.mysql.jdbc.Driver
- KAFKA_BOOTSTRAP_SERVER=broker:29092
- KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
- ELASTICSEARCH_HOST=elasticsearch
- ELASTICSEARCH_PORT=9200
- ES_BULK_REFRESH_POLICY=WAIT_UNTIL
- ELASTICSEARCH_INDEX_BUILDER_SETTINGS_REINDEX=true
- ELASTICSEARCH_INDEX_BUILDER_MAPPINGS_REINDEX=true
- NEO4J_HOST=http://neo4j:7474
- NEO4J_URI=bolt://neo4j
- NEO4J_USERNAME=neo4j
- NEO4J_PASSWORD=datahub
- JAVA_OPTS=-Xms1g -Xmx1g
- GRAPH_SERVICE_DIFF_MODE_ENABLED=true
- GRAPH_SERVICE_IMPL=neo4j
- ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-gms/resources/entity-registry.yml
- ENTITY_SERVICE_ENABLE_RETENTION=true
- MAE_CONSUMER_ENABLED=true
- MCE_CONSUMER_ENABLED=true
- PE_CONSUMER_ENABLED=true
- UI_INGESTION_ENABLED=true
- METADATA_SERVICE_AUTH_ENABLED=false
hostname: datahub-gms
image: ${DATAHUB_GMS_IMAGE:-linkedin/datahub-gms}:${DATAHUB_VERSION:-head}
ports:
- ${DATAHUB_MAPPED_GMS_PORT:-8080}:8080
volumes:
- ${HOME}/.datahub/plugins/:/etc/datahub/plugins
- ${HOME}/.datahub/plugins/auth/resources/:/etc/datahub/plugins/auth/resources
elasticsearch:
container_name: elasticsearch
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- ES_JAVA_OPTS=-Xms256m -Xmx256m -Dlog4j2.formatMsgNoLookups=true
healthcheck:
retries: 4
start_period: 2m
test:
- CMD-SHELL
- curl -sS --fail 'http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=0s'
|| exit 1
hostname: elasticsearch
image: elasticsearch:7.9.3
mem_limit: 1g
ports:
- ${DATAHUB_MAPPED_ELASTIC_PORT:-9200}:9200
volumes:
- esdata:/usr/share/elasticsearch/data
elasticsearch-setup:
container_name: elasticsearch-setup
depends_on:
- elasticsearch
environment:
- ELASTICSEARCH_HOST=elasticsearch
- ELASTICSEARCH_PORT=9200
- ELASTICSEARCH_PROTOCOL=http
hostname: elasticsearch-setup
image: ${DATAHUB_ELASTIC_SETUP_IMAGE:-linkedin/datahub-elasticsearch-setup}:${DATAHUB_VERSION:-head}
mysql:
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_bin --default-authentication-plugin=mysql_native_password
container_name: mysql
environment:
- MYSQL_DATABASE=datahub
- MYSQL_USER=datahub
- MYSQL_PASSWORD=datahub
- MYSQL_ROOT_PASSWORD=datahub
hostname: mysql
image: mariadb:10.5.8
ports:
- ${DATAHUB_MAPPED_MYSQL_PORT:-3306}:3306
volumes:
- ../mysql/init.sql:/docker-entrypoint-initdb.d/init.sql
- mysqldata:/var/lib/mysql
mysql-setup:
container_name: mysql-setup
depends_on:
- mysql
environment:
- MYSQL_HOST=mysql
- MYSQL_PORT=3306
- MYSQL_USERNAME=datahub
- MYSQL_PASSWORD=datahub
- DATAHUB_DB_NAME=datahub
hostname: mysql-setup
image: acryldata/datahub-mysql-setup:${DATAHUB_VERSION:-head}
neo4j:
container_name: neo4j
environment:
- NEO4J_AUTH=neo4j/datahub
- NEO4J_dbms_default__database=graph.db
- NEO4J_dbms_allow__upgrade=true
hostname: neo4j
image: neo4j/neo4j-arm64-experimental:4.0.6-arm64
ports:
- ${DATAHUB_MAPPED_NEO4J_HTTP_PORT:-7474}:7474
- ${DATAHUB_MAPPED_NEO4J_BOLT_PORT:-7687}:7687
volumes:
- neo4jdata:/data
schema-registry:
container_name: schema-registry
depends_on:
- zookeeper
- broker
environment:
- SCHEMA_REGISTRY_HOST_NAME=schemaregistry
- SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
hostname: schema-registry
image: eugenetea/schema-registry-arm64:latest
ports:
- ${DATAHUB_MAPPED_SCHEMA_REGISTRY_PORT:-8081}:8081
zookeeper:
container_name: zookeeper
environment:
- ZOOKEEPER_CLIENT_PORT=2181
- ZOOKEEPER_TICK_TIME=2000
hostname: zookeeper
image: kymeric/cp-zookeeper:latest
ports:
- ${DATAHUB_MAPPED_ZK_PORT:-2181}:2181
volumes:
- zkdata:/var/lib/zookeeper
version: '2.3'
volumes:
broker: null
esdata: null
mysqldata: null
neo4jdata: null
zkdata: null
Loading