Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#434] feat(CI): Graviton Trino connector E2E testing #616

Merged
merged 6 commits into from
Nov 3, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 11 additions & 8 deletions .github/workflows/integration-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ on:
branches: [ "main", "branch-*" ]

env:
HIVE_IMAGE_NAME: datastrato/gravitino-ci-hive
HIVE_IMAGE_TAG_NAME: 0.1.4
HIVE_IMAGE_NAME: datastrato/gravitino-ci-hive:0.1.5
TRINO_IMAGE_NAME: datastrato/gravitino-ci-trino:0.1.0

concurrency:
group: ${{ github.worklfow }}-${{ github.event.pull_request.number || github.ref }}
Expand Down Expand Up @@ -41,13 +41,15 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

- name: Build the hive Docker image for AMD64
if: ${{ contains(github.event.pull_request.labels.*.name, 'build docker image') }}
run: ./dev/docker/hive/build-docker.sh --platform ${PLATFORM} --image ${HIVE_IMAGE_NAME}:${HIVE_IMAGE_TAG_NAME}
- name: Pulling the Docker image in advance
if: ${{ !contains(github.event.pull_request.labels.*.name, 'build docker image') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this label still useful? I found that you removed docker run logic, so is this tag still meaningful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rollback these changes.

run: |
docker pull ${HIVE_IMAGE_NAME}
docker pull ${TRINO_IMAGE_NAME}

- name: Run AMD64 container
run: |
docker run --rm --name ${DOCKER_RUN_NAME} --platform ${PLATFORM} -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 ${HIVE_IMAGE_NAME}:${HIVE_IMAGE_TAG_NAME}
docker run --rm --name ${DOCKER_RUN_NAME} --platform ${PLATFORM} -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 ${HIVE_IMAGE_NAME}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see trino dock run command, is it required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trino container will auto-running in the integration test. so we didn't execute Trino docker run command.

docker ps -a

- name: Setup Gradle
Expand Down Expand Up @@ -76,7 +78,7 @@ jobs:
docker stop ${DOCKER_RUN_NAME}
sleep 3
docker ps -a
docker run --rm --name ${DOCKER_RUN_NAME} --platform ${PLATFORM} -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 ${HIVE_IMAGE_NAME}:${HIVE_IMAGE_TAG_NAME}
docker run --rm --name ${DOCKER_RUN_NAME} --platform ${PLATFORM} -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 ${HIVE_IMAGE_NAME}
sleep 60

- name: Setup debug Github Action
Expand Down Expand Up @@ -107,4 +109,5 @@ jobs:
docker stop ${DOCKER_RUN_NAME}
sleep 3
docker ps -a
docker rmi ${HIVE_IMAGE_NAME}:${HIVE_IMAGE_TAG_NAME}
docker rmi ${HIVE_IMAGE_NAME}
docker rmi ${TRINO_IMAGE_NAME}
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,7 @@ distribution
server/src/main/resources/project.properties

dev/docker/hive/packages
docs/build
docs/build

dev/docker/tools/docker-connector
dev/docker/tools/docker-connector.conf
4 changes: 2 additions & 2 deletions conf/gravitino.conf.template
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ gravitino.server.shutdown.timeout = 3000

# THE CONFIGURATION FOR Gravitino WEB SERVER
# The host name of the built-in web server
gravitino.server.webserver.host = 127.0.0.1
gravitino.server.webserver.host = 0.0.0.0
# The http port number of the built-in web server
gravitino.server.webserver.httpPort = 8090
# The min thread size of the built-in web server
Expand Down Expand Up @@ -44,7 +44,7 @@ gravitino.auxService.names = iceberg-rest
# Iceberg REST service classpath
gravitino.auxService.iceberg-rest.classpath = catalogs/lakehouse-iceberg/libs, catalogs/lakehouse-iceberg/conf
# Iceberg REST service host
gravitino.auxService.iceberg-rest.host = 127.0.0.1
gravitino.auxService.iceberg-rest.host = 0.0.0.0
# Iceberg REST service http port
gravitino.auxService.iceberg-rest.httpPort = 9001

2 changes: 1 addition & 1 deletion dev/docker/hive/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#

FROM ubuntu:16.04
LABEL maintainer="dev@datastrato.com"
LABEL maintainer="support@datastrato.com"

ARG HADOOP_PACKAGE_NAME
ARG HIVE_PACKAGE_NAME
Expand Down
35 changes: 35 additions & 0 deletions dev/docker/tools/macos-docker-connector.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/bin/bash
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
#set -ex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the usage of this script, can you please add some comments to explain the usage scenario of this script.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added README.md in the dev/docker/tools/ directory.

set -ex
bin="$(dirname "${BASH_SOURCE-$0}")"
bin="$(cd "${bin}">/dev/null; pwd)"

OS=$(uname -s)
if [ "${OS}" != "Darwin" ]; then
echo "Only macOS needs to start macos-docker-connector."
exit 1
fi

if pgrep -xq "docker-connector"; then
echo "docker-connector is running."
exit 1
fi

DOCKER_CONNECTOR_PACKAGE_NAME="docker-connector-darwin.tar.gz"
DOCKER_CONNECTOR_DOWNLOAD_URL="https://github.com/wenjunxiao/mac-docker-connector/releases/download/v3.2/${DOCKER_CONNECTOR_PACKAGE_NAME}"

if [ ! -f "${bin}/docker-connector" ]; then
wget -q -P "${bin}" ${DOCKER_CONNECTOR_DOWNLOAD_URL}
tar -xzf "${bin}/${DOCKER_CONNECTOR_PACKAGE_NAME}" -C "${bin}"
rm -rf "${bin}/${DOCKER_CONNECTOR_PACKAGE_NAME}"
fi

# Create a docker-connector.conf file with the routes to the docker networks
docker network ls --filter driver=bridge --format "{{.ID}}" | xargs docker network inspect --format "route {{range .IPAM.Config}}{{.Subnet}}{{end}}" > ./docker-connector.conf

echo "Start docker-connector requires root privileges, Please enter the root password."
sudo ${bin}/docker-connector -config ./docker-connector.conf
7 changes: 7 additions & 0 deletions dev/docker/trino/conf/catalog/gravitino.properties.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
connector.name=gravitino
gravitino.uri=http://GRAVITINO_HOST_IP:GRAVITINO_HOST_PORT
gravitino.metalake=GRAVITINO_METALAKE_NAME
7 changes: 7 additions & 0 deletions dev/docker/trino/conf/catalog/hive.properties.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
connector.name = hive
hive.metastore.uri = thrift://HIVE_HOST_IP:9083
hive.allow-drop-table=true
5 changes: 5 additions & 0 deletions dev/docker/trino/conf/catalog/jmx.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
connector.name=jmx
6 changes: 6 additions & 0 deletions dev/docker/trino/conf/catalog/tpcds.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
connector.name=tpcds
tpcds.splits-per-node=4
6 changes: 6 additions & 0 deletions dev/docker/trino/conf/catalog/tpch.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
connector.name=tpch
tpch.splits-per-node=4
16 changes: 16 additions & 0 deletions dev/docker/trino/conf/config.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
#single node install config
#coordinator=true
#node-scheduler.include-coordinator=true
#http-server.http.port=8080
#discovery-server.enabled=true
#discovery.uri=http://localhost:8080
#protocol.v1.alternate-header-name=Presto
#hive.hdfs.impersonation.enabled=true
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
discovery.uri=http://0.0.0.0:8080
18 changes: 18 additions & 0 deletions dev/docker/trino/conf/jvm.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
-server
-Xmx1G
-XX:-UseBiasedLocking
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+UseGCOverheadLimit
-XX:+ExitOnOutOfMemoryError
-XX:ReservedCodeCacheSize=256M
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000
-DHADOOP_USER_NAME=hive
-Dlog4j.configurationFile=/etc/trino/log4j2.properties
6 changes: 6 additions & 0 deletions dev/docker/trino/conf/log.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
# Enable verbose logging from Presto
io.trino=INFO
30 changes: 30 additions & 0 deletions dev/docker/trino/conf/log4j2.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#

# Set to debug or trace if log4j initialization is failing
status = info

# Name of the configuration
name = ConsoleLogConfig

# Console appender configuration
appender.console.type = Console
appender.console.name = consoleLogger
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

# File appender configuration
appender.file.type = File
appender.file.name = fileLogger
appender.file.fileName = gravitino-trino-connector.log
appender.file.layout.type = PatternLayout
appender.file.layout.pattern = %d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

# Root logger level
rootLogger.level = info

# Root logger referring to console and file appenders
rootLogger.appenderRef.stdout.ref = consoleLogger
rootLogger.appenderRef.file.ref = fileLogger
7 changes: 7 additions & 0 deletions dev/docker/trino/conf/node.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#
# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
node.environment=docker
node.data-dir=/data/trino
plugin.dir=/usr/lib/trino/plugin
5 changes: 5 additions & 0 deletions docs/integration-test.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,11 @@ Run only test cases where tag is set `gravitino-docker-it`. [embbeded|deplo

Before running the tests, make sure Docker is installed.

#### macOS Docker connector
Because Docker Desktop for Mac does not provide access to container IP from host(macOS).
The macos-docker-connector provides the ability for the macOS host to directly access the docker container IP.
Before running the integration tests, make sure to execute the `dev/docker/tools/macos-docker-connector.sh` script.

#### Running Gravitino Hive CI Docker Environment

1. Run a hive docker test environment container in the local using the `docker run --rm -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 datastrato/gravitino-ci-hive` command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add the doc about how to run trino integration test locally.

Copy link
Member Author

@xunliu xunliu Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I updated you about how to run the Trino integration test in the docs/integration-test.md, Actual TrinoIT will auto-check the test environment. If you didn't run mac-docker-connector in your local, then TrinoIT will not be running.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you have to change the paragraph title. Here it only mentions the "Hive CI Docker". Another thing is that do we need to start a docker before running the integration test?

I think you should reorganize the doc for others without background easy to run and hide details as much as you can.

Expand Down
7 changes: 5 additions & 2 deletions gradle/libs.versions.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ spark = "3.4.1"
scala-collection-compat = "2.7.0"
sqlite-jdbc = "3.42.0.0"
testng = "7.7.1"

testcontainers = "1.19.0"
trino-jdbc = "426"

protobuf-plugin = "0.9.2"
spotless-plugin = '6.11.0'
Expand Down Expand Up @@ -99,7 +100,9 @@ scala-collection-compat = { group = "org.scala-lang.modules", name = "scala-col
sqlite-jdbc = { group = "org.xerial", name = "sqlite-jdbc", version.ref = "sqlite-jdbc" }
testng = { group = "org.testng", name = "testng", version.ref = "testng" }
spark-hive = { group = "org.apache.spark", name = "spark-hive_2.13", version.ref = "spark" }

testcontainers = { group = "org.testcontainers", name = "testcontainers", version.ref = "testcontainers" }
testcontainers-junit-jupiter = { group = "org.testcontainers", name = "junit-jupiter", version.ref = "testcontainers" }
trino-jdbc = { group = "io.trino", name = "trino-jdbc", version.ref = "trino-jdbc" }

[bundles]
log4j = ["slf4j-api", "log4j-slf4j2-impl", "log4j-api", "log4j-core", "log4j-12-api"]
Expand Down
17 changes: 15 additions & 2 deletions integration-test/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,9 @@ dependencies {
testImplementation(libs.scala.collection.compat)
testImplementation(libs.sqlite.jdbc)
testImplementation(libs.spark.hive)
testImplementation(libs.testcontainers)
testImplementation(libs.testcontainers.junit.jupiter)
testImplementation(libs.trino.jdbc)
}

/* Optimizing integration test execution conditions */
Expand Down Expand Up @@ -196,17 +199,27 @@ tasks.test {
exclude("**/integration/test/**")
} else {
dependsOn("checkDockerRunning")

doFirst {
copy {
from("${project.rootDir}/dev/docker/trino/conf")
into("build/tirno-conf")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: trino-conf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not fixed.

rename { fileName ->
fileName.replace(".template", "")
}
fileMode = 0b111101101
}

// Default use MiniGravitino to run integration tests
environment("GRAVITINO_ROOT_DIR", rootDir.path)
// TODO: use hive user instead after we fix the permission issue #554
environment("HADOOP_USER_NAME", "root")
environment("HADOOP_HOME", "/tmp")
environment("PROJECT_VERSION", version)
environment("TRINO_CONF_DIR", buildDir.path + "/tirno-conf")

val testMode = project.properties["testMode"] as? String ?: "embedded"
systemProperty("gravitino.log.path", buildDir.path)
systemProperty("gravitino.log.path", buildDir.path + "/integration-test.log")
delete(buildDir.path + "/integration-test.log")
if (testMode == "deploy") {
environment("GRAVITINO_HOME", rootDir.path + "/distribution/package")
systemProperty("testMode", "deploy")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
/*
* Copyright 2023 Datastrato.
* This software is licensed under the Apache License version 2.
*/
package com.datastrato.gravitino.integration.test.trino;

import static com.google.common.base.Throwables.propagateIfPossible;
import static java.util.Objects.requireNonNull;

import java.util.ArrayDeque;
import java.util.Deque;

/** This class is inspired by com.google.common.io.Closer */
public final class AutoCloseableCloser implements AutoCloseable {
private final Deque<AutoCloseable> stack = new ArrayDeque<>(4);

private AutoCloseableCloser() {}

public static AutoCloseableCloser create() {
return new AutoCloseableCloser();
}

public <C extends AutoCloseable> C register(C closeable) {
requireNonNull(closeable, "closeable is null");
stack.addFirst(closeable);
return closeable;
}

@Override
public void close() throws Exception {
Throwable rootCause = null;
while (!stack.isEmpty()) {
AutoCloseable closeable = stack.removeFirst();
try {
closeable.close();
} catch (Throwable t) {
if (rootCause == null) {
rootCause = t;
} else if (rootCause != t) {
// Self-suppression not permitted
rootCause.addSuppressed(t);
}
}
}
if (rootCause != null) {
propagateIfPossible(rootCause, Exception.class);
// not possible
throw new AssertionError(rootCause);
}
}
}
Loading