Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#434] feat(CI): Graviton Trino connector E2E testing #616

Merged
merged 6 commits into from
Nov 3, 2023
Merged

[#434] feat(CI): Graviton Trino connector E2E testing #616

merged 6 commits into from
Nov 3, 2023

Conversation

xunliu
Copy link
Member

@xunliu xunliu commented Oct 27, 2023

What changes were proposed in this pull request?

  1. Add HiveContainer and TrinoContainer
  2. Use Trino JDBC to connect TrinoContainer execute create database, table, insert, select and join two tables test.

Why are the changes needed?

  • Through Trino operations hive
  • Supports Trino E2E testing in the local and GitHub Action.

Fix: #434

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

Add TrinoConnectorIT in the Integration-test

@github-actions
Copy link

github-actions bot commented Oct 27, 2023

Code Coverage Report

Overall Project 67.17% 🟢

There is no coverage information present for the Files changed

if: ${{ contains(github.event.pull_request.labels.*.name, 'build docker image') }}
run: ./dev/docker/hive/build-docker.sh --platform ${PLATFORM} --image ${HIVE_IMAGE_NAME}:${HIVE_IMAGE_TAG_NAME}
- name: Pulling the Docker image in advance
if: ${{ !contains(github.event.pull_request.labels.*.name, 'build docker image') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this label still useful? I found that you removed docker run logic, so is this tag still meaningful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rollback these changes.


- name: Run AMD64 container
run: |
docker run --rm --name ${DOCKER_RUN_NAME} --platform ${PLATFORM} -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 ${HIVE_IMAGE_NAME}:${HIVE_IMAGE_TAG_NAME}
docker run --rm --name ${DOCKER_RUN_NAME} --platform ${PLATFORM} -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 ${HIVE_IMAGE_NAME}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see trino dock run command, is it required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trino container will auto-running in the integration test. so we didn't execute Trino docker run command.

# Copyright 2023 Datastrato.
# This software is licensed under the Apache License version 2.
#
#set -ex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the usage of this script, can you please add some comments to explain the usage scenario of this script.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added README.md in the dev/docker/tools/ directory.

Because Docker Desktop for Mac does not provide access to container IP from host(macOS).
The macos-docker-connector provides the ability for the macOS host to directly access the docker container IP.
Before running the integration tests, make sure to execute the `dev/docker/tools/macos-docker-connector.sh` script.

#### Running Gravitino Hive CI Docker Environment

1. Run a hive docker test environment container in the local using the `docker run --rm -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 datastrato/gravitino-ci-hive` command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add the doc about how to run trino integration test locally.

Copy link
Member Author

@xunliu xunliu Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I updated you about how to run the Trino integration test in the docs/integration-test.md, Actual TrinoIT will auto-check the test environment. If you didn't run mac-docker-connector in your local, then TrinoIT will not be running.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you have to change the paragraph title. Here it only mentions the "Hive CI Docker". Another thing is that do we need to start a docker before running the integration test?

I think you should reorganize the doc for others without background easy to run and hide details as much as you can.

doFirst {
copy {
from("${project.rootDir}/dev/docker/trino/conf")
into("build/tirno-conf")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: trino-conf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not fixed.

}

@Test
public void joinTwoHiveTableTest() throws TException, InterruptedException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please split into several test, for each test only cover one functionality, which will be easy to maintain and add more tests later on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE.

@xunliu
Copy link
Member Author

xunliu commented Oct 31, 2023

hi @jerryshao @yuqi1129 @mchades I improved this PR, Please help me review it, Thanks.

@xunliu
Copy link
Member Author

xunliu commented Nov 1, 2023

@jerryshao I have fixed all the comments issue, Please help me review this PR, thanks.

Comment on lines 7 to 9
Because Docker Desktop for Mac does not provide access to container IP from host(macOS).
The [mac-docker-connector](https://github.com/wenjunxiao/mac-docker-connector) provides the ability for the macOS host to directly access the docker container IP.
This can result in host(macOS) and containers not being able to access each other's internal services directly over IPs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The words here are confusing to me. I think it might be:

Because Docker Desktop for Mac does not provide access to container IP from host(macOS). This can result in host(macOS) and containers not being able to access each other's internal services directly over IPs.

The [mac-docker-connector](https://github.com/wenjunxiao/mac-docker-connector) provides the ability for the macOS host to directly access the docker container IP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I fixed it.

------------------- Check Docker environment ------------------
Docker server status .......................................... [running]
Gravitino IT Docker container is already running ............... [no]
Run only test cases where tag is set `gravitino-trino-it`. [embbeded|deploy test]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that the gradle output doesn't match the description you mentioned here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested in these case:

  1. No Gravitino-hive-ci docker container and mac-docker-connector running.
------------------ Check Docker environment -----------------
Docker server status ........................................ [running]
Gravitino IT Docker container is already running ............. [no]
Run test cases without `gravitino-tirno-it` tag .............. [embedded test]
Run test cases without `gravitino-docker-it` tag ............. [embedded test]
-------------------------------------------------------------
  1. Only running Gravitino-hive-ci docker container
------------------ Check Docker environment -----------------
Docker server status ........................................ [running]
Gravitino IT Docker container is already running ............. [yes]
Run test cases without `gravitino-tirno-it` tag .............. [embedded test]
Use Gravitino IT Docker container to run all integration test. [embedded test]
-------------------------------------------------------------
  1. Running Gravitino-hive-ci docker container and mac-docker-connector
------------------ Check Docker environment -----------------
Docker server status ........................................ [running]
Gravitino IT Docker container is already running ............. [yes]
Use Gravitino IT Docker container to run all integration test. [embedded test]
-------------------------------------------------------------

Other, I will reconstruction HiveCatalogIT code and document in the #639

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see, but there's a typo "tirno".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I fixed it.

Because Docker Desktop for Mac does not provide access to container IP from host(macOS).
The macos-docker-connector provides the ability for the macOS host to directly access the docker container IP.
Before running the integration tests, make sure to execute the `dev/docker/tools/macos-docker-connector.sh` script.

#### Running Gravitino Hive CI Docker Environment

1. Run a hive docker test environment container in the local using the `docker run --rm -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50010:50010 -p 50070:50070 -p 50075:50075 datastrato/gravitino-ci-hive` command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you have to change the paragraph title. Here it only mentions the "Hive CI Docker". Another thing is that do we need to start a docker before running the integration test?

I think you should reorganize the doc for others without background easy to run and hide details as much as you can.

@@ -129,13 +135,20 @@ fun printDockerCheckInfo() {
}
val dockerRunning = project.extra["dockerRunning"] as? Boolean ?: false
val hiveContainerRunning = project.extra["hiveContainerRunning"] as? Boolean ?: false
val macDockerConnector = project.extra["macDockerConnector"] as? Boolean ?: false
if (dockerRunning && hiveContainerRunning) {
EXCLUDE_DOCKER_TEST = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd better change to "EXCLUDE_HIVE_TEST".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to refactor the HiveCatalogIT code and documentation in #639 and we won't need to check the container environment anymore!

if (dockerRunning && hiveContainerRunning) {
EXCLUDE_DOCKER_TEST = false
}
if (dockerRunning && macDockerConnector) {
EXCLUDE_TRINO_TEST = false
}

println("------------------ Check Docker environment -----------------")
println("Docker server status ........................................ [${if (dockerRunning) "running" else "stop"}]")
println("Gravitino IT Docker container is already running ............. [${if (hiveContainerRunning) "yes" else "no"}]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually hive container, you need to change to make people easy to understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next PR #639, we simply check to see if Docker is running. the Hive and Trino Docker containers will start automatically in the integration test code.

doFirst {
copy {
from("${project.rootDir}/dev/docker/trino/conf")
into("build/tirno-conf")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not fixed.

String ipAddress = containerResponse.getNetworkSettings().getIpAddress();
Map<String, ContainerNetwork> containerNetworkMap =
containerResponse.getNetworkSettings().getNetworks();
Assertions.assertEquals(1, containerNetworkMap.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not proper to use Assertions here for non-test code. You can change to use Guava Preconditions instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I fixed it.

* Copyright 2023 Datastrato.
* This software is licensed under the Apache License version 2.
*/
package com.datastrato.gravitino.integration.test.trino;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is still the same: a) it is not proper to put HiveContainer code to trino package. b) this is non-test code, better moving to src utils package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I fixed it.

private Builder() {
this.image = DEFAULT_IMAGE;
this.hostName = HOST_NAME;
this.exposePorts = ImmutableSet.of(8088, 9000, 9083, 10000, 10002, 50070, 50075, 50010);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd better clearly define the port number in code, not just a hard-coded integer here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE.

import org.testcontainers.containers.Network;

@Tag("gravitino-trino-it")
public class TrinoConnectorIT extends AbstractIT {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to refactor this class a bit.

  • Move the prepare-to-test and check-status code to an abstract class. With this test class, it is better only having test code.
  • Test the functionalities one by one. For example, 1) operate schemas, like create, list, alter, drop...; b) operate tables; c) operate data.

Here your test is actually a scenario test, not a functional test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it is hard to add new functional validation to your code if Trino connector supports more functions.

Copy link
Member Author

@xunliu xunliu Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Move the prepare-to-test and check-status code to an abstract class. With this test class, it is better only having test code.

DONE.

Test the functionalities one by one. For example, 1) operate schemas, like create, list, alter, drop...; b) operate tables; c) operate data.

Trino supports many SQL, We need to add test case to cover that.
I think the best way to verify our trino-connector is to use the official Trino test cases in Gravitino.
So I created #657 to track this issue.

@xunliu
Copy link
Member Author

xunliu commented Nov 2, 2023

@jerryshao I fixed all the issues about comments, Please help me review this PR, Thanks.


@Test
@Order(2)
public void createTableTest() throws TException, InterruptedException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rename all the test method name to "testXXX"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE


<logger name="org.testcontainers" level="DEBUG"/>
<logger name="com.github.dockerjava" level="WARN"/>
</configuration>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the usage of this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted this unused logback-test.xml

@xunliu
Copy link
Member Author

xunliu commented Nov 3, 2023

@jerryshao I fixed the 2 issues you raised. Please review it again. Thanks!

@jerryshao jerryshao merged commit 8cf2f61 into apache:main Nov 3, 2023
github-actions bot pushed a commit that referenced this pull request Nov 3, 2023
### What changes were proposed in this pull request?

1. Add HiveContainer and TrinoContainer
3. Use Trino JDBC to connect TrinoContainer execute create database,
table, insert, select and join two tables test.

### Why are the changes needed?

+ Through Trino operations hive
+ Supports Trino E2E testing in the local and GitHub Action.

Fix: #434 

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
Add TrinoConnectorIT in the Integration-test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] Graviton Trino connector E2E testing
4 participants