Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Fix flaky IT test cases #8260

Merged
merged 24 commits into from
Sep 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
9fb64e6
chore: Busy wait java-container CREATE operation to finish
lqiu96 Aug 31, 2022
c3edca8
chore: Delete any existing java-notebook instances
lqiu96 Aug 31, 2022
e6c548e
chore: Use ListInstanceRequest with parent
lqiu96 Aug 31, 2022
06d3f67
chore: Delete gapic compute instances
lqiu96 Sep 1, 2022
7ad0c30
chore: Use ZonedDateTime for RFC3339 with Zone
lqiu96 Sep 1, 2022
e77cefd
chore: Setup the correct Request object field
lqiu96 Sep 1, 2022
b466d5f
chore: Create new IT profile
lqiu96 Sep 1, 2022
704cd3c
chore: Move shared deps profile to root pom.xml
lqiu96 Sep 1, 2022
9c21c07
chore: Use test lifecycle phase for graalvm tests
lqiu96 Sep 1, 2022
056b2cd
chore: Use old pom configs
lqiu96 Sep 2, 2022
0ad9b28
chore: Use correct path to vision resources
lqiu96 Sep 2, 2022
06c618d
Merge branch 'main' into update_container_IT
lqiu96 Sep 2, 2022
c200945
chore: Remove unused pom.xml
lqiu96 Sep 6, 2022
3922ea7
chore: Move busy wait after CREATE operation
lqiu96 Sep 6, 2022
08a5645
chore: Add specific names for IT compute instances
lqiu96 Sep 6, 2022
7782c59
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Sep 6, 2022
aaa73ba
chore: Clean up compute instances names
lqiu96 Sep 7, 2022
7ac0434
chore: Clean up instances
lqiu96 Sep 7, 2022
ad03253
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Sep 7, 2022
2bfe213
chore: Add checks for deletion
lqiu96 Sep 8, 2022
24462f0
Merge branch 'update_container_IT' of github.com:googleapis/google-cl…
lqiu96 Sep 8, 2022
dd1bad5
chore: Add missing import
lqiu96 Sep 8, 2022
a99d4dd
chore: Add missing import for notebooks
lqiu96 Sep 8, 2022
b087f07
Merge branch 'main' into update_container_IT
lqiu96 Sep 9, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .kokoro/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ case ${JOB_TYPE} in
-Penable-integration-tests \
-Pnative \
-fae \
verify
test
meltsufin marked this conversation as resolved.
Show resolved Hide resolved
RETURN_CODE=$?
printf "Finished Unit and Integration Tests for GraalVM Native:\n%s\n" "${module_list}"
else
Expand All @@ -182,7 +182,7 @@ case ${JOB_TYPE} in
-Penable-integration-tests \
-Pnative \
-fae \
verify
test
RETURN_CODE=$?
printf "Finished Unit and Integration Tests for GraalVM Native 17:\n%s\n" "${module_list}"
else
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,9 @@ public class BaseTest {
protected static final String DEFAULT_PROJECT = ServiceOptions.getDefaultProjectId();
protected static final String DEFAULT_ZONE = "us-central1-a";
protected static final String DEFAULT_REGION = "us-west1";
protected static final String COMPUTE_PREFIX = "it-test-compute";

public static String generateRandomName(String placeholder) {
return "gapic-" + placeholder + UUID.randomUUID().toString().substring(0, 8);
return COMPUTE_PREFIX + "-" + placeholder + "-" + UUID.randomUUID().toString().substring(0, 8);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ public static void setUp() throws IOException {
instances = new ArrayList<>();
InstancesSettings instanceSettings = InstancesSettings.newBuilder().build();
instancesClient = InstancesClient.create(instanceSettings);

Util.cleanUpComputeInstances(instancesClient, DEFAULT_PROJECT, DEFAULT_ZONE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a bit too aggressive to delete all compute instances? Why isn't the cleanup in tearDown() sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tearDown() cleanup is sufficient for 99% of the case. There was a significant amount of compute instances that were in there (my guess from previous runs that didn't have the cleanup or when the IT jobs were cancelled halfway).

This project is only used for ITs so it would have 24 hours to report the results before the previous instances are deleted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that someone runs the test on their own project and causes inadvertent deletion of instances. If we want to do this kind of cleanup, maybe we can have instance name prefix that is consistent and can accurately filter instances created by this test.

Copy link
Contributor Author

@lqiu96 lqiu96 Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that someone runs the test on their own project and causes inadvertent deletion of instances.

Actually, I was thinking that this would be fine for the ITs. I was thinking the ideal state for gcloud-devel would have 0 compute instances when there are no active ITs running. If Util.cleanUpComputeInstances (or more filtered version of that function) matches any existing instances, then those should be deleted. 24 hours is 12x as long as the longest GraalVM job.

maybe we can have instance name prefix that is consistent and can accurately filter instances created by this test.

Yep! I was planning on creating another PR that would update the instances for each module to have a more unique name instead test-instance or gapic-instance. i.e. Compute would be test-compute-instance-UUID, Container would be test-container-instance-UUID or something like that.

Copy link
Contributor Author

@lqiu96 lqiu96 Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok nvm, I just re-read the comment.

test on their own project

}

@Before
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package com.google.cloud.compute.v1.integration;

import com.google.cloud.compute.v1.DeleteInstanceRequest;
import com.google.cloud.compute.v1.Instance;
import com.google.cloud.compute.v1.InstancesClient;
import com.google.cloud.compute.v1.InstancesClient.ListPagedResponse;
import java.time.Instant;
import java.time.ZonedDateTime;
import java.time.temporal.ChronoUnit;

public class Util {

// Cleans existing test resources if any.
private static final int DELETION_THRESHOLD_TIME_HOURS = 24;

/** Bring down any instances that are older than 24 hours */
public static void cleanUpComputeInstances(
InstancesClient instancesClient, String project, String zone) {
ListPagedResponse listPagedResponse = instancesClient.list(project, zone);
for (Instance instance : listPagedResponse.iterateAll()) {
if (isCreatedBeforeThresholdTime(
ZonedDateTime.parse(instance.getCreationTimestamp()).toInstant())
&& instance.getName().startsWith(BaseTest.COMPUTE_PREFIX)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For both Util classes, I think I would rather have the prefix passed in as an argument into this method, but I guess this is OK too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this seems better. I'll create a new PR for this.

instancesClient.deleteAsync(
DeleteInstanceRequest.newBuilder()
.setInstance(instance.getName())
.setProject(project)
.setZone(zone)
.build());
}
}
}

private static boolean isCreatedBeforeThresholdTime(Instant instant) {
return instant.isBefore(Instant.now().minus(DELETION_THRESHOLD_TIME_HOURS, ChronoUnit.HOURS));
}
}
4 changes: 2 additions & 2 deletions java-container/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ If you are using Maven with [BOM][libraries-bom], add this to your pom.xml file:
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>libraries-bom</artifactId>
<version>26.1.0</version>
<version>26.1.1</version>
<type>pom</type>
<scope>import</scope>
</dependency>
Expand Down Expand Up @@ -49,7 +49,7 @@ If you are using Maven without BOM, add this to your dependencies:
If you are using Gradle 5.x or later, add this to your dependencies:

```Groovy
implementation platform('com.google.cloud:libraries-bom:26.1.0')
implementation platform('com.google.cloud:libraries-bom:26.1.1')

implementation 'com.google.cloud:google-cloud-container'
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import com.google.container.v1.ListOperationsResponse;
import com.google.container.v1.NodePool;
import com.google.container.v1.Operation;
import com.google.container.v1.Operation.Status;
import com.google.container.v1.ServerConfig;
import java.util.List;
import java.util.UUID;
Expand All @@ -37,16 +38,18 @@

public class ITSystemTest {

protected static final String CONTAINER_PREFIX = "it-test-container";

private static ClusterManagerClient client;
private static Operation operation;

private static final Logger LOG = Logger.getLogger(ITSystemTest.class.getName());
private static final String PROJECT_ID = ServiceOptions.getDefaultProjectId();
private static final String ZONE = "us-central1-a";
private static final String CLUSTER_NAME =
"test-cluster-" + UUID.randomUUID().toString().substring(0, 8);
CONTAINER_PREFIX + "-cluster-" + UUID.randomUUID().toString().substring(0, 8);
private static final String NODE_POOL_NAME =
"test-node-pool-" + UUID.randomUUID().toString().substring(0, 8);
CONTAINER_PREFIX + "-node-pool-" + UUID.randomUUID().toString().substring(0, 8);
private static final String DETAIL = "test-detail";
private static final String STATUS_MESSAGE = "test-status-message";
private static final String SELF_LINK =
Expand All @@ -63,7 +66,7 @@ public static void beforeClass() throws Exception {
client = ClusterManagerClient.create();
Util.cleanUpExistingInstanceCluster(PROJECT_ID, ZONE, client);

/** create node pool* */
/* create node pool* */
NodePool nodePool =
NodePool.newBuilder()
.setInitialNodeCount(INITIAL_NODE_COUNT)
Expand All @@ -72,7 +75,7 @@ public static void beforeClass() throws Exception {
.setStatusMessage(STATUS_MESSAGE)
.build();

/** create cluster */
/* create cluster */
Cluster cluster =
Cluster.newBuilder()
.setName(CLUSTER_NAME)
Expand All @@ -84,13 +87,20 @@ public static void beforeClass() throws Exception {
.setNetwork(NETWORK)
.build();
operation = client.createCluster(PROJECT_ID, ZONE, cluster);

Operation response = client.getOperation(PROJECT_ID, ZONE, operation.getName());
// Busy Wait for one minute at a time until Cluster CREATE operation is complete
while (response.getStatus() != Status.DONE) {
LOG.info(String.format("Cluster CREATE Operation Status: %s", response.getStatus()));
Thread.sleep(TimeUnit.MINUTES.toMillis(1));
response = client.getOperation(PROJECT_ID, ZONE, operation.getName());
}
LOG.info(String.format("%s cluster created successfully.", CLUSTER_NAME));
LOG.info(String.format("%s node pool created successfully.", NODE_POOL_NAME));
}

@AfterClass
public static void afterClass() throws Exception {
Thread.sleep(TimeUnit.MINUTES.toMillis(5));
public static void afterClass() {
client.deleteCluster(PROJECT_ID, ZONE, CLUSTER_NAME);
LOG.info(String.format("%s cluster deleted successfully.", CLUSTER_NAME));
client.close();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,27 +19,26 @@
import com.google.cloud.container.v1.ClusterManagerClient;
import com.google.container.v1.Cluster;
import com.google.container.v1.ListClustersResponse;
import java.io.IOException;
import java.time.Instant;
import java.time.OffsetDateTime;
import java.time.temporal.ChronoUnit;
import java.util.List;
import java.util.concurrent.ExecutionException;

public class Util {

// Cleans existing test resources if any.
private static final int DELETION_THRESHOLD_TIME_HOURS = 24;

/** tear down any clusters that are older than 24 hours * */
public static void cleanUpExistingInstanceCluster(
String projectId, String zone, ClusterManagerClient client)
throws IOException, ExecutionException, InterruptedException {
String projectId, String zone, ClusterManagerClient client) {

ListClustersResponse clustersResponse = client.listClusters(projectId, zone);
List<Cluster> clusters = clustersResponse.getClustersList();

for (Cluster cluster : clusters) {
if (isCreatedBeforeThresholdTime(cluster.getCreateTime())) {
if (isCreatedBeforeThresholdTime(cluster.getCreateTime())
&& cluster.getName().startsWith(ITSystemTest.CONTAINER_PREFIX)) {
client.deleteCluster(projectId, zone, cluster.getName());
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,15 @@

public class ITNotebookServiceClientTest {

protected static final String NOTEBOOK_PREFIX = "it-test-notebook";

private static final String PROJECT_ID = ServiceOptions.getDefaultProjectId();
private static final String LOCATION = "us-central1-a";
private static final String PARENT = "projects/" + PROJECT_ID + "/locations/" + LOCATION;
private static NotebookServiceClient client;
private static final String ID = UUID.randomUUID().toString().substring(0, 8);
private static final String NOTEBOOK_INSTANCE_ID = "test-notebook-instance-id-" + ID;
private static final String ENVIRONMENT_ID = "test-environment-id-" + ID;
private static final String NOTEBOOK_INSTANCE_ID = NOTEBOOK_PREFIX + "-instance-id-" + ID;
private static final String ENVIRONMENT_ID = NOTEBOOK_PREFIX + "-environment-id-" + ID;
private static final String INSTANCE_NAME = PARENT + "/instances/" + NOTEBOOK_INSTANCE_ID;
private static final String ENVIRONMENT_NAME = PARENT + "/environments/" + ENVIRONMENT_ID;
private static Instance expectedNotebookInstance;
Expand All @@ -66,6 +68,8 @@ public class ITNotebookServiceClientTest {
public static void setUp() throws IOException, ExecutionException, InterruptedException {
// Create Test Notebook Instance
client = NotebookServiceClient.create();
Util.cleanUpNotebookInstances(client, PARENT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this delete the one created above before it's even used?
Also, why isn't cleanup in tearDown() sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would only delete notebooks that have a creation time more than 24 hours earlier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a prefix for the notebook names or something to avoid accidentally deleting unrelated notebooks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's similar to the comment above about compute and container. I'll add in more specific names for the IT instances.


ContainerImage containerImage =
ContainerImage.newBuilder().setRepository(FieldBehavior.OPTIONAL.name()).build();

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package com.google.cloud.notebooks.v1beta1.it;

import com.google.cloud.notebooks.v1beta1.DeleteInstanceRequest;
import com.google.cloud.notebooks.v1beta1.Instance;
import com.google.cloud.notebooks.v1beta1.ListInstancesRequest;
import com.google.cloud.notebooks.v1beta1.NotebookServiceClient;
import com.google.cloud.notebooks.v1beta1.NotebookServiceClient.ListInstancesPagedResponse;
import com.google.protobuf.util.Timestamps;
import java.time.Instant;
import java.time.temporal.ChronoUnit;

public class Util {

// Cleans existing test resources if any.
private static final int DELETION_THRESHOLD_TIME_HOURS = 24;

/** Bring down any instances that are older than 24 hours */
public static void cleanUpNotebookInstances(NotebookServiceClient client, String parent) {
ListInstancesPagedResponse listInstancesPagedResponse =
client.listInstances(ListInstancesRequest.newBuilder().setParent(parent).build());
for (Instance instance : listInstancesPagedResponse.iterateAll()) {
if (isCreatedBeforeThresholdTime(
Instant.ofEpochMilli(Timestamps.toMillis(instance.getCreateTime())))
&& instance.getName().startsWith(ITNotebookServiceClientTest.NOTEBOOK_PREFIX)) {
client.deleteInstanceAsync(
DeleteInstanceRequest.newBuilder().setName(instance.getName()).build());
}
}
}

private static boolean isCreatedBeforeThresholdTime(Instant instant) {
return instant.isBefore(Instant.now().minus(DELETION_THRESHOLD_TIME_HOURS, ChronoUnit.HOURS));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ public class ITSystemTest {
private static final String ID = UUID.randomUUID().toString().substring(0, 8);
// GraalVM native-image test uses the project root as working directory, not google-cloud-vision
private static final String RESOURCES =
Files.exists(Paths.get("google-cloud-vision", "src", "test", "resources"))
? "google-cloud-vision/src/test/resources/"
Files.exists(Paths.get("java-vision","google-cloud-vision", "src", "test", "resources"))
? "java-vision/google-cloud-vision/src/test/resources/"
: "src/test/resources/";

private static final String GCS_BUCKET_ENV_VAR = "GOOGLE_CLOUD_TESTS_VISION_BUCKET";
Expand Down
74 changes: 37 additions & 37 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -155,42 +155,42 @@
</modules>

<build>
<plugins>
<!-- jacoco code coverage -->
<plugin>
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
<version>0.8.7</version>
<executions>
<execution>
<goals>
<goal>prepare-agent</goal>
</goals>
</execution>
<execution>
<id>report</id>
<phase>test</phase>
<goals>
<goal>report</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<skip>${skipUnitTests}</skip>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-failsafe-plugin</artifactId>
<configuration>
<forkCount>1C</forkCount>
<reuseForks>true</reuseForks>
</configuration>
</plugin>
</plugins>
<plugins>
<!-- jacoco code coverage -->
<plugin>
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
<version>0.8.7</version>
<executions>
<execution>
<goals>
<goal>prepare-agent</goal>
</goals>
</execution>
<execution>
<id>report</id>
<phase>test</phase>
<goals>
<goal>report</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<skip>${skipUnitTests}</skip>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-failsafe-plugin</artifactId>
<configuration>
<forkCount>1C</forkCount>
<reuseForks>true</reuseForks>
</configuration>
</plugin>
</plugins>
</build>
</project>