Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MINOR] Fixed unit tests #10362

Merged
merged 2 commits into from
Jan 10, 2024
Merged

Conversation

geserdugarov
Copy link
Contributor

@geserdugarov geserdugarov commented Dec 19, 2023

Change Logs

Fixed unit test in TestJavaHoodieBackedMetadata, and TestHoodieDeltaStreamer.

Impact

Fixed unit tests.

Risk level (write none, low medium or high below)

Low.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@@ -492,6 +493,9 @@ public void testTableOperationsWithMetadataIndex(HoodieTableType tableType) thro
.withMaxNumDeltaCommitsBeforeCompaction(12) // cannot restore to before the oldest compaction on MDT as there are no base files before that time
.build())
.build();
// module com.fasterxml.jackson.datatype:jackson-datatype-jsr310 is needed for proper column stats processing for Jackson >= 2.11 (Spark >= 3.3)
// Java 8 date/time type `java.time.LocalDate` is not supported by default
JsonUtils.registerModules();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch ~

@geserdugarov
Copy link
Contributor Author

geserdugarov commented Dec 19, 2023

I've reverted back changes in TestHoodieTableSource. Couldn't figure out quickly, why there are differences in running on my local machine, work cluster vs Azure pipeline. But the difference in the changed test testBucketPruningSpecialKeyDataType.

Tried even to run locally full copy of the maven command from Azure pipeline "UT FT common & flink & UT client/spark-client":

/usr/bin/mvn -f /home/vsts/work/1/s/pom.xml -fae -Pwarn-log -Dscala-2.12 -Dspark3.2 -Dflink1.18 -Dcheckstyle.skip=true -Drat.skip=true -Djacoco.skip=true -ntp -B -V -Pwarn-log -Dorg.slf4j.simpleLogger.log.org.apache.maven.plugins.shade=warn -Dorg.slf4j.simpleLogger.log.org.apache.maven.plugins.dependency=warn -Punit-tests -pl hudi-common,hudi-flink-datasource,hudi-flink-datasource/hudi-flink,hudi-flink-datasource/hudi-flink1.14.x,hudi-flink-datasource/hudi-flink1.15.x,hudi-flink-datasource/hudi-flink1.16.x,hudi-flink-datasource/hudi-flink1.17.x,hudi-flink-datasource/hudi-flink1.18.x,hudi-client/hudi-spark-client test

but can't reproduce Azure running results.

There is a hung of Azure pipeline "UT FT other modules". Will just try to restart it. But I've checked that in the available log:

[WARNING] Tests run: 5, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 153.622 s - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerWithMultiWriter

which means that fix in TestHoodieDeltaStreamer should be correct.

@geserdugarov
Copy link
Contributor Author

@hudi-bot run azure

@geserdugarov geserdugarov changed the title [MINOR] Fixed unit tests [MINOR] Fixed unit test in TestJavaHoodieBackedMetadata Dec 20, 2023
@geserdugarov geserdugarov changed the title [MINOR] Fixed unit test in TestJavaHoodieBackedMetadata [MINOR] Fixed unit tests Dec 22, 2023
@geserdugarov geserdugarov force-pushed the fix-unit-tests-master branch 2 times, most recently from ef0f9f4 to 4365f28 Compare December 22, 2023 12:28
@geserdugarov
Copy link
Contributor Author

@hudi-bot run azure

@geserdugarov
Copy link
Contributor Author

geserdugarov commented Dec 23, 2023

I don't understand what is happening with CI. I've changed 2 unit tests:

  • TestJavaHoodieBackedMetadata, from hudi-client/hudi-java-client,
  • TestHoodieDeltaStreamer, from hudi-utilities.

Both are Java tests.

Azure CI

hudi-client/hudi-java-client is not included in the Azure CI.
hudi-utilities is included in the Azure CI in UT FT other modules job at UT other modules stage.
So, TestHoodieDeltaStreamer test is the only one, which could brake the Azure CI.
But the last log from UT other modules stage is

[INFO] Running org.apache.hudi.utilities.sources.TestSqlSource

before

This job was abandoned. We have detected that logs from the agent may have not finished uploading. We have included our in-memory record of all log lines uploaded before we lost contact with the agent:

My change in this test couldn't brake it this way, only test failure is possible. Maybe with my MR test ordering is changed and the unit tests running is hung at @AfterAll/Each of some test class or at @BeforeAll/Each of another one. But I couldn't reproduce the problem locally. This part of CI job is passing without any problem locally.

If the order of running test classes hasn't changed, then from another successful run the order is:

  • 71 another test classes
  • TestHoodieDeltaStreamer
  • 28 another test classes
  • TestGenericRddTransform
  • TestPostgresDebeziumSource
  • TestMysqlDebeziumSource
  • TestGcsEventsHoodieIncrSource
  • TestAvroDFSSource
  • TestSqlSource
  • 39 another test classes.

In my failed Azure CI log the part from TestGenericRddTransform to TestSqlSource is available only. Previous log is missed. If ordering the same then changed TestHoodieDeltaStreamer should be successfully passed, and hung in some another test.

GitHub Actions

My change in TestJavaHoodieBackedMetadata from hudi-client/hudi-java-client should affect only test-hudi-hadoop-mr-and-hudi-java-client job, but not test-spark.
And I see that test-hudi-hadoop-mr-and-hudi-java-client is ok, but there are hungs in test-spark and failure at TestDataSourceForBootstrap scala test after

2023-12-23T04:01:07.0996155Z 4017081 [Executor task launch worker for task 372] ERROR org.apache.spark.executor.Executor [] - Exception in task 0.0 in stage 133.0 (TID 372)
2023-12-23T04:01:07.0997116Z java.lang.OutOfMemoryError: GC overhead limit exceeded

@danny0405 , @yihua Could you, please, give me any suggestions what else I can try?

@geserdugarov
Copy link
Contributor Author

@hudi-bot run azure

@danny0405
Copy link
Contributor

@Geser There are some OOM issues on master code that are are trying to fix, should not be related with your change.

@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Contributor

@bvaradar bvaradar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bvaradar
Copy link
Contributor

bvaradar commented Jan 8, 2024

@yihua : If you are ok with this change, can you land it ?

@yihua
Copy link
Contributor

yihua commented Jan 10, 2024

2023-12-23T04:01:07.0996155Z 4017081 [Executor task launch worker for task 372] ERROR org.apache.spark.executor.Executor [] - Exception in task 0.0 in stage 133.0 (TID 372)
2023-12-23T04:01:07.0997116Z java.lang.OutOfMemoryError: GC overhead limit exceeded

@danny0405 , @yihua Could you, please, give me any suggestions what else I can try?

The OOM looks to be unrelated to this PR, which happens on master too.

Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yihua yihua merged commit 76cbca3 into apache:master Jan 10, 2024
31 checks passed
VitoMakarevich pushed a commit to VitoMakarevich/hudi that referenced this pull request Jan 13, 2024
VitoMakarevich pushed a commit to VitoMakarevich/hudi that referenced this pull request Jan 13, 2024
@geserdugarov geserdugarov deleted the fix-unit-tests-master branch February 6, 2024 06:04
yihua pushed a commit that referenced this pull request Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants