Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19399. Add support for CRT Client #7333

Closed

Conversation

ahmarsuhail
Copy link
Contributor

@ahmarsuhail ahmarsuhail commented Jan 27, 2025

Description of PR

Initial changes for CRT support.

How was this patch tested?

Not tested yet

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

Sorry, something went wrong.

@ahmarsuhail ahmarsuhail marked this pull request as draft January 27, 2025 17:34
@ahmarsuhail
Copy link
Contributor Author

@steveloughran before I do anymore work here, could you please validate these changes. Here is what I want to achieve from this PR:

  • Minimum required configuration

    • Proxy is not configured
    • Region configuration logic is highly simplified (no endpoint parsing)
    • A lot of HTTP configurations we apply cannot be applied to CRT
  • Add a new getOrCreateCRTClient(). We previously discussed getOrCreateAsyncClient(crtRequired). But I want to keep CRT client separate from the Java async client.

I am initially aiming for this behaviour:

  • When using analytic accelerator (AAL), use CRT as default. This will be controlled by ANALYTICS_ACCELERATOR_CRT_ENABLED. If you want to use it without CRT, just disable this.

  • Transfer manager to continue to use Java async client.

So, CRT default on with AAL, can be disabled, in which case getOrCreateAsyncClient() will get called. No changes to existing functionality of transfer manager.

The other option is to have a CRT_ENABLED flag for all of S3A, which will be default OFF. This will control if CRT is used in both AAL and the transfer manager. So will mean AAL is used with Java async as default.

Our benchmarks have been with the CRT client, and so would really like CRT to be the default client with AAL.

What do you think?

@steveloughran
Copy link
Contributor

I'd like a single CRT on/off switch, rather than a separate client for AAL

Create/destroy of a client is expensive, hence the move to lazy creation. Even with that, fs close() times get worse. Threads and the connection pool are key factor .

We need those shared thread and connection pools, not just for those overheads, but so that we can have warm pools of both waiting for new requests coming in.

We don't want to make things worse there.

Equally critical is test coverage: being able to turn it on for the single client makes it possible to run the entire ITest suites against it. Ideally a -Pcrt switch could be added to maven to turn this on and off, and we can add it to the test matrix for release qualification

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 35m 45s trunk passed
+1 💚 compile 0m 44s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 36s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 checkstyle 0m 33s trunk passed
+1 💚 mvnsite 0m 42s trunk passed
+1 💚 javadoc 0m 43s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 34s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 spotbugs 1m 8s trunk passed
+1 💚 shadedclient 34m 16s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 26s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 0m 34s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.
-1 ❌ javac 0m 34s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.
-1 ❌ compile 0m 28s /patch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.
-1 ❌ javac 0m 28s /patch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 20s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 4 new + 6 unchanged - 0 fixed = 10 total (was 6)
-1 ❌ mvnsite 0m 28s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 javadoc 0m 30s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 26s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
-1 ❌ spotbugs 0m 27s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 shadedclient 36m 25s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 31s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
115m 7s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7333/1/artifact/out/Dockerfile
GITHUB PR #7333
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux f2b8903c61c9 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0f47ebb
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7333/1/testReport/
Max. process+thread count 546 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7333/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@github-actions github-actions bot added the build label Jan 28, 2025
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 0s Maven dependency ordering for branch
+1 💚 mvninstall 30m 26s trunk passed
+1 💚 compile 17m 5s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 compile 15m 26s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 checkstyle 4m 16s trunk passed
+1 💚 mvnsite 1m 37s trunk passed
+1 💚 javadoc 1m 35s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 25s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+0 🆗 spotbugs 0m 45s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 34m 49s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 35s Maven dependency ordering for patch
+1 💚 mvninstall 0m 43s the patch passed
+1 💚 compile 16m 0s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javac 16m 0s the patch passed
+1 💚 compile 15m 28s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 javac 15m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 9s /results-checkstyle-root.txt root: The patch generated 6 new + 13 unchanged - 0 fixed = 19 total (was 13)
+1 💚 mvnsite 1m 36s the patch passed
-1 ❌ javadoc 0m 53s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
-1 ❌ javadoc 0m 50s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+0 🆗 spotbugs 0m 39s hadoop-project has no data from spotbugs
+1 💚 shadedclient 36m 46s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 38s hadoop-project in the patch passed.
+1 💚 unit 0m 55s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 3s The patch does not generate ASF License warnings.
202m 1s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7333/2/artifact/out/Dockerfile
GITHUB PR #7333
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname Linux e02aba4d9019 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / dc54f7a
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7333/2/testReport/
Max. process+thread count 546 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7333/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants