Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jvm packages] xgboost4j.jar is containing the libxgboost4j.so which is compiled with CUDA #10879

Closed
wbo4958 opened this issue Oct 9, 2024 · 6 comments · Fixed by #10982
Closed
Assignees

Comments

@wbo4958
Copy link
Contributor

wbo4958 commented Oct 9, 2024

The latest snapshot jars of xgboost4j in https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/list.html?prefix=snapshot/ml/dmlc/xgboost4j_2.12/2.2.0-SNAPSHOT/ is about 330M, which is far larger than xgboost4j 2.1.1 release from https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/list.html?prefix=release/ml/dmlc/xgboost4j_2.12/2.1.1/

Similar to xgboost4j-spark package from https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/list.html?prefix=snapshot/ml/dmlc/xgboost4j-spark_2.12/2.2.0-SNAPSHOT/

Hi @hcho3, Could you help check it.

@hcho3
Copy link
Collaborator

hcho3 commented Oct 9, 2024

My hands are pretty full right now, I will try to get to it before end of this month.

@hcho3 hcho3 self-assigned this Oct 11, 2024
@hcho3 hcho3 mentioned this issue Oct 11, 2024
10 tasks
@hcho3
Copy link
Collaborator

hcho3 commented Oct 12, 2024

Given that 2.1.2 release is upcoming, I will prioritize fixing this issue.

@hcho3
Copy link
Collaborator

hcho3 commented Oct 17, 2024

@wbo4958 I took a deep dive and here's the diagnosis.

  • To shade xgboost4j-spark and xgboost4j-spark-gpu, we need to build two variants of xgboost4j: one with GPU code and one without.
  • When running the deploy step, Maven uploads the GPU variant of xgboost4j, as the GPU variant has overwritten the CPU variant of xgboost4j.
  • We used to have a stub package xgboost4j-gpu which allowed it to be deployed simultaneously as xgboost4j (CPU variant), but it's now removed.

We would need to run mvn deploy twice, with following effects:

  1. First run: Deploy xgboost4j (CPU variant), xgboost4j-spark
  2. Second run: Build xgboost4j (GPU variant) and xgboost4j-spark-gpu, but deploy only xgboost4j-spark-gpu.

I'm not quite sure how to achieve Step 2 (build xgbosot4j but not deploy it). @wbo4958 Any ideas as to how?

Alternatively, we can bring back xgboost4j-gpu, by implementing a script dev/rename_xgboost4j.py that replaces all occurrences of xgboost4j to xgboost4j-gpu in POM files. It would be similar to how we support two versions of Scala: dev/change_scala_version.py.

I don't want to bring back the old method of symlinking source files, as it proved to be rather fragile.

@wbo4958
Copy link
Contributor Author

wbo4958 commented Oct 17, 2024

Hi @hcho3,

To be honest, I don't know it. But I just got some info from LLM, Maybe you can refer to it.

To deploy only module B when executing mvn deploy, you need to understand how Maven handles multi-module projects and dependencies. Here are the steps and considerations:

Project Structure: Ensure your project structure looks something like this:

parent-project
├── pom.xml (Parent POM)
├── module-A
│   └── pom.xml
└── module-B
    └── pom.xml
Parent POM Configuration: In the parent pom.xml, you should list both modules:

<modules>
    <module>module-A</module>
    <module>module-B</module>
</modules>
Module B Dependency: In module-B/pom.xml, declare the dependency on module A:

<dependencies>
    <dependency>
        <groupId>${project.groupId}</groupId>
        <artifactId>module-A</artifactId>
        <version>${project.version}</version>
    </dependency>
</dependencies>
Deploying Only Module B:
By default, Maven's deploy phase will attempt to deploy all modules defined in the reactor. However, you can control this behavior:
Command Line Approach: You can navigate into module B's directory and run the deploy command from there. This ensures only B is deployed:

cd module-B
mvn deploy
However, this might not properly handle the deployment of module A's artifact if it hasn't been deployed yet, since B depends on A.
Using Maven Profiles: You can define profiles in your parent POM to control which modules get deployed:

<profiles>
    <profile>
        <id>deploy-B</id>
        <modules>
            <module>module-B</module>
        </modules>
    </profile>
</profiles>
Then, activate this profile when running Maven:

mvn deploy -Pdeploy-B
Skipping Module A: If you want to deploy B without deploying A (assuming A's artifacts are already in the repository or handled separately), you might need to temporarily modify B's POM or use Maven's -pl option to specify which projects to include:

mvn deploy -pl module-B -am
Here, -pl specifies the project list, and -am means "also make" the dependencies.
Considerations:
Consistency: Ensure that module A's artifact is available in your local repository or remote repository before deploying B, or you'll encounter dependency resolution issues.
Version Management: If A's version changes, you'll need to update B's dependency or ensure that SNAPSHOT versions are handled correctly.
Automated Builds: If this is part of a CI/CD pipeline, ensure your build script reflects these commands or profile activations.
By using one of these methods, you can control Maven to deploy only module B when you run mvn deploy. Remember, handling dependencies correctly is crucial, so ensure module A's deployment or availability is managed appropriately.

@hcho3
Copy link
Collaborator

hcho3 commented Nov 5, 2024

@wbo4958 I finally figured out a workflow to deploy the JVM packages.

  1. mvn deploy -Pdefault,release-to-s3, to deploy all CPU packages (xgboost4j, xgboost4j-example, xgboost4j-spark, xgboost4j-flink).
  2. mvn clean to clean all artifacts
  3. mvn install -Pgpu, to build xgboost4j and xgboost4j-spark-gpu with CUDA enabled and to install them locally.
  4. mvn deploy -Pgpu,release-to-s3 -pl xgboost4j-spark-gpu to deploy xgboost4j-spark-gpu only.

I'm now working on #10982 to test the new workflow.

@wbo4958
Copy link
Contributor Author

wbo4958 commented Nov 5, 2024

Amazing. Great work @hcho3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants