Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-4903][CELEBORN] Support multiple versions of Celeborn #4913

Merged
merged 1 commit into from
Mar 15, 2024

Conversation

kerwin-zk
Copy link
Contributor

What changes were proposed in this pull request?

Support multiple versions of Celeborn

#4903

How was this patch tested?

CI

Copy link

#4903

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor

ulysses-you commented Mar 12, 2024

Is it possible to use one module to be compatible with multi-version celeborn ? It is hard to maintain if we always create new module for new coming celeborn version. Besides, it's hard to work if someone use 0.3 branch but backport some 0.4 patch..

cc @pan3793

@kerwin-zk
Copy link
Contributor Author

Is it possible to use one module to be compatible with multi-version celeborn ? It is hard to maintain if we always create new module for new coming celeborn version. Besides, it's hard to work if someone use 0.3 branch but backport some 0.4 patch..

cc @pan3793

@ulysses-you I do agree that we need to consider the case of it's hard to work if someone uses the 0.3 branch but backports some 0.4 patch, and I will consolidate it into one.

@kerwin-zk kerwin-zk changed the title [GLUTEN-4903][CELEBORN] Support multiple versions of Celeborn [GLUTEN-4903][CELEBORN][WIP] Support multiple versions of Celeborn Mar 12, 2024
Copy link

#4903

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

3 similar comments
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@kerwin-zk kerwin-zk force-pushed the issue-4903 branch 2 times, most recently from 0678ce5 to 56cda5b Compare March 14, 2024 02:16
Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

@kerwin-zk kerwin-zk changed the title [GLUTEN-4903][CELEBORN][WIP] Support multiple versions of Celeborn [GLUTEN-4903][CELEBORN] Support multiple versions of Celeborn Mar 14, 2024
run: |
$PATH_TO_GLUTEN_TE/$OS_IMAGE_NAME/gha/gha-checkout/exec.sh \
'wget https://archive.apache.org/dist/incubator/celeborn/celeborn-0.3.0-incubating/apache-celeborn-0.3.0-incubating-bin.tgz && \
tar xzf apache-celeborn-0.3.0-incubating-bin.tgz -C /opt/ && mv /opt/apache-celeborn-0.3.0-incubating-bin /opt/celeborn && cd /opt/celeborn && \
'wget https://archive.apache.org/dist/incubator/celeborn/celeborn-0.4.0-incubating/apache-celeborn-0.4.0-incubating-bin.tgz && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI only covers 0.3.2 and 0.4.0, what about 0.3.0 and 0.3.1? we actually have breaking changes on the APIs that Gluten used in the 0.3.x serial

Copy link
Contributor Author

@kerwin-zk kerwin-zk Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pan3793 CI of 0.3.0 and 0.3.1 has been successfully run. Mainly considering the running time of CI, only one version of 0.3.x is currently retained. If everyone feels that CI duration is not an issue after discussion, I can add it.

Comment on lines +29 to +38
<exclusions>
<exclusion>
<groupId>org.apache.celeborn</groupId>
<artifactId>celeborn-client-spark-${spark.major.version}_${scala.binary.version}</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.celeborn</groupId>
<artifactId>celeborn-spark-${spark.major.version}-columnar-shuffle_${scala.binary.version}</artifactId>
</exclusion>
</exclusions>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shaded artifacts should not expose the shipped transitive deps, seems there are issues in 0.4.0-incubating cc @cfmcgrady

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pan3793 I am handling this issue.

Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @kerwin-zk

@@ -47,7 +47,7 @@
<sparkbundle.version>3.2</sparkbundle.version>
<spark.version>3.4.2</spark.version>
<sparkshim.artifactId>spark-sql-columnar-shims-spark32</sparkshim.artifactId>
<celeborn.version>0.3.0-incubating</celeborn.version>
<celeborn.version>0.3.2-incubating</celeborn.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add profile for 0.4 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, Celeborn only needs one version, as differences between versions have been eliminated through reflection.

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@ulysses-you ulysses-you merged commit 62746c4 into apache:main Mar 15, 2024
17 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4913_time.csv log/native_master_03_14_2024_58a459bf4_time.csv difference percentage
q1 37.79 36.11 -1.683 95.55%
q2 23.97 23.82 -0.151 99.37%
q3 36.67 36.95 0.276 100.75%
q4 38.50 38.44 -0.065 99.83%
q5 72.26 69.68 -2.583 96.43%
q6 7.39 7.39 -0.007 99.91%
q7 84.01 82.24 -1.773 97.89%
q8 84.61 84.83 0.218 100.26%
q9 118.83 125.41 6.573 105.53%
q10 44.25 45.43 1.180 102.67%
q11 19.81 20.82 1.007 105.08%
q12 28.92 24.93 -3.990 86.20%
q13 46.80 47.32 0.520 101.11%
q14 21.03 20.91 -0.116 99.45%
q15 29.46 31.01 1.547 105.25%
q16 14.00 12.67 -1.333 90.48%
q17 100.03 100.39 0.368 100.37%
q18 143.23 141.86 -1.362 99.05%
q19 13.74 14.82 1.080 107.86%
q20 29.49 28.88 -0.602 97.96%
q21 227.55 226.25 -1.306 99.43%
q22 13.67 13.94 0.272 101.99%
total 1236.01 1234.08 -1.929 99.84%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants