-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-50082][CORE][ML] Upgrade pmml-model
to 1.7.1 and migrate jaxb-api
to jakarta.xml.bind-api
, activation
to jakarta.activation-api
to remove jersey-related warning logs
#48611
Conversation
cc @pan3793 and @dongjoon-hyun FYI |
also cc @panbingkun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much, @wayneguow . I also observed the issue.
Since this is a log-related thing, it seems a little hard to validate.
Could you revise your PR description about how to validate the contributions on the followings from their logs?
- All Spark shell environments (
spark-shell
,spark-sql
,pyspark
,sparkR
) are fixed? - Spark Deamons (Spark Master/Worker/HistoryServer) are fixed?
- Spark Applications on
Spark Standalone Cluster
andK9s Cluster
(at least)?
Are there any negative impacts of disabling these features? Is it possible to retain these features through upgrading dependencies or code changes? |
Gentle ping, @wayneguow . Could you answer the above questions? |
@dongjoon-hyun Sorry, I forgot about this, I will confirm the details and give the final solution and reason in the next few days. |
Thank you so much! |
After doing further research, I found out why there are these two warning logs in Spark 4.0 version, but not in Spark 3.x and earlier versions:
If the following two dependencies are in the class path, there will be no corresponding warning logs, but we excluded it in this PR: #25481 |
For the first question:
For the second question: |
Thank you for sharing that, @wayneguow . |
Given that, this PR is the best and safe way to handle these, right, @wayneguow ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
@dongjoon-hyun In my opinion, it's like this. We can also wait for @LuciferYang 's opinion. |
Ack! Sure, let's wait for @LuciferYang 's opinion. |
Other than that, I'm checking the items listed here one by one and will update the PR description if it's all finished. |
We upgraded from 2.41 to 3.0.x. Actually, there was a similar registration in 2.41 as well, with just a slight difference in the namespace between Why didn't we need to make settings similar to the current pr before? I want to know if we revert the changes made by #25481, will the issue described in #25481 still exist? And could it also resolve the current problem? |
A little bit of further investigation: Spark already pulls spark/dev/deps/spark-deps-hadoop-3-hive-2.3 Line 122 in 98f2767
while this version actually contains
I think Spark eventually needs to follow the Jakarta EE specification to migrate to
Back to the issue specific to this PR, I think either the current approach or @LuciferYang's suggestion has no real runtime differences because Spark only uses a few set features of Jersey(which integrates with a lot of Jakarta EE APIs), but I lean toward @LuciferYang's suggestion because it simplifies both |
If time permits, I suggest we try to migrate them to jakarta before Spark 4.0 release |
34fd881
to
98f4fef
Compare
pmml-model
to 1.7.1 and migrate to jaxb-api
to jakarta.xml.bind-api
, activation
to jakarta.activation-api
to avoid unnecessary jersey-related warning logs
pmml-model
to 1.7.1 and migrate to jaxb-api
to jakarta.xml.bind-api
, activation
to jakarta.activation-api
to avoid unnecessary jersey-related warning logspmml-model
to 1.7.1 and migrate jaxb-api
to jakarta.xml.bind-api
, activation
to jakarta.activation-api
to avoid unnecessary jersey-related warning logs
pmml-model
to 1.7.1 and migrate jaxb-api
to jakarta.xml.bind-api
, activation
to jakarta.activation-api
to avoid unnecessary jersey-related warning logspmml-model
to 1.7.1 and migrate jaxb-api
to jakarta.xml.bind-api
, activation
to jakarta.activation-api
to remove jersey-related warning logs
@dongjoon-hyun @LuciferYang @pan3793 Late on updating the PR. If you all get a chance, I'd appreciate another look! |
Also cc @zhengruifeng for the ML part. |
It seems we need to change the code for also cc @WeichenXu123 |
Okay, I agree with your suggestion, let me first ensure that |
What changes were proposed in this pull request?
This PR aims to:
pmml-model
to 1.7.1;javax.activation:jaxb-api
tojakarta.xml.bind:jakarta.xml.bind-api
,javax.activation:activation
tojakarta.activation:jakarta.activation-api
to remove some jersey-related warning logs whenApiRootResource
orPrometheusResource
api was called.When we start spark-shell with the latest master code, open the Spark UI, click the executor tab, you can see relevant warning logs appearing in the spark-shell.
Why are the changes needed?
pmml-model
to the latest version;Does this PR introduce any user-facing change?
No.
How was this patch tested?
The configuration of
metrics.properties
is as follows:./bin/spark-shell --conf spark.ui.prometheus.enabled=true --conf spark.metrics.conf=conf/metrics.properties
curl http://192.168.124.7:4040/api/v1/version
curl http://192.168.124.7:4040/metrics/executors/prometheus
Was this patch authored or co-authored using generative AI tooling?
No.