-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-1774] Respect SparkSubmit --jars on YARN (client) #710
Conversation
Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
We may not always want to automatically add the jar to |
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Maybe you're already aware of this - in YARN cluster mode, the app jar is renamed to "app.jar" on the cluster machine and addJar'd . So adding it to spark.jars with its original name wouldn't work. It could make sense to not rename it on the cluster machine, but there might be backwards compatibility issues there with supporting YARN jobs outside spark-submit. |
Yes, for yarn-cluster, we're not replacing the |
That sounds correct to me (that it's redundant). Just looked through the code to confirm. |
These scenarios already distribute the primary resource properly. See code comments for more detail.
Merged build triggered. |
Merged build started. |
OptionAssigner(args.files, LOCAL | STANDALONE | MESOS, false, sysProp = "spark.files"), | ||
OptionAssigner(args.files, LOCAL | STANDALONE | MESOS, true, sysProp = "spark.files"), | ||
OptionAssigner(args.jars, LOCAL | STANDALONE | MESOS, false, sysProp = "spark.jars") | ||
OptionAssigner(args.jars, ALL_CLUSTER_MGRS, false, sysProp = "spark.jars") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the line that makes SparkSubmit respect --jars
for yarn-client mode
Other than the two lines I pointed out in the comments, the rest are basically format and test fixes. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Thanks @andrewor14 - this looks good. I'm merging it! |
SparkSubmit ignores `--jars` for YARN client. This is a bug. This PR also automatically adds the application jar to `spark.jar`. Previously, when running as yarn-client, you must specify the jar additionally through `--files` (because `--jars` didn't work). Now you don't have to explicitly specify it through either. Tested on a YARN cluster. Author: Andrew Or <[email protected]> Closes #710 from andrewor14/yarn-jars and squashes the following commits: 35d1928 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-jars c27bf6c [Andrew Or] For yarn-cluster and python, do not add primaryResource to spark.jar c92c5bf [Andrew Or] Minor cleanups 269f9f3 [Andrew Or] Fix format 013d840 [Andrew Or] Fix tests 1407474 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-jars 3bb75e8 [Andrew Or] Allow SparkSubmit --jars to take effect in yarn-client mode (cherry picked from commit 83e0424) Signed-off-by: Patrick Wendell <[email protected]>
SparkSubmit ignores `--jars` for YARN client. This is a bug. This PR also automatically adds the application jar to `spark.jar`. Previously, when running as yarn-client, you must specify the jar additionally through `--files` (because `--jars` didn't work). Now you don't have to explicitly specify it through either. Tested on a YARN cluster. Author: Andrew Or <[email protected]> Closes apache#710 from andrewor14/yarn-jars and squashes the following commits: 35d1928 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-jars c27bf6c [Andrew Or] For yarn-cluster and python, do not add primaryResource to spark.jar c92c5bf [Andrew Or] Minor cleanups 269f9f3 [Andrew Or] Fix format 013d840 [Andrew Or] Fix tests 1407474 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-jars 3bb75e8 [Andrew Or] Allow SparkSubmit --jars to take effect in yarn-client mode
…'s transitive packages (apache#710) * use verbosity flag in install * always pass down file instruction for executors conda env * check style * reuse yarn test to double check executor conda env * fix yarn cluster test python file * better documentation, tests * update doc w/ checkstyle Co-authored-by: Willi Raschkowski <[email protected]> * remove unnecessary import test Co-authored-by: Willi Raschkowski <[email protected]>
SparkSubmit ignores
--jars
for YARN client. This is a bug.This PR also automatically adds the application jar to
spark.jar
. Previously, when running as yarn-client, you must specify the jar additionally through--files
(because--jars
didn't work). Now you don't have to explicitly specify it through either.Tested on a YARN cluster.