Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4504][Examples] fix run-example failure if multiple example jars present #3377

Closed
wants to merge 2 commits into from

Conversation

gvramana
Copy link
Contributor

Fix run-examples script error during multiple jars.
Modified to support failing fast similar to assembly.jar

Author: Venkata Ramana G [email protected]

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

export SPARK_EXAMPLES_JAR="`ls "$FWDIR"/lib/spark-examples-*hadoop*.jar`"
elif [ -e "$EXAMPLES_DIR"/target/scala-$SPARK_SCALA_VERSION/spark-examples-*hadoop*.jar ]; then
export SPARK_EXAMPLES_JAR="`ls "$EXAMPLES_DIR"/target/scala-$SPARK_SCALA_VERSION/spark-examples-*hadoop*.jar`"
export SPARK_EXAMPLES_JAR="`ls -t "$FWDIR"/lib/spark-examples-*hadoop*.jar 2>>/dev/null | head -1`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know this will pick up the right jar?

More generally, I think the only situation someone will run into this is when switching between branches and rebuilding without cleaning. Not sure it's worth it to add workarounds in the code for that.

You'll get a similar error from compute-classpath.sh if multiple assembly jars exist. The same behavior should be just fine for the examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that required jar will not always be latest. But failing with script error confuses user. We can at least correct to give a proper error message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to fail fast in this case like we do for the assembly jar.

@gvramana
Copy link
Contributor Author

Modified to support failing fast similar to assembly.jar

@gvramana
Copy link
Contributor Author

gvramana commented Dec 1, 2014

Please review the patch, thanks

@JoshRosen
Copy link
Contributor

This looks pretty good, but I've noticed that there are some small differences between the code here and the analogous code in compute-classpath.sh. We should keep them as similar as possible.

echo "Failed to find Spark examples assembly in $FWDIR/lib or $FWDIR/examples/target" 1>&2
echo "You need to build Spark before running this program" 1>&2
exit 1
fi

if [ "$JAR_COUNT" -gt "1" ]; then
JARS_LIST="`ls -t ${JAR_PATH}/spark-examples-*hadoop*.jar`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, compute-classpath uses

  jars_list=$(ls "$assembly_folder" | grep "spark-assembly.*hadoop.*.jar$")

for this; we should match this style rather than introducing a new one using backticks, ls - t, etc.

@JoshRosen
Copy link
Contributor

@nchammas, do you want to chime in here regarding bash style?

@nchammas
Copy link
Contributor

The style looks good to me, with a couple of exceptions: In general we should use $(...) in place of backticks, and we should also avoid parsing the output of ls.

fi

if [[ -z "$SPARK_EXAMPLES_JAR" ]]; then
JAR_COUNT="`ls ${JAR_PATH}/spark-examples-*hadoop*.jar 2>>/dev/null | wc -l`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the old code used this style, but it would be great if we could replace all occurrences where we capture the output of ls with one of the recommended patterns in either of these two pages (e.g. using for f in glob_expression or using find).

Even better would be to find a way to avoid having to capture this output at all

@srowen
Copy link
Member

srowen commented Jan 5, 2015

There was already a PR for this: #3069
But it seems to be fixing a different root cause, that the assembly generated by SBT and Maven are named differently. I am not sure if this also addresses this problem?

@JoshRosen
Copy link
Contributor

@srowen You can still end up with multiple examples assemblies if you're switching between branches without performing cleans, e.g. switch from branch-1.2 to master.

@srowen
Copy link
Member

srowen commented Jan 5, 2015

@JoshRosen Yeah I get that; I think there is another different issue lurking here, though I haven't really looked at it, and that is that the glob pattern does not match what SBT builds. The other PR might also be worth merging.

@gvramana
Copy link
Contributor Author

@nchammas @JoshRosen Thanks for your comments. I have updated the same. please review.

@nchammas
Copy link
Contributor

Updated Bash style looks good to me.

@JoshRosen
Copy link
Contributor

Jenkins, this is ok to test.

@SparkQA
Copy link

SparkQA commented Jan 15, 2015

Test build #25611 has started for PR 3377 at commit fa7f481.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 15, 2015

Test build #25611 has finished for PR 3377 at commit fa7f481.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25611/
Test PASSed.

@JoshRosen
Copy link
Contributor

LGTM. Thanks @nchammas for looking this over. I'm going to merge this into master (1.3.0), branch-1.2 (1.2.1), and branch-1.1 (1.1.2).

@asfgit asfgit closed this in 74de94e Jan 19, 2015
asfgit pushed a commit that referenced this pull request Jan 19, 2015
…ars exist

Fix run-example script to fail fast with useful error message if multiple
example assembly JARs are present.

Author: Venkata Ramana Gollamudi <[email protected]>

Closes #3377 from gvramana/run-example_fails and squashes the following commits:

fa7f481 [Venkata Ramana Gollamudi] Fixed review comments, avoiding ls output scanning.
6aa1ab7 [Venkata Ramana Gollamudi] Fix run-examples script error during multiple jars

(cherry picked from commit 74de94e)
Signed-off-by: Josh Rosen <[email protected]>

Conflicts:
	bin/compute-classpath.sh
asfgit pushed a commit that referenced this pull request Jan 19, 2015
…ars exist

Fix run-example script to fail fast with useful error message if multiple
example assembly JARs are present.

Author: Venkata Ramana Gollamudi <[email protected]>

Closes #3377 from gvramana/run-example_fails and squashes the following commits:

fa7f481 [Venkata Ramana Gollamudi] Fixed review comments, avoiding ls output scanning.
6aa1ab7 [Venkata Ramana Gollamudi] Fix run-examples script error during multiple jars

(cherry picked from commit 74de94e)
Signed-off-by: Josh Rosen <[email protected]>

Conflicts:
	bin/compute-classpath.sh
	bin/run-example
fi
num_jars=0

for f in ${assembly_folder}/spark-assembly*hadoop*.jar; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a word splitting bug here. My bad for missing this during review.

@gvramana
Copy link
Contributor Author

Oh, basic mistake :( . Sorry I missed it completely.
Fixed the same in PR #4561
@nchammas Thanks for identifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants