Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6122][Core] Upgrade tachyon-client version to 0.6.3 #5354

Closed
wants to merge 14 commits into from

Conversation

calvinjia
Copy link
Contributor

This is a reopening of #4867.
A short summary of the issues resolved from the previous PR:

  1. HTTPClient version mismatch: Selenium (used for UI tests) requires version 4.3.x, and Tachyon included 4.2.5 through a transitive dependency of its shaded thrift jar. To address this, Tachyon 0.6.3 will promote the transitive dependencies of the shaded jar so they can be excluded in spark.
  2. Jackson-Mapper-ASL version mismatch: In lower versions of hadoop-client (ie. 1.0.4), version 1.0.1 is included. The parquet library used in spark sql requires version 1.8+. Its unclear to me why upgrading tachyon-client would cause this dependency to break. The solution was to exclude jackson-mapper-asl from hadoop-client.

It seems that the dependency management in spark-parent will not work on transitive dependencies, one way to make sure jackson-mapper-asl is included with the correct version is to add it as a top level dependency. The best solution would be to exclude the dependency in the modules which require a higher version, but that did not fix the unit tests. Any suggestions on the best way to solve this would be appreciated!

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@haoyuan
Copy link
Contributor

haoyuan commented Apr 3, 2015

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Apr 3, 2015

Test build #29691 has started for PR 5354 at commit 0ae6c97.

@SparkQA
Copy link

SparkQA commented Apr 3, 2015

Test build #29691 has finished for PR 5354 at commit 0ae6c97.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch adds the following new dependencies:
    • com.sun.el-1.0.0.v201105211818.jar
    • commons-el-1.0.jar
    • commons-logging-1.0.3.jar
    • hadoop-core-1.0.4.jar
    • hsqldb-1.8.0.10.jar
    • javax.el-2.1.0.v201105211819.jar
    • javax.servlet.jsp-2.1.0.v201105211820.jar
    • javax.servlet.jsp.jstl-1.2.0.v201105211821.jar
    • jetty-continuation-8.1.14.v20131031.jar
    • jetty-http-8.1.14.v20131031.jar
    • jetty-io-7.6.15.v20140411.jar
    • jetty-security-8.1.14.v20131031.jar
    • jetty-util-8.1.14.v20131031.jar
    • jetty-xml-7.6.15.v20140411.jar
    • org.apache.jasper.glassfish-2.1.0.v201110031002.jar
    • org.apache.taglibs.standard.glassfish-1.2.0.v201112081803.jar
    • org.eclipse.jdt.core-3.7.1.jar
    • tachyon-0.6.3.jar
    • tachyon-client-0.6.3.jar
  • This patch removes the following dependencies:
    • tachyon-0.5.0.jar
    • tachyon-client-0.5.0.jar

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29691/
Test FAILed.

@haoyuan
Copy link
Contributor

haoyuan commented Apr 3, 2015

Jenkins, test this please

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29697/
Test PASSed.

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Apr 5, 2015

Test build #29731 has started for PR 5354 at commit a3a29da.

@SparkQA
Copy link

SparkQA commented Apr 5, 2015

Test build #29731 has finished for PR 5354 at commit a3a29da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch adds the following new dependencies:
    • jetty-continuation-8.1.14.v20131031.jar
    • jetty-http-8.1.14.v20131031.jar
    • jetty-io-7.6.15.v20140411.jar
    • jetty-security-8.1.14.v20131031.jar
    • jetty-util-8.1.14.v20131031.jar
    • jetty-xml-7.6.15.v20140411.jar
    • tachyon-0.6.3.jar
    • tachyon-client-0.6.3.jar
  • This patch removes the following dependencies:
    • tachyon-0.5.0.jar
    • tachyon-client-0.5.0.jar

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29731/
Test PASSed.

@calvinjia
Copy link
Contributor Author

@JoshRosen @haoyuan
Thanks for retesting, I've updated the pom to exclude the promoted dependencies of jetty and curator.

@haoyuan
Copy link
Contributor

haoyuan commented Apr 6, 2015

Jenkins, retest this please.

1 similar comment
@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Apr 6, 2015

Test build #29753 has started for PR 5354 at commit e2ff80a.

@SparkQA
Copy link

SparkQA commented Apr 6, 2015

Test build #29753 has finished for PR 5354 at commit e2ff80a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch adds the following new dependencies:
    • tachyon-0.6.3.jar
    • tachyon-client-0.6.3.jar
  • This patch removes the following dependencies:
    • tachyon-0.5.0.jar
    • tachyon-client-0.5.0.jar

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29753/
Test PASSed.

@calvinjia
Copy link
Contributor Author

@JoshRosen @aarondav @pwendell
What do you think about the current PR? Is there any better strategy for handling the Jackson-mapper-asl issue?

Thanks!

@aarondav
Copy link
Contributor

The change LGTM but I'll leave it to @pwendell to OK the pom changes. I'm not sure if there's a better way to express basically "don't include any of this guy's transitive dependencies".

@srowen
Copy link
Member

srowen commented Apr 13, 2015

Yea, this is really messy. Why are so many exclusions needed? many of these are artifacts that are already in the assembly, so, there's no particular problem.

However it does look like the Tachyon client has a lot of big dependencies. Is the client really pulling in hadoop-core and jetty-*? can the client's dependencies be trimmed somewhat? Basically this declaration says that fewof its dependencies are actually needed.

At the least we need to keep out hadoop-core, the servlet APIs, EL, probably hsqldb if possible, Glassfish. Anything that's added needs its license checked.

@srowen
Copy link
Member

srowen commented Apr 20, 2015

Oh, I'm referring to commons.httpclient.version declared in the parent POM. It's actually only used, however, to manage the version of httpclient used by Kinesis. I saw you were the one that added the line, but this could be a false alarm; maybe you were just moving code.

I think the right-er thing to do given the facts here is just make the whole project actually use the version of httpclient / httpcore that this string implies, and make it 4.3.6.

@ScrapCodes
Copy link
Member

hm.. your approach sounds good to me, we should really make sure we have the same version of a library across dependent projects.

@calvinjia
Copy link
Contributor Author

@srowen
Despite Jackson being managed by dependencyManagement, lower versions make it into managed_libs which cause issues when mllib tests run, since they expect the higher version. This is why I exclude the library from hadoop-client which brings in the lower versions.

For httpclient, 4.2.5 is brought in by sql/hive when they reference libthrift. This causes selenium tests to break since it expects 4.3.2.

Finding the perfect way to reconcile Spark's dependencies as you mentioned before, deserves its own ticket and will require substantial effort. For example, it is not even easy to tell that different versions of httpclient are referenced in Spark.

@ScrapCodes
Let's address cleaning up the Kinesis profile in a separate PR.

@srowen
Copy link
Member

srowen commented Apr 21, 2015

I don't see anything in lib_managed except the datanucleus jars. Jackson should not be something that has to be managed specially like this and I see no reference to Jackson jars in this dir in the code? are you sure this isn't something left around from some other process?

Right now, mvn -Phadoop-2.4 dependency:tree shows a dependency on Jackson 1.9.13 only, across the project. What are you seeing that is different and that causes a problem? The current Hadoop deps do not bring in a different version, and Tachyon doesn't add new Hadoop deps (right?)

If Tachyon requires HttpClient 4.3.2+, then that will need to be resolved here. Your change does not actually cause Spark to use 4.3.2, as I say above. That's easy to address, and needs to happen in this PR. We don't need a separate PR for the Kinesis change since that's logically part of the change you will have to make here to make this do what you want.

@calvinjia
Copy link
Contributor Author

@srowen
These issues only occur when running the sbt tests using lower versions of Hadoop (ie. 1.0.4).

Tachyon does not require httpclient 4.3.2, it's the selenium tests that do. To be clear, the dependency changes I am making are to address the jar conflicts that occur when running the sbt tests.

@srowen
Copy link
Member

srowen commented Apr 21, 2015

Why would the Tachyon version change itself alter the jackson or httpclient dependencies? was it already not working?

@calvinjia
Copy link
Contributor Author

The conflicting dependencies were already included prior to my changes. I think since the correct jars were also available, there is a chance to use those and mask the problem.

@calvinjia
Copy link
Contributor Author

@srowen
Do you have any other comments? Thanks.

@srowen
Copy link
Member

srowen commented Apr 22, 2015

I'm running SBT tests now just to see if I can reproduce a failure. I guess I'd be surprised if the default SBT build doesn't work, since it would have been this way for a while. I do see that the SBT build adds a bunch of stuff to lib_managed, yes, which Maven doesn't.

So, if the dependency changes are not related to Tachyon, I think I'd skip them. There's no problem with Maven right now. The Jackson exclusion should not make a difference; in Maven, the version is consistently 1.8.8 for pre-Hadoop-2 builds, and 1.9.13 for Hadoop 2.2+ builds. The HttpClient situation is a little bit wrong but if it's not actually affecting Tachyon, that can be fixed separately.

@calvinjia
Copy link
Contributor Author

@srowen
You shouldn't be able to reproduce a failure with master branch, but upgrading Tachyon triggers the issue (picking up the wrong jar out of the ones that get pulled into lib_managed). This is why I made modifications to prevent the wrong jars from being pulled in.

@srowen
Copy link
Member

srowen commented Apr 22, 2015

Yeah, that still confuses me. If Tachyon doesn't touch hadoop-client or hadoop-common I'm not sure how it could change the dependencies. I'm wary of doing things like excluding deps to manage versions, especially when Hadoop does in fact need Jackson.

I can appreciate that -- whatever is going on -- we still want the SBT build to work even if it's not the main build, and still want Hadoop 1.x builds to work even if it's not the default. Let me try applying this change and seeing the diff in SBT myself, to try to verify what is essential to change.

@calvinjia
Copy link
Contributor Author

@srowen
Are you completely opposed to using an exclusion for the jackson library? Thanks for taking a look at the sbt build.

@srowen
Copy link
Member

srowen commented Apr 22, 2015

I'm still working through this -- I do see the same problem you see with the httpclient library, and it's because the SBT resolution rules are different. It's weird, but yeah that probably has to be patched up as you have done. I have a proposed change that touches up the handling in Kinesis, etc.

I didn't see a Jackson-related problem, not yet, but that may be masked by earlier failures. Basically, I'd be surprised if this exclusion is the only possible solution, so I want to try it out in a different environment.

@calvinjia
Copy link
Contributor Author

@srowen
You can try running org.apache.spark.mllib.regression.RidgeRegressionSuite with hadoop version 1.0.4 to reproduce the Jackson library conflict.
What kind of solutions are you more open to?

@srowen
Copy link
Member

srowen commented Apr 23, 2015

I don't see a Jackson-related failure, even when directly running org.apache.spark.mllib.regression.RidgeRegressionSuite, with the changes in this patch.

However I do see the HttpClient problem:

[info] Exception encountered when attempting to run a suite with class name: org.apache.spark.streaming.UISeleniumSuite *** ABORTED *** (106 milliseconds)
[info]   java.lang.NoSuchMethodError: org.apache.http.impl.cookie.BrowserCompatSpecFactory.create(Lorg/apache/http/protocol/HttpContext;)Lorg/apache/http/cookie/CookieSpec;

The change here works, although I think it can be tightened up a little bit. My last experiment will be to see if we can get away with runtime scope instead; test scope didn't work. I'll post what I have in a branch for you to take a look at after that.

@srowen
Copy link
Member

srowen commented Apr 23, 2015

This is the best I've got, which still works with SBT:
srowen@c1e40f8

It's mostly the same, but I don't find a jackson exclusion is needed, and I think the httpclient situation could be further tightened.

@aniketbhatnagar @ScrapCodes does the change to the kinesis-asl profile make sense? basically, now we need to manage httpclient versions correctly for the whole project, so I figure there's no need to redundantly manage it in the profiles.

@srowen
Copy link
Member

srowen commented Apr 23, 2015

It seems like the Jackson dep has to be excluded to get SBT + Hadoop 1.0.4 to work. I think that has to stay then, yeah. I think the httpclient stuff can be cleaned up a small bit but that too is essential.

I'm getting worried at how much the divergence between SBT and Maven is causing us to hack the build, making it harder to get the build right for both. For example, these changes aren't necessary at all for Maven. It's exacerbated by trying to support Hadoop 1.x.

Still maybe we kick this can down the road a bit longer, to get in this change.

@calvinjia
Copy link
Contributor Author

@srowen
I appreciate the feedback, and I've cleaned up the httpclient versions as you suggested.
Do you have any other comments? Thanks.

@SparkQA
Copy link

SparkQA commented Apr 24, 2015

Test build #30915 has started for PR 5354 at commit 0eefe4d.

@aniketbhatnagar
Copy link
Contributor

+1 from my side. having a consistent httpclient version would be so much better!

@SparkQA
Copy link

SparkQA commented Apr 24, 2015

Test build #30915 has finished for PR 5354 at commit 0eefe4d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch removes the following dependencies:
    • RoaringBitmap-0.4.5.jar
    • activation-1.1.jar
    • akka-actor_2.10-2.3.4-spark.jar
    • akka-remote_2.10-2.3.4-spark.jar
    • akka-slf4j_2.10-2.3.4-spark.jar
    • aopalliance-1.0.jar
    • arpack_combined_all-0.1.jar
    • avro-1.7.7.jar
    • breeze-macros_2.10-0.11.2.jar
    • breeze_2.10-0.11.2.jar
    • chill-java-0.5.0.jar
    • chill_2.10-0.5.0.jar
    • commons-beanutils-1.7.0.jar
    • commons-beanutils-core-1.8.0.jar
    • commons-cli-1.2.jar
    • commons-codec-1.10.jar
    • commons-collections-3.2.1.jar
    • commons-compress-1.4.1.jar
    • commons-configuration-1.6.jar
    • commons-digester-1.8.jar
    • commons-httpclient-3.1.jar
    • commons-io-2.1.jar
    • commons-lang-2.5.jar
    • commons-lang3-3.3.2.jar
    • commons-math-2.1.jar
    • commons-math3-3.4.1.jar
    • commons-net-2.2.jar
    • compress-lzf-1.0.0.jar
    • config-1.2.1.jar
    • core-1.1.2.jar
    • curator-client-2.4.0.jar
    • curator-framework-2.4.0.jar
    • curator-recipes-2.4.0.jar
    • gmbal-api-only-3.0.0-b023.jar
    • grizzly-framework-2.1.2.jar
    • grizzly-http-2.1.2.jar
    • grizzly-http-server-2.1.2.jar
    • grizzly-http-servlet-2.1.2.jar
    • grizzly-rcm-2.1.2.jar
    • groovy-all-2.3.7.jar
    • guava-14.0.1.jar
    • guice-3.0.jar
    • hadoop-annotations-2.2.0.jar
    • hadoop-auth-2.2.0.jar
    • hadoop-client-2.2.0.jar
    • hadoop-common-2.2.0.jar
    • hadoop-hdfs-2.2.0.jar
    • hadoop-mapreduce-client-app-2.2.0.jar
    • hadoop-mapreduce-client-common-2.2.0.jar
    • hadoop-mapreduce-client-core-2.2.0.jar
    • hadoop-mapreduce-client-jobclient-2.2.0.jar
    • hadoop-mapreduce-client-shuffle-2.2.0.jar
    • hadoop-yarn-api-2.2.0.jar
    • hadoop-yarn-client-2.2.0.jar
    • hadoop-yarn-common-2.2.0.jar
    • hadoop-yarn-server-common-2.2.0.jar
    • ivy-2.4.0.jar
    • jackson-annotations-2.4.0.jar
    • jackson-core-2.4.4.jar
    • jackson-core-asl-1.8.8.jar
    • jackson-databind-2.4.4.jar
    • jackson-jaxrs-1.8.8.jar
    • jackson-mapper-asl-1.8.8.jar
    • jackson-module-scala_2.10-2.4.4.jar
    • jackson-xc-1.8.8.jar
    • jansi-1.4.jar
    • javax.inject-1.jar
    • javax.servlet-3.0.0.v201112011016.jar
    • javax.servlet-3.1.jar
    • javax.servlet-api-3.0.1.jar
    • jaxb-api-2.2.2.jar
    • jaxb-impl-2.2.3-1.jar
    • jcl-over-slf4j-1.7.10.jar
    • jersey-client-1.9.jar
    • jersey-core-1.9.jar
    • jersey-grizzly2-1.9.jar
    • jersey-guice-1.9.jar
    • jersey-json-1.9.jar
    • jersey-server-1.9.jar
    • jersey-test-framework-core-1.9.jar
    • jersey-test-framework-grizzly2-1.9.jar
    • jets3t-0.7.1.jar
    • jettison-1.1.jar
    • jetty-util-6.1.26.jar
    • jline-0.9.94.jar
    • jline-2.10.4.jar
    • jodd-core-3.6.3.jar
    • json4s-ast_2.10-3.2.10.jar
    • json4s-core_2.10-3.2.10.jar
    • json4s-jackson_2.10-3.2.10.jar
    • jsr305-1.3.9.jar
    • jtransforms-2.4.0.jar
    • jul-to-slf4j-1.7.10.jar
    • kryo-2.21.jar
    • log4j-1.2.17.jar
    • lz4-1.2.0.jar
    • management-api-3.0.0-b012.jar
    • mesos-0.21.0-shaded-protobuf.jar
    • metrics-core-3.1.0.jar
    • metrics-graphite-3.1.0.jar
    • metrics-json-3.1.0.jar
    • metrics-jvm-3.1.0.jar
    • minlog-1.2.jar
    • netty-3.8.0.Final.jar
    • netty-all-4.0.23.Final.jar
    • objenesis-1.2.jar
    • opencsv-2.3.jar
    • oro-2.0.8.jar
    • paranamer-2.6.jar
    • parquet-column-1.6.0rc3.jar
    • parquet-common-1.6.0rc3.jar
    • parquet-encoding-1.6.0rc3.jar
    • parquet-format-2.2.0-rc1.jar
    • parquet-generator-1.6.0rc3.jar
    • parquet-hadoop-1.6.0rc3.jar
    • parquet-jackson-1.6.0rc3.jar
    • protobuf-java-2.4.1.jar
    • protobuf-java-2.5.0-spark.jar
    • py4j-0.8.2.1.jar
    • pyrolite-2.0.1.jar
    • quasiquotes_2.10-2.0.1.jar
    • reflectasm-1.07-shaded.jar
    • scala-compiler-2.10.4.jar
    • scala-library-2.10.4.jar
    • scala-reflect-2.10.4.jar
    • scalap-2.10.4.jar
    • scalatest_2.10-2.2.1.jar
    • slf4j-api-1.7.10.jar
    • slf4j-log4j12-1.7.10.jar
    • snappy-java-1.1.1.7.jar
    • spark-bagel_2.10-1.4.0-SNAPSHOT.jar
    • spark-catalyst_2.10-1.4.0-SNAPSHOT.jar
    • spark-core_2.10-1.4.0-SNAPSHOT.jar
    • spark-graphx_2.10-1.4.0-SNAPSHOT.jar
    • spark-launcher_2.10-1.4.0-SNAPSHOT.jar
    • spark-mllib_2.10-1.4.0-SNAPSHOT.jar
    • spark-network-common_2.10-1.4.0-SNAPSHOT.jar
    • spark-network-shuffle_2.10-1.4.0-SNAPSHOT.jar
    • spark-repl_2.10-1.4.0-SNAPSHOT.jar
    • spark-sql_2.10-1.4.0-SNAPSHOT.jar
    • spark-streaming_2.10-1.4.0-SNAPSHOT.jar
    • spire-macros_2.10-0.7.4.jar
    • spire_2.10-0.7.4.jar
    • stax-api-1.0.1.jar
    • stream-2.7.0.jar
    • tachyon-0.5.0.jar
    • tachyon-client-0.5.0.jar
    • uncommons-maths-1.2.2a.jar
    • unused-1.0.0.jar
    • xmlenc-0.52.jar
    • xz-1.0.jar
    • zookeeper-3.4.5.jar

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30915/
Test PASSed.

@srowen
Copy link
Member

srowen commented Apr 24, 2015

LGTM. Thank you for your perseverance. This gets the change in with minimal additional change to the build, keeps everything compiling and actually improves the management of one dependency along the way.

I think the large list of removed dependencies above is a false positive. It can't remove these.

Let me merge and let's double check that the other Jenkins builds are still happy.

@calvinjia
Copy link
Contributor Author

@srowen @aniketbhatnagar
Thanks for reviewing this PR.

@asfgit asfgit closed this in 438859e Apr 24, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 14, 2015
This is a reopening of apache#4867.
A short summary of the issues resolved from the previous PR:

1. HTTPClient version mismatch: Selenium (used for UI tests) requires version 4.3.x, and Tachyon included 4.2.5 through a transitive dependency of its shaded thrift jar. To address this, Tachyon 0.6.3 will promote the transitive dependencies of the shaded jar so they can be excluded in spark.

2. Jackson-Mapper-ASL version mismatch: In lower versions of hadoop-client (ie. 1.0.4), version 1.0.1 is included. The parquet library used in spark sql requires version 1.8+. Its unclear to me why upgrading tachyon-client would cause this dependency to break. The solution was to exclude jackson-mapper-asl from hadoop-client.

It seems that the dependency management in spark-parent will not work on transitive dependencies, one way to make sure jackson-mapper-asl is included with the correct version is to add it as a top level dependency. The best solution would be to exclude the dependency in the modules which require a higher version, but that did not fix the unit tests. Any suggestions on the best way to solve this would be appreciated!

Author: Calvin Jia <[email protected]>

Closes apache#5354 from calvinjia/upgrade_tachyon_0.6.3 and squashes the following commits:

0eefe4d [Calvin Jia] Handle httpclient version in maven dependency management. Remove httpclient version setting from profiles.
7c00dfa [Calvin Jia] Set httpclient version to 4.3.2 for selenium. Specify version of httpclient for sql/hive (previously 4.2.5 transitive dependency of libthrift).
9263097 [Calvin Jia] Merge master to test latest changes
dbfc1bd [Calvin Jia] Use Tachyon 0.6.4 for cleaner dependencies.
e2ff80a [Calvin Jia] Exclude the jetty and curator promoted dependencies from tachyon-client.
a3a29da [Calvin Jia] Update tachyon-client exclusions.
0ae6c97 [Calvin Jia] Change tachyon version to 0.6.3
a204df9 [Calvin Jia] Update make distribution tachyon version.
a93c94f [Calvin Jia] Exclude jackson-mapper-asl from hadoop client since it has a lower version than spark's expected version.
a8a923c [Calvin Jia] Exclude httpcomponents from Tachyon
910fabd [Calvin Jia] Update to master
eed9230 [Calvin Jia] Update tachyon version to 0.6.1.
11907b3 [Calvin Jia] Use TachyonURI for tachyon paths instead of strings.
71bf441 [Calvin Jia] Upgrade Tachyon client version to 0.6.0.
@vnkesarwani
Copy link

Hi,

I have tried applying the patch on spark-1.3.1. I am getting following error

git apply --check /tmp/5354.patch
error: launcher/pom.xml: No such file or directory
error: patch failed: pom.xml:146
error: pom.xml: patch does not apply

launcher folder is not available when we download spark-1.3.1 source code.

Any help.

@JoshRosen
Copy link
Contributor

@vnkesarwani, try cherry-picking the commit then resolving the conflicts.

nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This is a reopening of apache#4867.
A short summary of the issues resolved from the previous PR:

1. HTTPClient version mismatch: Selenium (used for UI tests) requires version 4.3.x, and Tachyon included 4.2.5 through a transitive dependency of its shaded thrift jar. To address this, Tachyon 0.6.3 will promote the transitive dependencies of the shaded jar so they can be excluded in spark.

2. Jackson-Mapper-ASL version mismatch: In lower versions of hadoop-client (ie. 1.0.4), version 1.0.1 is included. The parquet library used in spark sql requires version 1.8+. Its unclear to me why upgrading tachyon-client would cause this dependency to break. The solution was to exclude jackson-mapper-asl from hadoop-client.

It seems that the dependency management in spark-parent will not work on transitive dependencies, one way to make sure jackson-mapper-asl is included with the correct version is to add it as a top level dependency. The best solution would be to exclude the dependency in the modules which require a higher version, but that did not fix the unit tests. Any suggestions on the best way to solve this would be appreciated!

Author: Calvin Jia <[email protected]>

Closes apache#5354 from calvinjia/upgrade_tachyon_0.6.3 and squashes the following commits:

0eefe4d [Calvin Jia] Handle httpclient version in maven dependency management. Remove httpclient version setting from profiles.
7c00dfa [Calvin Jia] Set httpclient version to 4.3.2 for selenium. Specify version of httpclient for sql/hive (previously 4.2.5 transitive dependency of libthrift).
9263097 [Calvin Jia] Merge master to test latest changes
dbfc1bd [Calvin Jia] Use Tachyon 0.6.4 for cleaner dependencies.
e2ff80a [Calvin Jia] Exclude the jetty and curator promoted dependencies from tachyon-client.
a3a29da [Calvin Jia] Update tachyon-client exclusions.
0ae6c97 [Calvin Jia] Change tachyon version to 0.6.3
a204df9 [Calvin Jia] Update make distribution tachyon version.
a93c94f [Calvin Jia] Exclude jackson-mapper-asl from hadoop client since it has a lower version than spark's expected version.
a8a923c [Calvin Jia] Exclude httpcomponents from Tachyon
910fabd [Calvin Jia] Update to master
eed9230 [Calvin Jia] Update tachyon version to 0.6.1.
11907b3 [Calvin Jia] Use TachyonURI for tachyon paths instead of strings.
71bf441 [Calvin Jia] Upgrade Tachyon client version to 0.6.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants