Bump artifacts to latest release #179

robert3005 · 2017-05-12T18:59:26Z

Needs some offline testing. fixes #174

robert3005 · 2017-05-12T20:19:52Z

@ash211 what do you think? Is there checkstyle test to verify things?

robert3005 · 2017-05-12T20:27:06Z

dev/lint-java - will add to our regular builds

ash211

All the versions got bigger 👍

This might introduce more merge conflicts as we pull in from upstream -- not entirely sure if the benefit from newness (don't think there's a specific bug in each lib driving upgrade) is worth that

ash211 · 2017-05-12T21:18:07Z

common/network-common/src/main/java/org/apache/spark/network/crypto/TransportCipher.java

+import io.netty.channel.ChannelOutboundHandlerAdapter;
+import io.netty.channel.ChannelPromise;
+import io.netty.channel.FileRegion;
+import io.netty.util.AbstractReferenceCounted;


these import reorderings probably don't pass checkstyle

robert3005 · 2017-05-14T04:38:13Z

Had to slightly tweak grammar to fail in when parsing schema and there are some unconsumed characters.

I actually don't understand why test case of a INT b long is failing right now since it shouldn't. If you upgrade antlr in no longer fails so I added additional rule to ensure it's consistent

…eExecutionEnabled ### What changes were proposed in this pull request? This PR makes `repartition`/`DISTRIBUTE BY` obeys [initialPartitionNum](https://github.com/apache/spark/blob/af4248b2d661d04fec89b37857a47713246d9465/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L446-L455) when adaptive execution enabled. ### Why are the changes needed? To make `DISTRIBUTE BY`/`GROUP BY` partitioned by same partition number. How to reproduce: ```scala spark.sql("CREATE TABLE spark_31220(id int)") spark.sql("set spark.sql.adaptive.enabled=true") spark.sql("set spark.sql.adaptive.coalescePartitions.initialPartitionNum=1000") ``` Before this PR: ``` scala> spark.sql("SELECT id from spark_31220 GROUP BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- HashAggregate(keys=[id#5], functions=[]) +- Exchange hashpartitioning(id#5, 1000), true, [id=palantir#171] +- HashAggregate(keys=[id#5], functions=[]) +- FileScan parquet default.spark_31220[id#5] scala> spark.sql("SELECT id from spark_31220 DISTRIBUTE BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- Exchange hashpartitioning(id#5, 200), false, [id=palantir#179] +- FileScan parquet default.spark_31220[id#5] ``` After this PR: ``` scala> spark.sql("SELECT id from spark_31220 GROUP BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- HashAggregate(keys=[id#5], functions=[]) +- Exchange hashpartitioning(id#5, 1000), true, [id=palantir#171] +- HashAggregate(keys=[id#5], functions=[]) +- FileScan parquet default.spark_31220[id#5] scala> spark.sql("SELECT id from spark_31220 DISTRIBUTE BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- Exchange hashpartitioning(id#5, 1000), false, [id=palantir#179] +- FileScan parquet default.spark_31220[id#5] ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Unit test. Closes apache#27986 from wangyum/SPARK-31220. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 1d1eacd) Signed-off-by: Wenchen Fan <[email protected]>

…eExecutionEnabled This PR makes `repartition`/`DISTRIBUTE BY` obeys [initialPartitionNum](https://github.com/apache/spark/blob/af4248b2d661d04fec89b37857a47713246d9465/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L446-L455) when adaptive execution enabled. To make `DISTRIBUTE BY`/`GROUP BY` partitioned by same partition number. How to reproduce: ```scala spark.sql("CREATE TABLE spark_31220(id int)") spark.sql("set spark.sql.adaptive.enabled=true") spark.sql("set spark.sql.adaptive.coalescePartitions.initialPartitionNum=1000") ``` Before this PR: ``` scala> spark.sql("SELECT id from spark_31220 GROUP BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- HashAggregate(keys=[id#5], functions=[]) +- Exchange hashpartitioning(id#5, 1000), true, [id=#171] +- HashAggregate(keys=[id#5], functions=[]) +- FileScan parquet default.spark_31220[id#5] scala> spark.sql("SELECT id from spark_31220 DISTRIBUTE BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- Exchange hashpartitioning(id#5, 200), false, [id=#179] +- FileScan parquet default.spark_31220[id#5] ``` After this PR: ``` scala> spark.sql("SELECT id from spark_31220 GROUP BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- HashAggregate(keys=[id#5], functions=[]) +- Exchange hashpartitioning(id#5, 1000), true, [id=#171] +- HashAggregate(keys=[id#5], functions=[]) +- FileScan parquet default.spark_31220[id#5] scala> spark.sql("SELECT id from spark_31220 DISTRIBUTE BY id").explain == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- Exchange hashpartitioning(id#5, 1000), false, [id=#179] +- FileScan parquet default.spark_31220[id#5] ``` No. Unit test. Closes apache#27986 from wangyum/SPARK-31220. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

ash211 reviewed May 12, 2017

View reviewed changes

Bump artifact versions to latest

1ffa236

robert3005 force-pushed the rk/bump-artifacts branch from c85218d to 1ffa236 Compare May 14, 2017 04:34

Robert Kruszewski added 2 commits May 14, 2017 00:40

revert import changes

f03a92b

fix exception messages

f53ec7a

robert3005 merged commit ad4ca63 into master May 16, 2017

robert3005 deleted the rk/bump-artifacts branch May 19, 2017 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump artifacts to latest release #179

Bump artifacts to latest release #179

robert3005 commented May 12, 2017

robert3005 commented May 12, 2017

robert3005 commented May 12, 2017

ash211 left a comment

ash211 May 12, 2017

robert3005 May 13, 2017

robert3005 commented May 14, 2017

Bump artifacts to latest release #179

Bump artifacts to latest release #179

Conversation

robert3005 commented May 12, 2017

robert3005 commented May 12, 2017

robert3005 commented May 12, 2017

ash211 left a comment

Choose a reason for hiding this comment

ash211 May 12, 2017

Choose a reason for hiding this comment

robert3005 May 13, 2017

Choose a reason for hiding this comment

robert3005 commented May 14, 2017