[SPARK-24418][Build] Upgrade Scala to 2.11.12 and 2.12.6 #21495

dbtsai · 2018-06-05T06:17:32Z

What changes were proposed in this pull request?

Scala is upgraded to 2.11.12 and 2.12.6.

We used loadFIles() in ILoop as a hook to initialize the Spark before REPL sees any files in Scala 2.11.8. However, it was a hack, and it was not intended to be a public API, so it was removed in Scala 2.11.12.

From the discussion in Scala community, scala/bug#10913 , we can use initializeSynchronous to initialize Spark instead. This PR implements the Spark initialization there.

However, in Scala 2.11.12's ILoop.scala, in function def startup(), the first thing it calls is printWelcome(). As a result, Scala will call printWelcome() and splash before calling initializeSynchronous.

Thus, the Spark shell will allow users to type commends first, and then show the Spark UI URL. It's working, but it will change the Spark Shell interface as the following.

➜  apache-spark git:(scala-2.11.12) ✗ ./bin/spark-shell 
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.4.0-SNAPSHOT
      /_/
         
Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161)
Type in expressions to have them evaluated.
Type :help for more information.

scala> Spark context Web UI available at http://192.168.1.169:4040
Spark context available as 'sc' (master = local[*], app id = local-1528180279528).
Spark session available as 'spark'.


scala>

It seems there is no easy way to inject the Spark initialization code in the proper place as Scala doesn't provide a hook. Maybe @som-snytt can comment on this.

The following command is used to update the dep files.

./dev/test-dependencies.sh --replace-manifest

How was this patch tested?

Existing tests

dbtsai · 2018-06-05T06:41:02Z

repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoopInterpreter.scala

+   */
+  override def initializeSynchronous(): Unit = {
+    super.initializeSynchronous()
+    initializeSpark()


@som-snytt It's working, but I'm wondering if I'm doing it correctly.

In this case, I'll use $intp without checking

if (intp.reporter.hasErrors) { echo("Interpreter encountered errors during initialization!") null }

in iLoop.scala. Of course, I can add the checking in our own code, but it doesn't look right to me.

And intp.quietBind(NamedParam[IMain]("$intp", intp)(tagOfIMain, classTag[IMain])) will not be executed before our custom Spark initialization code.

For my environment (with build/sbt), I'm hitting NoSuchMethodError like the following. Did you see something like this?

~/PR-21495:PR-21495$ bin/spark-shell 18/06/07 23:39:00 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.0-SNAPSHOT /_/ Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_171) Type in expressions to have them evaluated. Type :help for more information. scala> Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = local[*], app id = local-1528414746558). Spark session available as 'spark'. Exception in thread "main" java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V at scala.tools.nsc.interpreter.jline.JLineConsoleReader.initCompletion(JLineReader.scala:139) at scala.tools.nsc.interpreter.jline.InteractiveReader.postInit(JLineReader.scala:54)

It looks like jline version mismatch.

Spark 2.4.0-SNAPSHOT uses 2.12.1

Scala 2.11.12 uses 2.14.3

It is working for me. I got scala-compiler-2.11.12.jar in my classpath. Can you do a clean build?

Yep. Of course, it's a clean clone and build of this PR. Accoring to the error message and the followings, we need jline-2.14.3.jar because scala uses the setPrintSpaceAfterFullCompletion API of higher version of jline. Could you confirm this, @som-snytt ?

$ java -version openjdk version "1.8.0_171" OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.18.04.1-b11) OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode) $ javap -cp jline-2.12.1.jar jline.console.completer.CandidateListCompletionHandler Compiled from "CandidateListCompletionHandler.java" public class jline.console.completer.CandidateListCompletionHandler implements jline.console.completer.CompletionHandler { public jline.console.completer.CandidateListCompletionHandler(); public boolean complete(jline.console.ConsoleReader, java.util.List<java.lang.CharSequence>, int) throws java.io.IOException; public static void setBuffer(jline.console.ConsoleReader, java.lang.CharSequence, int) throws java.io.IOException; public static void printCandidates(jline.console.ConsoleReader, java.util.Collection<java.lang.CharSequence>) throws java.io.IOException; } $ javap -cp jline-2.14.3.jar jline.console.completer.CandidateListCompletionHandler Compiled from "CandidateListCompletionHandler.java" public class jline.console.completer.CandidateListCompletionHandler implements jline.console.completer.CompletionHandler { public jline.console.completer.CandidateListCompletionHandler(); public boolean getPrintSpaceAfterFullCompletion(); public void setPrintSpaceAfterFullCompletion(boolean); public boolean isStripAnsi(); public void setStripAnsi(boolean); public boolean complete(jline.console.ConsoleReader, java.util.List<java.lang.CharSequence>, int) throws java.io.IOException; public static void setBuffer(jline.console.ConsoleReader, java.lang.CharSequence, int) throws java.io.IOException; public static void printCandidates(jline.console.ConsoleReader, java.util.Collection<java.lang.CharSequence>) throws java.io.IOException; }

Can we upgrade to the corresponding jline version together in this PR, @dbtsai ?

Completion was upgraded since the old days; also, other bugs required updating jline. There is interest in upgrading to jline 3.

Thank you for confirming, @som-snytt .

Oh, I know why it works for me.

I just run the following without building a distribution. jline was automatically resolved to newer version.

./build/sbt clean package ./bin/spark-shell

SparkQA · 2018-06-05T07:05:02Z

Test build #91474 has finished for PR 21495 at commit de790fd.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class SparkILoopInterpreter(settings: Settings, out: JPrintWriter, initializeSpark: () => Unit)

viirya · 2018-06-05T07:24:17Z

retest this please.

SparkQA · 2018-06-05T11:21:03Z

Test build #91476 has finished for PR 21495 at commit de790fd.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class SparkILoopInterpreter(settings: Settings, out: JPrintWriter, initializeSpark: () => Unit)

som-snytt · 2018-06-05T17:29:35Z

Your best bet for synchronous startup is to do everything in printWelcome: createInterpreter, your init commands that produce output. IMain has a compiler that is initialized lazily, so I don't think you have to explicitly intp.initializeSynchronous. createInterpreter will be called again; you'll want to detect that and make it a no-op.

It looks like Scala REPL is moving toward a different architecture in 2.13, with loosely coupled front and back ends.

dbtsai · 2018-06-05T19:02:37Z

retest this please

SparkQA · 2018-06-05T23:33:51Z

Test build #91491 has finished for PR 21495 at commit de790fd.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class SparkILoopInterpreter(settings: Settings, out: JPrintWriter, initializeSpark: () => Unit)

dongjoon-hyun · 2018-06-07T23:31:18Z

Retest this please.

dbtsai · 2018-06-08T00:50:20Z

@som-snytt initialize it in printWelcome will not work since in order version of Scala, printWelcome is the last one to be executed.

som-snytt · 2018-06-08T02:40:45Z

@dbtsai 2.11 looks similar to 2.12. Do you mean you want the same technique on 2.10? I would not expect to find a single hook for all versions.

Edit: sorry, I see it was a mid-2.11 change.

SparkQA · 2018-06-08T03:52:42Z

Test build #91538 has finished for PR 21495 at commit de790fd.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class SparkILoopInterpreter(settings: Settings, out: JPrintWriter, initializeSpark: () => Unit)

SparkQA · 2018-06-08T05:14:50Z

Test build #91541 has finished for PR 21495 at commit f91d75a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-06-08T07:52:20Z

repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoopInterpreter.scala

-class SparkILoopInterpreter(settings: Settings, out: JPrintWriter) extends IMain(settings, out) {
-  self =>
+class SparkILoopInterpreter(settings: Settings, out: JPrintWriter, initializeSpark: () => Unit)
+    extends IMain(settings, out) { self =>


nit: two spaces

I thought for extends, it's four spaces?

IIRC, four spaces is OK.

It's definitely two spaces after a period. I've been wanting to make that joke, but held off.

Guys, parameters are listed in 4 spaces and other keywords after that are lined up with 2 spaces, which is written in https://github.com/databricks/scala-style-guide#spacing-and-indentation

In case of two lines, it's not explicitly written but wouldn't we better stick to the example as possible as we can? "Use 2-space indentation in general.".

Well, there's a bunch of code in Spark where both 4-space and 2-space indentation are existed. From my thought it is not so worthy to stick to this tiny point.

If you want to fix it, then creating a separate PR to fix all of them.

We can fix them in place. There are many other style nits too. It's not worth swiping them making the backport harder.

I am fine with ignoring such nits and they don't block this PR but it's not something we should say the opposite side is okay.

Really not worthy to block on this thing (if it is the only issue).

SparkQA · 2018-06-08T11:59:14Z

Test build #91561 has finished for PR 21495 at commit 631ef48.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-08T12:22:04Z

Test build #91559 has finished for PR 21495 at commit c1ffd0b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-08T20:53:43Z

Test build #91579 has finished for PR 21495 at commit 96e87c2.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-09T00:46:09Z

Test build #91580 has finished for PR 21495 at commit 4c852fa.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

jerryshao · 2018-06-11T02:07:12Z

~~Having issues tested with latest patch:~~

NVM, just using the old patch...

dbtsai · 2018-06-11T06:34:36Z

I decided to remove the hack I put in to get the Spark UI consistent because this hack will bring in more problems.

@som-snytt Is it possible to move the printWelcome and splash.start() to some place after initializeSynchronous() is called?

SparkQA · 2018-06-11T07:05:01Z

Test build #91650 has finished for PR 21495 at commit 82ca5f6.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-25T11:51:13Z

Test build #92292 has finished for PR 21495 at commit 82ca5f6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jerryshao

LGTM.

felixcheung

LGTM

felixcheung · 2018-06-25T17:24:56Z

db might be out for a bit. we should merge it if it looks/tested good?

jerryshao · 2018-06-26T01:47:04Z

OK, I'm going to merge it. We can fix the following issues if exists.

HyukjinKwon · 2018-06-26T01:49:25Z

LGTM too.

dbtsai · 2018-06-27T15:21:03Z

I was on a family leave for couple weeks. Thank you all for helping out and merging it.

The only change with this PR is that the welcome message will be printed first, and then the Spark URL will be shown latter. It's a minor difference.

I had an offline discussion with @adriaanm in Scala community. To overcome this issue,
he suggested we can override the entire def process(settings: Settings): Boolean to put our initialization code. I have a working implementation of this, but this will copy a lot of code from Scala to just get the printing order right. If we decide to have a consistent printing order, I can submit a separate PR for this.

He is open to working with us to add proper hook in Scala so we don't need use hacks to initialize our code.

Thanks.

felixcheung · 2018-06-27T16:06:31Z

I think it’d be great to not change the order

…

________________________________

jerryshao · 2018-06-28T00:37:55Z

The only change with this PR is that the welcome message will be printed first, and then the Spark URL will be shown latter. It's a minor difference.

I think we should create a JIRA to create this change, also evaluate the necessity to fix this issue.

If this behavior change will affect some users, maybe we should move the target version to 3.0.0? @dbtsai @felixcheung

felixcheung · 2018-06-28T15:51:43Z

I don’t feel strongly either way. Agreed we should track it with JIRA

…

________________________________

dongjoon-hyun · 2018-06-30T19:16:04Z

In addition to that, it woud be great if we fix sbt soon. After this PR, mvn works correctly, but sbt is still hitting NoSuchMethodError in master branch.

$ ./build/sbt -Pyarn -Phadoop-2.7 -Phadoop-cloud -Phive -Phive-thriftserver -Psparkr test:package
$ bin/spark-shell
scala> Spark context Web UI available at http://localhost:4040
Spark context available as 'sc' (master = local[*], app id = local-1530385877441).
Spark session available as 'spark'.
Exception in thread "main" java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V

jerryshao · 2018-07-02T03:14:50Z

@dongjoon-hyun can you please create a JIRA to track this issue.

dongjoon-hyun · 2018-07-02T05:18:01Z

Sure. https://issues.apache.org/jira/browse/SPARK-24715 is created now.

dbtsai · 2018-07-11T18:16:32Z

JIRA and PR are created to make sure the messages are printed in the right order.

https://issues.apache.org/jira/browse/SPARK-24785

#21749

* Upgrade Scala to 2.11.12 - Modifies slightly the Spark REPL code to reflect internal changes in Scala tooling - This code was ported from apache#21495 (cherry picked from commit 3e52a9160875ec5c145c4e9fa0106ff7d1f380b2)

…elcome message ## What changes were proposed in this pull request? After #21495 the welcome message is printed first, and then Scala prompt will be shown before the Spark UI info is printed. Although it's a minor issue, but visually, it doesn't look as nice as the existing behavior. This PR intends to fix it by duplicating the Scala `process` code to arrange the printing order. However, one variable is private, so reflection has to be used which is not desirable. We can use this PR to brainstorm how to handle it properly and how Scala can change their APIs to fit our need. ## How was this patch tested? Existing test Closes #21749 from dbtsai/repl-followup. Authored-by: DB Tsai <[email protected]> Signed-off-by: DB Tsai <[email protected]>

* Upgrade Scala to 2.11.12 - Modifies slightly the Spark REPL code to reflect internal changes in Scala tooling - This code was ported from apache#21495 (cherry picked from commit 3e52a9160875ec5c145c4e9fa0106ff7d1f380b2) (cherry picked from commit f5a3901)

* Upgrade Scala to 2.11.12 - Modifies slightly the Spark REPL code to reflect internal changes in Scala tooling - This code was ported from apache#21495 (cherry picked from commit 3e52a9160875ec5c145c4e9fa0106ff7d1f380b2) (cherry picked from commit f5a3901) (cherry picked from commit a72488d)

Adaptation of apache#21495

* Upgrade Scala to 2.11.12 - Modifies slightly the Spark REPL code to reflect internal changes in Scala tooling - This code was ported from apache#21495 (cherry picked from commit 3e52a9160875ec5c145c4e9fa0106ff7d1f380b2) (cherry picked from commit f5a3901) (cherry picked from commit a72488d)

Adaptation of apache#21495

Upgrade scala to 2.11.12 and 2.12.6

de790fd

dbtsai mentioned this pull request Jun 5, 2018

loadFiles() in ILoop was removed in Scala 2.11.12 which breaks Apache Spark scala/bug#10913

Closed

dbtsai commented Jun 5, 2018

View reviewed changes

dbtsai mentioned this pull request Jun 5, 2018

Test branch to see how Scala 2.11.12 performs #21453

Closed

Update LICENSE

f91d75a

HyukjinKwon reviewed Jun 8, 2018

View reviewed changes

dbtsai added 2 commits June 8, 2018 00:57

Update JLine

c1ffd0b

update scala-parser-combinators

631ef48

Error handling

96e87c2

dbtsai added 2 commits June 8, 2018 14:01

Fix compilation

4c852fa

temp

096e6a9

Revert hack

82ca5f6

jerryshao approved these changes Jun 25, 2018

View reviewed changes

felixcheung approved these changes Jun 25, 2018

View reviewed changes

asfgit closed this in c7967c6 Jun 26, 2018

dbtsai deleted the scala-2.11.12 branch June 27, 2018 15:21

viirya mentioned this pull request Jul 2, 2018

[SPARK-24715][Build] Override jline version as 2.14.3 in SBT #21692

Closed

dbtsai mentioned this pull request Jul 11, 2018

[SPARK-24785] [SHELL] Making sure REPL prints Spark UI info and then Welcome message #21749

Closed

Willymontaz pushed a commit to Willymontaz/spark that referenced this pull request Sep 26, 2018

Fix scala 2.11.12 build

ba07914

Adaptation of apache#21495

Willymontaz mentioned this pull request Sep 26, 2018

Fix scala 2.11.12 build criteo-forks/spark#80

Merged

Willymontaz pushed a commit to Willymontaz/spark that referenced this pull request Jun 6, 2019

Move from scala 2.11.8 to 2.11.12

1d67972

Adaptation of apache#21495

Willymontaz mentioned this pull request Jun 6, 2019

Move from scala 2.11.8 to 2.11.12 criteo-forks/spark#93

Merged

Willymontaz added a commit to criteo-forks/spark that referenced this pull request Jun 12, 2019

Move from scala 2.11.8 to 2.11.12 (#93)

5af2a5d

Adaptation of apache#21495

[SPARK-24418][Build] Upgrade Scala to 2.11.12 and 2.12.6 #21495

[SPARK-24418][Build] Upgrade Scala to 2.11.12 and 2.12.6 #21495

Conversation

dbtsai commented Jun 5, 2018 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

dbtsai Jun 5, 2018 • edited Loading

Choose a reason for hiding this comment

dongjoon-hyun Jun 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun Jun 8, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbtsai Jun 8, 2018 • edited Loading

Choose a reason for hiding this comment

SparkQA commented Jun 5, 2018

viirya commented Jun 5, 2018

SparkQA commented Jun 5, 2018

som-snytt commented Jun 5, 2018 • edited Loading

dbtsai commented Jun 5, 2018

SparkQA commented Jun 5, 2018

dongjoon-hyun commented Jun 7, 2018

dbtsai commented Jun 8, 2018

som-snytt commented Jun 8, 2018 • edited Loading

SparkQA commented Jun 8, 2018

SparkQA commented Jun 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HyukjinKwon Jun 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 8, 2018

SparkQA commented Jun 8, 2018

SparkQA commented Jun 8, 2018

SparkQA commented Jun 9, 2018

jerryshao commented Jun 11, 2018 • edited Loading

dbtsai commented Jun 11, 2018 • edited Loading

SparkQA commented Jun 11, 2018

SparkQA commented Jun 25, 2018

jerryshao left a comment

Choose a reason for hiding this comment

felixcheung left a comment

Choose a reason for hiding this comment

felixcheung commented Jun 25, 2018

jerryshao commented Jun 26, 2018

HyukjinKwon commented Jun 26, 2018

dbtsai commented Jun 27, 2018

felixcheung commented Jun 27, 2018 via email

jerryshao commented Jun 28, 2018

felixcheung commented Jun 28, 2018 via email

dongjoon-hyun commented Jun 30, 2018

jerryshao commented Jul 2, 2018

dongjoon-hyun commented Jul 2, 2018

dbtsai commented Jul 11, 2018

dbtsai commented Jun 5, 2018 •

edited

Loading

dbtsai Jun 5, 2018 •

edited

Loading

dongjoon-hyun Jun 7, 2018 •

edited

Loading

dongjoon-hyun Jun 8, 2018 •

edited

Loading

dbtsai Jun 8, 2018 •

edited

Loading

som-snytt commented Jun 5, 2018 •

edited

Loading

som-snytt commented Jun 8, 2018 •

edited

Loading

HyukjinKwon Jun 20, 2018 •

edited

Loading

jerryshao commented Jun 11, 2018 •

edited

Loading

dbtsai commented Jun 11, 2018 •

edited

Loading