[SPARK-12120][PYSPARK] Improve exception message when failing to init… #10126

zjffdu · 2015-12-03T09:02:49Z

…ialize HiveContext in PySpark

@davies Mind to review ?

This is the error message after this PR

15/12/03 16:59:53 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
/Users/jzhang/github/spark/python/pyspark/sql/context.py:689: UserWarning: You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly
  warnings.warn("You must build Spark with Hive. "
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jzhang/github/spark/python/pyspark/sql/context.py", line 663, in read
    return DataFrameReader(self)
  File "/Users/jzhang/github/spark/python/pyspark/sql/readwriter.py", line 56, in __init__
    self._jreader = sqlContext._ssql_ctx.read()
  File "/Users/jzhang/github/spark/python/pyspark/sql/context.py", line 692, in _ssql_ctx
    raise e
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
: java.lang.RuntimeException: java.net.ConnectException: Call From jzhangMBPr.local/127.0.0.1 to 0.0.0.0:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
    at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218)
    at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461)
    at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40)
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:214)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:745)

…ialize HiveContext in PySpark

SparkQA · 2015-12-03T09:48:43Z

Test build #47132 has finished for PR 10126 at commit 1878c70.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2016-01-13T20:34:21Z

python/pyspark/sql/context.py

-            raise Exception("You must build Spark with Hive. "
-                            "Export 'SPARK_HIVE=true' and run "
-                            "build/sbt assembly", e)
+            warnings.warn("You must build Spark with Hive. "


Python warnings can be disabled, so I'm worried that a lot of users might not end up seeing this message in that case. Also, this is more of an error message than a warning. Therefore, I think we should change this to a print statement (if we had a good logging story in PySpark, I'd say to log it as an error instead).

JoshRosen · 2016-01-13T20:35:52Z

This looks like a good improvement, but I had two really minor suggestions. If you take care of those, I'll get this merged quickly. Thanks for fixing this.

zjffdu · 2016-01-13T21:38:16Z

Thanks @JoshRosen, today is busy for me, I will update the patch tonight or tomorrow.

JoshRosen · 2016-01-13T22:06:35Z

Yeah, no huge rush. Any time in the next couple of days is fine.

JoshRosen · 2016-01-19T07:01:37Z

Ping. No rush but just wanted to bump this back up in the PR review list.

SparkQA · 2016-01-20T00:34:17Z

Test build #49730 has finished for PR 10126 at commit b1d404e.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds no public classes.

zjffdu · 2016-01-20T00:52:53Z

Seems the build failure is not related.

JoshRosen · 2016-01-20T00:55:10Z

The failure is related:

./python/pyspark/sql/context.py:576:27: E127 continuation line over-indented for visual indent

This corresponds to 69955c9#diff-74ba016ef40c1cb268e14aee817d71bdR575

JoshRosen · 2016-01-20T00:55:32Z

python/pyspark/sql/context.py

-                            "Export 'SPARK_HIVE=true' and run "
-                            "build/sbt assembly", e)
+            print("You must build Spark with Hive. "
+                          "Export 'SPARK_HIVE=true' and run "


This line is overindented. The leftmost quote needs to line up with the one on the previous line.

Thanks, I looked at the wrong line.

SparkQA · 2016-01-20T02:18:30Z

Test build #49745 has finished for PR 10126 at commit 180fb9b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2016-01-24T20:28:09Z

Thanks for bringing this up to date. I'm going to merge this into master and branch-1.6.

…ialize HiveContext in PySpark davies Mind to review ? This is the error message after this PR ``` 15/12/03 16:59:53 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException /Users/jzhang/github/spark/python/pyspark/sql/context.py:689: UserWarning: You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly warnings.warn("You must build Spark with Hive. " Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/jzhang/github/spark/python/pyspark/sql/context.py", line 663, in read return DataFrameReader(self) File "/Users/jzhang/github/spark/python/pyspark/sql/readwriter.py", line 56, in __init__ self._jreader = sqlContext._ssql_ctx.read() File "/Users/jzhang/github/spark/python/pyspark/sql/context.py", line 692, in _ssql_ctx raise e py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext. : java.lang.RuntimeException: java.net.ConnectException: Call From jzhangMBPr.local/127.0.0.1 to 0.0.0.0:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238) at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218) at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208) at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462) at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461) at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40) at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Thread.java:745) ``` Author: Jeff Zhang <[email protected]> Closes #10126 from zjffdu/SPARK-12120. (cherry picked from commit e789b1d) Signed-off-by: Josh Rosen <[email protected]>

[SPARK-12120][PYSPARK] Improve exception message when failing to init…

1878c70

…ialize HiveContext in PySpark

JoshRosen reviewed Jan 13, 2016
View reviewed changes

Address review comments

b1d404e

JoshRosen reviewed Jan 20, 2016
View reviewed changes

fix style issue

180fb9b

asfgit closed this in e789b1d Jan 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-12120][PYSPARK] Improve exception message when failing to init… #10126

[SPARK-12120][PYSPARK] Improve exception message when failing to init… #10126

zjffdu commented Dec 3, 2015

SparkQA commented Dec 3, 2015

JoshRosen Jan 13, 2016

JoshRosen commented Jan 13, 2016

zjffdu commented Jan 13, 2016

JoshRosen commented Jan 13, 2016

JoshRosen commented Jan 19, 2016

SparkQA commented Jan 20, 2016

zjffdu commented Jan 20, 2016

JoshRosen commented Jan 20, 2016

JoshRosen Jan 20, 2016

zjffdu Jan 20, 2016

SparkQA commented Jan 20, 2016

JoshRosen commented Jan 24, 2016

[SPARK-12120][PYSPARK] Improve exception message when failing to init… #10126

[SPARK-12120][PYSPARK] Improve exception message when failing to init… #10126

Conversation

zjffdu commented Dec 3, 2015

SparkQA commented Dec 3, 2015

JoshRosen Jan 13, 2016

Choose a reason for hiding this comment

JoshRosen commented Jan 13, 2016

zjffdu commented Jan 13, 2016

JoshRosen commented Jan 13, 2016

JoshRosen commented Jan 19, 2016

SparkQA commented Jan 20, 2016

zjffdu commented Jan 20, 2016

JoshRosen commented Jan 20, 2016

JoshRosen Jan 20, 2016

Choose a reason for hiding this comment

zjffdu Jan 20, 2016

Choose a reason for hiding this comment

SparkQA commented Jan 20, 2016

JoshRosen commented Jan 24, 2016