Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] NoClassDefFoundError for spark-avro when the Plugin is deployed via extraClassPath/driver-class-path #5758

Open
gerashegalov opened this issue Jun 6, 2022 · 1 comment
Labels
bug Something isn't working reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@gerashegalov
Copy link
Collaborator

gerashegalov commented Jun 6, 2022

Describe the bug
Plugin crashes with java.lang.NoClassDefFoundError: org/apache/spark/sql/v2/avro/AvroScan when the Plugin jar(s) are deployed using extraClassPath instead of --jars or --packages Spark submit options

Steps/Code to reproduce bug
Invoke pyspark with --driver-class-path:

pyspark --driver-class-path dist/target/rapids-4-spark_2.12-22.08.0-SNAPSHOT-cuda11.jar \
  --packages org.apache.spark:spark-avro_2.12:3.2.1 \
  --conf spark.rapids.sql.enabled=true \
  --conf spark.plugins=com.nvidia.spark.SQLPlugin

22/06/06 15:56:02 ERROR RapidsExecutorPlugin: Exception in the executor plugin, shutting down!
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: org/apache/spark/sql/v2/avro/AvroScan
        at org.apache.spark.sql.rapids.ExternalSource$.getScans(ExternalSource.scala:128)
        at com.nvidia.spark.rapids.GpuOverrides$.<init>(GpuOverrides.scala:3555)
        at com.nvidia.spark.rapids.GpuOverrides$.<clinit>(GpuOverrides.scala)
        at com.nvidia.spark.rapids.TypeChecks$.areTimestampsSupported(TypeChecks.scala:797)
        at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:218)
        at org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125)
        at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
        at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
        at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
        at org.apache.spark.internal.plugin.ExecutorPluginContainer.<init>(PluginContainer.scala:113)
        at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211)
        at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199)
        at org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:253)
        at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:231)
        at org.apache.spark.executor.Executor.<init>(Executor.scala:253)
        at org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalSchedulerBackend.scala:64)
        at org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:132)
        at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:220)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:581)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
        at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/sql/v2/avro/AvroScan
        ... 36 more
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.v2.avro.AvroScan
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        ... 36 more

Expected behavior
Should work just like --jars

pyspark --jars dist/target/rapids-4-spark_2.12-22.08.0-SNAPSHOT-cuda11.jar \
  --packages org.apache.spark:spark-avro_2.12:3.2.1 \ 
  --conf spark.rapids.sql.enabled=true \
  --conf spark.plugins=com.nvidia.spark.SQLPlugin 

which loads the Plugin and Avro correctly

Environment details (please complete the following information)
any

Additional context
N/A

@gerashegalov gerashegalov added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jun 6, 2022
@gerashegalov gerashegalov removed the ? - Needs Triage Need team to review and classify label Jun 6, 2022
@gerashegalov gerashegalov added this to the Jun 6 - Jun 17 milestone Jun 6, 2022
@gerashegalov gerashegalov assigned razajafri and unassigned razajafri Jun 6, 2022
@sameerz sameerz removed this from the Jun 6 - Jun 17 milestone Jun 18, 2022
@sameerz sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Jul 12, 2022
@razajafri
Copy link
Collaborator

I am going to unassign myself until this is a higher priority

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

No branches or pull requests

3 participants