Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5307] SerializationDebugger to help debug NotSerializableException #4093

Closed
wants to merge 3 commits into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented Jan 18, 2015

This patch adds a SerializationDebugger that is used to add more information to a NotSerializableException. When a NotSerializableException is encountered, the debugger tries to serialize the object one more time through a DebugStream that hooks into the internals of ObjectOutputStream to get the serialization stack. This ensures that there is no performance loss to run with SerializationDebugger, unlike setting the sun.io.serialization.extendedDebugInfo flag.

An example output looks like this:

org.apache.spark.serializer.NotSerializableClass
    Serialization stack (3):
    - org.apache.spark.serializer.NotSerializableClass@5e20dc10 (class org.apache.spark.serializer.NotSerializableClass)
    - org.apache.spark.serializer.SerializableClass2@521fb14e (class org.apache.spark.serializer.SerializableClass2)
    - org.apache.spark.serializer.SerializableClass1@5f54e92c (class org.apache.spark.serializer.SerializableClass1)
    Run the JVM with sun.io.serialization.extendedDebugInfo for more information.

When sun.io.serialization.extendedDebugInfo is turned on, this debugger no longer adds more information. Note that sun.io.serialization.extendedDebugInfo can show also the field name information, which is harder to get by the SerializationDebugger (technically possible with reflection but fairly convoluted).

…tion.

This patch adds a SerializationDebugger that is used to add more information to
a NotSerializableException. When a NotSerializableException is encountered, the
debugger tries to serialize the object one more time through a DebugStream that
hooks into the internals of ObjectOutputStream to get the serialization stack.

An example output looks like this:

org.apache.spark.serializer.NotSerializableClass
	Serialization stack (3):
	- org.apache.spark.serializer.NotSerializableClass@5e20dc10 (class org.apache.spark.serializer.NotSerializableClass)
	- org.apache.spark.serializer.SerializableClass2@521fb14e (class org.apache.spark.serializer.SerializableClass2)
	- org.apache.spark.serializer.SerializableClass1@5f54e92c (class org.apache.spark.serializer.SerializableClass1)
	Run the JVM with sun.io.serialization.extendedDebugInfo for more information.
nse.getMessage + "\n" +
s"\tSerialization stack (${out.stack.size}):\n" +
out.stack.map(o => s"\t- $o (class ${o.getClass.getName})").mkString("\n") + "\n" +
"\tRun the JVM with sun.io.serialization.extendedDebugInfo for more information.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe -Dsun.io.serialization.extendedDebugInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually "-Dsun.io.serialization.extendedDebugInfo=true". Kinda long ...

@SparkQA
Copy link

SparkQA commented Jan 18, 2015

Test build #25716 has finished for PR 4093 at commit bde6512.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • out.stack.map(o => s" - $o (class $

@liancheng
Copy link
Contributor

This LGTM.

@SparkQA
Copy link

SparkQA commented Jan 18, 2015

Test build #25715 timed out for PR 4093 at commit f7e6320 after a configured wait of 120m.

@SparkQA
Copy link

SparkQA commented Jan 18, 2015

Test build #25719 has finished for PR 4093 at commit beed86d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • out.stack.map(o => s" - $o (class $

@rxin
Copy link
Contributor Author

rxin commented Jan 28, 2015

Closing this one in favor of #4098

@rxin rxin closed this Jan 28, 2015
@geofflangenderfer
Copy link

geofflangenderfer commented Nov 18, 2022

could someone give a simple example of how to read the graph? I'm not sure where to start

https://stackoverflow.com/questions/74487510/sparkexception-task-not-serializable-how-to-read-serialization-stack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants