Skip to content

Commit

Permalink
[SPARK-25174][YARN] Limit the size of diagnostic message for am to un…
Browse files Browse the repository at this point in the history
…register itself from rm

## What changes were proposed in this pull request?

When using older versions of spark releases,  a use case generated a huge code-gen file which hit the limitation `Constant pool has grown past JVM limit of 0xFFFF`.  In this situation, it should fail immediately. But the diagnosis message sent to RM is too large,  the ApplicationMaster suspended and RM's ZKStateStore was crashed. For 2.3 or later spark releases the limitation of code-gen has been removed, but maybe there are still some uncaught exceptions that contain oversized error message will cause such a problem.

This PR is aim to cut down the diagnosis message size.

## How was this patch tested?

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes apache#22180 from yaooqinn/SPARK-25174.

Authored-by: Kent Yao <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
  • Loading branch information
yaooqinn authored and Marcelo Vanzin committed Aug 24, 2018
1 parent 8bb9414 commit f8346d2
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ package org.apache.spark.deploy.yarn

import java.io.{File, IOException}
import java.lang.reflect.{InvocationTargetException, Modifier}
import java.net.{Socket, URI, URL}
import java.net.{URI, URL}
import java.security.PrivilegedExceptionAction
import java.util.concurrent.{TimeoutException, TimeUnit}

Expand All @@ -28,6 +28,7 @@ import scala.concurrent.Promise
import scala.concurrent.duration.Duration
import scala.util.control.NonFatal

import org.apache.commons.lang3.{StringUtils => ComStrUtils}
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.hadoop.util.StringUtils
import org.apache.hadoop.yarn.api._
Expand Down Expand Up @@ -368,7 +369,7 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends
}
logInfo(s"Final app status: $finalStatus, exitCode: $exitCode" +
Option(msg).map(msg => s", (reason: $msg)").getOrElse(""))
finalMsg = msg
finalMsg = ComStrUtils.abbreviate(msg, sparkConf.get(AM_FINAL_MSG_LIMIT).toInt)
finished = true
if (!inShutdown && Thread.currentThread() != reporterThread && reporterThread != null) {
logDebug("shutting down reporter thread")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,12 @@ package object config {
.toSequence
.createWithDefault(Nil)

private[spark] val AM_FINAL_MSG_LIMIT = ConfigBuilder("spark.yarn.am.finalMessageLimit")
.doc("The limit size of final diagnostic message for our ApplicationMaster to unregister from" +
" the ResourceManager.")
.bytesConf(ByteUnit.BYTE)
.createWithDefaultString("1m")

/* Client-mode AM configuration. */

private[spark] val AM_CORES = ConfigBuilder("spark.yarn.am.cores")
Expand Down

0 comments on commit f8346d2

Please sign in to comment.