Bypass init-containers if `spark.jars` and `spark.files` is empty or … #348

chenchun · 2017-06-16T09:46:31Z

…only has local:// URIs

Fixes #338

chenchun · 2017-06-16T09:47:24Z

@mccheah Would you like to take a look ?

ash211 · 2017-06-16T14:58:41Z

Thanks for the contribution @chenchun ! Definitely appreciate seeing more and more people using this project!

For this PR, at first glance it look like the scalastyle check is failing because one of the lines is too long, see http://spark-k8s-jenkins.pepperdata.org:8080/job/PR-spark-k8s-full-build/569/consoleFull#1143296497853a0453-9a85-4740-a867-694552c49a93 Would you mind please updating the PR to pass scalastyle? You can run it locally with ./dev/scalastyle

ash211 · 2017-06-16T15:01:39Z

...nagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/Client.scala

+      }
+    }
+    var podBuilder = basePod
+    var sparkConfWithExecutorInit = sparkConf


prefer using val over var where possible

ash211 · 2017-06-16T15:02:45Z

...nagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/Client.scala

+      if (e.nonEmpty && !e.startsWith("local://")) {
+        needInitContainer = true
+      }
+    }


can do something like

val needInitContainer = (resolvedSparkJars ++ resolvedSparkFiles).exists { e -> e.nonEmpty && !e.startsWith("local://") }

mccheah · 2017-06-16T17:46:20Z

...nagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/Client.scala

+        needInitContainer = true
+      }
+    }
+    var podBuilder = basePod


Write the logic such that this can be kept as a val. There's a few options, but here's one:

val (sparkConfWithExecutorInit, podBuilderWithInitContainer) = if (needInitContainer { ... } else (sparkConf, basePod)

mccheah · 2017-06-16T17:46:54Z

...nagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/Client.scala

    val credentialsSecret = credentialsMounter.createCredentialsSecret()
-    val podWithInitContainerAndMountedCreds = credentialsMounter.mountDriverKubernetesCredentials(
-      podWithInitContainer, driverContainer.getName, credentialsSecret)
+    podBuilder = credentialsMounter.mountDriverKubernetesCredentials(


Again, use val everywhere and don't re-assign here.

mccheah · 2017-06-16T17:48:43Z

It might be preferable to have the switch on if init-containers are needed or not to be handled inside either one of the existing modules or in a new module. See DriverInitContainerComponentsProvider and the modules it creates for some examples. We're trying to avoid the problem we had in the first implementation where Client.scala grew to be untenably large because of all the logic that was incrementally added over time.

mccheah · 2017-06-16T17:49:34Z

...nagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/Client.scala

@@ -168,24 +168,34 @@ private[spark] class Client(
    val initContainerConfigMap = initContainerComponentsProvider


And we shouldn't be making a config map if we're not using an init-container. In general we should try to encapsulate everything related to whether or not we need an init-container in one place, so that we only have to make the check once and run everything that relates to that check there.

chenchun · 2017-06-19T15:06:09Z

Thanks for the review. Haven't got chance to test it, but if this is the right direction? @mccheah

mccheah · 2017-06-19T23:38:07Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala


  override def provideExecutorInitContainerConfiguration(): ExecutorInitContainerConfiguration = {
-    new ExecutorInitContainerConfigurationImpl(
+    if (needInitContainer) new ExecutorInitContainerConfigurationImpl(


I think the repeated instances of checking for this variable is indicating we need to do some refactoring here. I already had a hunch that we had done too much sharding and fragmentation in this architecture, and this repeated logic seems to confirm that theory. In v1 we had the extreme that had a particularly monolithic class hierarchy that was difficult to read because we had everything in one place, but in this code we appear to have went to the opposite extreme by trying to make every trait do exactly one thing, but that might have been too granular and results in problems like this.

I wonder if you could come up with a design that makes it such that we only have to switch once. One example of this could be to have a single method in this provider class that creates an init container "bundle", which can be a case class / struct of all the things that's required to set up one of these init containers. Then, we can have a switch that provides an Option of one of these bundles, and either returns all of the components, or none of them.

We don't actually have to merge the classes themselves, though in some cases that might make sense. We particularly need to isolate the SparkPodInitContainerBootstrap because it's a component that's re-used for both the driver pod and the executor pods. But there might be other cases where it makes more sense to combine some of the classes.

I'm fairly open to any ideas on how to make this code better, but can you see what can be done here?

cc @ifilonenko who has also been working on this code.

Difficulty of this statement is that for PySpark and SparkR there will be files that arent necessarily SparkJars or SparkFiles... so this logic wouldn't work specifically given PR 351 As @mccheah has noted, this may require re-structuring. Good to bring up at next SIG meeting

ifilonenko · 2017-06-20T01:21:03Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala

+  val resolvedSparkFiles = containerLocalizedFilesResolver.resolveSubmittedSparkFiles()
+  // Bypass init-containers if `spark.jars` and `spark.files` is empty or only has `local://` URIs
+  private val needInitContainer = (resolvedSparkJars ++ resolvedSparkFiles).exists { e =>
+    e.nonEmpty && !e.startsWith("local://")


Try to refrain from this check by leveraging the functions already provided by: here
or if use-case is specific enough try adding the function to the Utils for re-usability within the system

chenchun · 2017-06-21T07:24:01Z

Thanks for your comments. I think I started to get into Scala. @mccheah @ifilonenko PTAL.

ifilonenko · 2017-06-21T22:19:50Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala

@@ -105,7 +105,7 @@ private[spark] class DriverInitContainerComponentsProviderImpl(
  private val dockerImagePullPolicy = sparkConf.get(DOCKER_IMAGE_PULL_POLICY)
  private val downloadTimeoutMinutes = sparkConf.get(INIT_CONTAINER_MOUNT_TIMEOUT)

-  override def provideInitContainerConfigMapBuilder(
+  private def provideInitContainerConfigMapBuilder(


This is causing Compile errors in Unit Test (as seen here in ClientV2Suite) please modify ClientV2Suite to account for these changes

ifilonenko · 2017-06-21T22:24:46Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala

+    val containerLocalizedFilesResolver = provideContainerLocalizedFilesResolver()
+    // Bypass init-containers if `spark.jars` and `spark.files` is empty or only has `local://` URIs
+    if (KubernetesFileUtils.getNonContainerLocalFiles(
+      containerLocalizedFilesResolver.resolveSubmittedSparkJars()


good use of KubernetesFileUtils however we are already calling this in Client.scala. Maybe pass those jars down so we don't have to re-run this same command in multiple places

ifilonenko · 2017-06-21T22:28:34Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala

+      maybeSubmittedResourceIds: Option[SubmittedResourceIds]): Option[InitContainerBundle] = {
+    val containerLocalizedFilesResolver = provideContainerLocalizedFilesResolver()
+    // Bypass init-containers if `spark.jars` and `spark.files` is empty or only has `local://` URIs
+    if (KubernetesFileUtils.getNonContainerLocalFiles(


Why local://? What if it includes: hdfs://, s3? or whatnot? @mccheah is intention here to bypass if only local:// or if all non-file:// ? In which case the logic here should change to not just be a check for != "local"

If there are HDFS or other remote URIs, we still need the init-container to download them, because while they might not be needed for Spark-specific things they're still needed for the application itself. For example if my main class is in a jar stored in HDFS the jar needs to be downloaded before the JVM can launch at all.

Therefore bypassing is only guaranteed to be safe if all dependencies are located on the Docker image or can afford to be downloaded after the JVM starts running (via SparkContext.addJar).

chenchun · 2017-06-22T04:43:31Z

@ifilonenko Addressed your comments

ash211

A couple minor suggestions from me, then I'm good to merge.

@mccheah any last bits from you?

All, let's aim to merge in the next 24hr

ash211 · 2017-06-22T08:26:59Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala

+    val containerLocalizedFilesResolver = provideContainerLocalizedFilesResolver()
+    // Bypass init-containers if `spark.jars` and `spark.files` is empty or only has `local://` URIs
+    if (KubernetesFileUtils.getNonContainerLocalFiles(uris).nonEmpty) {
+      Some(InitContainerBundle(provideInitContainerConfigMapBuilder(maybeSubmittedResourceIds),


maybe should call .build on provideInitContainerConfigMapBuilder(maybeSubmittedResourceIds) since all clients immediately call .build() on this value

I leave it as it is because it seems @mccheah will do a refactor after.

I think the comment above is still worth doing

ash211 · 2017-06-22T08:28:51Z

...nagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/Client.scala

+    val initContainerConfigMapSeq = maybeInitContainerConfigMap match {
+      case Some(configMap) => Seq(configMap)
+      case None => Seq()
+    }


does .toSeq work here?

Yes, it works. Updated the PR.

ash211 · 2017-06-22T08:29:52Z

...nagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/Client.scala

@@ -224,7 +232,7 @@ private[spark] class Client(
            .watch(loggingPodStatusWatcher)) { _ =>
      val createdDriverPod = kubernetesClient.pods().create(resolvedDriverPod)
      try {
-        val driverOwnedResources = Seq(initContainerConfigMap) ++
+        val driverOwnedResources = initContainerConfigMapSeq ++


inline initContainerConfigMapSeq -- no need to declare a variable used only once

mccheah

Like the idea - I think this actually addresses point 2 on #354 where we should return either all the init-container things or none of them.

mccheah · 2017-06-22T19:54:47Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala

@@ -202,4 +201,15 @@ private[spark] class DriverInitContainerComponentsProviderImpl(
        configMapKey,
        resourceStagingServerSecretPlugin)
  }
+
+  override def provideInitContainerBundle(maybeSubmittedResourceIds: Option[SubmittedResourceIds],


Move this argument down a line and put each argument on one line.

…only has `local://` URIs

ash211 · 2017-06-23T06:07:28Z

Thanks for the contribution @chenchun ! I'm looking forward to trying this out in my own clusters.

Out of curiosity, how much less time is spent in pod startup on your cluster now that the init container is bypassed?

chenchun · 2017-06-23T06:27:00Z

It's like 0.3s per container based on aufs.

ifilonenko · 2017-06-23T17:12:55Z

...kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/submit/ClientV2Suite.scala

-    when(initContainerComponentsProvider
-        .provideInitContainerConfigMapBuilder(Some(SUBMITTED_RESOURCES.ids())))
-        .thenReturn(initContainerConfigMapBuilder)
+    when(initContainerComponentsProvider.provideInitContainerBundle(Some(SUBMITTED_RESOURCES.ids()),


I modified this in the merge but you should be using mockitoEq() here.

ifilonenko · 2017-06-23T17:15:32Z

...kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/submit/ClientV2Suite.scala

-    when(initContainerComponentsProvider
-      .provideInitContainerConfigMapBuilder(None))
-      .thenReturn(initContainerConfigMapBuilder)
+    when(initContainerComponentsProvider.provideInitContainerBundle(None, RESOLVED_SPARK_JARS ++


Same as above. Use mockitoEq()

ash211 · 2017-06-23T18:51:08Z

@ifilonenko so does this require a followup commit?

ifilonenko · 2017-06-23T20:40:26Z

.../scala/org/apache/spark/deploy/kubernetes/submit/DriverInitContainerComponentsProvider.scala

+  override def provideInitContainerBundle(
+      maybeSubmittedResourceIds: Option[SubmittedResourceIds],
+      uris: Iterable[String]): Option[InitContainerBundle] = {
+    val containerLocalizedFilesResolver = provideContainerLocalizedFilesResolver()


This line isn't used. No need to build this out. I removed this in my merge

ifilonenko · 2017-06-23T20:42:09Z

@ash211 I refactored parts of the testing environment and accounted for pySparkFiles, in my merge in PR-351. I also took out the unnecessary line in the DriverInitComponentImpl which initialized a FileResolver that wasn't used. So this will not require a followup commit as I fixed the issues addressed above, its just for the sake of bookkeeping.

ash211 reviewed Jun 16, 2017

View reviewed changes

mccheah reviewed Jun 16, 2017

View reviewed changes

chenchun force-pushed the pod_launch branch from 01dd4e4 to dec9769 Compare June 19, 2017 15:04

mccheah reviewed Jun 19, 2017

View reviewed changes

ifilonenko reviewed Jun 20, 2017

View reviewed changes

chenchun force-pushed the pod_launch branch from dec9769 to 041ec9a Compare June 21, 2017 07:20

chenchun force-pushed the pod_launch branch from 041ec9a to 55553c9 Compare June 21, 2017 07:34

ifilonenko reviewed Jun 21, 2017

View reviewed changes

chenchun force-pushed the pod_launch branch 2 times, most recently from a07409a to bf54fd2 Compare June 22, 2017 03:46

ash211 reviewed Jun 22, 2017

View reviewed changes

chenchun force-pushed the pod_launch branch from bf54fd2 to fcc6959 Compare June 22, 2017 09:57

mccheah reviewed Jun 22, 2017

View reviewed changes

chenchun force-pushed the pod_launch branch 2 times, most recently from 17fdc74 to c466ef6 Compare June 23, 2017 02:00

Bypass init-containers if spark.jars and spark.files is empty or …

c466ef6

…only has `local://` URIs

ash211 approved these changes Jun 23, 2017

View reviewed changes

ash211 merged commit 08fe944 into apache-spark-on-k8s:branch-2.1-kubernetes Jun 23, 2017

chenchun deleted the pod_launch branch June 23, 2017 06:24

ifilonenko reviewed Jun 23, 2017

View reviewed changes

foxish pushed a commit that referenced this pull request Jul 24, 2017

Bypass init-containers when possible (#348)

168ef0a

puneetloya pushed a commit to puneetloya/spark that referenced this pull request Mar 11, 2019

Bypass init-containers when possible (apache-spark-on-k8s#348)

b6c5f0f

		@@ -168,24 +168,34 @@ private[spark] class Client(
		val initContainerConfigMap = initContainerComponentsProvider

Bypass init-containers if spark.jars and spark.files is empty or … #348

Bypass init-containers if spark.jars and spark.files is empty or … #348

Conversation

chenchun commented Jun 16, 2017

chenchun commented Jun 16, 2017

ash211 commented Jun 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mccheah commented Jun 16, 2017

Choose a reason for hiding this comment

chenchun commented Jun 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenchun commented Jun 21, 2017

ifilonenko Jun 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mccheah Jun 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenchun commented Jun 22, 2017

ash211 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mccheah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ash211 commented Jun 23, 2017

chenchun commented Jun 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ash211 commented Jun 23, 2017

Choose a reason for hiding this comment

ifilonenko commented Jun 23, 2017

Bypass init-containers if `spark.jars` and `spark.files` is empty or … #348

Bypass init-containers if `spark.jars` and `spark.files` is empty or … #348

ifilonenko Jun 21, 2017 •

edited

Loading

mccheah Jun 21, 2017 •

edited

Loading