-
Notifications
You must be signed in to change notification settings - Fork 118
Use the driver pod IP address for spark.driver.bindAddress #533
Use the driver pod IP address for spark.driver.bindAddress #533
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good to me. Thanks!
Maybe @mccheah or @ifilonenko can take a look as well?
rerun integration tests please |
@@ -147,7 +128,7 @@ private[spark] class DriverAddressConfigurationStepSuite | |||
} catch { | |||
case e: Throwable => | |||
assert(e.getMessage === | |||
s"requirement failed: ${DriverAddressConfigurationStep.DRIVER_HOST_KEY} is" + | |||
s"requirement failed: ${DriverServiceBootstrapStep.DRIVER_HOST_KEY} is" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be indented 2 spaces not 4 (same for the ones above).
http://spark.apache.org/contributing.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -108,7 +91,7 @@ private[spark] class DriverAddressConfigurationStepSuite | |||
} | |||
|
|||
test("Long prefixes should switch to using a generated name.") { | |||
val configurationStep = new DriverAddressConfigurationStep( | |||
val configurationStep = new DriverServiceBootstrapStep( | |||
LONG_RESOURCE_NAME_PREFIX, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be indented 2 spaces not 4 (same for the ones above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Has this been tested in production clusters? |
Particularly with SPARK-21642 also merged. |
Actually I wanted to circle back to this and what we concluded from #523. It sounds like from the comments in the issue, the problem is a race condition when the I wonder if we can just fix the race condition itself. Can the scheduler backend's initialization have a Watch that blocks until the endpoint object is ready before attempting to bind to the given hostname? That seems more idiomatic and fixes the actual underlying problem, rather than working with IP address magic. But there's pros and cons either way. If we use the IP address, is kube-dns a requirement for this code path? I think we still rely on kube-dns to resolve the K8s master hostname. But relying less on kube-dns seems beneficial anyways. |
@mccheah Excellent questions.
This might also work. However, it could be difficult to block the binding until the endpoint is ready. Depends on which Spark core classes are in charge of binding the ports. (I think there are actually two, the scheduler backend port and block manager port) Personally, I do not know which core classes handle the binding.
The IP address approach allows the driver to avoid
For the k8s master hostname, we are relying on the underlying bare-metal DNS, not kube-dns. |
If we wanted, we could block this by having polling for the Endpoints resource's readiness in the scheduler backend - we know for sure that the scheduler needs to initialize before it requests for any executors, unlike with the general case where we don't know when the driver will attempt to bind to an address. Having it bind to IP address instead seems fine, but I think we want to make sure this is tested with SPARK-21642 merged. If someone can custom build and test Spark with the appropriate patches then that would be fantastic. Something like what @ash211 did with #484. |
I'm testing a custom build with changes from SPARK-21642. Will report back later. |
I agree the scheduler backend should block and wait for the Endpoints resource to become ready before launching executor pods. |
That would require a minor and very subtle API break. We allow the driver to be provided specific credentials so that the driver can have only the minimum privilege to create/delete pods. Now, the driver will also need privileges to read the status of endpoints in its namespace. I don't see this as a significant concern but I am noting it here as a reference. |
There seems to be an issue with changes from this PR and changes from SPARK-21642. I got the following exception when running the SparkPi example:
The problem is when the executor started up, The hostname of the driver pod can be customized using the |
Actually to solve the issue above, we just need to pass |
After failing to make the approach above work, this landed onto a totally different approach that finally works. Tested the new solution on a GKE cluster with changes from SPARK-21642 merged. Please see the updated PR description. |
I believe the main point of SPARK-21642 was to tie the driver to a hostname instead of an IP address. It seems like moving to set the host to an IP address is a regression in this behavior. Should we still be trying to make this work with the service hostname? |
We are still setting |
I was curious why it does that. It is helpful to consider the behavior in two pieces:
Is only (1) the point of SPARK-21642? Or (2) is also important for SPARK-21642? |
@kimoonkim Very good points. IMO (1) is what SPARK-21642 really is all about. (2) is a side effect of it simply because |
s"requirement failed: ${DriverAddressConfigurationStep.DRIVER_BIND_ADDRESS_KEY} is" + | ||
s" not supported in Kubernetes mode, as the driver's hostname will be managed via" + | ||
s" a Kubernetes service.") | ||
s"requirement failed: ${DriverServiceBootstrapStep.DRIVER_BIND_ADDRESS_KEY} is" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error message should probably say "the driver's bind address will be managed...". It's also not by a Kubernetes service anymore, but by the pod's IP address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -75,7 +75,7 @@ private[spark] class BaseDriverConfigurationStepSuite extends SparkFunSuite { | |||
.asScala | |||
.map(env => (env.getName, env.getValue)) | |||
.toMap | |||
assert(envs.size === 6) | |||
assert(envs.size === 7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the value of the new env we put here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot check the value as it will be null unless there's a way to set status.podIP
. But we can definitely check the existence of the new env key. Added a check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be able to check that there is a valueFrom
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, because we're setting the envs in a map but the valueFrom
isn't captured in that map. Looks like we'll need to break out of the Map[envKey -> envValue]
paradigm to check this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Looks ok apart from the minor comments. |
val envDriverBindAddress = preparedDriverSpec.driverContainer | ||
.getEnv | ||
.asScala | ||
.filter(envVar => envVar.getName.equals(ENV_DRIVER_BIND_ADDRESS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit -> can compact this all into an exists
call. E.g.
val hasBindAddressWithPodIP = preparedDriverSpec.driverContainer.getEnv.asScala.exists { envVar -> envVar.getName == ... && envVar.getValueFrom.... }
assert(hasBindAddressWithPodIP, <message>)
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge when the build passes; feel free to merge if I miss it otherwise.
…ark-on-k8s#533) * Use the driver pod IP address for spark.driver.bindAddress * Addressed comments * Addressed more comments * Fixed broken DriverServiceBootstrapStepSuite
Introduce driver shuffle lifecycle APIs
What changes were proposed in this pull request?
This PR attempts to fix the issue reported in #523 that may happen if the driver tries to bind to the driver host name before the endpoint controller modifies the DNS configuration.
Changes:
the submission client stops setting
spark.driver.bindAddress
based on the name of the headless service for the driver inDriverAddressConfigurationStep
that's renamed toDriverServiceBootstrapStep
in this PR. Instead, this PR introduces a new environment variableSPARK_DRIVER_BIND_ADDRESS
that get its value fromstatus.podIP
using the downward API. So at runtimeSPARK_DRIVER_BIND_ADDRESS
's value is the driver pod's IP address. Then we can do-Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS
in the Dockerfile to givespark.driver.bindAddress
the right value. The submission client still setsspark.driver.host
to the driver service DNS name, though.Tested on a GKE cluster using the SparkPi example. Verified that the following showed up in the driver container:
And the driver pod YAML contained the following:
@foxish @mccheah @kimoonkim