Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-41781][K8S] Add the ability to create pvc before creating driver/executor pod #39306

Closed
wants to merge 3 commits into from

Conversation

dcoliversun
Copy link
Contributor

@dcoliversun dcoliversun commented Dec 30, 2022

What changes were proposed in this pull request?

This patch aims to make PVCs as pre-resources and setup them before driver/executor Pod creation. After this patch, the workflow looks like this:

  • setup pre-resources (including PVCs) before Pod creation
  • create Pod
  • refresh all pre-resources' owner references as created Pod
  • make the created Pod an owner of other resources and create them

Why are the changes needed?

These pre-resources are usually necessary in Pod creation and scheduling, they should be ready before Pod creation, should be deleted when user delete Pod, the lifecycle of these resource is same with Pod.

If PVCs are created after Pod creation, we will have warning event in Pod as follow:

error getting PVC "spark/application-exec-1-pvc-0": could not find v1.PersistentVolumeClaim "spark/application-exec-1-pvc-0" 

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Add new UTs.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed code changes look like a little too intrusive. Do you think you can minimize some, @dcoliversun ?

@dcoliversun
Copy link
Contributor Author

@dongjoon-hyun Sorry for my late reply. I'm working on minimize intrusive code :)

@dcoliversun
Copy link
Contributor Author

@dongjoon-hyun
I'd like to discuss two options with you :)

  • First: Refer to the idea of SPARK-37331, I try to move PVC from resources to pre-resources, and set up before driver and executor creation. In future, we could set ConfigMap as pre-resource because there is same warn information about resource not found from k8s scheduler. (This is how this PR works.)
  • Second: Leave PVC as resource, filter PVC from resources and setup it before pod creation.

I prefer the first option. First option has more scalability, we could set other k8s resource as pre-resource if need. And I agree with you about code intrusiveness. I think it's worth setting PVC as pre-resource.

Any suggestion is greatly appreciated.

@dcoliversun dcoliversun changed the title [WIP][SPARK-41781][K8S] Add the ability to create pvc before creating driver/executor pod [SPARK-41781][K8S] Add the ability to create pvc before creating driver/executor pod Jan 10, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updates. I have two questions.

  1. According to the updated PR description, is the only benefit of this refactoring is to remove warning K8s event messages?
  2. Does this PR work with the existing feature like SPARK-41515 ?

@dcoliversun
Copy link
Contributor Author

dcoliversun commented Jan 16, 2023

@dongjoon-hyun

According to the updated PR description, is the only benefit of this refactoring is to remove warning K8s event messages?

This PR brings two benefit:

  • Remove unnecessary warning k8s event messages. We collect events as alarms, and it is currently impossible to determine whether this event is caused by insufficient storage resources or resource creation order.
  • In some schedule scenarios, the problem that webhook cannot find PVC based on Pod Spec can be fixed, because old logic is to create PVC after Pod creation.

Does this PR work with the existing feature like SPARK-41515 ?

ID PR Result
SPARK-39006 #36374 Pass the original ut
SPARK-39688 #37095 Pass the original ut
SPARK-39898 #37321 Only upgrade fabric8 version
SPARK-39965 #37433 This PR keeps driver owns PVCs when spark.kubernetes.driver.ownPersistentVolumeClaim=true
SPARK-40459 #37903 Fail to reproduce FileExistsException and additional exception catching is not changed. I believe this PR works fine with the feature
SPARK-41388 #38912 Pass the original ut
SPARK-41410 #38943 Pass the original ut
SPARK-41410 #38949 Pass the original ut
SPARK-41514 #39058 Only update doc
SPARK-41525 #39070 This PR does not involve the ExecId and PVCName. k8sKnownExecIds and k8sKnownPVCNames are still distinct

@dcoliversun
Copy link
Contributor Author

dcoliversun commented Jan 21, 2023

Thank you for the reviews @dongjoon-hyun , I believe I've addressed your comments!

@dcoliversun
Copy link
Contributor Author

gentle ping @dongjoon-hyun :)

@cometta
Copy link

cometta commented Apr 29, 2023

on spark 3.3 i able to shared the same PVC for driver and executors but in spark 3.4 , i get error should contain OnDemand or SPARK_EXECUTOR_ID when requiring multiple executors. is this related to this ticket?

@dcoliversun
Copy link
Contributor Author

@cometta Hi, this PR is not related to the problem you're having. Could you create an issue in JIRA, attach your spark configuration? I will follow up :)

@cometta
Copy link

cometta commented May 4, 2023

@dcoliversun created jira ticket https://issues.apache.org/jira/browse/SPARK-43329

@dcoliversun dcoliversun deleted the SPARK-41781 branch November 21, 2024 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants