Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-27872][K8s] Fix executor service account inconsistency #24748

Closed
wants to merge 1 commit into from

Conversation

skonto
Copy link
Contributor

@skonto skonto commented May 30, 2019

What changes were proposed in this pull request?

Fixes the service account inconsistency that breaks pull secrets. It gives the option to the user to setup a specific service account for the executors if he has to
(via spark.kubernetes.authenticate.executor.serviceAccountName). Defaults to the driver's one.
We are not supporting special authentication credentials for the executors with this PR.

How was this patch tested?

Tested manually by launching a Spark job exercising the introduced settings.
Added a new integration tests for this fix.

@skonto skonto force-pushed the fix_executor_sa branch from 9698ab8 to d91a41c Compare May 30, 2019 21:11
@skonto
Copy link
Contributor Author

skonto commented May 30, 2019

jenkins test this pelase

@SparkQA
Copy link

SparkQA commented May 30, 2019

Test build #105980 has finished for PR 24748 at commit d91a41c.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@skonto skonto force-pushed the fix_executor_sa branch from d91a41c to c0573e9 Compare May 30, 2019 21:22
@SparkQA
Copy link

SparkQA commented May 30, 2019

@skonto
Copy link
Contributor Author

skonto commented May 30, 2019

@shaneknapp I think it would be good to run tests using the rbac file available to make sure we cover cases like this. Thoughts?

@skonto
Copy link
Contributor Author

skonto commented May 30, 2019

@erikerlandson @liyinan926 @srowen pls review. Should we backport it to branch 2.4?

@SparkQA
Copy link

SparkQA commented May 30, 2019

@SparkQA
Copy link

SparkQA commented May 30, 2019

@SparkQA
Copy link

SparkQA commented May 30, 2019

Test build #105981 has finished for PR 24748 at commit c0573e9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 30, 2019

@srowen
Copy link
Member

srowen commented May 31, 2019

I don't know much about the context here, but would someone really need separate service accounts for driver and executor?

@skonto
Copy link
Contributor Author

skonto commented May 31, 2019

@srowen the idea so far was to have executors get assigned the default one (independently of the driver sa), assuming that the default account has the least privileges. Executors dont create resources. We discussed this a bit at the User Group Meeting and seems a viable option to make it configurable in case you dont have to comply with the driver sa. I will wait to see what the others have to say, I am ok if we dont make it configurable.

@erikerlandson
Copy link
Contributor

One possibility would be constraining to two options: either "default" or "whatever the driver's SA is", which I believe solves the original issue.

@liyinan926
Copy link
Contributor

liyinan926 commented May 31, 2019

+1 on @erikerlandson's suggestion. The only question is how do we know when to use the default and when to use the driver's.

@felixcheung
Copy link
Member

so with the driver sa, the driver can still do stuff? that doesn't sound very secure as the driver is still running user/app code?

@skonto
Copy link
Contributor Author

skonto commented Jun 2, 2019

@felixcheung that is true yet the driver needs to create certain resources so it needs some privileges.
So DoS is possible here by design before this PR and as you said it is not secure in that sense. On the other hand, afaik this is the case with other resource managers eg. mesos where you could potentially register an arbitrary framework and start accepting offers with the same privileges. Only quotas can restrict the misuse here so if there are no quotas user can do by definition whatever the account allows to. So it does not seem, a security issue to me. The malicious code is capable to do what the app is supposed to do not more, it cannot for example escalate privileges or anything, unless you run with an admin account.
Malicious code though will be able to bypass spark configuration and launch for example more than the allowed number of pods.

@skonto
Copy link
Contributor Author

skonto commented Jun 2, 2019

@liyinan926 the initial PR was following @erikerlandson suggestion to make it configurable, but default or driver's sa sounds good too ;)

The only question is how do we know when to use the default and when to use the driver's.

If there is no service account defined with the config property then we dont set anything which means picking up default for both the driver and the executors.

@skonto skonto force-pushed the fix_executor_sa branch from c0573e9 to a42f7a8 Compare June 3, 2019 13:26
@SparkQA
Copy link

SparkQA commented Jun 3, 2019

@SparkQA
Copy link

SparkQA commented Jun 3, 2019

Test build #106109 has finished for PR 24748 at commit a42f7a8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 3, 2019

@skonto
Copy link
Contributor Author

skonto commented Jun 3, 2019

@erikerlandson @felixcheung pls review I made the modifications needed.

@erikerlandson
Copy link
Contributor

My take on "driver's SA or default" is that the executor setting is some kind of boolean flag, such as spark.kubernetes.executor.inherit.service.account and if it is true, it inherits the SA from the driver, but defaults to false (use default SA as before).

@skonto
Copy link
Contributor Author

skonto commented Jun 3, 2019

@erikerlandson I am not sure we can assume default has less privileges as before, user may assign it any kind of privileges. I dont see any benefit keeping the old behavior. It would be good to have clear semantics. @liyinan926 what's ur take?

@erikerlandson
Copy link
Contributor

The proposed behavior of "default executor SA to the driver SA" and default driver SA to default is also reasonable. My earlier concern was minimal permissions to executors, however I don't think it adds much vulnerability surface, since executors will not by default have more premissions than the driver, and could only have more if some such SA is deliberately made available to the user.

@skonto
Copy link
Contributor Author

skonto commented Jun 3, 2019

@erikerlandson I see ok. Whatever works I dont mind let me know if this can be merged otherwise I can add a flag. ;)

@erikerlandson
Copy link
Contributor

LGTM

@skonto
Copy link
Contributor Author

skonto commented Jun 4, 2019

@erikerlandson could you merge it pls if there are no objections from @liyinan926 or @felixcheung ?

@skonto
Copy link
Contributor Author

skonto commented Jun 5, 2019

@erikerlandson gentle ping :)

@skonto
Copy link
Contributor Author

skonto commented Jun 9, 2019

@srowen do you think I can get this merged, Erik is off until the 24th.

@srowen
Copy link
Member

srowen commented Jun 9, 2019

Merged to master

@srowen srowen closed this in 7912ab8 Jun 9, 2019
emanuelebardelli pushed a commit to emanuelebardelli/spark that referenced this pull request Jun 15, 2019
## What changes were proposed in this pull request?

Fixes the service account inconsistency that breaks pull secrets. It gives the option to the user to setup a specific service account for the executors if he has to
(via `spark.kubernetes.authenticate.executor.serviceAccountName`). Defaults to the driver's one.
We are not supporting special authentication credentials for the executors with this PR.

## How was this patch tested?

Tested manually by launching a Spark job exercising the introduced settings.
Added a new integration tests for this fix.

Closes apache#24748 from skonto/fix_executor_sa.

Authored-by: Stavros Kontopoulos <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
gregsterin pushed a commit to Affirm/spark that referenced this pull request Sep 18, 2020
Fixes the service account inconsistency that breaks pull secrets. It gives the option to the user to setup a specific service account for the executors if he has to
(via `spark.kubernetes.authenticate.executor.serviceAccountName`). Defaults to the driver's one.
We are not supporting special authentication credentials for the executors with this PR.

Tested manually by launching a Spark job exercising the introduced settings.
Added a new integration tests for this fix.

Closes apache#24748 from skonto/fix_executor_sa.

Authored-by: Stavros Kontopoulos <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit 7912ab8)
erikerlandson pushed a commit that referenced this pull request Sep 25, 2020
### What changes were proposed in this pull request?

Similar patch to #24748 but applied to the branch-2.4.
Backporting the fix to releases 2.4.x.
Please let me know if I missed some step; I haven't contributed to spark in a long time.

Closes #29844 from nssalian/patch-SPARK-27872.

Authored-by: nssalian <[email protected]>
Signed-off-by: Erik Erlandson <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Oct 12, 2020
### What changes were proposed in this pull request?

Similar patch to #24748 but applied to the branch-2.4.
Backporting the fix to releases 2.4.x.

Closes #29877 from nssalian/SPARK-27872.

Authored-by: nssalian <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@SemanticBeeng
Copy link

SemanticBeeng commented May 2, 2021

@CloudFlow @RayRoestenburg @yuchaoran2011 : can latest cloudflow work with latest Spark operator for 3.1.1 ?

On https://cloudflow.io/docs/dev/administration/installing-spark-operator.html (https://archive.ph/wip/Qa1Vn) it says

The only Cloudflow compatible Spark operator is a fork maintained by Lightbend. See Backported Fix for Spark 2.4.5 for more details.

Is it still the case that the spark 2.4 fork is required ?
If so then please advise what needs to be done to make cloudflow run with spark 3.1.1.

Resources

  1. Spark executor pods fail to pull image when private registry is used lightbend/cloudflow#511 (comment)
  2. Made field serviceAccount available for both driver and executors kubeflow/spark-operator#924 (comment)
  3. Updated Spark base image to use Alpine; Fixed Spark build script lightbend/cloudflow#578 (comment)

@RayRoestenburg
Copy link

Hi @SemanticBeeng, we are in the process of releasing Cloudflow 2.1.0, which is taking a different approach for Spark integration. We have tested that this approach does work with the latest version of Spark operator and with Spark 3.1.1. Please see the cloudflow-contrib docs for the new approach: https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/index.html (repo is here: https://github.com/lightbend/cloudflow-contrib).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants