Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24948][SHS][BACKPORT-2.2] Delegate check access permissions to the file system #22022

Closed
wants to merge 4 commits into from

Conversation

mgaido91
Copy link
Contributor

@mgaido91 mgaido91 commented Aug 7, 2018

What changes were proposed in this pull request?

In SparkHadoopUtil. checkAccessPermission, we consider only basic permissions in order to check whether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can).

The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons.

How was this patch tested?

added UT

In `SparkHadoopUtil. checkAccessPermission`,  we consider only basic permissions in order to check wether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can).

The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons.

modified UTs

Author: Marco Gaido <[email protected]>

Closes apache#21895 from mgaido91/SPARK-24948.
@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94360 has finished for PR 22022 at commit 16b7b40.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91 mgaido91 changed the title [SPARK-24948][SHS] Delegate check access permissions to the file system [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access permissions to the file system Aug 7, 2018
@mgaido91
Copy link
Contributor Author

mgaido91 commented Aug 7, 2018

cc @jerryshao @mridulm

@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94361 has finished for PR 22022 at commit 657d364.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94369 has finished for PR 22022 at commit 657d364.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94374 has finished for PR 22022 at commit 16233d1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@jerryshao jerryshao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merging to branch 2.2

@jerryshao
Copy link
Contributor

jerryshao commented Aug 8, 2018

Sorry, let me test again to see everything is ok. Will merge it when test is passed.

@jerryshao
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 8, 2018

Test build #94401 has finished for PR 22022 at commit 16233d1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Aug 8, 2018
… the file system

## What changes were proposed in this pull request?

In `SparkHadoopUtil. checkAccessPermission`,  we consider only basic permissions in order to check whether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can).

The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons.

## How was this patch tested?

added UT

Author: Marco Gaido <[email protected]>

Closes #22022 from mgaido91/SPARK-24948_2.2.
@jerryshao
Copy link
Contributor

Merged to branch 2.2, please close this PR @mgaido91

@mgaido91
Copy link
Contributor Author

mgaido91 commented Aug 8, 2018

Thanks @jerryshao , closing.

@mgaido91 mgaido91 closed this Aug 8, 2018
@mgaido91 mgaido91 deleted the SPARK-24948_2.2 branch August 8, 2018 08:40
Willymontaz pushed a commit to criteo-forks/spark that referenced this pull request Sep 26, 2019
… the file system

## What changes were proposed in this pull request?

In `SparkHadoopUtil. checkAccessPermission`,  we consider only basic permissions in order to check whether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can).

The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons.

## How was this patch tested?

added UT

Author: Marco Gaido <[email protected]>

Closes apache#22022 from mgaido91/SPARK-24948_2.2.
Willymontaz pushed a commit to criteo-forks/spark that referenced this pull request Sep 27, 2019
… the file system

## What changes were proposed in this pull request?

In `SparkHadoopUtil. checkAccessPermission`,  we consider only basic permissions in order to check whether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can).

The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons.

## How was this patch tested?

added UT

Author: Marco Gaido <[email protected]>

Closes apache#22022 from mgaido91/SPARK-24948_2.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants