[SPARK-24948][SHS] Delegate check access permissions to the file system #21895

mgaido91 · 2018-07-27T16:07:53Z

What changes were proposed in this pull request?

In SparkHadoopUtil. checkAccessPermission, we consider only basic permissions in order to check wether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can).

The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons.

How was this patch tested?

modified UTs

mgaido91 · 2018-07-27T16:08:47Z

@jerryshao @vanzin may you please take a look at this? Thanks.

SparkQA · 2018-07-27T16:15:20Z

Test build #93673 has finished for PR 21895 at commit 1052c17.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-27T20:20:14Z

Test build #93676 has finished for PR 21895 at commit ef42a93.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2018-07-28T07:13:02Z

retest this please

SparkQA · 2018-07-28T11:47:10Z

Test build #93719 has finished for PR 21895 at commit ef42a93.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…ery time

mgaido91 · 2018-07-28T15:38:32Z

cc @mridulm too

mridulm · 2018-07-28T17:31:38Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

+   * Cache containing the result for the already checked files.
+   */
+  // Visible for testing.
+  private[history] val cache = new mutable.HashMap[String, Boolean]


For long running history server in busy clusters (particularly where spark.history.fs.cleaner.maxAge is configured to be low), this Map will cause OOM.
Either an LRU cache or a disk backed map with periodic cleanup (based on maxAge) might be better ?

mridulm · 2018-07-28T17:37:39Z

+CC @jerryshao

SparkQA · 2018-07-28T20:09:45Z

Test build #93726 has finished for PR 21895 at commit 480e326.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-29T02:52:20Z

Test build #93735 has finished for PR 21895 at commit 2ad5285.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jerryshao · 2018-08-01T00:45:41Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

@@ -973,6 +978,42 @@ private[history] object FsHistoryProvider {
  private[history] val CURRENT_LISTING_VERSION = 1L
 }

+private[history] trait CachedFileSystemHelper extends Logging {


As discussed offline, my main concern is about cache inconsistency if user changed the file permission during cache valid time.

This is true, but the only way to avoid this issue is to call fs.access every time,which may cause huge performance issues. Moreover,I think it is also very unlikely that a user manually changes permission of the event logs of an application and restarting the SHS in such a scenario would solve the problem. In the current state, even though the file is accessible, it is ignored and the user has no workaround other than changing ownership or permissions to all files,despite the user running SHS can read the files (moreover it is a regression for these users)...

Anyway if you have a better suggestion I am more than happy to follow it.

jerryshao · 2018-08-01T00:47:57Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

+  // Visible for testing.
+  private[history] val cache = CacheBuilder.newBuilder()
+    .expireAfterAccess(expireTimeInSeconds, TimeUnit.SECONDS)
+    .build[String, java.lang.Boolean]()


In the real world, there will be many event logs under the folder, this will lead to memory increase indefinitely and potentially lead to OOM. We have seen that customer has more than 100K event logs in this folder.

Memory doesn't increase indefinitely as entries expire over the time. Moreover, as here we are storing a string containing only the name of the file and a Boolean, each entry is going to need about 100bytes in memory. With 100k event logs,this means about 10MB, which doesn't seem to me a value which can cause an OOM. Anyway, we can also add a maximum number of entries for this cache if you think it is necessary. This would cause some more RPC calls though.

jerryshao · 2018-08-01T07:38:42Z

My current thinking is to revert SPARK-20172 and improve the logging when exception is met during the actual read. Also if the file cannot be read for the first time, adding them to blacklist to avoid read again.

mgaido91 · 2018-08-01T08:57:22Z

@jerryshao thansk for your reply and suggestion. That can be done but I see mainly 2 problems IIUC:

your suggestion about blacklisting has the same "caching" and "memory leakage" problems of the solution proposed here, ie. if permissions on the file are changed, we wouldn't be aware until STS is restarted and we need to store in memory the set of the files in blacklist (they may be much less than the total number of files, this is true, so probably this is not a big problem);
with your suggestion, we will also try to delete the log file IIUC, so we have to lower to debug also the log related to access denied when deleting the files (I don't think this is a big issue, but this is something which has to be taken into account, as user may miss other issues which are currently evident with the normal logging level, eg. if the file is readonly for the spark user).

What do you think? Thanks.

jerryshao · 2018-08-01T11:38:22Z

I don't think the problem you mentioned is a big problem.

For the blacklist mechanism, we can have a time-based reviving mechanism to check if permission is changed, compared to check file permission for all the files, the cost would not be so high. Also as you mentioned, the permission is seldom changed, so it is fine without change.
I don't think this is a problem, try-catch with proper log should be enough.

mgaido91 · 2018-08-01T12:01:18Z

thanks for comments @jerryshao . I will update this PR accordingly. Thanks.

SparkQA · 2018-08-01T18:23:03Z

Test build #93881 has finished for PR 21895 at commit aec9b86.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2018-08-01T18:33:38Z

retest this please

SparkQA · 2018-08-01T23:11:44Z

Test build #93894 has finished for PR 21895 at commit aec9b86.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jerryshao · 2018-08-02T07:41:03Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

-private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
-  extends ApplicationHistoryProvider with Logging {
+private[history] class FsHistoryProvider(conf: SparkConf, protected val clock: Clock)
+  extends ApplicationHistoryProvider with LogFilesBlacklisting with Logging {


What is the special advantage of using a mixin trait rather than directly changing the code here in FsHistoryProvider?

I just wanted to separate the blacklisting logic since FsHistoryProvider contains already a lot of code. So I just considered it more readable. If you prefer I can inline it.

This seems not so necessary, let's inline this trait.

jerryshao · 2018-08-02T07:51:13Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

        try {
          task.get()
        } catch {
          case e: InterruptedException =>
            throw e
+          case e: ExecutionException if e.getCause.isInstanceOf[AccessControlException] =>
+            // We don't have read permissions on the log file
+            logDebug(s"Unable to read log $path", e.getCause)


I would suggest to use warning log for the first time we met such issue, to notify user that some event logs cannot be read.

Sure, will do, thanks

jerryshao · 2018-08-02T07:59:18Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

+  protected def clearBlacklist(expireTimeInSeconds: Long): Unit = {
+    val expiredThreshold = clock.getTimeMillis() - expireTimeInSeconds * 1000
+    val expired = new mutable.ArrayBuffer[String]
+    blacklist.asScala.foreach {


Ideally the iteration should be synchronized, but I think it is not a big deal here.

I don't think it is needed as a new collection is build when doing asScala so we work on a definite snapshot of the original map.

AFAIK, asScala doesn't copy and create a snapshot from original map, it just wraps the original map and provide Scala API. The change of original map will also affect the object after asScala.

Yes, sorry, you are right, I got confused because I changed this line before pushing here and I was thinking to my previous implementation. Ye, we are not working on a definite snapshot of the values here. But I think anyway this shouldn't be a problem as we are not updating the values. We may miss to process new entries but this is not an issue I think. Thanks.

jerryshao · 2018-08-02T08:00:31Z

@mridulm would you please also take a review. Thanks!

SparkQA · 2018-08-02T14:37:54Z

Test build #93991 has finished for PR 21895 at commit c620fff.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2018-08-02T14:56:13Z

retest this pelase

jerryshao · 2018-08-03T01:29:28Z

Jenkins, retest this please.

jerryshao · 2018-08-03T05:31:12Z

Ping @mridulm , would you please also take a review, thanks!

SparkQA · 2018-08-03T06:08:44Z

Test build #94084 has finished for PR 21895 at commit c620fff.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jerryshao · 2018-08-03T08:15:55Z

Hi @mgaido91 would you please check it is auto-mergeable to branch 2.2/2.3, if not please also repare the fix for the related branch once this is merged.

mgaido91 · 2018-08-03T08:45:01Z

sure @jerryshao , will do. Thanks for the review.

mridulm

Looks good to me ! Left a couple of minor comments though.

mridulm · 2018-08-03T09:37:30Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

@@ -779,6 +808,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
        listing.delete(classOf[LogInfo], log.logPath)
      }
    }
+    // Clean the blacklist from the expired entries.
+    clearBlacklist(CLEAN_INTERVAL_S)


My only concern is that, if there happens to be a transient acl issue when initially accessing the file, we will never see it in the application list even when acl is fixed : without a SHS restart.
Wondering if the clean interval here could be fraction of CLEAN_INTERVAL_S - so that these files have a chance of making it to app list : without much of an overhead on NN.

This is scheduled anyway every CLEAN_INTERVAL_S. So I don't think that changing the value here helps. We may define another config for the blacklisting expiration, but this seems an overkill to me. I think it is very unlikely that a user changes application permissions on this files and when he does, he can always restart the SHS. Or we can also decide to clean the blacklist every fixed X amount of time. I don't have a strong opinion on which of these options is the best honestly.

I misread it as MAX_LOG_AGE_S ... CLEAN_INTERVAL_S should be fine here, you are right.

mridulm · 2018-08-03T09:38:20Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

+    blacklist.asScala.foreach {
+      case (path, creationTime) if creationTime < expiredThreshold => expired += path
+    }
+    expired.foreach(blacklist.remove(_))


Instead of this, why not simply:

blacklist.asScala.retain((_, creationTime) => creationTime >= expiredThreshold)

SparkQA · 2018-08-03T11:07:44Z

Test build #94111 has finished for PR 21895 at commit 0a48f9a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-08-03T15:00:25Z

Test build #94127 has finished for PR 21895 at commit 14ae790.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2018-08-03T15:11:42Z

retest this please

SparkQA · 2018-08-03T19:00:30Z

Test build #94140 has finished for PR 21895 at commit 14ae790.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2018-08-03T19:59:51Z

retest this please

SparkQA · 2018-08-04T01:03:29Z

Test build #94171 has finished for PR 21895 at commit 14ae790.

This patch fails from timeout after a configured wait of `300m`.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2018-08-06T07:55:09Z

retest this please

SparkQA · 2018-08-06T12:33:24Z

Test build #94268 has finished for PR 21895 at commit 14ae790.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jerryshao

LGTM.

@mridulm do you have any further comment?

mridulm · 2018-08-06T21:34:09Z

I merged to master, thanks for the work @mgaido91 !

jerryshao · 2018-08-07T00:41:36Z

This should also be backported to branch 2.2 and 2.3 @mridulm , this is a regression.

@mgaido91 would you please create backport PRs for the separate branches?

mgaido91 · 2018-08-07T08:20:51Z

Thanks for your help with this @mridulm @jerryshao .

Yes, sure, I am doing it, I'll ping you once they are ready, thanks.

In `SparkHadoopUtil. checkAccessPermission`, we consider only basic permissions in order to check wether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can). The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons. modified UTs Author: Marco Gaido <[email protected]> Closes apache#21895 from mgaido91/SPARK-24948.

[SPARK-24948][SHS] Delegate check access permissions to the file system

1052c17

fix scalastyle

ef42a93

mgaido91 added 2 commits July 28, 2018 17:36

use cache layer in order to avoid to access the remote file system ev…

bf3f4dc

…ery time

cleanup

480e326

mridulm reviewed Jul 28, 2018

View reviewed changes

address comment: use LRU cache in order to avoid OOM

2ad5285

jerryshao reviewed Aug 1, 2018

View reviewed changes

check access when reading and use blacklist

aec9b86

jerryshao reviewed Aug 2, 2018

View reviewed changes

address comment

c620fff

address comment

0a48f9a

mridulm reviewed Aug 3, 2018

View reviewed changes

address comment

14ae790

jerryshao approved these changes Aug 6, 2018

View reviewed changes

asfgit closed this in 3c96937 Aug 6, 2018

[SPARK-24948][SHS] Delegate check access permissions to the file system #21895

[SPARK-24948][SHS] Delegate check access permissions to the file system #21895

Conversation

mgaido91 commented Jul 27, 2018 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

mgaido91 commented Jul 27, 2018

SparkQA commented Jul 27, 2018

SparkQA commented Jul 27, 2018

mgaido91 commented Jul 28, 2018

SparkQA commented Jul 28, 2018

mgaido91 commented Jul 28, 2018

Choose a reason for hiding this comment

mridulm commented Jul 28, 2018

SparkQA commented Jul 28, 2018

SparkQA commented Jul 29, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryshao Aug 1, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryshao commented Aug 1, 2018 • edited Loading

mgaido91 commented Aug 1, 2018 • edited Loading

jerryshao commented Aug 1, 2018

mgaido91 commented Aug 1, 2018

SparkQA commented Aug 1, 2018

mgaido91 commented Aug 1, 2018

SparkQA commented Aug 1, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryshao commented Aug 2, 2018

SparkQA commented Aug 2, 2018

mgaido91 commented Aug 2, 2018

jerryshao commented Aug 3, 2018

jerryshao commented Aug 3, 2018

SparkQA commented Aug 3, 2018

jerryshao commented Aug 3, 2018

mgaido91 commented Aug 3, 2018

mridulm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mgaido91 Aug 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Aug 3, 2018

SparkQA commented Aug 3, 2018

mgaido91 commented Aug 3, 2018

SparkQA commented Aug 3, 2018

mgaido91 commented Aug 3, 2018

SparkQA commented Aug 4, 2018

mgaido91 commented Aug 6, 2018

SparkQA commented Aug 6, 2018

jerryshao left a comment

Choose a reason for hiding this comment

mridulm commented Aug 6, 2018

jerryshao commented Aug 7, 2018

mgaido91 commented Aug 7, 2018

mgaido91 commented Jul 27, 2018 •

edited

Loading

jerryshao Aug 1, 2018 •

edited

Loading

jerryshao commented Aug 1, 2018 •

edited

Loading

mgaido91 commented Aug 1, 2018 •

edited

Loading

mgaido91 Aug 3, 2018 •

edited

Loading