-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23639][SQL]Obtain token before init metastore client in SparkSQL CLI #20784
Conversation
cc @cloud-fan |
Test build #88122 has finished for PR 20784 at commit
|
Which cluster manager are you using? This should be completely unnecessary in YARN and Mesos, and standalone in general does not support kerberos. |
yarn @vanzin |
if (isSecuredAndProxy(conf)) { | ||
val currentUser = UserGroupInformation.getCurrentUser | ||
try { | ||
SparkHadoopUtil.get.doAsRealUser { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you just call HiveDelegationTokenProvider
here instead of copy & pasting the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Test build #88612 has finished for PR 20784 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @vanzin
if (isSecuredAndProxy(conf)) { | ||
val currentUser = UserGroupInformation.getCurrentUser | ||
try { | ||
SparkHadoopUtil.get.doAsRealUser { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
cc @jerryshao |
@@ -77,6 +80,12 @@ private[hive] object SparkSQLCLIDriver extends Logging { | |||
}) | |||
} | |||
|
|||
private def isSecuredAndProxy(hiveConf: HiveConf): Boolean = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this basically HiveDelegationTokenProvider.delegationTokensRequired
? Doesn't it work if you just call that method? The only difference is the check for deploy mode, which should work fine in this context.
Test build #88613 has finished for PR 20784 at commit
|
Test build #88615 has finished for PR 20784 at commit
|
Test build #88617 has finished for PR 20784 at commit
|
retest this please |
Test build #88620 has finished for PR 20784 at commit
|
retest this please |
Test build #88624 has finished for PR 20784 at commit
|
@@ -121,6 +123,11 @@ private[hive] object SparkSQLCLIDriver extends Logging { | |||
} | |||
} | |||
|
|||
Option(new HiveDelegationTokenProvider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you trying to achieve? new
either returns something or throws an exception, so either you get Some(foo)
here or an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will get a Some and filtered if token unneeded
Option(new HiveDelegationTokenProvider) | ||
.filter(_.delegationTokensRequired(sparkConf, hadoopConf)) | ||
.foreach(_.obtainDelegationTokens( | ||
hadoopConf, sparkConf, UserGroupInformation.getCurrentUser.getCredentials)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not insert the tokens into the current UGI, because getCredentials
returns a copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,i got it
Test build #88647 has finished for PR 20784 at commit
|
LGTM. retest this please |
Test build #88680 has finished for PR 20784 at commit
|
Merging to master / 2.3. |
…SQL CLI ## What changes were proposed in this pull request? In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket. This PR use real user ugi to obtain token for owner before talking to kerberized metastore. ## How was this patch tested? Manually verified with kerberized hive metasotre / hdfs. Author: Kent Yao <[email protected]> Closes #20784 from yaooqinn/SPARK-23639. (cherry picked from commit a7755fd) Signed-off-by: Marcelo Vanzin <[email protected]>
…SQL CLI ## What changes were proposed in this pull request? In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket. This PR use real user ugi to obtain token for owner before talking to kerberized metastore. ## How was this patch tested? Manually verified with kerberized hive metasotre / hdfs. Author: Kent Yao <[email protected]> Closes apache#20784 from yaooqinn/SPARK-23639.
…SQL CLI In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket. This PR use real user ugi to obtain token for owner before talking to kerberized metastore. Manually verified with kerberized hive metasotre / hdfs. Author: Kent Yao <[email protected]> Closes apache#20784 from yaooqinn/SPARK-23639. (cherry picked from commit a7755fd) Signed-off-by: Marcelo Vanzin <[email protected]> Change-Id: I78879cd2500f911c19eccef6a1140fb996485e26
What changes were proposed in this pull request?
In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket.
This PR use real user ugi to obtain token for owner before talking to kerberized metastore.
How was this patch tested?
Manually verified with kerberized hive metasotre / hdfs.