Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23639][SQL]Obtain token before init metastore client in SparkSQL CLI #20784

Closed
wants to merge 6 commits into from
Closed

Conversation

yaooqinn
Copy link
Member

@yaooqinn yaooqinn commented Mar 9, 2018

What changes were proposed in this pull request?

In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket.

This PR use real user ugi to obtain token for owner before talking to kerberized metastore.

How was this patch tested?

Manually verified with kerberized hive metasotre / hdfs.

@yaooqinn
Copy link
Member Author

yaooqinn commented Mar 9, 2018

cc @cloud-fan

@SparkQA
Copy link

SparkQA commented Mar 9, 2018

Test build #88122 has finished for PR 20784 at commit 2011ef2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 9, 2018

Which cluster manager are you using? This should be completely unnecessary in YARN and Mesos, and standalone in general does not support kerberos.

@yaooqinn
Copy link
Member Author

yarn @vanzin

if (isSecuredAndProxy(conf)) {
val currentUser = UserGroupInformation.getCurrentUser
try {
SparkHadoopUtil.get.doAsRealUser {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you just call HiveDelegationTokenProvider here instead of copy & pasting the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@SparkQA
Copy link

SparkQA commented Mar 27, 2018

Test build #88612 has finished for PR 20784 at commit 4063855.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member Author

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @vanzin

if (isSecuredAndProxy(conf)) {
val currentUser = UserGroupInformation.getCurrentUser
try {
SparkHadoopUtil.get.doAsRealUser {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@cloud-fan
Copy link
Contributor

cc @jerryshao

@@ -77,6 +80,12 @@ private[hive] object SparkSQLCLIDriver extends Logging {
})
}

private def isSecuredAndProxy(hiveConf: HiveConf): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this basically HiveDelegationTokenProvider.delegationTokensRequired? Doesn't it work if you just call that method? The only difference is the check for deploy mode, which should work fine in this context.

@SparkQA
Copy link

SparkQA commented Mar 27, 2018

Test build #88613 has finished for PR 20784 at commit ba6b086.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 27, 2018

Test build #88615 has finished for PR 20784 at commit 5c4335b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 27, 2018

Test build #88617 has finished for PR 20784 at commit 1e38840.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 27, 2018

Test build #88620 has finished for PR 20784 at commit 1e38840.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 27, 2018

Test build #88624 has finished for PR 20784 at commit 1e38840.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -121,6 +123,11 @@ private[hive] object SparkSQLCLIDriver extends Logging {
}
}

Option(new HiveDelegationTokenProvider)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you trying to achieve? new either returns something or throws an exception, so either you get Some(foo) here or an exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will get a Some and filtered if token unneeded

Option(new HiveDelegationTokenProvider)
.filter(_.delegationTokensRequired(sparkConf, hadoopConf))
.foreach(_.obtainDelegationTokens(
hadoopConf, sparkConf, UserGroupInformation.getCurrentUser.getCredentials))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not insert the tokens into the current UGI, because getCredentials returns a copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,i got it

@SparkQA
Copy link

SparkQA commented Mar 28, 2018

Test build #88647 has finished for PR 20784 at commit cd8056c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 28, 2018

LGTM. retest this please

@SparkQA
Copy link

SparkQA commented Mar 29, 2018

Test build #88680 has finished for PR 20784 at commit cd8056c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 29, 2018

Merging to master / 2.3.

asfgit pushed a commit that referenced this pull request Mar 29, 2018
…SQL CLI

## What changes were proposed in this pull request?

In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket.

This PR use real user ugi to obtain token for owner before talking to kerberized metastore.

## How was this patch tested?

Manually verified with kerberized hive metasotre / hdfs.

Author: Kent Yao <[email protected]>

Closes #20784 from yaooqinn/SPARK-23639.

(cherry picked from commit a7755fd)
Signed-off-by: Marcelo Vanzin <[email protected]>
@asfgit asfgit closed this in a7755fd Mar 29, 2018
mshtelma pushed a commit to mshtelma/spark that referenced this pull request Apr 5, 2018
…SQL CLI

## What changes were proposed in this pull request?

In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket.

This PR use real user ugi to obtain token for owner before talking to kerberized metastore.

## How was this patch tested?

Manually verified with kerberized hive metasotre / hdfs.

Author: Kent Yao <[email protected]>

Closes apache#20784 from yaooqinn/SPARK-23639.
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…SQL CLI

In SparkSQLCLI, SessionState generates before SparkContext instantiating. When we use --proxy-user to impersonate, it's unable to initializing a metastore client to talk to the secured metastore for no kerberos ticket.

This PR use real user ugi to obtain token for owner before talking to kerberized metastore.

Manually verified with kerberized hive metasotre / hdfs.

Author: Kent Yao <[email protected]>

Closes apache#20784 from yaooqinn/SPARK-23639.

(cherry picked from commit a7755fd)
Signed-off-by: Marcelo Vanzin <[email protected]>

Change-Id: I78879cd2500f911c19eccef6a1140fb996485e26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants