-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3286] - Cannot view ApplicationMaster UI when Yarn’s url scheme i... #2276
Conversation
Can one of the admins verify this patch? |
@@ -188,7 +188,7 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments, | |||
if (sc == null) { | |||
finish(FinalApplicationStatus.FAILED, "Timed out waiting for SparkContext.") | |||
} else { | |||
registerAM(sc.ui.appUIHostPort) | |||
registerAM(sc.ui.appUIAddress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling over my comment from the previous pr - #2206
This won't work with yarn alpha.
yes so unfortunately I went and verified that hadoop 0.23 can't have the scheme. It automatically adds the scheme itself (just supports http://).
This is already calling down into stable/alpha versions of YarnRMClientImpl.register. We could potentially pass whole uri, including scheme and then have the alpha version strip off the scheme.
Also you modified the call in runDriver to registerAM which covers client mode, I assume the same change needs to happen in runExecutorLauncher for yarn client mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added code to strip off the scheme in alpha version.
Not sure if I need to modify runExecutorLauncher.
@@ -96,7 +96,7 @@ private class YarnRMClientImpl(args: ApplicationMasterArguments) extends YarnRMC | |||
// Users can then monitor stderr/stdout on that node if required. | |||
appMasterRequest.setHost(Utils.localHostName()) | |||
appMasterRequest.setRpcPort(0) | |||
appMasterRequest.setTrackingUrl(uiAddress) | |||
appMasterRequest.setTrackingUrl(uiAddress.replaceAll("^http(\\w)*://", "")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather this done with something more reliable like URI class and just removing the scheme if it has one.
Also can you add a comment about we are removing it because hadoop doesn't handle the scheme.
Sure. I'll do both. |
No, alpha means pre-branch-2 hadoop (I think, Hadoop branching is not exactly an exact science). Anyway, there are stable releases without YARN-1203. So that probably should be handled. If there isn't an API to figure out the Yarn version, I'd use reflection to detect a method that was added after YARN-1203 (preferrably around this API), and only apply the fix if the method is available. |
Can one of the admins verify this patch? |
Thanks . With this implementation, I prefer not to use URI to remove the scheme from uiAddress for the alpha version of YarnRMClientImpl since there is a good chance of passing uiAddress without scheme and URI construction with throw exception. |
@@ -189,7 +189,11 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments, | |||
if (sc == null) { | |||
finish(FinalApplicationStatus.FAILED, "Timed out waiting for SparkContext.") | |||
} else { | |||
registerAM(sc.ui.appUIHostPort, securityMgr) | |||
var uiAddress = sc.ui.appUIHostPort | |||
if (yarnConf.get("yarn.http.policy").equals("HTTPS_ONLY")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like this as much. The reason I prefer stripping it off in alpha is the yarn alpha support will be deprecated at some point (hopefully fairly soon) and I would rather have it be optimized for the yarn stable case which is what most people are using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per @vanzin , alpha means pre-branch-2 hadoop. Before YARN-1203(Hadoop 2.2), we cannot pass AM URLS with scheme. There are stable releases before YARN-1203. So for stable releases without YARN-1203, we have to pass host:port.
There are 4 cases:
Alpha and yarn.http.policy !=HTTPS_ONLY : uiAddress = host:port
Alpha and yarn.http.policy ==HTTPS_ONLY : uiAddress = scheme:host:port (strip off scheme since alpha will choke on scheme.)
stable and yarn.http.policy !=HTTPS_ONLY : uiAddress = host:port
stable and yarn.http.policy ==HTTPS_ONLY : uiAddress = scheme:host:port (we are taking a bet that yarn.http.policy==HTTPS_ONLY occurs only for stable release after Yarn-1203 since this property was introduced by Yarn-1277 . Both these changes are released in Hadoop-2.2. )
So I think, we need to do the conditional logic for both stable and alpha.
We also need to strip off the scheme for alpha.
Please let me know , if there is a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the yarn versioning thing is: Alpha is branch 0.23 and 2.x up til 2.1.x .
stable is 2.2.x and beyond
see http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version
since YARN-1203 went into 2.2 its fine to only special case in alpha but do the easy thing for stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll defer to @tgravescs on Hadoop branch trivia; I see the point of having different logic for stable and alpha, although I'm ok with the current approach too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tgravescs .
So all stable versions are post YARN-1203 and can handle urls with scheme.
Alpha versions cannot handle url with scheme.
Address with scheme is passed from the ApplicationMaster.scala.
There are only two cases.
Stable : pass the address with scheme.
Alpha : strips of the scheme from the address.
I have changed the code appropriately
LGTM. |
+1, looks good. Thanks @benoyantony |
Thank you. |
@pwendell can you add @benoyantony as contributor in jira. |
This broke the Jenkins PRB build because it introduced a style error:
I'll push a hotfix now. |
Thanks @JoshRosen. Sorry about that I forgot to trigger jenkins again. I keep having issues with it not triggering I forgot to double check this one. |
@tgravescs Don't sweat it, Jenkins has been flaky lately. I'm spending the entire day today ironing out Jenkins issues and building my own alternative to the Jenkins GitHub pull request builder plugin; follow https://github.com/databricks/spark-pr-dashboard/compare/pull-request-builder to track my progress. |
...s https