-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEZ-4441: TezAppMaster may stuck because of reportError skip send err… #236
Conversation
🎊 +1 overall
This message was automatically generated. |
@@ -910,7 +910,8 @@ public void reportError(int taskSchedulerIndex, ServicePluginError servicePlugin | |||
LOG.info("Error reported by scheduler {} - {}", | |||
Utils.getTaskSchedulerIdentifierString(taskSchedulerIndex, appContext) + ": " + | |||
diagnostics); | |||
if (taskSchedulerDescriptors[taskSchedulerIndex].getClassName().equals(yarnSchedulerClassName)) { | |||
if (taskSchedulerDescriptors[taskSchedulerIndex].getEntityName() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is the actual fix
just a note, what are the values in your case:
taskSchedulerDescriptors[taskSchedulerIndex].getClassName()
yarnSchedulerClassName
taskSchedulerDescriptors[taskSchedulerIndex].getEntityName()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the only actual fix.
- Before this PR
method | return value |
---|---|
taskSchedulerDescriptors[taskSchedulerIndex].getClassName() | null |
yarnSchedulerClassName | "org.apache.tez.dag.app.rm.YarnTaskSchedulerService" |
taskSchedulerDescriptors[taskSchedulerIndex].getClassName() is set from the variable 'taskSchedulerDescriptors' of DAGAppMaster::serviceInit. In DAGAppMaster::parsePlugin, when we construct NamedEntityDescriptor for tez yarn plugin, the className is all null.
yarnSchedulerClassName is set from tez.am.yarn.scheduler.class, default value is "org.apache.tez.dag.app.rm.YarnTaskSchedulerService".
So for tez yarn plugin, taskSchedulerDescriptors[taskSchedulerIndex].getClassName() will never equals to yarnSchedulerClassName. Then
- After this PR
taskSchedulerDescriptors[taskSchedulerIndex].getEntityName() will return "TezYarn"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, nice catch:
TezConstants.getTezYarnServicePluginName(), null).setUserPayload(defaultPayload); |
we simply don't fill the classname, so we should not rely on it, only use it in case of createCustomTaskScheduler
tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestTaskSchedulerManager.java
Outdated
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
…or event (#236) (zhengchenyu reviewed by Laszlo Bodor)
…or event (apache#236) (zhengchenyu reviewed by Laszlo Bodor) (cherry picked from commit 55b6031)
https://issues.apache.org/jira/browse/TEZ-4441