Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

[Rest Server] Expose task role completion policy #879

Merged
merged 12 commits into from
Jul 27, 2018

Conversation

abuccts
Copy link
Member

@abuccts abuccts commented Jul 13, 2018

Obsolete killOnAnyComplete and use task role completion policy.

@coveralls
Copy link

coveralls commented Jul 13, 2018

Coverage Status

Coverage increased (+20.7%) to 72.546% when pulling ba1a772 on xiongyf/rest-server-retry-policy into 7e2d587 on master.

@abuccts abuccts force-pushed the xiongyf/rest-server-retry-policy branch from 0ee9807 to 0e98e97 Compare July 13, 2018 09:21
| `taskRole.portType.portNumber` | Integer, required | Number of ports for the specific type |
| `taskRole.command` | String, required | Executable command for tasks in the task role, can not be empty |
| `taskRole.minFailedTaskCount` | Integer, optional | Number of failed tasks to kill the entire job, no less than 1 |
| `taskRole.minSucceededTaskCount` | Integer, optional | Number of succeeded tasks to kill the entire job, no less than 1 |
Copy link
Member

@yqwang-ms yqwang-ms Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@abuccts abuccts Jul 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok #Closed

Copy link
Member Author

@abuccts abuccts Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated #Closed

minSucceededTaskCount: Joi.number()
.integer()
.min(1)
.optional(),
Copy link
Member

@yqwang-ms yqwang-ms Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To distinguish field NOT EXIST and field is set to NULL (To allow user can set minFailedTaskCount = null).

Can we make:
minFailedTaskCount.default(1)
minSucceededTaskCount.default(null)

And change this:
'applicationCompletionPolicy': {
'minFailedTaskCount': data.taskRoles[i].minFailedTaskCount,
'minSucceededTaskCount': data.taskRoles[i].minSucceededTaskCount,
}, #Closed

Copy link
Member Author

@abuccts abuccts Jul 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default value means if user doesn't set the field then the field will be set to default value, which is the same as current code.
We can make minFailedTaskCount.default(1) but can't make minSucceededTaskCount.default(null) because minSucceededTaskCount is limited to integer in schema. #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user set the minFailedTaskCount = null, from below line:

'minFailedTaskCount': data.taskRoles[i].minFailedTaskCount || 1,

What will you send to launcher? null or 1?
I expect you also send null to launcher.


In reply to: 202507760 [](ancestors = 202507760)

Copy link
Member Author

@abuccts abuccts Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the minFailedTaskCount is limited to integer, user can not set it to null. I suggest we can use 0 to replace null in config file.
And what's difference between minFailedTaskCount=null and minFailedTaskCount=taskNumber, are they the same? #Closed

Copy link
Member

@yqwang-ms yqwang-ms Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, 2 taskrole A and B, and B is always running,

If you set A.minFailedTaskCount=null, then if all task in A are failed, the whole job will still run.

If you set A.minFailedTaskCount=A.taskNumber, then if all task in A are failed, the whole job will fail.


In reply to: 203646504 [](ancestors = 203646504)

Copy link
Member Author

@abuccts abuccts Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use 0 to support null? Because minFailedTaskCount is greater or equal than 1 at present. #Closed

Copy link
Member

@yqwang-ms yqwang-ms Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use 0 is the last resort, since it is different from Launcher User Manaual, user will see misleading wiki. #Closed

Copy link
Member

@yqwang-ms yqwang-ms Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And 0 means any failed task (because it is larger than 0) will trigger the whole app fail. #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use -1 to substitute null if js really cannot accept null as int value.


In reply to: 203690975 [](ancestors = 203690975)

Copy link
Member Author

@abuccts abuccts Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, allow null in minFailedTaskCount and minSucceededTaskCount #Closed

@yqwang-ms
Copy link
Member

yqwang-ms commented Jul 13, 2018

Seems there are other places in PAI using "killAllOnCompletedTaskNumber", could you please search and clean up them? #Closed

@abuccts
Copy link
Member Author

abuccts commented Jul 14, 2018

I have removed all killAllOnCompletedTaskNumber in docs and examples. #Closed

@yqwang-ms
Copy link
Member

Copy link
Member

@Gerhut Gerhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, #839 should be update after this pr merged.

@yqwang-ms
Copy link
Member

yqwang-ms commented Jul 16, 2018

@Gerhut If without #839, this PR will still work (backward compatiable), it is fine. If not, we would better make the update here. #Closed

@Gerhut
Copy link
Member

Gerhut commented Jul 16, 2018

@yqwang-ms Confirmed that it will work without #839. #Closed

@yqwang-ms
Copy link
Member

yqwang-ms commented Jul 16, 2018

@Gerhut Thanks for confirm #Closed

@Gerhut
Copy link
Member

Gerhut commented Jul 24, 2018

@abuccts Since #807 is merge into master, if you have trouble with resolving conflict, feel free to contact me. #Closed

@yqwang-ms
Copy link
Member

yqwang-ms commented Jul 24, 2018

minSucceededTaskCount: Joi.number()
.integer()
.min(1)
.allow(null)
Copy link
Member

@yqwang-ms yqwang-ms Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, js allow null to be int value? #Closed

Copy link
Member Author

@abuccts abuccts Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about how to validate one field on both int and null. #Closed

.integer()
.min(1)
.allow(null)
.default(1),
Copy link
Member

@yqwang-ms yqwang-ms Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test below things:

If user set the minFailedTaskCount = null, from below line:

'minFailedTaskCount': data.taskRoles[i].minFailedTaskCount,

What will you send to launcher?
I expect you also send null to launcher.
#Closed

Copy link
Member Author

@abuccts abuccts Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. #Closed

.integer()
.min(1)
.allow(null)
.default(null),
}))
Copy link
Member

@yqwang-ms yqwang-ms Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have default, do we still need optional? #Closed

Copy link
Member Author

@abuccts abuccts Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. If user do not fill the field, the default value will be used. #Closed

Copy link
Member

@yqwang-ms yqwang-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@fanyangCS
Copy link
Contributor

Can you support backward compatibility?

@abuccts abuccts merged commit 39e0015 into master Jul 27, 2018
@abuccts abuccts deleted the xiongyf/rest-server-retry-policy branch July 27, 2018 08:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants