-
Notifications
You must be signed in to change notification settings - Fork 549
Change code in framework luancher to support Specific Ports and Random port request co-exist. #1402
Conversation
fix minus issue fix ports fix coment adjust
List<ValueRange> newCandidatePorts = ValueRangeUtils.getSubRangeRandomly(selectionResult.getOverlapPorts(), optimizedRequestResource.getPortNumber(), | ||
private synchronized List<ValueRange> selectPorts(List<ValueRange> availablePorts, ResourceDescriptor optimizedRequestResource) { | ||
// Need select the ports for this task. | ||
if (optimizedRequestResource.getPortNumber() > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refine #Resolved
String portLabel = portDefinition.getKey(); | ||
Ports ports = portDefinition.getValue(); | ||
// If defined static ports, directly use it, and remove the static ports from all ports list | ||
if (ports.getCount() > 0 && ports.getStart() > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ports.getStart() > 0 [](start = 36, length = 20)
!= 0 #Resolved
} | ||
portString.append(";"); | ||
} else { | ||
if (ports.getStart() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge #Resolved
@@ -35,20 +33,32 @@ public static String toPortString( | |||
|
|||
if (portsDefinitions != null && !portsDefinitions.isEmpty()) { | |||
List<ValueRange> coalescedPortRanges = ValueRangeUtils.coalesceRangeList(portRanges); | |||
// Check user specified ports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Check user specified ports. [](start = 6, length = 30)
Assign static ports #Resolved
} | ||
coalescedPortRanges = ValueRangeUtils.subtractRange(coalescedPortRanges, staticPorts); | ||
|
||
// Check the dynamic ports. | ||
for (Map.Entry<String, Ports> portDefinition : portsDefinitions.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assign dynamic ports #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* [Rest-server]Add OS check for ssh-keygen and fix code size bug (#1399) * add OS check for ssh-keygen + fix code size bug * fix code size bug * add callback if generate ssh keyfiles failed * workaround cleanup failed * Pylon: Fix rewrite of WebHDFS redirection (#1328) (#1407) * Zhaoyu/deleted files (#1394) * add list file script * add file checker * add file worker * add commond line and test cases * fix the test case * fix typo and change configure * leverage lsof to get deleted files * change output * change comments * add test case * add more test cases * call the deleted file command directly * fix scrape time env config (#1401) * change name of launcher (#1404) * make node-exporter's readiness probe less sensitive (#1412) * REST Server: Refine error message transfering from upstream. (#1410) * fix gpu num display bugs (#1411) * Change code in framework luancher to support Specific Ports and Random port request co-exist. (#1402) * fixportsIssue * fix minus issue * fix ports * fix minus issue * fix coment * adjust * fix minus issue fix minus issue fix ports fix coment adjust * fix CR comments * refresh service (#1388) * refresh all * remove chmod * extract public part * update hadoop version to fix NodeManger GPU detect issue when ecc is turn off. (#1421) * update hadoop version * update * add push * [Deployment] service stop also check frameworklauncher-ds for backward compatibility (#1409) * check frameworklauncher-ds for backward compatibility * fix frameworklauncher rename backward capability * fix travis markdown version error * [Rest-server] Add job retry history (#1425) * add job retry history link * add job history link * revert stop.sh change * mount the code dir as readonly (#1422) * Webportal: fix version display in PAIShare pages (#1424) * Fix backward compatibility of killAllOnCompletedTaskNumber (#1329) (#1408) * [PAIShare Doc] How-to-config-gitHubPAT.md (#1427) * add Jenkinsfile * Modify Jenkinsfile * minor change * minor change * Add paishare test case in cluster test * how-to-config-gitHubPAT.md * minor change * minor change to Jekinsfile * Refine Images for githubPAT config * Refine Images for githubPAT config2 * minor change * resize image * refine * refine * refine * Minor change to image * Add empty line * Enable launcher ACL (#1150) * Mount job router under user routes * Allow job router read user params * Add namespace to API endpoints * Allow PUT execution type of legacy jobs * Fix HDFS path * Fix Docker container name * Add namespace support to web portal * Add default user name for legacy jobs * enable ACL * Add user namespace to Jenkins CI * Fix backward compability for JobConfig & JobSSHInfo * Support namespace in e2e test * Fix e2e test * fix test case after back-compat * Fix launcher test * Disable tildes in job name * docs * Fix stop job in detail page * Fix job detail * Fix doc link * Lint * support acl in submit V2 * fix unit test * collect network metrics for containers (#1418) * [aks] deploy dev-box as a daemonset choice for user (#1413) * [aks] deploy dev-box as a daemonset * rename dev-box.yaml file to dev-box-k8s-deploy.yaml * remove docker mount / make it to pod /remove redundancy * rename the dev-box name * fix path at deploy doc for new code structure (#1398) add / for pai/deployment Update document after refactor. (#1397) [Rest-server]Add OS check for ssh-keygen and fix code size bug (#1399) * add OS check for ssh-keygen + fix code size bug * fix code size bug * add callback if generate ssh keyfiles failed * [Launcher]: Recover the queue for TASK_COMPLETED tasks (#1432) * Export config at job detail page (#1429) * config export * move label machine step from kubelet start into service deployment (#1403) * fix dependencies check (#1430) * add init option in docker run, help reaping zombie processes (#1435) add init option in docker run * Add memory limit for all PAI services, make it 'Burstable' Qos class (#1384) * set kubernetes memory eviction threshold To reach that capacity, either some Pod is using more than its request, or the system is using more than 3Gi - 1Gi = 2Gi. * set those pods as 'Guaranteed' QoS: node-exporter hadoop-node-manager hadoop-data-node drivers-one-shot * Set '--oom-score-adj=1000' for job container so it would oom killed first * set those pods as 'Burstable' QoS: prometheus grafana * set those pods as 'Guaranteed' QoS: frameworklauncher hadoop-jobhistory hadoop-name-node hadoop-resource-manager pylon rest-server webportal zookeeper * adjust services memory limits * add k8s services resource limit * seem 1g is not enough for launcher * adjust hadoop-resource-manager limit * adjust webportal memory limit * adjust cpu limits * rm yarn-exporter resource limits * adjuest prometheus limits * adjust limits * frameworklauncher: set JAVA_OPTS="-server -Xmx512m" zookeeper: set JAVA_OPTS="-server -Xmx512m" fix env name to JAVA_OPTS fix zookeeper * add heapsize limit for hadoop-data-node hadoop-jobhistory * add xmx for hadoop * modify memory limits * reserve 40g for singlebox, else reserve 12g * using LAUNCHER_OPTS * revert zookeeper dockerfile * adjust node manager memory limit * drivers would take more memory when install * increase memory for zookeeper and launcher * set requests to a lower value * comment it out, using the continer env "YARN_RESOURCEMANAGER_HEAPSIZE" * add comments * fix dependency check (#1442) * PAIShare opt-in (#1436) * Set home page back to dashboard * Add PAIShare optIn * REST server: Allow user to set its own GitHub PAT * Remove opt-in and add PAT notification * Update how-to-config-github-pat.component.js * Refine * lint * Zhaoyu/cleaner build deploy (#1441) * add docker file * fix cleaner dockerfile * add deploy script * add liveness probe * fix liveness probe * track stopped worker * fix docker mount * add probe period * fix rule * add delete refresh * delete template * change per the review comments * change the cool down time to 1800 seconds
No description provided.