This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 549
[machine maintain]add machine and remove machine issue #1863
Comments
Node-manager issue:
please refer to this: drivers issue: remove problems: kind a bug |
for nm issue from your #253 suggestion: hostmame is the same with machinelist.yaml hostname: root@paidevbox:~/PaiDeployment# cd pai-cluster-msr-next-p100/ Currently i want to remove this node and try to deploy but also meet remove node error.... above |
For @253 i guess maybe the reason is i directly use config file and not gen config replace this lead to this issue. But this assumption currently not mentioned on our doc. If you have method tell me to remove this node and not throw exception, i could try to redeploy this. |
Yundong's fix for k8s remove: #1864 |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
i follow to add and remove machine 10.0.0.7 to existing p100 cluster (v0.8.1 pai) but meet problem:
https://github.com/Microsoft/pai/blob/pai-0.8.y/docs/paictl/paictl-manual.md#Machine_Nodelist_Example
For add:
machine-list:
hostip: 10.0.0.7
machine-type: GENERIC
k8s-role: worker
pai-worker: "true"
sshport: 22
username: xx
password: xx
But datanode and namenode IP address is error:
data:image/s3,"s3://crabby-images/74f8d/74f8d384656f9123b1bab1d6c9eb2b11b892358b" alt="image"
datanode
namenode log:
data:image/s3,"s3://crabby-images/8e2f0/8e2f08d34cc0f139add771b67cc999f404d5c32b" alt="image"
Driver meet rediness probe error, log seems right:
data:image/s3,"s3://crabby-images/5e4b0/5e4b086ec274efdd83e4027078e1fc64326f95ca" alt="image"
For remove machine: meet error
The text was updated successfully, but these errors were encountered: