-
Notifications
You must be signed in to change notification settings - Fork 549
/paictl.py cluster k8s-bootup -p /cluster-configuration
randomly failed
#1153
Comments
It's similar with #841 , all about can't create |
also #917 |
can you close all related issues and keep only one opened for ease of tracking. |
@fanyangCS OK, others are closed but can't be deleted. Reference of the similar issues is for easy tracking. |
When
@mzmssg Could provide more details. |
update in #1166 |
For no matches for kind "DaemonSet" in version "apps/v1", it maybe the k8s API version's problem. Our single-node k8s version is 1.9.4, maybe can't totally support the stable version "apps/v1", #1126 change the version to 1.9.9, and have not appear the error now. Or you can use "apps/v1beta1" and "apps/v1beta2" in your daemonset yaml file. |
More details:
|
Hi @YitongFeng , Can we check |
It's maybe etcd's problem. When deployed successfully, the etcd log is clean: When deploying a pod, the api-servier will register ResourceGroup and ResourceVersion as RestAPI, and will communicate with etcd. So the etcd timeout may cause the deploy process can't find the Resource Type. The possible cause of etcd timeout: |
Is this possibly caused too small VMs for qualification bed? One possible action is we increase the VMs size on qualification bed to see is there any improvements. |
@sterowang low disk IO is the usually cause. |
Try to use ramdisk for etcd, it works fine. |
If this is the disk I/O issue, it could very likely happen in other bed as well. The short-term solution could be to add some retry to every etcd operation (don't forget to add some random backoff between retries). Long term, we could add an option to support etcd deployment to a disk other than OS disk. I do not prefer ramdisk as it is not a realistic real-world deployment environment |
We get some output from When we are trying to create |
The k8s-bootup randomly failed because of
kube-proxy
.The text was updated successfully, but these errors were encountered: