-
Notifications
You must be signed in to change notification settings - Fork 676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility to change update behavior for ibm_container_cluster #1969
Comments
Subscribing to this issue as a VERY interested party working in IBM Watson Health. Another common scenario that should not require a master update, is occasions where there is only a worker update available (usually happens every two weeks) and no master update. So the desire is to update the workers while within the context of the same existing major.minor, where no change occurs to the master. It is also worth noting that our observation when using "update _all_workers", is all workers updated at the same time -- which is contrary to what Josh reported when opening this issue. Perhaps that is because the workers were of provider type "vpc-gen2". Obviously, updating all workers at once would result in a down system which is not tolerable for our (most) cloud solutions. Josh @jkayani , please confirm that your result was using classic or vpc-classic, and not vpc-gen2 -- which would probablyt explain why we see different behavior of all workers being done at once vs. one-at-a-time. |
@TBradCreech we are using classic clusters. The reason that we see saw a one at a time versus all at once appears to be the way the code was written. If you look at the code here https://github.com/IBM-Cloud/terraform-provider-ibm/blob/master/ibm/resource_ibm_container_cluster.go#L849 you can see looping over each worker and updating then waiting for the update to finish, before moving on to the next worker. Looking at the vpc code it does not appear to "wait" for each worker to finish https://github.com/IBM-Cloud/terraform-provider-ibm/blob/master/ibm/resource_ibm_container_vpc_cluster.go#L574. |
@TBradCreech I confirm that yes, our use case was regarding classic Openshift 4 clusters on IBM Cloud. No VPC at all right now. |
@jkayani we addressed these issues in the PR - #1989 The PR addresses the following:
we are coming up with a new resource for kube version upgrade to handle all the requirements. |
Fixed in latest release |
@hkantare I was testing this feature for last 2 days and I wasn't able to make it work
Terraform code
Initial deployment
Updated Second deployment
Master version was updated, but no worker nodes were updated
|
Hi @tpolekhin, by default |
@Anil-CM we have both set to true
|
@hkantare @Anil-CM I've tested new provider release 1.16.0 with this feature and can confirm that it's now working and upgrading worker nodes one by one. Can we somehow speed up this process? Possibly upgrading multiple nodes at a time, but not all at once? VPC-Gen2 advertised a significant increase in workers provisioning as a part it's features, but according to my observations I can see that it still takes 20-30 minutes for a worker node to be replaced. I would appreciate any suggestion you could give to speed up this process |
@tpolekhin its behind the scope of provider ..As per basic desgin of VPC Cluster the upgrade of nodes are in parallel and each nodes takes around 15 to 20 min. |
Hi,
We have a use case of creating and updating IBM Cloud IKS/ROKS clusters via Terraform. We execute
terraform apply
in our CI/CD system, which is subject to time limits on jobs.We noticed 2 things with the
update_all_workers
argument:It only tries to update workers if the same
terraform apply
registers a change in the master version: https://github.com/IBM-Cloud/terraform-provider-ibm/blob/master/ibm/resource_ibm_container_cluster.go#L818. This can be a problem in the case where the master update finishes, but theterraform apply
fails shortly after. We had this problem recently, and were unable to complete the cluster upgrade (master and workers) purely via Terraform.It waits for each worker to finish before continuing to the next one: https://github.com/IBM-Cloud/terraform-provider-ibm/blob/master/ibm/resource_ibm_container_cluster.go#L858. In our case, we had clusters of 9 nodes - this of course caused our CI/CD system to terminate
terraform apply
before it finished.For the first observation, we were wondering if it'd be possible to modify the logic so that if
update_all_workers
is true, and the master version doesn't match the worker version, to plan worker updates.For the second observation, it makes sense why it's done that way, since Terraform has to wait for the operation to complete before recording success in the state. However, we were wondering if we could have an option like
wait_for_workers
that we could setfalse
for our use case. Then, we could trigger worker updates (ensuring each worker received the update command), have the CI/CD job finish on time, and simplyterraform refresh
the state of the cluster once the update is complete.The text was updated successfully, but these errors were encountered: