Changing multiple fields of cluster while update is not working #667

cpinjani · 2024-09-16T12:53:24Z

Rancher version:
Rancher - v2.9-f1b43d2568d7c53c3adf45d9ffd74a04ea65fc22-head
aks-operator:v1.9.2-rc.2

Cluster Type: Downstream AKS cluster

Describe the bug:

Provision cluster with multiple Nodepools and wait for it to be Active
Edit the cluster: delete any Nodepool and enable HTTP Application routing, click Save.

Spec is updated with changes applied.
Only the change to routing gets applied, and Nodepool remains intact after the update.

aksConfig:
    authBaseUrl: https://login.microsoftonline.com/
    authorizedIpRanges: null
    azureCredentialSecret: cattle-global-data:cc-j7sg7
    baseUrl: https://management.azure.com/
    clusterName: cpinjani-aks
    dnsPrefix: cpinjani-aks
    dnsServiceIp: 10.0.0.10
    dockerBridgeCidr: null
    httpApplicationRouting: true
    imported: false
    kubernetesVersion: 1.29.0
    linuxAdminUsername: azureuser
    loadBalancerSku: Standard
    logAnalyticsWorkspaceGroup: null
    logAnalyticsWorkspaceName: null
    managedIdentity: null
    monitoring: null
    networkPlugin: kubenet
    networkPolicy: null
    nodePools:
      - availabilityZones:
          - '1'
          - '2'
          - '3'
        count: 1
        maxPods: 110
        maxSurge: '1'
        mode: System
        name: np1
        orchestratorVersion: 1.29.0
        osDiskSizeGB: 128
        osDiskType: Managed
        osType: Linux
        vmSize: Standard_DS2_v2
    outboundType: loadBalancer
    podCidr: 10.244.0.0/16
    privateCluster: false
    privateDnsZone: null
    resourceGroup: cpinjani-aks
    resourceLocation: eastus
    serviceCidr: 10.0.0.0/16
    subnet: null
    tags:
      Account Type: group
    userAssignedIdentity: null
    virtualNetwork: null
    virtualNetworkResourceGroup: null

Logs:

2.9:

time="2024-09-16T11:03:50Z" level=info msg="Checking configuration for cluster [cpinjani-aks (id: c-jm2vn)]"
time="2024-09-16T11:03:50Z" level=info msg="Updating HTTP application routing to true for cluster [cpinjani-aks (id: c-jm2vn)]"
time="2024-09-16T11:06:27Z" level=error msg="Error recording akscc [ (id: )] failure message: resource name may not be empty"
time="2024-09-16T11:06:28Z" level=info msg="Checking configuration for cluster [cpinjani-aks (id: c-jm2vn)]"
time="2024-09-16T11:06:28Z" level=info msg="Configuration for cluster [cpinjani-aks (id: c-jm2vn)] was verified"
time="2024-09-16T11:06:29Z" level=info msg="Checking configuration for cluster [cpinjani-aks (id: c-jm2vn)]"
time="2024-09-16T11:06:30Z" level=info msg="Configuration for cluster [cpinjani-aks (id: c-jm2vn)] was verified"

2.8:

time="2024-09-16T10:42:39Z" level=info msg="Checking configuration for cluster [cpinjani-aks28]"
time="2024-09-16T10:42:40Z" level=info msg="Updating HTTP application routing for cluster [cpinjani-aks28]"
time="2024-09-16T10:42:45Z" level=info msg="Waiting for cluster [c-zw6hc] to finish updating"
time="2024-09-16T10:43:15Z" level=info msg="Waiting for cluster [c-zw6hc] to finish updating"
time="2024-09-16T10:43:46Z" level=info msg="Waiting for cluster [c-zw6hc] to finish updating"
time="2024-09-16T10:44:17Z" level=info msg="Waiting for cluster [c-zw6hc] to finish updating"
time="2024-09-16T10:44:48Z" level=info msg="Checking configuration for cluster [cpinjani-aks28]"
time="2024-09-16T10:44:49Z" level=info msg="Removing node pool [pool3] from cluster [cpinjani-aks28]"
time="2024-09-16T10:44:51Z" level=info msg="Waiting for cluster [c-zw6hc] to delete node pool [pool3]"
time="2024-09-16T10:45:22Z" level=info msg="Waiting for cluster [c-zw6hc] to delete node pool [pool3]"
time="2024-09-16T10:45:52Z" level=info msg="Waiting for cluster [c-zw6hc] to delete node pool [pool3]"
time="2024-09-16T10:46:23Z" level=info msg="Checking configuration for cluster [cpinjani-aks28]"
time="2024-09-16T10:46:24Z" level=info msg="Cluster [c-zw6hc] finished updating"
time="2024-09-16T10:46:25Z" level=info msg="Checking configuration for cluster [cpinjani-aks28]"
time="2024-09-16T10:46:25Z" level=info msg="Configuration for cluster [cpinjani-aks28] was verified"
time="2024-09-16T10:46:40Z" level=info msg="Checking configuration for cluster [cpinjani-aks28]"
time="2024-09-16T10:46:41Z" level=info msg="Configuration for cluster [cpinjani-aks28] was verified"

The text was updated successfully, but these errors were encountered:

valaparthvi · 2024-09-23T14:09:11Z

On the same lines, it is not possible to add more than one nodepool to the cluster via API. One of the test cases is to update a cluster while it is still provisioning.
I tested with 2 updates:

K8s upgrade and node pool addition - it only upgraded k8s version
Adding more than one node pool - it only added one nodepool

When I tested the same things via UI, it seemed to be working as expected.

AKSConfig is updated when the request is first sent, but then it seems to revert.

Logs

e="2024-09-23T13:52:39Z" level=info msg="Checking if cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] exists"
time="2024-09-23T13:52:40Z" level=info msg="Checking if resource group [auto-aks-pvala-hp-ci-cqsni] exists"
time="2024-09-23T13:52:40Z" level=info msg="Creating resource group [auto-aks-pvala-hp-ci-cqsni] for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T13:52:41Z" level=info msg="Resource group [auto-aks-pvala-hp-ci-cqsni] created successfully"
time="2024-09-23T13:52:41Z" level=info msg="Creating AKS cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T13:52:49Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:53:13Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:53:19Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:53:20Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:53:50Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:54:22Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:54:54Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:55:25Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:55:56Z" level=info msg="Waiting for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] to finish creating, cluster state: Creating"
time="2024-09-23T13:56:27Z" level=info msg="Cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] created successfully"
time="2024-09-23T13:56:30Z" level=info msg="Checking configuration for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T13:56:31Z" level=info msg="Updating tags for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T13:56:33Z" level=info msg="Tags were not updated for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)], config map[janitor-ignore:true owner:hosted-providers-qa-ci-pvala testfilenumber:line125_p1_provisioning_test], upstream map[Account Owner:[email protected] Account Type:group Cost Center:211799999 Department:ecm Environment:development Finance Business Partner:[email protected] General Ledger Code:200000119 Stakeholder:[email protected] Team:container-es janitor-ignore:true owner:hosted-providers-qa-ci-pvala testfilenumber:line125_p1_provisioning_test], moving on"
time="2024-09-23T13:56:33Z" level=info msg="Updating kubernetes version to 1.30.3 for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T13:59:11Z" level=error msg="Error recording akscc [ (id: )] failure message: resource name may not be empty"
time="2024-09-23T13:59:12Z" level=info msg="Checking configuration for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T13:59:13Z" level=info msg="Configuration for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] was verified"
time="2024-09-23T13:59:15Z" level=info msg="Checking configuration for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T13:59:16Z" level=info msg="Configuration for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] was verified"
time="2024-09-23T13:59:24Z" level=info msg="Removing cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)]"
time="2024-09-23T14:01:18Z" level=info msg="Creating cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:01:18Z" level=info msg="Checking if cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] exists"
time="2024-09-23T14:01:19Z" level=info msg="Checking if resource group [auto-aks-hp-ci-vgcla] exists"
time="2024-09-23T14:01:19Z" level=info msg="Creating resource group [auto-aks-hp-ci-vgcla] for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:01:20Z" level=info msg="Resource group [auto-aks-hp-ci-vgcla] created successfully"
time="2024-09-23T14:01:20Z" level=info msg="Creating AKS cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:01:32Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to finish creating, cluster state: Creating"
time="2024-09-23T14:01:44Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to finish creating, cluster state: Creating"
time="2024-09-23T14:02:03Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to finish creating, cluster state: Creating"
time="2024-09-23T14:02:34Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to finish creating, cluster state: Creating"
time="2024-09-23T14:02:57Z" level=info msg="Cluster auto-aks-pvala-hp-ci-cqsni removed successfully"
time="2024-09-23T14:02:57Z" level=info msg="Cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] was removed successfully"
time="2024-09-23T14:02:57Z" level=info msg="Resource group [auto-aks-pvala-hp-ci-cqsni] for cluster [auto-aks-pvala-hp-ci-cqsni (id: c-p98qf)] still exists, please remove it if needed"
time="2024-09-23T14:03:06Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to finish creating, cluster state: Creating"
time="2024-09-23T14:03:37Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to finish creating, cluster state: Creating"
time="2024-09-23T14:04:08Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to finish creating, cluster state: Creating"
time="2024-09-23T14:04:40Z" level=info msg="Cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] created successfully"
time="2024-09-23T14:04:43Z" level=info msg="Checking configuration for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:04:44Z" level=info msg="Updating tags for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:04:47Z" level=info msg="Tags were not updated for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)], config map[owner:hosted-providers-qa-ci-pvala testfilenumber:line125_p1_provisioning_test], upstream map[Account Owner:[email protected] Account Type:group Cost Center:211799999 Department:ecm Environment:development Finance Business Partner:[email protected] General Ledger Code:200000119 Stakeholder:[email protected] Team:container-es owner:hosted-providers-qa-ci-pvala testfilenumber:line125_p1_provisioning_test], moving on"
time="2024-09-23T14:04:47Z" level=info msg="Adding node pool [ggygl] for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:04:52Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to update node pool [ggygl]"
time="2024-09-23T14:05:23Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to update node pool [ggygl]"
time="2024-09-23T14:05:55Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to update node pool [ggygl]"
time="2024-09-23T14:06:26Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to update node pool [ggygl]"
time="2024-09-23T14:06:39Z" level=info msg="Waiting for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] to update node pool [ggygl]"
time="2024-09-23T14:06:57Z" level=info msg="Checking configuration for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:06:59Z" level=info msg="Cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] finished updating"
time="2024-09-23T14:07:00Z" level=info msg="Checking configuration for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"
time="2024-09-23T14:07:01Z" level=info msg="Configuration for cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)] was verified"
time="2024-09-23T14:07:50Z" level=info msg="Removing cluster [auto-aks-hp-ci-vgcla (id: c-gjz7d)]"

valaparthvi · 2024-09-24T09:47:19Z

This was also seen while deleting a nodepool and adding a new one. While the new nodepool remained, the deleted nodepool was re-added after a few minutes.
AKSConfig maintains the desired state, until it is restored.

krunalhinguu · 2025-01-28T06:26:51Z

@cpinjani After attempting to reproduce the reported problem in the AKS cluster management workflow (v2.9), the scenario executed as expected, with cluster updates functioning correctly. Coordination with @valaparthvi confirmed no inconsistencies in the observed behavior.

The recorded error: time="2024-09-16T11:06:27Z" level=error msg="Error recording akscc [ (id: )] failure message: resource name may not be empty" was traced to a separate code path unrelated to the cluster update mechanism. Root cause analysis identified an edge case in resource metadata validation logic, where empty resource identifiers were improperly handled. A fix is currently in development and will be resolved in an upcoming patch.

cpinjani added kind/bug Something isn't working kind/regression labels Sep 16, 2024

cpinjani added this to the v2.9-Next1 milestone Sep 16, 2024

cpinjani added this to CAPI / Turtles Sep 16, 2024

cpinjani moved this to Backlog in CAPI / Turtles Sep 16, 2024

vatsalparekh self-assigned this Sep 19, 2024

valaparthvi mentioned this issue Sep 24, 2024

Automate Qase 222, 210, 261, 209, 217, 195, 230 rancher/hosted-providers-e2e#171

Merged

3 tasks

vatsalparekh mentioned this issue Oct 8, 2024

Change order of applying changes #683

Closed

5 tasks

mjura modified the milestones: v2.9.3, v2.9.4 Oct 17, 2024

mjura self-assigned this Oct 18, 2024

mjura removed their assignment Oct 25, 2024

vatsalparekh mentioned this issue Oct 29, 2024

Correct code flow of node pools #703

Closed

5 tasks

kkaempf modified the milestones: v2.9.4, v2.10.1 Nov 5, 2024

kkaempf modified the milestones: v2.10.1, v2.10.2 Dec 10, 2024

mjura moved this from PR to be reviewed to Backlog in CAPI / Turtles Dec 24, 2024

mjura unassigned vatsalparekh Dec 24, 2024

kkaempf modified the milestones: v2.10.2, v2.10.3 Jan 14, 2025

mitulshah-suse added the team/infracloud Issues for infracloud team label Jan 17, 2025

kkaempf removed this from CAPI / Turtles Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing multiple fields of cluster while update is not working #667

Changing multiple fields of cluster while update is not working #667

cpinjani commented Sep 16, 2024

valaparthvi commented Sep 23, 2024 •

edited

Loading

valaparthvi commented Sep 24, 2024 •

edited

Loading

krunalhinguu commented Jan 28, 2025 •

edited

Loading

Changing multiple fields of cluster while update is not working #667

Changing multiple fields of cluster while update is not working #667

Comments

cpinjani commented Sep 16, 2024

valaparthvi commented Sep 23, 2024 • edited Loading

valaparthvi commented Sep 24, 2024 • edited Loading

krunalhinguu commented Jan 28, 2025 • edited Loading

valaparthvi commented Sep 23, 2024 •

edited

Loading

valaparthvi commented Sep 24, 2024 •

edited

Loading

krunalhinguu commented Jan 28, 2025 •

edited

Loading