-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New resource - azurerm_kubernetes_cluster_agentpool #4001
Comments
Also, note that Azure just changed the way they handle the call of until Wednesday, we could pass multiple agentPoolProfiles to it and the first one was treated as the primary. Now, when we pass multiple AgentPoolProfile, it creates an AKS resource which will be marked as a success but there won't be any VMSS created, hence no nodes. It looks like MSFT are changing the way they are handling the creation of the AKS resource while forcing the users to manage agent pool separately, using the AgentPoolClient API. |
Hey @titilambert , I am not sure if you have started on this yet but I wanted to first thank you for this contribution and also for proactively letting us know that you will be working on this and your proposed schema. Super helpful! MarkG |
This is awesome, thanks for this... based off what I've seen here I have a couple of suggestions if you are open to hearing them.
Other than those couple of suggestions, I am really looking forward to this contribution. 🚀 |
@titilambert I don’t know if you have this in your design but we noticed a behavior that AKS won’t allow nodepool changes if the auto scaler is active on the nodepool. If you haven’t already accounted for this could you build in logic to disable the auto scaler if enabled before performing nodepool changes on an existing nodepool? |
@jeffreyCline : are AgentPool supported with AvailabilitySet ? though the required VMSS |
👋🏻 This is great 👍 - there's a few changes I'd suggest to @jeffreyCline's proposal, but this otherwise looks good:
WDYT? |
Thanks for all your comments, |
@tombuildsstuff I will need to make some change in the kubernetes resources. |
Thank you @titilambert! @tombuildsstuff's suggestions sound good with a couple of comments:
|
Also the following should not be required:
|
@titilambert I noticed some issues when updating a agent_pool_profile today. We are loading the values from a terraform variable type "map". It seems like the value is being interpreted as the literal value after the = for each attribute. -/+ module.azure-aks.azurerm_kubernetes_cluster.aks (new resource required)
id: "/subscriptions/************/resourcegroups/********/providers/Microsoft.ContainerService/managedClusters/********" => <computed> (forces new resource)
addon_profile.#: "1" => <computed>
agent_pool_profile.#: "2" => "1"
agent_pool_profile.0.count: "2" => "0"
agent_pool_profile.0.dns_prefix: "" => <computed>
agent_pool_profile.0.fqdn: "********-*****.hcp.centralus.azmk8s.io" => <computed>
agent_pool_profile.0.max_pods: "60" => "0" (forces new resource)
agent_pool_profile.0.name: "pool1" => "${lookup(data.external.clusters.result, \"scaleset01_name\")}" (forces new resource)
agent_pool_profile.0.os_disk_size_gb: "30" => "0" (forces new resource)
agent_pool_profile.0.os_type: "Linux" => "${lookup(data.external.clusters.result, \"scaleset01_os_type\")}" (forces new resource)
agent_pool_profile.0.type: "VirtualMachineScaleSets" => "${lookup(data.external.clusters.result, \"scaleset01_type\")}" (forces new resource)
agent_pool_profile.0.vm_size: "Standard_DS13_v2" => "${lookup(data.external.clusters.result, \"scaleset01_vm_size\")}" (forces new resource)
agent_pool_profile.0.vnet_subnet_id: "/subscriptions/************/resourceGroups/*********/providers/Microsoft.Network/virtualNetworks/********-vnet/subnets/********-nodes-subnet" => "/subscriptions/************/resourceGroups/********/providers/Microsoft.Network/virtualNetworks/********-vnet/subnets/********-nodes-subnet" I may need to file a separate issue about this, but reading your post it seemed relevant and possibly related. |
Since there's two very different behaviours, perhaps it makes sense to expose this field twice, once as
👍 to looking this up if it's not specified
whilst that's the behaviour of the OSType is more complicated - whilst I can see the argument for defaulting to Linux here it may be prudent to be explicit, given changing this requires reprovisioning the pool? |
Just to ensure I understand the behaviour here to point you in the right direction: is the intention that users would provision an AKS Cluster with a single node via the |
@tombuildsstuff the API expect the primary Agent pool to be pass with the managedcluster client. other agent pool need to be provisioned with the agentpool client. Tehre was a question we were wondering, should we do le the NSG resource where we can do either pool definition at the kubernetes_cluster resource using a list and having a flag to define the primary agent pool (this one would be pass to the managedcluster client, while the remaining would be handled by the agentpool client). or should we limit to a single agent_pool block within the kubernetes_resource and force users to use kubernetes__cluster_agent_pool resource. |
This one is actually implemented in a way that is not TF friendly. The count is required on create and can be passed with autoscale enabled but cannot be passed afterwards if the count differs from the actual node count. I will touch base with the AKS team about this.
I am not saying that we set the defaults for these in the provider. The properties are not required by the API so we should not require them. And I stand corrected. It looks like the default for the type is actually VMSS for the Agent Pool API. For OSType, I am always in favor of reducing unnecessary typing but can see your point. |
This is the first commit for this new feature #4046 Breaking change:
What is working:
Missing:
I will probably wait for some feedback for some days before continuing on this. |
@titilambert did you change the limit of agent profiles for
This is the same problem in #3549. I will add this comment there as well... that issue is the bug, this issue is the fix. I will leave a related comment there too. In the mean time, if you could get whoever made this API change to back it out until version 1.33 of the azurerm provider with your kubernetes_cluster_agent_pool changes is released, that would be great. As it is, I am now unable to provision an AKS cluster with Terraform due to this problem. |
MSFT have made an update on the backend which reverted their server side API. I’m not sure if they did it for all region but we opened a support ticket and got them to revert EASTUS for us to unblock us, I would say, that this should be your best bet for now until the agent_pool resource is finished and released.
… On Aug 15, 2019, at 4:47 PM, davidack ***@***.***> wrote:
@titilambert did you change the limit of agent profiles for azure_kubernetes_cluster to 1 just in the azurerm provider, or did you change the API itself? Because I am seeing behavior that indicates the API itself was changed. Terraform code with 4 agent_pool_profile resources in my azure_kubernetes_cluster which worked fine last week, now results in this error:
Error: Error creating/updating Managed Kubernetes Cluster "foo" (Resource Group "foo-rg"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="AgentPoolUpdateViaManagedClusterNotSupported" Message="Node pool update via managed cluster not allowed when the cluster contains more than one node pool. Please use per node pool operations."
This is the same problem in #3549. I will add this comment there as well... that issue is the bug, this issue is the fix. I will leave a related comment there too.
In the mean time, if you could get whoever made this API change to back it out until version 1.33 of the azurerm provider with your kubernetes_cluster_agent_pool changes is released, that would be great. As it is, I am now unable to provision an AKS cluster with Terraform due to this problem.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Any updates when this will be merged and released? |
@djsly |
@ranokarno MSFT have yet to revert the API change so I am still blocked. They are rolling out an update across all regions that they say will fix the problem, but they have not provided any specifics about how. Until the new azurerm_kubernetes_cluster_agentpool resource (using the VMSS API instead of the AKS API) is released as part of a new version of the azurerm provider, I can't think of any way the change they are rolling out now would fix the problem I am having unless they remove the limit of not being able to create or update more than one node pool via the AKS API. |
@davidack , Sorry for the delay. The fix has been deployed in Japan East region. With the fix,you can still use MC API to scale agent pools. Please try it. |
@davidack , i encounter the same issue in "eastasia" and "southeastasia" azure region. |
@titilambert Thank you for working on this new resource. Do you have any updates when you expect the PR for this can be merged? |
FYI. @titilambert is currently on paternity leave.
If someone would want to pick this up in the mean time feel free.
He’s expected to come back online on October .1st.
Unless he wants to work on this during is time off, this won’t see the day for another month.
… On Sep 4, 2019, at 8:52 AM, lewalt000 ***@***.***> wrote:
@titilambert Thank you for working on this new resource. Do you have any updates when you expect the PR for this can be merged?
#4046
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@xizhamsft , thanks for the heads-up... I tried it last week and I was able to provision a new AKS cluster in japaneast with multiple node pools that were specified as blocks within the azure_kubernetes_cluster resource. I look forward to using the new azurerm_kubernetes_cluster_agentpool resource once it is available, but for now the original method using the MC API once again works. Thanks to the Azure team for getting this fixed. Also, congrats to @titilambert on the new baby! |
@ranokarno, sorry for the late reply. Yes. We already pushed the release to all regions. |
Recently I have been investigating how to add custom data to my node pools to make some changes to every node as it boots, and I discovered that what I need is the Wouldn't it make sense to have a |
@titilambert any updates on this? Will this be in the next provider release? Also hope paternity leave went well! |
For awareness and help on priority the multiple nodepools feature will be GA in a few weeks. However in the agent pool body described above some of those properties are limited preview features including:
|
@coreywagehoft I just rebased the PR on master |
@tombuildsstuff I noticed you recently assigned this issue to yourself. Would you be able to provide any guidance on how much work you have left for it? Thank you. |
@lewalt000 at the moment we're waiting on a PR to be merged on Microsoft's side so that we can push the changes needed to make this happen |
This has been released in version 1.37.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example: provider "azurerm" {
version = "~> 1.37.0"
}
# ... other configuration ... |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
Community Note
Description
There is a new resource to manage correctly AKS cluster agent pool.
With this, we will be able to upgrade/scale up/scale down an agent pool
New or Affected Resource(s)
Potential Terraform Configuration
References
@mbfrahry @tombuildsstuff
I'm going to start this PR on Monday, do you have any warnings/thoughts/recommandations ?
The text was updated successfully, but these errors were encountered: