Support for dynamic quota in azurerm_cognitive_deployment #23988

rmoesbergen · 2023-11-22T13:56:41Z

Is there an existing issue for this?

I have searched the existing issues

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment and review the contribution guide to help.

Description

A azure cognitive services deployment now supports dynamic scaling of quota when capacity is available in the account. Please add this setting to the azurerm_cognitive_deployment terraform resource so it can be auto-provisioned. (The setting is called "Dynamic Quota" in the UI:

New or Affected Resource(s)/Data Source(s)

azurerm_cognitive_deployment

Potential Terraform Configuration

resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_cognitive_account" "example" {
  name                = "example-ca"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  kind                = "OpenAI"
  sku_name            = "S0"
}

resource "azurerm_cognitive_deployment" "example" {
  name                 = "example-cd"
  cognitive_account_id = azurerm_cognitive_account.example.id
  model {
    format  = "OpenAI"
    name    = "text-curie-001"
    version = "1"
  }

  scale {
    type = "Standard"
    capacity = "10" # K Transactions per minute
    dynamic = true  # <---- this would be the new setting
  }
}

References

https://microsoftlearning.github.io/mslearn-openai/Instructions/Labs/01-get-started-azure-openai.html#deploy-a-model

rcskosir · 2023-11-22T14:40:12Z

Thank you for taking the time to open this feature request!

unique-dominik · 2024-01-03T08:52:55Z

Good request 🍻

I observe that if you toggle this from the portal it sends:

{"displayName":"gpt-35-turbo-16k","sku":{"name":"Standard","capacity":240},"properties":{"model":{"format":"OpenAI","version":"0613","name":"gpt-35-turbo-16k"},"versionUpgradeOption":"NoAutoUpgrade","dynamicThrottlingEnabled":true,"raiPolicyName":"Microsoft.Nil"}}

and backward

{"displayName":"gpt-35-turbo-16k","sku":{"name":"Standard","capacity":240},"properties":{"model":{"format":"OpenAI","version":"0613","name":"gpt-35-turbo-16k"},"versionUpgradeOption":"NoAutoUpgrade","dynamicThrottlingEnabled":false,"raiPolicyName":"Microsoft.Nil"}}

Notice that the dynamicThrottlingEnabled flips.

Interestingly, on the cognitive_account there is a dynamic_throttling_enabled.

I try out later if these are equal or not 👀 I suspect no as accounts/create#dynamicThrottlingEnabled has its own dynamicThrottlingEnabled vs deployments/create#dynamicThrottlingEnabled

If we are lucky, they get inherited 🤣

illgitthat · 2024-01-17T00:55:52Z

Interestingly, on the cognitive_account there is a dynamic_throttling_enabled.

I try out later if these are equal or not 👀 I suspect no as accounts/create#dynamicThrottlingEnabled has its own dynamicThrottlingEnabled vs deployments/create#dynamicThrottlingEnabled

If we are lucky, they get inherited 🤣

That dynamic_throttling_enabled on the cognitive account resource level is different from the actual model deployments, like you mentioned.

I tried it and it gives:
DynamicThrottlingNotSupported: Thank you for your interest in Dynamic Throttling for Cognitive Services. This feature is currently not supported for the resource kind OpenAI and sku S0.

Thanks for opening this initial issue, would love to know if there is any planned update for this or I will investigate further on implementing this via azapi terraform provider.

JorisAndrade · 2024-03-07T10:44:08Z

Any news on this? From https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/dynamic-quota it is indeed dynamicThrottlingEnabled

az rest --method patch --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/deployments/{deploymentName}?2023-10-01-preview" --body '{"properties": {"dynamicThrottlingEnabled": true} }'

illgitthat · 2024-03-08T16:52:15Z

@JorisAndrade you can accomplish this via azapi provider in the meantime. Hope this helps!

resource "azapi_resource" "model_deployment" {
  type                      = Microsoft.CognitiveServices/accounts/deployments@2023-10-01-preview
  schema_validation_enabled = false
  parent_id                 = cognitive_account.account.id
  name                      = "gpt-4"
  body = jsonencode({
    sku = {
      name     = "gpt-4",
      capacity = 1
    },
    properties = {
      model = {
        format  = "OpenAI"
        name    = "gpt-4"
        version = "1106-preview"
      },
      dynamicThrottlingEnabled = true
      versionUpgradeOption     = "OnceNewDefaultVersionAvailable" # Options: NoAutoUpgrade, OnceCurrentVersionExpired, OnceNewDefaultVersionAvailable
    }
  })
  depends_on = [cognitive_account.account]
}

VickyWinner · 2024-03-12T13:51:14Z

@illgitthat thanks for sharing. will try this out.
@rcskosir any ETA on the enhancement to be available as part of the provider?

rcskosir · 2024-03-12T15:27:23Z

Thanks for reaching out, unfortunately I do not have an ETA on this enhancement. Any future work via the team or the community should end up linked here via a PR.

This version adds many options. Should be useful for hashicorp/terraform-provider-azurerm#23988

guilhem · 2024-03-24T23:41:58Z

I opened a PR on pandora to add https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/resource-manager/Microsoft.CognitiveServices/preview/2023-10-01-preview in https://github.com/hashicorp/go-azure-sdk

https://github.com/Azure/azure-rest-api-specs/blob/82f3d9571517966992eaf97b1db73f0a821cd06b/specification/cognitiveservices/resource-manager/Microsoft.CognitiveServices/preview/2023-10-01-preview/cognitiveservices.json#L3303

After done, we will be able to import it in provider to add this feature

savage-alex · 2024-11-05T19:02:35Z

Hi, thanks for raising this issue. Is there any idea of timeframe for this to get implemented please? Thank you again.

github-actions · 2025-01-02T02:13:43Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions bot added the service/cognitive-services label Nov 22, 2023

rcskosir added the enhancement label Nov 22, 2023

guilhem added a commit to guilhem/pandora that referenced this issue Mar 24, 2024

config: add cognitiveservices 2023-10-01-preview

4b07bce

This version adds many options. Should be useful for hashicorp/terraform-provider-azurerm#23988

guilhem mentioned this issue Mar 24, 2024

config: add cognitiveservices 2023-10-01-preview hashicorp/pandora#3997

Merged

guilhem mentioned this issue Mar 25, 2024

azurerm_cognitive_deployment dynamic quota #25401

Closed

14 tasks

zioproto mentioned this issue Aug 6, 2024

Support for Dynamic Quota in Chat-Models Azure/terraform-azurerm-openai#91

Open

1 task

onordberg mentioned this issue Sep 7, 2024

Support OpenAI Dynamic Quota (DynamicThrottling) in azure-native.cognitiveservices.Deployment pulumi/pulumi-azure-native#3564

Open

liuwuliuyun mentioned this issue Nov 25, 2024

azurerm_cognitive_deployment - support for the property dynamic_throttling_enabled #28100

Merged

14 tasks

stephybun closed this as completed in #28100 Dec 2, 2024

github-actions bot added this to the v4.13.0 milestone Dec 2, 2024

github-actions bot locked as resolved and limited conversation to collaborators Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for dynamic quota in azurerm_cognitive_deployment #23988

Support for dynamic quota in azurerm_cognitive_deployment #23988

rmoesbergen commented Nov 22, 2023

rcskosir commented Nov 22, 2023

unique-dominik commented Jan 3, 2024 •

edited

Loading

illgitthat commented Jan 17, 2024

JorisAndrade commented Mar 7, 2024

illgitthat commented Mar 8, 2024

VickyWinner commented Mar 12, 2024

rcskosir commented Mar 12, 2024

guilhem commented Mar 24, 2024 •

edited

Loading

savage-alex commented Nov 5, 2024

github-actions bot commented Jan 2, 2025

Support for dynamic quota in azurerm_cognitive_deployment #23988

Support for dynamic quota in azurerm_cognitive_deployment #23988

Comments

rmoesbergen commented Nov 22, 2023

Is there an existing issue for this?

Community Note

Description

New or Affected Resource(s)/Data Source(s)

Potential Terraform Configuration

References

rcskosir commented Nov 22, 2023

unique-dominik commented Jan 3, 2024 • edited Loading

illgitthat commented Jan 17, 2024

JorisAndrade commented Mar 7, 2024

illgitthat commented Mar 8, 2024

VickyWinner commented Mar 12, 2024

rcskosir commented Mar 12, 2024

guilhem commented Mar 24, 2024 • edited Loading

savage-alex commented Nov 5, 2024

github-actions bot commented Jan 2, 2025

unique-dominik commented Jan 3, 2024 •

edited

Loading

guilhem commented Mar 24, 2024 •

edited

Loading