Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

key vault soft-delete causing failure when updating secrets or certificates #5659

Closed
AliAllomani opened this issue Feb 10, 2020 · 15 comments · Fixed by #9911
Closed

key vault soft-delete causing failure when updating secrets or certificates #5659

AliAllomani opened this issue Feb 10, 2020 · 15 comments · Fixed by #9911
Assignees
Labels
Milestone

Comments

@AliAllomani
Copy link
Contributor

key vault soft-delete causing failure when updating secrets and certificates

instead of creating a new version of the secret of the certificate, terraform apply fails

15:07:42  module.acme.azurerm_key_vault_secret.cert[0]: Creating...
15:07:44  
15:07:44  Error: keyvault.BaseClient#SetSecret: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="Forbidden" Message="Operation \"set\" is not allowed as this object's lifetime is managed by Key Vault." InnerError={"code":"SecretManagedByKeyVault"}
15:07:44  
15:07:44    on .terraform/modules/acme/Terraform-modules/acme/vault.tf line 30, in resource "azurerm_key_vault_secret" "cert":
15:07:44    30: resource "azurerm_key_vault_secret" "cert" {
15:07:44  
15:07:44  
14:10:02  module.acme.azurerm_key_vault_certificate.cert[0]: Destruction complete after 3s
14:10:02  module.acme.azurerm_key_vault_certificate.cert[0]: Creating...
14:10:03  
14:10:03  Error: keyvault.BaseClient#ImportCertificate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="Conflict" Message="Certificate jenkins is currently being deleted and cannot be re-created; retry later." InnerError={"code":"ObjectIsBeingDeleted"}
14:10:03  
14:10:03    on .terraform/modules/acme/Terraform-modules/acme/vault.tf line 1, in resource "azurerm_key_vault_certificate" "cert":
14:10:03     1: resource "azurerm_key_vault_certificate" "cert" {
14:10:03  
14:10:03 
14:12:17  module.acme.azurerm_key_vault_certificate.cert[0]: Creating...
14:12:18  
14:12:18  Error: keyvault.BaseClient#ImportCertificate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="Conflict" Message="Certificate jenkins is currently in a deleted but recoverable state, and its name cannot be reused; in this state, the certificate can only be recovered or purged." InnerError={"code":"ObjectIsDeletedButRecoverable"}
14:12:18  
14:12:18    on .terraform/modules/acme/Terraform-modules/acme/vault.tf line 1, in resource "azurerm_key_vault_certificate" "cert":
14:12:18     1: resource "azurerm_key_vault_certificate" "cert" {
14:12:18  
14:12:18 
@jamesbannan
Copy link

I'm seeing exactly the same issue. The new key_vault block in the features block in the v.2.0 AzureRm Terraform provider does support deleting and purging soft delete-enabled Key Vaults, but this functionality doesn't seem to extend to deleting and purging secrets/keys/certificates which are managed by Terraform:

https://www.terraform.io/docs/providers/azurerm/index.html#purge_soft_delete_on_destroy

@mgrlabs
Copy link

mgrlabs commented Feb 26, 2020

Hi guys,

I have a client that requires soft delete to be enabled on their Key Vaults. This limits the existing AzureRM provider's usefulness due to a lack of handling around deleted but not purged secrets with the same name. In the case where the secret needs to be deleted and recreated using Terraform, the apply will fail due to the secret being in a recoverable state.

There are also some interesting timing challenges around the delete --> purge --> recover that the API presents if you script this use AZ CLI as the secret enters a transition state between each of these major states.

Would love to see support for this in the provider.

@garethmorris78
Copy link

garethmorris78 commented Jul 22, 2020

Any timeline on when this will be fixed; I've just run into this issue. Also Looks like soft delete is now a default option when you create a key vault via the azure portal
image
Also a warning banner for any key vault without it enabled. so chances are it will become the norm rather than the exception to have it enabled.

@LaurentLesle
Copy link
Contributor

@garethmorris78 have you tried adding to your keyvault attributes:

soft_delete_enabled = true

@jvassbo
Copy link

jvassbo commented Oct 13, 2020

We also have this problem. Now that Azure will remove opt-out of soft-delete on all Key Vaults by the end of year, this presents a major problem.
All automation we've built on Terraform for certificate renewals is now a manual taks.

Side note / question: Is it a bug that a azurerm_key_vault_certificate must be re-created and is not updated in-place? The certificate resource does support versioning, but maybe the underlying API does not?

@alex-3sr
Copy link

Hi,

As said just above by @jvassbo , that will be an issue when Azure will force Soft delete on all KV until this end of year, and good remark about update certificate and not destroy/create (versionning must me used here).

This is espacially annoying when we used automatic ACME certificate, and it's not impossible to automatic renew certificates now.

As workaround, i use a random ressource for the naming the key vault certificate, and in CICD task, i terraform taint this random name for force renew with a new certificate name. but I've to schedule it, and renew is not based on expiration days left.

It could be nice to have an option to update actual certificate with new one, without destroy previous one.

Regards
Alex

@tareks
Copy link

tareks commented Oct 30, 2020

I've started looking into this and it appears that there is an overlap in how the key_vault features block impacts the Azure Key Vault itself and the objects (secrets, keys, certificates) within it assuming purge protection is not enabled.

The recover_soft_deleted_key_vaults option will recover both the key vault itself and objects within it. If this is set to false, and the either the keyvault itself or the object in question (for example: azurerm_key_vault_secret) have been destroyed but not purged, terraform apply will error out saying that the resource already exists. If this is set to true, then either the keyvault or the objects will be recovered on terraform apply.

The purge_soft_delete_on_destroy option will permanently delete the azurerm_key_vault resource on a terraform destroy when set to true. Note that this does not apply to all objects within the key vault itself.

In a specific case to azurerm_key_vault_secret (could be the same for keys and certificates), the secret will be deleted and remain in deleted state after a destroy. This means that further "post-execution" steps are required to completely purge the secret. This can be done before/after terraform runs or using local-exec on the same secret resource in question.

As some have already highlighted both in this issue and others, this is becoming more of a concern now that soft-delete will be enforced across all keyvaults later this year and no longer be optional. In situations where terraform is managing its own secrets (creation/updates/deletion), one would expect that a destroy operation can provide the option to "purge" the secret.

That said, one question is: would we use the existing provider-level feature purge_soft_delete_on_destroy to perform a purge after the secret is deleted or would a new option need to be defined that is clearer in terms of what exactly is happening? Would or can we want to enforce this on the resource (per secret/key/certificate) level or keep it provider wide? Note that its not always the case that both the keyvault and the secret are being provisioned within the same state context.

This could be implemented with something like this:

	_, err = client.DeleteSecret(ctx, id.KeyVaultBaseUrl, id.Name)
	log.Printf("[DEBUG] Secret %q deleted.", id.Name)
	// If purge_soft_delete_on_destroy = true and soft_delete is enabled on the keyvault, then purge the secret on deletion
	if meta.(*clients.Client).Features.KeyVault.PurgeSoftDeleteOnDestroy {
		log.Printf("[DEBUG] Purging %q from %q.", id.Name, id.KeyVaultBaseUrl)
		_, err = client.PurgeDeletedSecret(ctx, id.KeyVaultBaseUrl, id.Name)
		// TODO: Wait for secret state to transition as purge operation is not instant and we want to avoid 409s ("ObjectIsBeingDeleted")
		if err != nil {
			return fmt.Errorf("Error purging secret %q from %q: %v", id.Name, id.KeyVaultBaseUrl, err)
		}
	}

Feedback is appreciated. Happy to raise a PR if needed based on what the wider group comes back with.

@tareks
Copy link

tareks commented Nov 9, 2020

@tombuildsstuff Would you be able to provide feedback on this or perhaps suggest someone who can help us move this forward?

@hbuckle
Copy link
Contributor

hbuckle commented Nov 24, 2020

Definitely seems like a big problem waiting to happen. From my observations with certificates

  • When Terraform tries to replace the certificate with a new one it fails with the ObjectIsBeingDeleted error

  • After waiting and trying again, the apply succeeds, but instead of the new certificate being created the old one is restored

@thomasbrueggemann
Copy link

@hbuckle I noticed the same behaviour. What happens if you then run another round of terraform plan & apply? Does it update the key vault objects to the latest values?

@hbuckle
Copy link
Contributor

hbuckle commented Nov 25, 2020

I haven't tried but I assume it would try and remove and re-create it again and therefore fail.

TBH I am currently going through and ripping out the azurerm_key_vault_certificate resource and replacing it with a script provisioner to create new versions, as I can't afford for this to stop working

@tareks
Copy link

tareks commented Nov 25, 2020

@thomasbrueggemann If you re-run the apply, terraform will find that the (recovered) does not match what it expects in state and will then force the update, creating a new version of the secret.

Recovery on apply may well be the right option in certain situations where the secret may have been deleted outside of terraform. It seems like the most pressing concern is that if terraform destroys the secret, we need to be able to specify whether it should be purged or not (perhaps using the features purge_soft_delete_on_destroy option?).

This behavior is specifically for secrets. I haven't tested keys or certificates.

@rdvansloten
Copy link

rdvansloten commented Dec 1, 2020

Bumping this so developers see this. Does anyone have a working fix that doesn't require null_resource?

The az cli module allows specific methods and detects existing configurations without being destructive. It must be possible for Terraform to tap into these same API calls. Below is a workaround. The local_file provider is only necessary to convert base64 to binary .pfx. content_base64 is required to convert base64 to binary, base64decode will error out with UTF-8 issues.

Workaround example:

# Create .pfx file locally
resource "local_file" "b64cert" {
  content_base64 = acme_certificate.certificate.certificate_p12
  filename       = "${filename}.pfx"
}

# Import certificate to Azure Key Vault - Workaround
resource "null_resource" "AzCommand" {
  provisioner "local-exec" {
    command = "az keyvault certificate import --vault-name ${data.azurerm_key_vault.vault.name} -n ${var.certificatename} -f ${filename}.pfx"
  }
}

@ghost
Copy link

ghost commented Dec 17, 2020

This has been released in version 2.41.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.41.0"
}
# ... other configuration ...

@ghost
Copy link

ghost commented Jan 17, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Jan 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.