Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core key vault firewall should not be set to "Allow public access from all networks" #4260

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

jonnyry
Copy link
Collaborator

@jonnyry jonnyry commented Jan 7, 2025

Resolves #4250

What is being addressed

  • Changes the core key vault firewall from Allow public access from all networks to Allow public access from specific virtual networks and IP addresses
  • Adds an IP exception to the key vault firewall for the deployment machine's internet IP (or the PUBLIC_DEPLOYMENT_IP_ADDRESS variable if set) during deployment
  • Removes the IP exception at the end of deployment (whether deployment succeeds or fails)

How is this addressed

  • A new script to add and remove the keyvault deployment IP exception:

    • devops/scripts/kv_add_network_exception.sh
  • They are called from the following scenarios in order to provider access to KV:

    • core/terraform/deploy.sh
    • core/terraform/scripts/letsencrypt.sh
    • devops/scripts/destroy_env_no_terraform.sh
    • core/terraform/destroy.sh
  • The script uses a bash trap so that it runs regardless of whether the preceeding code fails or not, to ensure the IP exception is removed

A bug in azurerm provider was encountered which required the use of a terraform provisioner:

  1. A create provisioner on azurerm_key_vault was required to work around an azurerm provider bug which means if a key vault is being re-created (it was previously soft deleted), the network acls are not updated. This can be removed when the bug is fixed, or a different workaround found.

Updates since inital commit (as discussed with @marrobi):

  1. Remove use of tags and null provisioner to add tag.
  2. Delete the following scripts as they're no longer used:
  • devops/scripts/key_vault_list.sh
  • devops/scripts/set_contributor_sp_secrets.sh
  1. Refactor the add and remove scripts into a single script

Copy link

github-actions bot commented Jan 7, 2025

Unit Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit 67c2b2a.

♻️ This comment has been updated with latest results.

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 7, 2025

/test 8af920d

Copy link

github-actions bot commented Jan 7, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12660338621 (with refid 26f9d939)

(in response to this comment from @jonnyry)

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 7, 2025

/test-extended 8af920d

Copy link

github-actions bot commented Jan 7, 2025

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/12661150197 (with refid 26f9d939)

(in response to this comment from @jonnyry)

#
resource "null_resource" "add_deployment_tag" {
triggers = {
always_run = timestamp()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this always need to run? Once it's added once, it shouldn't get removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention was so if the tag is removed in Azure, it will always be readded.

However as discussed, have removed the use of tags altogether, so the provisioner has been removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the Storage Account rules in this script be handles the same way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes certainly - planning to have a look at storage accounts after this.

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test-destroy-env

Copy link

github-actions bot commented Jan 8, 2025

Destroying PR test environment (RG: rg-tre26f9d939)... (run: https://github.com/microsoft/AzureTRE/actions/runs/12669260987)

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test 2970a5d

Copy link

github-actions bot commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12669597448 (with refid 26f9d939)

(in response to this comment from @jonnyry)

@jonnyry jonnyry force-pushed the jr/upstream-main/93-close-keyvault-firewall branch from 2970a5d to dcb0b8f Compare January 8, 2025 12:00
@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test 272589f

Copy link

github-actions bot commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12670289419 (with refid 26f9d939)

(in response to this comment from @jonnyry)

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test bf9fd32

Copy link

github-actions bot commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12670349633 (with refid 26f9d939)

(in response to this comment from @jonnyry)

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test-destroy-env

Copy link

github-actions bot commented Jan 8, 2025

Destroying PR test environment (RG: rg-tre26f9d939)... (run: https://github.com/microsoft/AzureTRE/actions/runs/12670413797)

@jonnyry jonnyry force-pushed the jr/upstream-main/93-close-keyvault-firewall branch from bf9fd32 to dcb0b8f Compare January 8, 2025 12:25
Copy link

github-actions bot commented Jan 8, 2025

PR test environment destroy complete (RG: rg-tre26f9d939)

@jonnyry jonnyry requested a review from tamirkamara January 8, 2025 13:04
@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test dcb0b8f

Copy link

github-actions bot commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12671159667 (with refid 26f9d939)

(in response to this comment from @jonnyry)

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test dcb0b8f

Copy link

github-actions bot commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12671848713 (with refid 26f9d939)

(in response to this comment from @jonnyry)

CHANGELOG.md Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have time to go over this the next few days and guess @marrobi is the same. Just wanted to point out we now have 2 vaults being used from the deployer point of view.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When CMK is enabled another vault is created in the mgmt resource group

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12671848713 (with refid 26f9d939)

(in response to this comment from @jonnyry)

Notes on test run starting with an empty environment:

KV exception added here:

https://github.com/microsoft/AzureTRE/actions/runs/12671848713/job/35314921879#step:3:432

Adding deployment network exception to key vault kv-***...
 Core resource group rg-*** not found

KV exception removed here:

https://github.com/microsoft/AzureTRE/actions/runs/12671848713/job/35314921879#step:3:8259

Removing deployment network exception to key vault kv-***...
 Deployment network exception removed

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

/test 135be76

Copy link

github-actions bot commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12674163834 (with refid 26f9d939)

(in response to this comment from @jonnyry)

@jonnyry
Copy link
Collaborator Author

jonnyry commented Jan 8, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/12674163834 (with refid 26f9d939)

(in response to this comment from @jonnyry)

Notes on test run starting with an existing TRE:

KV exception added here:

https://github.com/microsoft/AzureTRE/actions/runs/12674163834/job/35322577601#step:3:456

 Adding deployment network exception to key vault kv-***...
 Keyvault kv-*** is now accessible

KV exception removed here:

 Removing deployment network exception to key vault kv-***...
 Deployment network exception removed

https://github.com/microsoft/AzureTRE/actions/runs/12674163834/job/35322577601#step:3:1181

CHANGELOG.md Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a pending ask to enable the deployer to access resources such as these keyvaults over private network only.

  1. Will this make this approach obsolete?
  2. If not, this means we will need to do all of this in a conditional way. Right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. That would mean all TRE deployers would need to switch to private self hosted runners right? If so, yes this would be obsolete.
  2. If we want to support both deployment patterns - deployment from GitHub hosted runners + deployment from private self hosted runners with KV set to private networking - then yes we'd need to do it conditionally (in order to prevent the KV from being fully public).

I guess it depends on whether implementing keyvaults private networking only means switching off the ability to use Github hosted runners - is that the plan?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal I'm referring to that resources will be accessed via private network only which means private runners.

The question for you, is weather this PR comes from a similar place but doesn't go as far yet and just limits which public IPs can access. Or, in a situation where private agents / network is done, will you still need this method of limiting public IPs

Copy link
Collaborator Author

@jonnyry jonnyry Jan 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep when the codebase switches to deployment from private runners ONLY, then this change won't be needed anymore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, in your deployment/usecase you would also like to use private runners?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although are a lot of scripts to maintain...

Not sure there is another way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking the add & remove scripts could be consolidated down to a single script, so there's only one script to call when you need KV access.

Also there's a couple of scripts I added the add/remove calls to that don't seem to be called from anywhere (they could possibly be removed?):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they can go. I've definitively not used either in recent times. @tamirkamara ?

If can get it into a single script might be cleaner?

Copy link
Collaborator Author

@jonnyry jonnyry Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK simplified version is checked in:

  1. Single add_kv_network_exception.sh script which does both the add & remove on exit
  2. Deleted old scripts key_vault_list.sh & set_contributor_sp_secrets.sh

Needs a good re-test.

@marrobi @tamirkamara WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re testing, have run the following scenarios on a local CICD build:

  1. Deploy to empty environment.
  2. Deploy to existing TRE.
  3. Certificate renewal.
  4. Destroy TRE

@microsoft microsoft deleted a comment from github-actions bot Jan 19, 2025
@jonnyry
Copy link
Collaborator Author

jonnyry commented Feb 6, 2025

/test dee1c14

Copy link

github-actions bot commented Feb 6, 2025

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/13178338837 (with refid 26f9d939)

(in response to this comment from @jonnyry)

@marrobi marrobi requested a review from Copilot February 7, 2025 19:48

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 11 changed files in this pull request and generated no comments.

Files not reviewed (10)
  • core/terraform/deploy.sh: Language not supported
  • core/terraform/destroy.sh: Language not supported
  • core/terraform/keyvault.tf: Language not supported
  • core/terraform/locals.tf: Language not supported
  • core/terraform/scripts/letsencrypt.sh: Language not supported
  • core/version.txt: Language not supported
  • devops/scripts/destroy_env_no_terraform.sh: Language not supported
  • devops/scripts/key_vault_list.sh: Language not supported
  • devops/scripts/kv_add_network_exception.sh: Language not supported
  • devops/scripts/set_contributor_sp_secrets.sh: Language not supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Core key vault firewall should not be set to "Allow public access from all networks"
3 participants