Polykey secrets env integration #289

brynblack · 2024-09-27T02:32:33Z

Description

This PR integrates the polykey secrets env command into the development shell hook, to securely load development secrets into the development environment.

Tasks

1. Replace . ./.env with pk secrets env ...
2. Update CI to include Polykey and pull from seed node

Final checklist

tegefaulkes · 2024-09-27T06:01:19Z

@brynblack does this not link back to any issues?

flake.nix

CMCDragonkai · 2024-09-28T02:27:14Z

Where's the issue?

aryanjassal · 2025-01-31T01:22:59Z

We need to get this issue completed. The main blocker for this issue is the usage of polykey in the CI environment.

So far, we have discussed that creating an agent the normal way isn't predictable, as it randomly generates a node id. As such, we cannot delegate authority to a randomly-created agent.

We can use the recovery passphrase to re-generate the node with the same node ID. However, how will the password and the passphrase be stored safely? If it is hard-coded in the repo or in the CI file, then it can be leaked. Now that we are no longer using a runner image in newer CIs, we cannot pre-load the information onto the runner file system either. Perhaps the passphrase and password could be a repo-specific secret.

This is not the end of all issues, however. We also need to figure out a distribution method for a Polykey agent.

Should we integrate the agent in the runner image itself? That can no longer be done, as we are transitioning to the usage of a custom action which sets up nix on the stock runners. No images are being used anymore, so this idea will not work after the CI upgrade.

We currently don't have an action which might set up polykey and run the agent in the background. Brynley has created a custom action which prepares the nix environment, so she might be able to set up a Polykey action. The simplest way I can think of doing this would be downloading the package directly from the releases page for the appropriate platform, then using the aforementioned methods to launch an agent and delegate it authority.

Alternatively, we can use the inbuilt package manager to download Polykey. (https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/customizing-github-hosted-runners#installing-software-on-ubuntu-runners). This method works on all platforms, so custom packages can be imported. However, the examples have used apt for Ubuntu, brew for MacOS, and chocolatey for Windows. We currently don't have Polykey published to these package managers. There might be ways to 'sideload' package managers and get Polykey, but that likely won't be supported and an API change might introduce failures in all our CIs.

This is partially related to MatrixAI/Polykey#222. Implementing egress schema would help with smooth CI by restricting the exported secrets for each workflow.

This leads to another point - how to delegate secrets. Should each repo get a vault? Or each branch? Technically, each runtime is unique, so do we need a unique vault per runtime? (of course not!)

Once support for egress schema has been implemented, a vault can be dedicated for a single repo, with egress schema controlling the secrets that are exported. This idea seems the best to me, but perhaps I could be overlooking something.

Eventually, we would use a tool like Orchestrator or even Polykey Enterprise to manage secret delegation, but initially, we would need someone to delegate and manage these secrets, which can get cumbersome very quickly.

However, there is another issue with dogfooding Polykey to this extent.

What if a commit deploys a change which breaks Polykey on the CI, but isn't caught by the tests? Then, the newer Polykey version would be published, but break in CI, which would prevent any other commits to trigger the CI, becoming deadlocked in a broken state. We would need to manually downgrade polykey to allow the CI to trigger, then upgrade it to the default version.

That issue is specific to a scenario where Polykey is always fetched from the latest version. However, what about the case where polykey would be fetched from a fixed version instead? In that case, if a new major update has released, polykey would need manual update in all repos to bring the pin to the latest version.

I haven't thought up of a solution to this issue yet. This potential failure case might need more discussion.

This is what I could understand about the current state of this issue; about getting Polykey into the CI environments. I might be missing some key details, though. Some discussion might be needed. Thoughts, @CMCDragonkai @brynblack?

CMCDragonkai · 2025-02-03T19:14:15Z

As we had discussed during our earlier meeting, I believe it make sense to have a Polykey agent running separately orchestrated by the Orchestrator. This agent will run in our cloud. Then our CI workers will instead pull secrets down according to the schema.

You want to incorporate:

Vault and File Schema - Ingress and Egress Schemas Polykey#222 to support the egress secrets.schema.json
Filesystem Egress & Ingress - Authority Flow Push/Pull Automation Polykey#835 to a lesser extent this is important for any usage of "desktop CLI tools" - but ideally we prefer to use env variables where we can
https://github.com/MatrixAI/Orchestrator/issues/42
https://github.com/MatrixAI/Orchestrator/issues/36
https://github.com/MatrixAI/Orchestrator/issues/37
https://github.com/MatrixAI/Orchestrator/issues/33
https://github.com/MatrixAI/Orchestrator/issues/1

There's the issue of secret-zero. How do you "give the initial secret" to the polykey client to call the agent?

There was recently some discussion about "workload identities" I did with ChatGPT, and I believe there's a token delegation - short-lived delegation process that should be investigated: ChatGPT-IAM and Service Accounts.pdf

But I think atm, the easiest way is to actually just pass the root password to each job at a Organisation level.

But @brynblack you have not been keeping up closing issues, so there's way too much entropy here. I'm going to be trying to get @Abby010 to help (along with some debug/tracing problems for PK in production). It's time to think operationally now.

CMCDragonkai · 2025-02-10T00:41:58Z

@brynblack comment when consolidated.

brynblack self-assigned this Sep 27, 2024

CMCDragonkai reviewed Sep 28, 2024

View reviewed changes

flake.nix Show resolved Hide resolved

CMCDragonkai requested a review from aryanjassal September 28, 2024 02:27

brynblack force-pushed the feature-pk-integration branch from cbdc2c4 to bab5959 Compare October 9, 2024 01:20

wip

e685d20

brynblack force-pushed the feature-pk-integration branch from bab5959 to e685d20 Compare October 10, 2024 03:11

This was referenced Oct 18, 2024

Allow parsing just the vault name without requiring the secret path #305

Merged

Allow exporting all environment variables by default for secrets env #312

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polykey secrets env integration #289

Polykey secrets env integration #289

brynblack commented Sep 27, 2024 •

edited

Loading

tegefaulkes commented Sep 27, 2024

CMCDragonkai commented Sep 28, 2024

aryanjassal commented Jan 31, 2025

CMCDragonkai commented Feb 3, 2025 •

edited

Loading

CMCDragonkai commented Feb 10, 2025

Polykey secrets env integration #289

Are you sure you want to change the base?

Polykey secrets env integration #289

Conversation

brynblack commented Sep 27, 2024 • edited Loading

Description

Tasks

Final checklist

tegefaulkes commented Sep 27, 2024

CMCDragonkai commented Sep 28, 2024

aryanjassal commented Jan 31, 2025

CMCDragonkai commented Feb 3, 2025 • edited Loading

CMCDragonkai commented Feb 10, 2025

brynblack commented Sep 27, 2024 •

edited

Loading

CMCDragonkai commented Feb 3, 2025 •

edited

Loading