Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing different interfaces for use of secrets within polykey #55

Closed
robert-cronin opened this issue Jul 10, 2020 · 9 comments
Closed
Assignees
Labels
development Standard development

Comments

@robert-cronin
Copy link
Contributor

robert-cronin commented Jul 10, 2020

We need to figure out how to expose the secrets contained within polykey for use in difference contexts.

Some potential interfaces are:

  • clipboard
  • stdout,
  • a file descriptor
  • output to file (or just pipe to file/redirect to file)
  • "env variable" injection
  • http
@robert-cronin robert-cronin added the development Standard development label Jul 10, 2020
@robert-cronin robert-cronin self-assigned this Jul 10, 2020
@CMCDragonkai
Copy link
Member

Should keep track of this issue too: NixOS/rfcs#59 (comment)

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 15, 2020

Environment Variables

So to give an example of environment variables.

In any Unix shell you can set environment variables:

x=3
echo "$x"

You can expose these variables to subprograms:

export x=3
echo "$x"
bash -c 'echo "$x"'

However subprograms CANNOT set environment variables in the parent process. Which is the current shell.

So how would polykey help in the process of setting environment variables?

There are 2 ways:

  1. Is for the pk command to spit out instructions to set environment variables in the shell. This requires the user to copy the command. This I've seen in other programs, but it's not really a nice UX.
  2. Is to do what env does. The env command basically is a "fork+exec" at the shell-level but with the additional ability to set environment variables.

Let's explore 2.

To do 2. it is possible to not even use env at all.

x=3 bash -c 'echo "$x"'

This ensures that the environment variable is set for only the subprocess. And it does not export it to every other subprocess. This is good because of principle of least privilege.

So why use env? It is because it has lot of extra options to deal with environment variable manipulation and also signal manipulation and also dealing multi-parameter shebangs.

So how does this apply to Polykey?

The second way for Polykey to interface via environment variables is to expose an env like interface.

Suppose I wanted to pass some secret into some program that uses environment variables.

I could run it with polykey like this.

pk secrets get K --run 'bash -c "echo $K"'

The above basically "chains" the secret coming out of polykey directly injecting as an environment variable into the command in --run.

See nix-shell for examples of this. The nix-shell has [--command cmd] [--run cmd]. The first is an expression of a shell command. The second is the ACTUAL program being executed. So --run should be equivalent to doing a execvp. Whereas --command should be equivalent to a execv. See: https://stackoverflow.com/a/20823413. So the main difference is that --command runs in the "shell", whereas --run runs the actual program.

So this would mean that we are embedding the ability to do env-like functionality inside the pk commands. However it isn't entirely necessary. It is possible to allow the user to do command substitution.

K="$(pk secrets get K)" bash -c 'echo "$K"'

However... we should probably care about UX, so let's add the ability to "run" something with a given context. But it should probably not be done in the trivial way I just explained above.

To deal with more complex cases, it may be better to have some "chording" interface. This means almost a small kind of APL-like language in the pk commandline. So you can do things like pk secrets get K, get Y, get C... etc.

There are so many possible ways to organize/format the output. The best way to do so, is think of it as an language or "formatting" language. So you could spit out JSON, you could spit environment variable setters:

K='3' Y='4' C='5'

And many possible others.

The point is, this is the "env" variable interface.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 15, 2020

Clipboard

Clipboards are a property of the "windowing" system of operating systems. Which means there is not standard API for clipboards. If you are on Linux, you have both X windowing system AND also the Wayland system. These are 2 completely different APIs as well. If you are on Windows, you probably have some "windows native api" to hook into. If you are on Android or iOS, again, native APIs here.

So therefore the clipboard interface cannot be done in js-polykey. Because clipboards mean there is a GUI. That means clipboard interface has to be coded in Polykey.

So this sort of means that we can add the "stub" to the CLI to support clipboard. However if the GUI version of polykey hasn't been installed, then it just fails. Or errors out. Outputs to STDERR, saying "clipboard is not supported, use Polykey".

When the Polykey GUI is launched or when the Polykey GUI is installed. Then this functionality becomes possible, and when using the clipboard it should work.

Examples of doing this:

# this copies to clipboard on linux X window
echo 'abc' | xclip -selection c
# this copies OUT from clipboard on linux X window
xclip -selection clipboard -o | cat

That would mean something like:

pk secrets get K -clipboard

Consider this as part of the "chording" interface.

Clipboard access is primarily a UX thing for desktops GUIs and interactive usage on mobile platforms too. On mobile platforms people are more likely to use direct integrations of the system password autofiller. However autofiller doesn't always work, so clipboard is necessary.

On Desktop GUI interactive usage, you pretty always use clipboard. There's no OS-level autofiller that I know of. There are "application-specific" autofillers like browser extensions and gpg agent/ssh agent/pinentry provided by the Free Desktop Foundation. There's too many of these integration points, this is why clipboard is still the universal adapter of data passing between arbitrary GUI programs.

If we were to hook into OS level autofillers/entry systems:

  • Linux + Free Desktop Foundation - pinentry
  • Mac - keychain
  • Windows - ?

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 15, 2020

File Descriptor

This is a theoretically elegant way of passing capabilities around. Because of its advanced nature, it will probably only be used by advanced users like Matrix AI, or other people who really care about security.

A file descriptor is a kernel object representing/and acting like a file. But it doesn't exist on the filesystem. It's not a "named" file.

myprogramthatneedssecret --password-file /path/to/the/secret/file # we saw this with step-ca
# now with process substitution/redirection, we get a magical file descriptor
myprogramthatneedssecret --password-file <(pk secrets get blah)

Now technically this just really makes use of the shell. There's no magic supplied by polykey.

So this is elegant because it really is a principle of least privilege.

There's no secret floating around in the environment variables, nor in the filesystem, nor in the clipboard. The file descriptor is created and ONLY accessible by the program in which it was passed to. There's no way for any other program to read that file descriptor (unless they are able to break the kernel).

If another program wanted to read it, it could not. Firstly because there's no "name" for this file descriptor. The name that myprogramthatneedssecret knows is not a universal name. It's a name only local to that specific process.

Now this also allows myprogramthatneedssecret to pass that file descriptor. It is possible to use UDS to pass file descriptors between processes.

One thing though, this makes the fd read-once. It's a "single-use" place. That's good and bad. What if you wanted to be able to read multiple times? It is not possible unless you are using ZSH:

myprogramthatneedssecret --password-file =(pk secrets get blah)

However this is not as secure as before, because ZSH creates a file at /tmp, and uses that as the real file, to allow random access and repeated reads.

Also the other thing is, the file descriptor is readable as long as the myprogramthatneedssecret keeps the file descriptor open. As soon as the program closes, or explicitly closes the file descriptor, the file is not readable.

Now because we are using the shell, the idea that the descriptor is only readable by myprogramthatneedssecret is not entirely true.

Because you can do this:

#!/usr/bin/env bash

echo $$

exec 3< <(echo abc)

sleep 10000

cat <&3

And in a separate program to output cat /proc/PID/fd/3 where PID is what is echoed from the above script.

The fd however is limited to r-x------ and so it is only readable any program acting under the same user authority.

lr-x------ 1 cmcdragonkai operators 64 Jul 15 20:44 3 -> 'pipe:[10749868]'

But this is all because it is the shell that is doing the magic.

What if Polykey was doing the magic?

Then it would be possible to ensure absolute security of the file descriptor.

How would this work? 2 ways:

  1. Passing directly from parent to child
  2. Via UDS

For parent to child, this is actually similar to the environment variable situation. So you would "run" the process, while having opened an anonymous pipe, and subsequently piping the secret into the pipe and assigning to a file descriptor. However this requires passing some address in the same process space to the child process so it knows what number is the file descriptor relating to the secret. This is explained here: https://stackoverflow.com/a/21512395

This won't work for alot of programs. Because they were not designed to receive parameters referring a file descriptor number.

For UDS, this could work as well, but again relies on the idea that the receiving program knows how to use the socket. Furthermore the socket has to exist on the filesystem.

Which is why this is theoretically elegant, but practically difficult to do. Unless we are the ones writing the target program as well. So in the case of Matrix OS, this is the case where we could potentially use UDS or other stuff.

For normal use cases, it appears the process substitution/redirection is the best way to do it if you want to use file descriptors. The important idea is to realize that the secret is still readable on the filesystem if the other process knew the PID of the shell managing it, and if the fd hasn't been fully read yet, because it's a read-once fd. And also it needs to be running under the authority of the same user that is running the shell.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 15, 2020

HTTP or GRPC or take your pick of any networked protocol

This interfacing method makes Polykey equivalent to any other password/secret service like Vault or AWS Secret manager.

Basically this requires a third party networked client to send a request message with an authentation token that PK checks, and then returns a response with the relevant secret.

This has a little of complication here, because now you have these authentication tokens that PK needs to keep track of. It can do this statelessly with HMAC tokens, or it can do it statefully. But essentially this makes PK a "web service" like any other secret management system.

So if somebody wanted to use polykey, but wanted to use it IN THE EXACT SAME WAY they already use AWS Secret Manager or Vault. Then they could. But it does add some extra work necessary to provide the token generation/distribution ability.

To see how to do this, just look at Vault. They have an existing token system for this particular usecase.

And also OAuth2 is basically this. @DrFacepalm

To simplify this implementation, you could just take an OAuth2 server javascript impl and use that as the token system.

It seems like we can just use this: https://github.com/oauthjs/node-oauth2-server

Once you do this you have these concepts to model:

  • Users
  • Tokens
  • The relationship between users, tokens and vaults

So basically 1 user, may be given N tokens, and each token has the "authority" a.k.a. "scope" to access a subset of all vaults.

See the OAuth2 spec here: https://tools.ietf.org/html/rfc6749#section-1.2

And talk to @DrFacepalm since he's already gone through it.

This thing should be left as low priority for now.

@CMCDragonkai
Copy link
Member

Adding another one...

External secret stores

There are some other external secret stores that sometimes users have to utilize.

For example to use AWS ECS and to put in secret keys in during deployment we use environment variables. However sometimes these keys should not be exposed to the infrastructure team.

So AWS allows one to get secret variables as paths to the AWS SSM.

So if a user were to be using AWS ECS/SSM then where does polykey fit into that picture?

There needs to be external push/pull integrations.

The HTTP interface is basically a pull integration that any external system can arbitrarily fetch.

But in some cases external push is needed. And this is currently facilitated through the other integrations already mentioned. Users have to be able to "put" stuff into SSM. And sometimes that could mean a manual clipboard copy & paste.

This is not an implementation task, this is just something for us to be wary of during our UI/UX design.

@CMCDragonkai
Copy link
Member

Adding another one when I was thinking about integration into Matrix OS and NixOS.

Pseudo Filesystems

See: https://en.wikipedia.org/wiki/Synthetic_file_system

Basically we can present certain secrets to the user through presenting a pseudo filesystem and automatically mounting it somewhere.

Of course, this involves things like FUSE since Polykey is likely run by non-root user.

On the other hand running as root is also possible and that may be relevant to our Matrix OS system (thus managing root level secrets).

Windows will need to do something different like https://dokan-dev.github.io/. That's what keybase uses.

@robert-cronin
Copy link
Contributor Author

This should really be split into separate issues so we can tackle them asynchronously.

@robert-cronin
Copy link
Contributor Author

Has been split, child issues:
#67
#68
#69
#70
#71
#72

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development
Development

No branches or pull requests

2 participants