Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add handling in agent cache for server leaf certs #14556

Merged
merged 7 commits into from
Sep 17, 2022
Merged

Conversation

freddygv
Copy link
Contributor

@freddygv freddygv commented Sep 9, 2022

Description

  • Adds a new internally generated and locally managed ACL token for servers to use for management actions. This new ACL token has a corresponding ACL identity and unlimited permissions given that it is generated for server agents.

  • Adds a new server certificate manager for requesting a certificate for servers and updating the TLS configurator as needed.

Testing & Reproduction steps

  • Manual testing
  • Unit testing with multiple test agents

Links

https://go.hashi.co/csl-204-server-cert

PR Checklist

  • updated test coverage
  • external facing docs updated
  • not a security concern

@freddygv freddygv requested review from a team and boxofrad and removed request for a team September 9, 2022 19:09
@github-actions github-actions bot added theme/certificates Related to creating, distributing, and rotating certificates in Consul theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies theme/tls Using TLS (Transport Layer Security) or mTLS (mutual TLS) to secure communication labels Sep 9, 2022
@freddygv
Copy link
Contributor Author

freddygv commented Sep 9, 2022

Hmm will take a look at these failures on Monday

Copy link
Contributor

@boxofrad boxofrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, nice job! 👏🏻 I really enjoyed getting this review request because it nudged me to read the RFC and learn about this feature.

I've left a bunch of mostly non-blocking comments. The only one I think needs to be addressed is how we're hanging onto stale state stores.

agent/agent.go Outdated Show resolved Hide resolved
agent/structs/acl.go Outdated Show resolved Hide resolved
tlsutil/config.go Show resolved Hide resolved
agent/consul/leader.go Show resolved Hide resolved
agent/consul/servercert/manager.go Outdated Show resolved Hide resolved
agent/consul/servercert/manager.go Outdated Show resolved Hide resolved
@freddygv
Copy link
Contributor Author

freddygv commented Sep 13, 2022

Err the TLS configurator side needs more thinking. I've been trying to fix the test failures but there is a legitimate issue here:

When an autoTLS cert is present we ALWAYS use it as the outbound cert. However, for WANFed via mesh gateways the secondary DCs can't validate the autoTLS server certificates from servers in the primary because the secondaries don't have the Connect CA roots until federation succeeds. This wasn't an issue before because servers never had an autoTLS cert configured.

For server agents we shouldn't use the autoTLS certs for any outbound connections other than peering. But the configurator currently isn't aware of whether it's attached to a server or client agent.

Defaulting to using the manually configured certs for outbound connections would work, but it's a breaking change. To be honest I don't know why we default to autoTLS when a manual cert exists. Seems kind of odd. Also the auto_config docs say These configurations get merged in as defaults with any user-supplied configuration on the client agent able to override them. But I don't see that happening anywhere?

Thoughts @boxofrad ?

@boxofrad
Copy link
Contributor

Ah yes, I remember staring at this for a long while when we switched to a per-listener TLS configuration. 😅

I'm not 100%, but maybe @i0rek remembers? why we always prefer the auto-encrypt/config certificate in outgoing connections. If I had to hazard a guess, I'd say it's because if you're using one of those features, only their automatically provisioned client certificates (signed by the Connect CA) are accepted by the server, not certificates created by consul tls cert create.

I think it'd be ok to break backwards compatibility here just for the gRPC port, as long as we leave the server/multiplexed port alone. Only peering makes outgoing connections on that port, and we've told folks using auto-{encrypt,config} not to upgrade to 1.13 yet, so I don't think it'd impact anyone.

@freddygv freddygv requested a review from boxofrad September 16, 2022 01:35
This commit introduces a new ACL token used for internal server
management purposes.

It has a few key properties:
- It has unlimited permissions.
- It is persisted through Raft as System Metadata rather than in the
ACL tokens table. This is to avoid users seeing or modifying it.
- It is re-generated on leadership establishment.
When the TLS-enabled gRPC port receives a request for the expected
it must use the auto-tls certificates.
This certificate manager will request a leaf certificate for server
agents and then keep them up to date.
- Pulls in CLI test fix from main
- Updates psutils to fix TestAgent_Host on M1 Mac
Preivously the TLS configurator would default to presenting auto TLS
certificates as client certificates.

Server agents should not have this behavior and should instead present
the manually configured certs. The autoTLS certs for servers are
exclusively used for peering and should not be used as the default for
outbound communication.
@freddygv freddygv force-pushed the NET-818-server-cert-v2 branch from c49426a to 1248912 Compare September 16, 2022 23:57
@freddygv freddygv merged commit 0095ca0 into main Sep 17, 2022
@freddygv freddygv deleted the NET-818-server-cert-v2 branch September 17, 2022 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/certificates Related to creating, distributing, and rotating certificates in Consul theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies theme/tls Using TLS (Transport Layer Security) or mTLS (mutual TLS) to secure communication
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants