Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server CA Manager cleanup #859

Merged
merged 6 commits into from
Apr 25, 2019
Merged

Conversation

azdagron
Copy link
Member

  • CA no longer returns full SVID chain back to signing authority and
    manager code required to facilitate is removed.
  • X509CA and JWT keys rotate separately (currently at the same interval)
  • removed unnecessary details out of the CA struct so it can be trivially
    used from tests.
  • replaced on-disk JSON store with a protobuf-based journal that
    maintains a number of old entries for debugging.

- CA no longer returns full SVID chain back to signing authority and
manager code required to facilitate is removed.
- removed unnecessary details out of the CA struct so it can be trivally
used from tests.
- replaced on-disk JSON store with a protobuf-based journal that
maintains a number of old entries for debugging.

Signed-off-by: Andrew Harding <[email protected]>
Signed-off-by: Andrew Harding <[email protected]>
Copy link
Member

@evan2645 evan2645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super clean, things are a lot easier to follow now... thank you for this!

A few comments here, some more important than others.

@@ -703,7 +708,7 @@ func (h *Handler) signCSRs(ctx context.Context,
}

} else if spiffeID == h.c.TrustDomain.String() {
h.c.Log.Debugf("Signing downstream SVID for %v on request by %v", spiffeID, callerID)
signLog.Debug("Signing downstream SVID for caller")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be explicit about this being a CA cert

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -691,7 +696,7 @@ func (h *Handler) signCSRs(ctx context.Context,
return nil, errors.New("SVID serial number does not match")
}

h.c.Log.Debugf("Signing SVID for %v on request by %v", spiffeID, callerID)
signLog.Debug("Signing SVID for caller")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be "Renewing agent SVID"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

signLog := h.c.Log.WithFields(logrus.Fields{
"caller_id": callerID,
"spiffe_id": spiffeID,
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we're in here, it would be good to add source ip and port to this entry e.g. source_address

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

token, err := ca.jwtSigner.SignToken(jsr.SpiffeId, jsr.Audience, expiresAt, signer, kp.jwtSigningKey.Kid)
if err != nil {
return "", fmt.Errorf("unable to sign JWT-SVID: %v", err)
func (ca *CA) calculateLifetime(ttl time.Duration, expirationCap time.Time) (notBefore, notAfter time.Time) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: perhaps capLifetime? It would make the purpose of the second argument very clear when seen from a callsite

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

upstreamCA, _ := m.c.Catalog.GetUpstreamCA()
x509CA, trustBundle, err := SignX509CA(ctx, signer, upstreamCA, m.c.UpstreamBundle, m.c.TrustDomain.Host, m.c.CASubject, notBefore, notAfter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to have UpstreamSignX509CA and check ok? This function signature is a mouthful and it seems that SignX509CA complexity can be reduced by a lighter if/else check here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, i did this to DRY up this code when called from the fakeserverca.... i could dup the if/then code if you think it would be clearer...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker. I think this is probably the only real spot where GetUpstreamCA matters, ignoring the ok return parameter is mostly what caught my eye, in addition to the curiousity of why we are passing a signer and also an upstreamCA. I do think having UpstreamSignX509CA would be an improvement, but it's pretty minor.

I didn't carefully consider impact on fakeserverca, though I would probably lean towards optimization in the "real" code. As you wish!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up fixing it and liked the results. good suggestion.

@@ -237,11 +304,21 @@ func (m *manager) pruneBundleEvery(ctx context.Context, interval time.Duration)
}
}

func (m *manager) pruneBundle(ctx context.Context) (err error) {
func (m *Manager) activateJWTKey() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should probably appear after prepareJWTKey and before pruneBundleEvery

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😬 I done goofed resolving a merge conflict. Thanks for catching.

DNSList []string
}

type serverCAConfig struct {
type X509CA struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this struct could be a little clearer if we separately define something like Certificate and UpstreamChain. I know that in node api and workload api we have a single slice where element 0 is the leaf and the rest is the chain (if any), but AFAICT the reason for that is that this is how TLS does it.

IsIntermediate is then defined by len(UpstreamChain) > 0. The IsIntermediate bool isn't often evaluated, and where it is, an empty UpstreamChain seems like it could simplify things. The name IsIntermediate is also a little confusing because when upstream_bundle is false and we are using UpstreamCA, the cert is still technically an intermediate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind that. Will fix.


template, err := CreateServerCATemplate(csrDER, ca.c.TrustDomain.Host, notBefore, notAfter, serialNumber)
template, err := CreateX509SVIDTemplate(csrDER, ca.c.TrustDomain.Host, notBefore, notAfter, serialNumber)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does CA bit get set? Key usages there are different too.

I get worried about DRYing the CA minting process with the regular leaf SVID minting process, and I think that this change validates the paranoia (not that it is your fault, it can happen to anyone). Note that previously we called CreateServerCATemplate. Handling CA certificate signing for downstream servers is a super sensitive operation... aside from having nuanced differences in the minting process, it is also easier to introduce security bugs/holes that pass CR (I almost missed this one myself).

I'd feel a lot more comfortable having repeated code, one path which handles minting of downstream CA certificate, and one which handles minting of leaf certificates.

Relatedly, seems that we don't have adequate unit test coverage on SignX509SVID vs SignX509CASVID. I can see that we were anemic there from before, but now would be an awesome time to beef that up to assert that the former is not giving CA certs, and the latter is giving properly-formed CA certs (e.g. CA bit is set, key usage is appropriate, etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops 😬 I'll un-DRY and beef up the testing.

evan2645
evan2645 previously approved these changes Apr 25, 2019
Copy link
Member

@evan2645 evan2645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!! thanks again, and sorry it took so long to review. A couple small non-blocking nits, approved anyway 🍨 🦅

// UpstreamChain contains the CA certificate and intermediates necessary
// to chain back to the upstream trust bundle. It is only set if the CA
// is signed by an UpstreamCA and the upstream trust bundle is included in
// the SPIRE trust bundle (see the upstream_bundle configurable).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word included is a little confusing since IIUC if upstream_bundle is set, then the SPIRE trust bundle is the upstream bundle? Perhaps It is only set if the CA is signed by an UpstreamCA and upstream_bundle is set to true?

s.NotEmpty(svid.SubjectKeyId, "SubjectKeyId is not set")
s.Equal(x509.KeyUsageKeyEncipherment|x509.KeyUsageKeyAgreement|x509.KeyUsageDigitalSignature, svid.KeyUsage, "key usage does not match")
s.Equal([]x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth, x509.ExtKeyUsageClientAuth}, svid.ExtKeyUsage, "ext key usage does not match")
s.False(svid.IsCA, "CA bit is set")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for this ❤️


// SPIFFE ID should be set to that of the trust domain
if s.Len(svid.URIs, 1, "has no URIs") {
s.Equal("spiffe://example.org", svid.URIs[0].String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be allowing this? Perhaps we should file an issue, I think it is against SPIFFE spec

// when the upstream bundle has been included in the trust bundle (see
// upstream_bundle configurable).
bool is_intermediate = 4;
// DER encoded upstream CA chain. This will also contain the CA certificate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this may also contain the CA certificate?

- changed AllowAnyInTrustDomain to only allow trust domain members, and
not the trust domain ID itself
- cleaned up some comments
- added a few more tests around ID validation for signing ops

Signed-off-by: Andrew Harding <[email protected]>
}

// Subject is hard coded by the CA and should not be pulled from the CSR.
s.Equal("O=SPIRE,C=US", svid.Subject.String())
}

func (s *CATestSuite) TestSignX509SVIDCannotSignTrustDomainID() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

s.Require().NoError(err)
s.Require().Len(svid, 1)
s.Require().Equal(s.clock.Now().Add(-backdate), svid[0].NotBefore)
s.Require().Equal(s.clock.Now().Add(time.Minute), svid[0].NotAfter)
}

func (s *CATestSuite) TestSignCAX509SVIDValidatesTrustDomain() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants