-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement upstream authority plugin for GCP #2172
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @dragcp! I have some initial comments/suggestions.
} | ||
if len(allCertRoots) == 0 { | ||
rootSpec := config.RootSpec | ||
return nil, makeError(codes.Internal, "no certificate authorities found with label pair %q:%q", rootSpec.LabelKey, rootSpec.LabelValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning codes.InvalidArgument
feels more appropriate to me here, since the error is due to the current configuration / state rather than an internal plugin failure.
| Configuration | Description | | ||
| ----------------------------- | ----------------------------------------------------------------- | | ||
| root_cert_spec.project_name | Project in GCP that has the root CA certificate | | ||
| root_cert_spec.region_name | The name of the region within GCP | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is implemented without the root_cert_spec.
prefix, which I think that is a good choice.
| root_cert_spec.region_name | The name of the region within GCP | | |
| region_name | The name of the region within GCP | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @amartinezfayo
This configuration table is correct. We do need root_cert_spec. It is the "sample configuration" section which is incorrect. The correct version is there in gcpcas.go as well as in gcpcas_test.go
I will correct and update the sample configuration in this doc to be:
upstreamAuthority "gcp_cas" {
plugin_data {
root_cert_spec {
project_name = "MyProject"
region_name = "us-central1"
label_key = "myapp-identity-root"
label_value = "true"
}
trust_bundle_cert_spec = [
{
project_name = "DifferentProject"
region_name = "us-east1"
label_key = "differentapp-identity-root"
label_value = "true"
}
]
}
With that being the case, is this Configuration table reasonable? Or is there a better way to denote / describe it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now that there is a hierarchy, I missed that. I think that it's good to have the table but maybe split them in two: one for root_cert_spec
and the other one for trust_bundle_cert_spec
configs, and make it explicit. That way we can avoid to have the prefix notation that may be confusing.
Having a configuration example in addition to that is also good IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you
I will incorporate all of your feedback, squash, rebase and repush
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wouldn't mind just pushing new commits on top, it's easier to see the diff of what has changed that way (reduces review burden). A rebase and squash at the end after it has all been sorted out would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushing new commits on top is easier for me. I was under the wrong impression that squash and rebase was necessary
Let me stop squashing
| ----------------------------- | ----------------------------------------------------------------- | | ||
| root_cert_spec.project_name | Project in GCP that has the root CA certificate | | ||
| root_cert_spec.region_name | The name of the region within GCP | | ||
| root_cert_spec.label_key | Label key - value pair is used to filter and select the relevant certificate | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is implemented without the root_cert_spec.
prefix, which I think that is a good choice.
| root_cert_spec.label_key | Label key - value pair is used to filter and select the relevant certificate | | |
| label_key | Label key - value pair is used to filter and select the relevant certificate | |
| root_cert_spec.project_name | Project in GCP that has the root CA certificate | | ||
| root_cert_spec.region_name | The name of the region within GCP | | ||
| root_cert_spec.label_key | Label key - value pair is used to filter and select the relevant certificate | | ||
| root_cert_spec.label_value | Label key - value pair is used to filter and select the relevant certificate | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is implemented without the root_cert_spec.
prefix, which I think that is a good choice.
| root_cert_spec.label_value | Label key - value pair is used to filter and select the relevant certificate | | |
| label_value | Label key - value pair is used to filter and select the relevant certificate | |
| trusted_root_spec | ( Optional ) Array of entries containing project_name, region_name, label_key and label_value as described in root_cert_spec | | ||
|
||
|
||
##Sample configuration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conf/server/server_full.conf
file should be updated with an example configuration also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added it to server_full_conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting the final touches on this @dragcp. Things are looking pretty good. I had one big question around the inclusion of intermediates being included in the trust bundle that don't terminate with a root in the trust bundle that I think we need to figure out.
Also, it looks like the use of cloud.google.com/go
has pulled in a bump to the gRPC and protobuf dependencies. I think it would be best if those dependencies were bumped in their own PR (that includes the changes to all of the regenerated code). I'm happy to put that PR up if you'd rather.
to generate intermediate signing certificates for SPIRE Server. | ||
|
||
# Considerations | ||
This plugin relies on GCP Certificate Authority Service which is currently in Beta. So please do not use this plugin in production |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the plugin is not ready for production, because GCP CAS is not ready for production, we may consider logging a warning during the Configure()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a warning message
LoadCertificateAuthorities(ctx context.Context, spec CertificateAuthoritySpec) ([]*privatecapb.CertificateAuthority, error) | ||
} | ||
|
||
type gcpCAClient struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we typically group unexported structs, methods, and functions below exported ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have switched the order
} | ||
|
||
// Swap out the current configuration with the new configuration | ||
p.setConfig(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be validating the configuration before setting (or replacing) here. At a minimum we need to ensure the required fields are present and where it is possible, ensure the values are within the proper range or have the proper shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am validating it. And added a few tests for that
p.log.Debug("Request to GCP_CAS to mint new X509") | ||
csrParsed, err := x509.ParseCertificateRequest(csr) | ||
if err != nil { | ||
return nil, makeError(codes.Internal, "unable to parse CSR: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want InvalidArgument here since this would imply SPIRE gave the plugin a bad value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
// trust bundle spec | ||
rootBundle := make([]*x509.Certificate, 0, len(trustBundle)) | ||
for _, c := range trustBundle { | ||
// Note. We are intentionally retrieving the immediate CAs from the GCP CAS that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that we can end up with intermediates in the trust bundle that don't chain up to a root also in the trust bundle? This can be very problematic for downstream software, which does not always verify such chains successfully by default (e.g. OpenSSL).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scenario that I have in mind is:
Lets say that there are three departments of the same organization in google cloud project. The three departments have their own CA - say caX, caY and caZ respectively. And they might all have the same root CA - caM
If each department is their own trust domain ( say domain X, Y and Z ) and trust domain X and Y needs to trust each other but not domain Z
If we have caM in the trust bundle, then wouldn't it allow any department to accept any SVID generated from any department?
I was hoping that if we can tag with appropriate labels, then the trust bundle will only contain caX and caY rather than caM
Or is it a precondition that each trust domain should have an exclusive root CA? That would also eliminate the above problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of verification woes by stacks that use OpenSSL (and probably others), I think including intermediates in the trust bundle is dangerous. It makes it much easier to accidentally open up trust in unintended ways. In this case, what you REALLY trust is the root. Without some way of constraining that intermediate, its hard to anchor trust in it. What you don't want is workloads in two trust domains with intermediates in the trust bundle to accidentally trust each other because they got their hands on the root somewhere (say from the system pool) and included it in their verification chain. That opens up doors for one trust domain to impersonate workloads in the other trust domain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is it a precondition that each trust domain should have an exclusive root CA? That would also eliminate the above problem
It's not strictly required, but is strongly encouraged and generally assumed
I agree with everything @azdagron said. SPIRE used to include intermediates in the bundle (and exclude the real roots) to address the use case you're describing, but we found that in practice it didn't work well, wasn't very safe, and led to a lot of confusion. Removing certs from the bundle is hard from a compat standpoint - adding them is of course easy :)
I think people that find themselves in the circumstance you describe... which I might summarize as "We all have to have intermediates that share a root due to corporate policy, but we don't really trust each other" ... should lean on name constraints to solve their problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thank you. Switched to root rather than intermediate
return allCerts, nil | ||
} | ||
|
||
func TestGcpCAS(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: prefer the test methods to be at the top and all of the scaffolding (e.g. fakeClient
) to be below. This follows our general guidance of exported methods/types sorting before unexported methods/types.
Do I need to do "go get cloud.google.com/go", "make all" and "make generate-check" and commit whatever has changed to a new PR? |
We can scope it more finely to just gRPC and protobuf with:
|
Thank you. I have raised a PR at #2175 to upgrade the dependencies separately |
#2175 has been merged. |
I just merged a big change related to UpstreamAuthority plugin interfaces that will cause a some conflicts here. If you don't mind, I can merge in master and push a commit to fix it all up. Happy to clean up my mess. Please let me know if you are ok with that. |
) | ||
|
||
func makeError(code codes.Code, format string, args ...interface{}) error { | ||
return status.Errorf(code, "GCPCAS: "+format, args...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error must never start with uppercase, this may be:
return status.Errorf(code, "gcp_cas: "+format, args...)
TrustBundleSpecs []CertificateAuthoritySpec `hcl:"trust_bundle_cert_spec,block"` | ||
} | ||
|
||
type CAClient interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this interface, with this approach we are not able to tests our filters that are inside LoadCertificateAuthorities,
what do you think about our LoadCerticates only calls gcp but filter and maybe sort is done post loading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have separated them out and updated test case to cover it
) | ||
|
||
func generateCert(cn string, issuer *x509.Certificate, issuerKey crypto.PrivateKey) (*x509.Certificate, crypto.PrivateKey, error) { | ||
priv, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not to use keys we have on fixture? https://github.com/spiffe/spire/blob/master/test/testkey/new.go#L41
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using it now
`}) | ||
require.NoError(t, err) | ||
|
||
priv, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may we use fixture here?
Sorry @azdagron. I did not notice your comment till now. Thank you. Please feel free to do so I have addressed the review comments locally. After you push the changes, I will commit on top of that |
Signed-off-by: ramand.dragcp <[email protected]>
Co-authored-by: Agustín Martínez Fayó <[email protected]> Signed-off-by: ramand.dragcp <[email protected]>
Thanks, @dragcp. I'll take a look. |
conf/server/server_full.conf
Outdated
# project_name = "" | ||
# region_name = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong keys here
conf/server/server_full.conf
Outdated
# project_name = "" | ||
# region_name = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong keys here
// Also pick up if there any additional trust CAs (as per label in plugin config) | ||
for _, spec := range config.TrustBundleSpecs { | ||
trustBundleCerts, err := pcaClient.LoadCertificateAuthorities(ctx, spec) | ||
trustBundleCerts = filterOutNonEnabledCAs(trustBundleCerts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you move this after the error check? in general, using the result without checking the error first is a code smell.
Maybe we do the Load+Filter step in a helper that is shared for both the root spec and trust bundle spec loading.
func loadCertificateAuthorities(ctx context.Context, caClient CAClient, spec CertificateAuthoritySpec) ([]*privatecapb.CertificateAuthority, error) {
cas, err := caClient.LoadCertificateAuthorities(ctx, spec)
if err != nil {
// handle error
}
return filterOutNonEnabledCAs(cas), nil
}
I've given it a lot of thought. I don't think we want the functionality provided by In other words, the trust bundle is not the appropriate way to distribute other roots to workloads. There are other mechanisms for that. The Maybe I'm missing something here? |
I've just merged #2180. Consequently, you'll need to touch up the import for the upstreamauthority definitions. Old import: If you'd rather me touch it up and push a commit, I'm happy to oblige. Just let me know. |
Wouldn't it be the same as "supplemental_bundle_path" in aws_pca plugin? The only difference is that this is coming from GCP CAS instead of local disk The associated ticket has a use case listed for two sets of root CA in a single trust domain where we sign with one set but the trust bundle has both: |
Thank you. I will update it as part of my fixes |
Can you help me understand maybe a concrete use case that requires the functionality provided by trust bundle spec? I've dug into the previous PR when it was introduced and I'm not convinced there wasn't a misunderstanding there on how the trust bundle is built and maintained through the preparation and activation intervals, based on the responses from the UpstreamAuthority. I'm not opposed to it per-se, I just need to understand the concrete scenario it is solving. If the use case is theoretical, I'd probably advocate for leaving it out for now. It's much easier to add something like this later than it is to deprecate it out if it turns out to be the wrong thing. |
We have an application composed of multiple services talking to each other. We might also have the same service running concurrently with different versions of the software ( say RC vs Prod ). Any service can call any other service including RC version calling Production version. We want each service to authorize the call based on which service is calling and whether that service is running non-prod version or prod version We could model this as single trust domain. There are a set of root CAs for prod and set of root CAs for non-prod. Then we can have a set of spire servers that sign with only non-prod CAs and another set of spire servers which sign only with prod CAs. The trust bundle in all cases will contain both the prod and non-prod root CAs. In this case, we will have two sets of information. (1) the SPIFFE Id identifies the workload. Say spiffe://mytrustdomain.org/serviceA. (2) the rootCA can tell us whether it was a prod version of the software or non-prod version of the software So after a service to service call has performed mTLS for authentication, it can traverse the CA chain to identify whether it was signed by prod CA or non-prod CA and use that information to take authorization decision Extending it further, we could have few more intermediate certificates with parent as non-prod CA for finer grained authorization ( than just a binary prod vs non-prod ) |
Thank you for taking the time to provide that use case. Some questions:
Do these SPIRE servers share a database (i.e. configured to use the same datastore)? If they have a shared database, then there is no need for each SPIRE server to return anything but the exact upstream root that signed its intermediate, since upstream roots are appended by each SPIRE server to the trust bundle. The datastore layer is the canonical source of the trust bundle for the trust domain. The roots returned from the UpstreamAuthority plugin are _not _ treated as the definitive set of authorities, but are instead appended to the existing roots present in bundle.
An alternative here would be to model the prod and non-prod services as having distinct SPIFFE IDs (e.g. spiffe://mytrustdomain.org/prod/serviceA v.s. spiffe://mytrustdomain.org/prod/serviceB). One complication of using the rootCA as an indicator of whether it is prod or non-prod is that each 1) workload needs to understand what the set of prod CAs is, and that understanding needs to survive CA rotation. 2) each workload needs to perform this awkward inspection of the root CA to determine prod v.s. non-prod. |
Furthermore, let's say you decide later on that you need to federate with other trust domains. Now you've likely pushed this prod/non-prod authorization semantic onto the other trust domain. |
Thank you Let me discuss this with @Jonpez2 and get back to you tomorrow |
Awesome. I appreciate your willingness to work through these things. We're very excited about this plugin. |
Could you please help with one data point: Is there any functional difference between two different spiffe servers running with isolated independent databases ( say sqllite ) vs a single shared database? Is it that: |
Assuming each server has an independent CA, whether self-signed or a distinct upstream CA, then these are effectively two trust domains that happen to share the same name. That's one difference. There are others, like each server will have its own set of registration entries, view of attested agents, etc. |
Thank you |
Another alternative: use a distinct trust domain for prod v.s. non-prod and federate the two. |
Signed-off-by: ramand.dragcp <[email protected]>
Signed-off-by: ramand.dragcp <[email protected]>
Thank you for all the suggestions. I have dropped trust_bundle_cert_spec and also merged in from the master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement upstream authority plugin for GCP Signed-off-by: Andrew Harding <[email protected]>
Signed-off-by: ramand.dragcp [email protected]
Pull Request check list
Affected functionality
Adds a new plugin for GCP Certificate Authority Service backed upstream authority
Description of change
Thank you for reviewing the PR #2039 by @Jonpez2. I am continuing that PR here. I believe I have incorporated all of the feedback
A few notes:
Which issue this PR fixes