-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workload identity : lack of usable user_claim when using Nomad namespaces and Vault entities #23510
Comments
Hi @the-nando! So just to summarize the problem as you see it here, it's not that bound claims don't work, but that the (For what it's worth, I suspect our intent here is that there's a 1:1 mapping between Nomad namespace and Vault namespace, but I realize that's not always going to be feasible. Especially because Nomad namespaces are in CE and Vault namespaces are in ENT.) |
Hey @tgross 👋 Bound claims works as intended but It would also be worth adding a note in the tutorial mentioning the possible implications of using |
Ok thanks @the-nando. I'll get this surfaced for roadmapping. |
When defining Vault entities the `user_claim` must be unique. When writing Vault binding rules for use with Nomad workload identities the binding rule won't be able to create a 1:1 mapping because the selector language allows accessing only a single field. The `nomad_job_id` claim isn't sufficient to uniquely identify a job because of namespaces. It's possible to create a JWT auth role with `bound_claims` to avoid this becoming a security problem, but this doesn't allow for correct accounting of user claims. Add a new claim `nomad_workload_id` that uniquely identifies a Nomad job by using the namespaced job ID (with a separator that cannot appear inside a namespace name). This will allow any external consumer of WI to use a single claim field for binding rules, so long as that consumer is ok with sharing the binding rule across groups within a job or tasks within a group (at which point they'll need to go look at the task/service fields). Fixes: #23510 Ref: https://hashicorp.atlassian.net/browse/NET-10372 Ref: https://hashicorp.atlassian.net/browse/NET-10387
I've got a draft PR up here #23675. The implementation is easy, but I want to do some testing with Vault to make sure it's getting us what we want so that'll need some E2E testing. |
Hi @the-nando, I was wondering if you could elaborate on this. I can understand that agreeing upon a fully qualified name ahead of time might be a hassle, but I'm worried about the security implications of relying purely on If we do add a new field would it make sense to make it Out of curiosity would #19438 (custom claims) also address this? It would not prevent multiple jobs from sharing a value, but perhaps there's no concern with job submitters being able to do that. If custom claims would address your use case, I have a slight preference for it since it seems very difficult to articulate to users when to use |
Custom claims as described in #19438 could totally solve it but they make the ergonomics for job authors not very nice, as now the job author is responsible for describing the claim for all their jobs. Maybe not bad for "this one job needs it" but if there was a case where many many jobs need third-party auth that needs a claim like However, along those lines what if we made this a server configuration? Ex. cluster administrators could specify extra claims in their server {
identity {
extra_claims = {
"example" = "${region}:${namespace}:${id}"
"whatever" = "${region}:${namespace}:${id}"
}
}
}
vault {
default_identity {
aud = ["vault.io"]
ttl = 1h
extra_claims = {
"example" = "${region}:${namespace}:${id}"
"whatever" = "${region}:${namespace}:${id}"
}
}
}
If we did this, we could allow job authors to have |
Hi @schmichael
I do have overlapping namespaces and job names across clusters but they are connected to different Vault clusters.
@tgross thanks for the input on the custom claims, your answer sums up my point of view as well. A generic solution for custom claims is more versatile and welcome, as long as that be can be controlled at server's configuration level as well. Introducing changes to job specs is often a non-trivial exercise when running hundreds of them deployed by different teams. |
Ok, so @schmichael and I had a chat and I think we've settled on the idea of introducing a extra claims block that accepts template strings in the server configuration. So in the Vault block you'll do something like this: vault {
address = "https://vault.example.com:8200"
enabled = true
default_identity {
aud = ["vault.io"]
ttl = "1h"
extra_claims {
nomad_workload_id = "${job.namespace}:${job.id}"
some_other_claim = "foo"
}
}
} We'll need to do a little investigation to see the exact objects we can expose in those templates, but that's the gist of things. This allows us to avoid adding lots more claims to the JWT that some users might not need, while giving cluster admins the flexibility they need to meet their requirements for controls. We'll also probably want to add the same feature for a top-level |
Upcoming work to add extensibility to identity claims for Vault (ref #23510) will require exposing server configuration and more objects from state to the process of creating an `IdentityClaims` struct. Depending on how we inject these parameters into the constructor, we end up creating circular dependencies or a lot more logic in the setup in the plan applier and alloc endpoint. There are three contexts where we call `NewIdentityClaims`: the plan applier (where we only care about the default identity), signing task identities, and signing service identities. Each needs different parameters. So we'll refactor the constructor as a builder with methods that the caller can decide to use (or not) depending on context. I've pulled this work out of #23675 to make it easier to review separately. Ref: #23510 Ref: #23675 Ref: https://hashicorp.atlassian.net/browse/NET-10372 Ref: https://hashicorp.atlassian.net/browse/NET-10387
Upcoming work to add extensibility to identity claims for Vault (ref #23510) will require exposing server configuration and more objects from state to the process of creating an `IdentityClaims` struct. Depending on how we inject these parameters into the constructor, we end up creating circular dependencies or a lot more logic in the setup in the plan applier and alloc endpoint. There are three contexts where we call `NewIdentityClaims`: the plan applier (where we only care about the default identity), signing task identities, and signing service identities. Each needs different parameters. So we'll refactor the constructor as a builder with methods that the caller can decide to use (or not) depending on context. I've pulled this work out of #23675 to make it easier to review separately. Ref: #23510 Ref: #23675 Ref: https://hashicorp.atlassian.net/browse/NET-10372 Ref: https://hashicorp.atlassian.net/browse/NET-10387
Although we encourage users to use Vault roles, sometimes they're going to want to assign policies based on entity and pre-create entities and aliases based on claims. This allows them to use single default role (or at least small number of them) that has a templated policy, but have an escape hatch from that. When defining Vault entities the `user_claim` must be unique. When writing Vault binding rules for use with Nomad workload identities the binding rule won't be able to create a 1:1 mapping because the selector language allows accessing only a single field. The `nomad_job_id` claim isn't sufficient to uniquely identify a job because of namespaces. It's possible to create a JWT auth role with `bound_claims` to avoid this becoming a security problem, but this doesn't allow for correct accounting of user claims. Add support for an `extra_claims` block on the server's `default_identity` blocks for Vault. This allows a cluster administrator to add a custom claim on all allocations. The values for these claims are interpolatable with a limited subset of fields, similar to how we interpolate the task environment. Fixes: #23510 Ref: https://hashicorp.atlassian.net/browse/NET-10372 Ref: https://hashicorp.atlassian.net/browse/NET-10387
) Although we encourage users to use Vault roles, sometimes they're going to want to assign policies based on entity and pre-create entities and aliases based on claims. This allows them to use single default role (or at least small number of them) that has a templated policy, but have an escape hatch from that. When defining Vault entities the `user_claim` must be unique. When writing Vault binding rules for use with Nomad workload identities the binding rule won't be able to create a 1:1 mapping because the selector language allows accessing only a single field. The `nomad_job_id` claim isn't sufficient to uniquely identify a job because of namespaces. It's possible to create a JWT auth role with `bound_claims` to avoid this becoming a security problem, but this doesn't allow for correct accounting of user claims. Add support for an `extra_claims` block on the server's `default_identity` blocks for Vault. This allows a cluster administrator to add a custom claim on all allocations. The values for these claims are interpolatable with a limited subset of fields, similar to how we interpolate the task environment. Fixes: #23510 Ref: https://hashicorp.atlassian.net/browse/NET-10372 Ref: https://hashicorp.atlassian.net/browse/NET-10387
#23675 has been merged and will ship in the upcoming Nomad 1.8.3 (with backports to Nomad Enterprise 1.7.x and 1.6.x) |
Thanks a LOT @tgross! |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
I'm working on migrating some clusters from the legacy Vault token based integration to the new workload identity based one.
My aim is to be able to create a single Vault entity per workload, set entity specific policies and use that in addition to the generic role's token policy.
The tutorial suggests to use
"user_claim": "/nomad_job_id"
and a templated Vault policy utilising the claim mapped metadata, something along the lines of:To cater for jobs which may require additional ad-hoc policies, I want to pre-create Vault identities for workloads that will have one or more additional identity policies.
To get this to work I would use an entity-alias based on the
user_claim
to map it to that entity. This would allow me to setup a default token workload policy, like in the tutorial, with templated paths and for any exception I could just create a policy with the same name as the one we assign to the entity.The problem is that the
user_claim
isn't unique when one uses/nomad_job_id
in combination with Nomad namespace as the Job ID isn't unique within a Nomad cluster.The implication on the Vault side is that any job by the same name will get assigned the same implied identity which is a potential security risk and that could lead to unintended access to Vault resources.
A workaround is to create a Vault JWT role per workload and configure
bound_claims
:But this invalidates completely the features of Vault entity management. Furthermore, to my knowledge, a JWT user claim must be unique within the system. It would be perhaps better to recommend users to use
"user_claim": "/sub"
if they don't intend to usebound_claims
.What I would like, is to be able to use a unique claim, something like
nomad_workload_id: "<namespace>:::<job_id>"
which can then be leverage on the Vault side to configure entities and aliases accordingly."/sub"
wouldn't work as it contains additional details, like region/taskgroup/task/identity, which are something Vault operator may not know upfront for each job.Can such user_claim be made available?
The text was updated successfully, but these errors were encountered: