[Discuss] Indexing permissions as part of the Elastic Agent policy #101

ruflin · 2021-02-23T15:50:47Z

Currently all Elastic Agents enrolled into the fleet-server get the same permissions. To cover all the use cases these are the most permissive permissions. This proposal is to reduce the permissions given to an Elastic Agent based on the the policy. With this, each Elastic Agent would receive an API Key with the minimal permissions needed to get its job done.

Elastic Agent Policy contains permissions block

The fleet-server creates the API Keys for the Elastic Agents. Which permissions an Elastic Agent requires is based on the content of the policy. Because of this the policy should contain a section with the permissions it requires. This section could look similar to the following (inspired by Elasticsearch API Key permissions):

policy: A
permissions: [
  {
    "names": ["logs-*", "metrics-*"],
    "privileges": ["create_doc"]
  }
]

The above would give create_doc permissions for the logs-* and metrics-* data streams. If an Elastic Agent is enrolled for the policy A, an Elasticsearch output API Key would be created with only the above permissions and added to the policy.

The above model also works for more complex cases. Lets assume we have a case where only 2 indices should be allowed to be written to and one index should have read permissions. This could look as following:

policy: B
permissions: [
  {
    "names": ["logs-nginx.access-default", "logs-nginx.error-default"],
    "privileges": ["create_doc"]
  },
  {
    "names": ["state-docs"],
    "privileges": ["create_doc", "read"]
  }
]

If an Elastic Agent is enrolled for policy B, the permissions to write to the nginx indices is given and read from the state-docs index.

In general, this model can be used to be as permissive or restrive as needed based on the Elasticearch permission model. The limitation on the maximum permissions that can be given to an Elastic Agent is the permissions the fleet-server user itself has.

Change of policy

A change to a policy can mean the permission required on the Elastic Agent changes. For example at first only nginx access logs were collected but now also error logs. This means additional privileges for the error log data stream are required. The fleet-server must be able to hand out new API Keys with increased / reduced permissions in case of policy changes. In addition, old API Keys have to be invalidated.

Same permissions for all processes per Elastic Agent

The above assumes there are no sub permissions per input in an Elastic Agent. Whatever runs in an Elastic Agent, the input that requires the most permissions will defined the permissions of the API Key. If for example APM Integration is run together with the Endpoint integration and APM process requires read access to certain indices, the Endpoint process would also get the same permissions. This simplifies the permissions model.

On the policy creation side, it is important to notify the user about potential issues through concept likes Trusted / Untrusted integrations or similar, but this is not part of the privileges concept itself.

Fleet

The permissions which are part of the policy need to be created somewhere. It is expected that these are created in Fleet. Every integration should have the option to specify which permissions it needs. Based on this information and the user input like namespace, Fleet needs to generate the permission block. The UX and parts needed in Fleet should be discussed separately.

The text was updated successfully, but these errors were encountered:

scunningham · 2021-02-23T17:08:59Z

To the extent that can be achieved, it would desirable for the fleet server to limit the amount of interpretation it has to do. Ideally, the permissions required would be listed explicitly in a section of the policy, or if necessary, multiple sections that the fleet server has to combine. It's not yet clear where in the system these explicit permission will be enumerated and what support is necessary to do so.

Also, there will be some race conditions to deal with here. Ideally, the fleet server would not revoke the previous ingest key until the policy was ACK'd by the agent. This would largely prevent the scenario where the key was delayed in rolling out, or we lost comms to the agent immediately after the old key was revoked, but before the new policy rolled out. That said, there will be some stateful book-keeping in the fleet server to accomplish this.

scunningham · 2021-02-23T17:14:02Z

Also, Fleet-Server should have the permission to create these data streams. Basically, for any explicitly enumerated data stream that does not have the auto_configure, Fleet-Server will need to attempt to create the data stream (and no-op if already there). This is to avoid having to provide auto_configure privileges to untrusted endpoints, which is what we have right now.

ph · 2021-02-23T18:22:44Z

The data stream fields are already present in each input or default values already exists, if we add a "permission" key under the "input", the stream, or both the input can be traversed and an aggregated set of permissions be generated.

@ruflin Where the required permissions would be defined, would it be the package responsable of defining the permissions. The system integration should only be able to define permissions for logs-system.*- and `metrics-system.-,

ph · 2021-02-23T18:23:51Z

@andresrc FYI, I am adding this item to both Agent and Fleet board, once the design we surely have 2 (or more issues)

ph · 2021-02-23T18:29:26Z

This is the design issue to fix the APM-Server different permissions

Requirement for Kibana API Key for Agent subprocess [Fleet] Requirement for Kibana API Key for Agent subprocess kibana#89311

With the above proposal the Elastic Agent would still have a single api key to talk with Elasticsearch, that API key will have the correctly defined permission defined on it based on the configuration. Based on this dicussion I will close elastic/kibana#91704

axw · 2021-02-24T02:42:45Z

Thanks for filing this @ruflin. This sounds similar to what I've had in mind, like an app store type approach: install an app (integration), and the store (Fleet/Integrations UI) prompts you to grant additional privileges. I guess we can skip this until untrusted integrations need the feature.

In an email thread @simitt raised the principle of least privilege. I tend to agree that in an ideal world we would restrict each integration's process from obtaining another integration's privileges that it doesn't need access to; but there's no practical issue with this for APM's foreseeable future.

ruflin · 2021-02-24T15:36:57Z

@scunningham My assumption was that Fleet (Kibana) would be the tool that sets up all the data streams if need as setup of the stack already happens there. fleet-server just takes the permissions and creates API keys based on it.

@ph I assume the definition of the permissions needed should happen where the data streams are defined. So this would be in the manifest of the data_stream directory for example. But we should have a separate discussion around this in package-spect repo.

scunningham · 2021-02-24T15:38:50Z

@ruflin I would prefer if Kibana creates the data streams as well, assuming the kibana_system user has permissions to do that.

scunningham · 2021-02-24T15:40:35Z

I should clarify that to the extent we can, we should not be giving wild carded privileges to the endpoints. Even an append-only logs-* privilege will allow an attacker to scribble over any log data stream in the system. We should try to limit permissions as specifically to the target data streams as possible.

ruflin · 2021-02-24T15:49:30Z

I don't think the kibana_system user has and should have the permission. But the user that creates the policy should have sufficient permissions, be it in the UI or API call. The creation of the data_streams happens as soon as the user hits save for a policy.

With the above model, we delegate the responsibility about how many privileges an integration needs to the creator of the integration. The model would still allow for someone to use logs-*-* if really needed, fleet-server would just create the API keys needed. Instead I think we should have additional features in Fleet where certain flags can be set on policies. So only if a user sets trusted flag on a policy for example, the permissions in the policy allow to have * or dynamic fields are allowed. Otherwise Fleet would overwrite certain flags from the integrations and hopefully tell the user that not all the permissions are ok for this policy.

scunningham · 2021-02-24T15:57:50Z

Yep, that sounds right. Customer should not be able to implicitly create data streams by adding an integration to a policy without having the privileges to do so. For the exceptional cases, and there will be some, the user adding the integration should have those privileges as well. Exceptional cases should be called out in the UX by marking a policy as untrusted or some other obvious treatment.

ruflin · 2021-02-25T09:31:16Z

Do we forsee any other permissions which might be given out to Agents beside read / write index permissions? The reason I'm asking is because if we only need index permissions, we could just take the ES format that exists today and add it to the policy. So if new permissions are introduced, only Kibana would need to be updated but fleet-server would just forward the request to ES and not know about the details. One additional permission I have in mind is around pulling artifacts. Not directly defined in API key itself but fleet-server should know if agents are allowed to pull artifacts or not.

ruflin · 2021-02-25T10:37:01Z

Thinking out loud about a potential implementation: We add this feature to fleet-server and Fleet at first, just keep shipping down what we have defined today as defaults. This makes it possible that we can start to iterate on this feature in Kibana and don't require changes to fleet-server.

ph · 2021-03-08T19:01:31Z

@ruflin looking at #101 (comment) I think @aleksmaus could take that action item?

ruflin · 2021-03-09T10:00:26Z

I filed elastic/kibana#94058 on the Kibana side as a follow up.

There was an offline discussion around how we detect changes to the permissions to not have to give out new API keys all the time. The idea here was to potentially use a hash that is persisted and used for comparison. @nchaulet This might also solve elastic/kibana#85777 ?

ph · 2021-03-09T16:06:53Z

@ruflin Concerning the hash, this is something that fleet-server would compute? We have code in place in the Elastic Agent that can be reused to generate a predictable hash for a specific tree-structure.

ruflin · 2021-03-10T12:02:44Z

The hash could be calculated by Fleet or fleet-server. I was thinking of fleet-server as it needs to read and store the config anyway. Someone should think through the pros / cons here.

scunningham · 2021-03-10T12:07:35Z

Fleet Server should normalize the input and calculate a sha2 on the normalized input. This is the natural place for the calculation is the Fleet Server will be making the decision whether or not it needs to regenerate an output API key based on this input. Sha2 is pretty brute force; a more subtle method could be employed later; however I think it is sufficient to handle the 99% use case here.

By normalized, I mean pull in the json, pretty print it out, and hash that. We want to avoid the situation where the upstream processor (kibana) generates random key order breaking the hash on each policy update.

ph · 2021-03-10T12:36:45Z

This is what I had in mind, we have something that does the normalization in elastic-agent to decide if the configuration changed so we could use the same logic there.

ruflin · 2021-03-17T13:06:12Z

One issue to think through is how and when we revoke the old api keys.

aleksmaus · 2021-03-25T01:13:26Z

For the very first cut PR that works against elastic/kibana#94591

not invalidating the api keys

In the next iteration we could:
keep and array of old keys id (maybe with ttl, possibly with coordinar/revision idxs) within the agent document. revoke the keys as soon as we get the policy coordinator/revision past the values that are kept or expire them within a reasonable time (1 week)?

ruflin · 2021-03-25T13:45:18Z

If we get the ACK from all Agents for a policy, does it mean already all sub processes are shipping data now with the new API Keys or can be there a certain delay, meaning some of the events are still send with the old API Keys? TTL of 1 week sounds pretty long, assuming all Agents have acked, we could probably choose something shorter?

aleksmaus · 2021-03-25T16:15:34Z

Just had a conversation with @blakerouse over Slack. As far as I understood the beats are not restarted, and will continue using the the old key.
Here is the issue about that:
elastic/beats#24538

So it's not clear at the moment when it it safe to invalidate the key.

It would be nice to be able to update the output key without restarting the beats though.

ruflin · 2021-03-26T07:30:11Z

@urso Would it be possible to exchange API Keys in the output without a restart of the Beat?

ruflin · 2021-03-31T11:37:21Z

Closing as #187 was merged.

ph added the Team:Fleet Label for the Fleet team label Feb 23, 2021

ph added the Team:Elastic-Agent Label for the Agent team label Feb 23, 2021

ph mentioned this issue Feb 23, 2021

[Fleet] Create additional API Keys for inputs elastic/kibana#91704

Closed

jalvz mentioned this issue Feb 24, 2021

[meta] APM Server managed by Elastic Agent with Fleet (GA) elastic/apm-server#4636

Closed

16 tasks

scunningham mentioned this issue Feb 24, 2021

Append-only privilege for untrusted endpoints elastic/elasticsearch#68414

Open

ph assigned ruflin Feb 24, 2021

ruflin mentioned this issue Mar 8, 2021

[Meta] Fleet Server Phase 2 #91

Closed

20 tasks

ph unassigned ruflin Mar 8, 2021

ph added Team:Elastic-Agent Label for the Agent team and removed Team:Elastic-Agent Label for the Agent team Team:Fleet Label for the Fleet team labels Mar 8, 2021

ph assigned aleksmaus Mar 8, 2021

ruflin mentioned this issue Mar 9, 2021

[Fleet] Add permission specs to policy elastic/kibana#94058

Closed

ruflin mentioned this issue Mar 9, 2021

[Fleet] handle API key upgrade/renewal elastic/kibana#85777

Closed

aleksmaus mentioned this issue Mar 25, 2021

Indexing permissions as part of the Elastic Agent policy #169

Merged

3 tasks

aleksmaus mentioned this issue Mar 25, 2021

Add policy_output_permissions_hash field to .fleet-agents mapping elastic/kibana#95410

Merged

aleksmaus mentioned this issue Mar 30, 2021

Indexing permissions as part of the Elastic Agent policy #187

Merged

3 tasks

ruflin closed this as completed Mar 31, 2021

ruflin mentioned this issue Jul 7, 2021

[Fleet] Too many privileges for Elastic Agent data streams elastic/kibana#104622

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discuss] Indexing permissions as part of the Elastic Agent policy #101

[Discuss] Indexing permissions as part of the Elastic Agent policy #101

ruflin commented Feb 23, 2021 •

edited by ph

Loading

scunningham commented Feb 23, 2021

scunningham commented Feb 23, 2021

ph commented Feb 23, 2021

ph commented Feb 23, 2021

ph commented Feb 23, 2021

axw commented Feb 24, 2021

ruflin commented Feb 24, 2021

scunningham commented Feb 24, 2021

scunningham commented Feb 24, 2021

ruflin commented Feb 24, 2021

scunningham commented Feb 24, 2021

ruflin commented Feb 25, 2021

ruflin commented Feb 25, 2021

ph commented Mar 8, 2021

ruflin commented Mar 9, 2021

ph commented Mar 9, 2021

ruflin commented Mar 10, 2021

scunningham commented Mar 10, 2021

ph commented Mar 10, 2021

ruflin commented Mar 17, 2021

aleksmaus commented Mar 25, 2021 •

edited

Loading

ruflin commented Mar 25, 2021

aleksmaus commented Mar 25, 2021 •

edited

Loading

ruflin commented Mar 26, 2021

ruflin commented Mar 31, 2021

[Discuss] Indexing permissions as part of the Elastic Agent policy #101

[Discuss] Indexing permissions as part of the Elastic Agent policy #101

Comments

ruflin commented Feb 23, 2021 • edited by ph Loading

Elastic Agent Policy contains permissions block

Change of policy

Same permissions for all processes per Elastic Agent

Fleet

scunningham commented Feb 23, 2021

scunningham commented Feb 23, 2021

ph commented Feb 23, 2021

ph commented Feb 23, 2021

ph commented Feb 23, 2021

axw commented Feb 24, 2021

ruflin commented Feb 24, 2021

scunningham commented Feb 24, 2021

scunningham commented Feb 24, 2021

ruflin commented Feb 24, 2021

scunningham commented Feb 24, 2021

ruflin commented Feb 25, 2021

ruflin commented Feb 25, 2021

ph commented Mar 8, 2021

ruflin commented Mar 9, 2021

ph commented Mar 9, 2021

ruflin commented Mar 10, 2021

scunningham commented Mar 10, 2021

ph commented Mar 10, 2021

ruflin commented Mar 17, 2021

aleksmaus commented Mar 25, 2021 • edited Loading

ruflin commented Mar 25, 2021

aleksmaus commented Mar 25, 2021 • edited Loading

ruflin commented Mar 26, 2021

ruflin commented Mar 31, 2021

ruflin commented Feb 23, 2021 •

edited by ph

Loading

aleksmaus commented Mar 25, 2021 •

edited

Loading

aleksmaus commented Mar 25, 2021 •

edited

Loading