Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we restrict the alerting and task manager health APIs further? #102100

Closed
mikecote opened this issue Jun 14, 2021 · 14 comments
Closed

Should we restrict the alerting and task manager health APIs further? #102100

mikecote opened this issue Jun 14, 2021 · 14 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience discuss estimate:medium Medium Estimated Level of Effort Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting Feature:Task Manager security Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

Currently, both health APIs are available to any Kibana user.

  1. Should the alerting framework health API work only for users who access the alerting APIs?
  2. Should the task manager health API work only for system administrators (+superusers)?
@mikecote mikecote added discuss Feature:Alerting Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Jun 14, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@mikecote
Copy link
Contributor Author

mikecote commented Jun 23, 2021

Adding to 7.15 candidates to explore if we need to do this in a major.

@gmmorris gmmorris added the Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework label Jul 1, 2021
@gmmorris gmmorris added the loe:large Large Level of Effort label Jul 6, 2021
@mikecote mikecote added the loe:needs-research This issue requires some research before it can be worked on or estimated label Jul 12, 2021
@gmmorris gmmorris added security estimate:medium Medium Estimated Level of Effort estimate:needs-research Estimated as too large and requires research to break down into workable issues labels Aug 16, 2021
@gmmorris gmmorris removed loe:needs-research This issue requires some research before it can be worked on or estimated loe:large Large Level of Effort labels Sep 2, 2021
@mikecote mikecote changed the title [8.0 only] Should we restrict the alerting and task manager health APIs further? Should we restrict the alerting and task manager health APIs further? Sep 29, 2021
@gmmorris gmmorris added the bug Fixes for quality problems that affect the customer experience label Oct 1, 2021
@gmmorris
Copy link
Contributor

gmmorris commented Oct 1, 2021

We have decided to treat how open these health endpoints are as a bug.

This means we no longer consider this a breaking change, but rather a fix intended on correcting a broken implementation.

@gmmorris
Copy link
Contributor

gmmorris commented Oct 4, 2021

Whoever picks this up, please talk to @elastic/kibana-security to figure out how best to approach this in terms of who should have access. :)

@gmmorris gmmorris removed the estimate:needs-research Estimated as too large and requires research to break down into workable issues label Oct 18, 2021
@gmmorris
Copy link
Contributor

As we have decided to consider this a bug we will aim to get this done asap, including post FF.

@chrisronline chrisronline self-assigned this Oct 20, 2021
@chrisronline chrisronline removed their assignment Oct 21, 2021
@pmuellr
Copy link
Member

pmuellr commented Nov 1, 2021

Still not clear to me HOW to determine a user is authorized to get this info. The easiest thing I think we could do is try to read the task manager index with the es client using the user credentials. So if you can do a search against the index, then you are authorized to use this API. So we'd basically be requiring read access to the task manager index, which would work for super user, and anyone given broader index privileges than usually provided (for example, for diagnostic purposes).

@ymao1 ymao1 self-assigned this Nov 17, 2021
@ymao1
Copy link
Contributor

ymao1 commented Nov 18, 2021

Should the task manager health API work only for system administrators (+superusers)?

We can limit the task manager health API to users with Global All privileges (basically the kibana_admin role) by adding an internal privilege definition for the task manager feature (or the task manager health feature if we want to be more targeted). There are examples of other APIs that do this (thanks @legrego for this information!), like the features API

router.get(
{
path: '/api/features',
options: { tags: ['access:features'] },
validate: {
query: schema.object({ ignoreValidLicenses: schema.boolean({ defaultValue: false }) }),
},
},

global: {
all: [
actions.login,
actions.version,
actions.api.get('decryptedTelemetry'),
actions.api.get('features'),
actions.space.manage,
actions.ui.get('spaces', 'manage'),
actions.ui.get('management', 'kibana', 'spaces'),
actions.ui.get('catalogue', 'spaces'),
actions.ui.get('enterpriseSearch', 'all'),
...allActions,
],

This would be more restrictive than @pmuellr's suggestion of testing for read access to the task manager index. Do we have any concerns that this will make SDHs more difficult in the future? We currently ask for the task manager health output pretty frequently when trying to diagnose issues. Is it too much of a barrier to limit this API to superusers?

@pmuellr
Copy link
Member

pmuellr commented Nov 18, 2021

This would be more restrictive than @pmuellr's suggestion of testing for read access to the task manager index. Do we have any concerns that this will make SDHs more difficult in the future? We currently ask for the task manager health output pretty frequently when trying to diagnose issues. Is it too much of a barrier to limit this API to superusers?

I'm not sure there's much of a difference. You'd need to add a priv to read the index anyway. Feels more natural to go with the flow and make use of this Global All privilege.

@mikecote
Copy link
Contributor Author

@kobelb, @ymao1 and I are looking for your 👍.

We are aiming to restrict the task manager health API to superuser only. Unfortunately, we do not have the data to understand what impacts this change will cause. Especially from a supportability perspective when it comes to users pulling data from those APIs (SDHs). We are thinking of gathering the necessary data (telemetry) to understand the impacts before restricting the APIs. Would this path make sense as compared to making the restriction without knowing the implications?

@kobelb
Copy link
Contributor

kobelb commented Nov 18, 2021

Gathering the necessary data using telemetry before putting this restriction in place makes sense to me. What data were you going to collect using telemetry? All of the privileges that the user has that is accessing the task manager health API or were you going to look for specific privileges?

@mikecote
Copy link
Contributor Author

mikecote commented Nov 19, 2021

What data were you going to collect using telemetry? All of the privileges that the user has that is accessing the task manager health API or were you going to look for specific privileges?

@kobelb We would be looking for a specific privilege. The plan would be to capture if the user has Global All privilege when calling the API (after adding an internal privilege definition to task manager #102100 (comment)). This would help us track the ratio between users having and not having such role. If we see most users have superuser roles, we will be comfortable implementing the restriction, otherwise, we will have to implement restrictions in a more gradual method (ex: warning/deprecation messages, etc.)

@kobelb
Copy link
Contributor

kobelb commented Nov 19, 2021

Thanks, @mikecote. Adding telemetry to determine whether it's safe to restrict access to the task-manager health API to users with the Global All privilege seems fine for the short-term solution.

Long-term, I don't think that we should require such a high-level of privileges to be able to consume this HTTP API. The Global All privilege gives uses the ability to make rather destructive changes to Kibana. This API would ideally be consumed by external systems that are monitoring Kibana's health, and these external systems shouldn't need all of these privileges.

@ymao1
Copy link
Contributor

ymao1 commented Nov 19, 2021

So I am able to add actions.api.get('taskManager') to the Global All privilege, but I don't want to add options: { tags: ['access:taskManager'] } to the task manager health API yet because we don't yet want to block access to non-superusers.

Instead, I would like to use checkPrivilegesDynamicallyWithRequest from the security plugin start contract to manually do this check and generate telemetry accordingly. However, the security plugin has a dependency on taskManager so I can't add security as a dependency of taskManager to get access to its start contract.

@legrego Is there another way to achieve what I'm trying to do?

@jportner
Copy link
Contributor

@legrego Is there another way to achieve what I'm trying to do?

You don't have to use the Security plugin's abstraction for the ES _has_privileges API.
You can call it directly using the ES client.

For example, if you create a "test" user and assign it the "kibana_admin" role, that gives it the Global Admin privilege.

Example of checking this in Dev Tools for the "manage spaces" privilege (which only Global Admins hold):

GET _security/user/_has_privileges
{
  "application": [{
    "application": "kibana-.kibana",
    "privileges": ["space:8.1.0:manage"],
    "resources": ["*"]
  }]
}

Response:

{
  "username" : "test",
  "has_all_requested" : true,
  "cluster" : { },
  "index" : { },
  "application" : {
    "kibana-.kibana" : {
      "*" : {
        "space:8.1.0:manage" : true
      }
    }
  }
}

Here's where the Security plugin uses this API:

const { body } = await clusterClient.asScoped(request).asCurrentUser.security.hasPrivileges({
body: {
cluster: privileges.elasticsearch?.cluster as estypes.SecurityClusterPrivilege[],
index: Object.entries(privileges.elasticsearch?.index ?? {}).map(
([name, indexPrivileges]) => ({
names: [name],
privileges: indexPrivileges as estypes.SecurityIndexPrivilege[],
})
),
application: [
{ application: applicationName, resources, privileges: allApplicationPrivileges },
],
},
});

So you could call it like so:

// plugin constructor
const kibanaVersion = initializerContext.env.packageInfo.version;
...

// plugin setup
const kibanaIndexName = core.savedObjects.getKibanaIndex();
...

const { body } = await clusterClient.asScoped(request).asCurrentUser.security.hasPrivileges({
  body: {
    application: [
      {
        application: `kibana-${kibanaIndexName}`,
        resources: ['*'],
        privileges: [`api:${kibanaVersion}:taskManager`]
      },
    ],
  },
});

The only other way that I can think of to accomplish this would be to flip this dependency around -- expose an extension from Task Manager that the Security plugin could register a handler for. But that seems worse, IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience discuss estimate:medium Medium Estimated Level of Effort Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting Feature:Task Manager security Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

No branches or pull requests

8 participants