[Fleet] Fix agent policy selection and creation when there are a large number of agent policies ( > 1k) #151119

hop-dev · 2023-02-14T11:28:01Z

Summary

When adding an integration, the agent policy selector (create or select existing agent policy) does not work if there are too many agent policies.

In an environment with a large number of agent policies (e.g 1000+) the following API request times out or takes 30 - 60 seconds to respond:

http://localhost:5601/mark/api/fleet/agent_policies?page=1&perPage=10000&sortField=name&sortOrder=asc&full=true

That is because for each agent policy we get the saved object twice, get ALL package policies, and count the agents, so for 1000 agents we fun 4000 queries using pmap in some places.

this PR changes it so that we do not get the agent count or full agent policy for every agent policy when populating the agent policy select, instead we only get the selected agent policy.

This has meant that I have to separately query package policies in order to do the limited packages and APM output checks, but the component now loads instantly on an environment with 3k agent policies.

Key changes:

add noAgentCount query param to agent policies API, to not count the agents for all agent policies
populate agents in the get one agent policy API, this was an inconsistency in our agent policy API
when selecting an existing agent policy, only get the full agent policy for the selected agent policy

###Testing

No behaviour should have changed for the agent policy select, some niche behaviour of the selector:

if adding the APM integration, any agent policy with a logstash output configured for integration data should be disabled
if adding a limited integration e.g apm or defend, any agent policy already containing that integration should be disabled

… policies

kibana-ci · 2023-02-14T12:27:38Z

💚 Build Succeeded

Buildkite Build
Commit: ecf601c

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`fleet`	927.0KB	927.3KB	+288.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`fleet`	124.3KB	124.3KB	+38.0B

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @hop-dev

elasticmachine · 2023-02-14T12:57:53Z

Pinging @elastic/fleet (Team:Fleet)

juliaElastic · 2023-02-14T13:21:34Z

x-pack/plugins/fleet/common/openapi/paths/agent_policies.yaml

+    - schema:
+        type: boolean
+      in: query
+      name: noAgentCount


I would find it more intuitive to have a flag when we want to include agentCount, rather than having a flag to disable. Though this might be a bigger refactor if the API is already used in many places.

I agree, I was so tempted to do it but this change is already larger than I expected. It would be a breaking change to the API

You're right, we shouldn't do a breaking change.

juliaElastic

LGTM, great improvement.

…e number of agent policies ( > 1k) (elastic#151119) ## Summary Closes elastic#150605 When adding an integration, the agent policy selector (create or select existing agent policy) does not work if there are too many agent policies. In an environment with a large number of agent policies (e.g 1000+) the following API request times out or takes 30 - 60 seconds to respond: ``` http://localhost:5601/mark/api/fleet/agent_policies?page=1&perPage=10000&sortField=name&sortOrder=asc&full=true ``` That is because for each agent policy we get the saved object twice, get ALL package policies, and count the agents, so for 1000 agents we fun 4000 queries using `pmap` in some places. this PR changes it so that we do not get the agent count or `full` agent policy for every agent policy when populating the agent policy select, instead we only get the selected agent policy. This has meant that I have to separately query package policies in order to do the limited packages and APM output checks, but the component now loads instantly on an environment with 3k agent policies. Key changes: - add `noAgentCount` query param to agent policies API, to not count the agents for all agent policies - populate `agents` in the get one agent policy API, this was an inconsistency in our agent policy API - when selecting an existing agent policy, only get the full agent policy for the selected agent policy ###Testing No behaviour should have changed for the agent policy select, some niche behaviour of the selector: - if adding the APM integration, any agent policy with a logstash output configured for integration data should be disabled - if adding a limited integration e.g apm or defend, any agent policy already containing that integration should be disabled <img width="1052" alt="Screenshot 2023-02-14 at 10 45 03" src="https://user-images.githubusercontent.com/3315046/218745430-4a8f0ded-1e0b-4319-bc2c-cc5253b4cdd2.png"> (cherry picked from commit 9a52ef4)

kibanamachine · 2023-02-14T13:40:52Z

💚 All backports created successfully

Status	Branch	Result
✅	8.7

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

…a large number of agent policies ( > 1k) (#151119) (#151131) # Backport This will backport the following commits from `main` to `8.7`: - [[Fleet] Fix agent policy selection and creation when there are a large number of agent policies ( > 1k) (#151119)](#151119)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Mark Hopkin <[email protected]>

…e number of agent policies ( > 1k) (elastic#151119) ## Summary Closes elastic#150605 When adding an integration, the agent policy selector (create or select existing agent policy) does not work if there are too many agent policies. In an environment with a large number of agent policies (e.g 1000+) the following API request times out or takes 30 - 60 seconds to respond: ``` http://localhost:5601/mark/api/fleet/agent_policies?page=1&perPage=10000&sortField=name&sortOrder=asc&full=true ``` That is because for each agent policy we get the saved object twice, get ALL package policies, and count the agents, so for 1000 agents we fun 4000 queries using `pmap` in some places. this PR changes it so that we do not get the agent count or `full` agent policy for every agent policy when populating the agent policy select, instead we only get the selected agent policy. This has meant that I have to separately query package policies in order to do the limited packages and APM output checks, but the component now loads instantly on an environment with 3k agent policies. Key changes: - add `noAgentCount` query param to agent policies API, to not count the agents for all agent policies - populate `agents` in the get one agent policy API, this was an inconsistency in our agent policy API - when selecting an existing agent policy, only get the full agent policy for the selected agent policy ###Testing No behaviour should have changed for the agent policy select, some niche behaviour of the selector: - if adding the APM integration, any agent policy with a logstash output configured for integration data should be disabled - if adding a limited integration e.g apm or defend, any agent policy already containing that integration should be disabled <img width="1052" alt="Screenshot 2023-02-14 at 10 45 03" src="https://user-images.githubusercontent.com/3315046/218745430-4a8f0ded-1e0b-4319-bc2c-cc5253b4cdd2.png">

hop-dev added 2 commits February 14, 2023 11:17

add noAgentCount query param

7e677a1

greatly improve performance of agent policy select over lots of agent…

ecf601c

… policies

hop-dev self-assigned this Feb 14, 2023

hop-dev added release_note:fix Team:Fleet Team label for Observability Data Collection Fleet team v8.7.0 v8.8.0 labels Feb 14, 2023

hop-dev marked this pull request as ready for review February 14, 2023 12:57

hop-dev requested a review from a team as a code owner February 14, 2023 12:57

juliaElastic reviewed Feb 14, 2023

View reviewed changes

juliaElastic approved these changes Feb 14, 2023

View reviewed changes

hop-dev merged commit 9a52ef4 into elastic:main Feb 14, 2023

hop-dev deleted the 150605-limit-agent-policies-when-selecting branch February 14, 2023 13:36

kibanamachine mentioned this pull request Feb 14, 2023

[8.7] [Fleet] Fix agent policy selection and creation when there are a large number of agent policies ( > 1k) (#151119) #151131

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet] Fix agent policy selection and creation when there are a large number of agent policies ( > 1k) #151119

[Fleet] Fix agent policy selection and creation when there are a large number of agent policies ( > 1k) #151119

hop-dev commented Feb 14, 2023 •

edited by kibanamachine

Loading

kibana-ci commented Feb 14, 2023

elasticmachine commented Feb 14, 2023

juliaElastic Feb 14, 2023

hop-dev Feb 14, 2023

juliaElastic Feb 14, 2023

juliaElastic left a comment

kibanamachine commented Feb 14, 2023

[Fleet] Fix agent policy selection and creation when there are a large number of agent policies ( > 1k) #151119

[Fleet] Fix agent policy selection and creation when there are a large number of agent policies ( > 1k) #151119

Conversation

hop-dev commented Feb 14, 2023 • edited by kibanamachine Loading

Summary

kibana-ci commented Feb 14, 2023

💚 Build Succeeded

Metrics [docs]

Async chunks

Page load bundle

elasticmachine commented Feb 14, 2023

juliaElastic Feb 14, 2023

Choose a reason for hiding this comment

hop-dev Feb 14, 2023

Choose a reason for hiding this comment

juliaElastic Feb 14, 2023

Choose a reason for hiding this comment

juliaElastic left a comment

Choose a reason for hiding this comment

kibanamachine commented Feb 14, 2023

💚 All backports created successfully

Questions ?

hop-dev commented Feb 14, 2023 •

edited by kibanamachine

Loading