Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Implement reliable tests to catch OOMs during rules package installation #188090

Open
Tracked by #179907
xcrzx opened this issue Jul 11, 2024 · 4 comments
Assignees
Labels
8.18 candidate Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area rca-action Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. technical debt Improvement of the software architecture and operational architecture v8.18.0

Comments

@xcrzx
Copy link
Contributor

xcrzx commented Jul 11, 2024

Epic: #179907

Summary

Our existing package installation test has not been effective in catching potential OOMs. The test is designed to emulate the installation of 15,000 rules. However, in a Serverless environment, we've seen OOMs when installing ~5,000 rules. This might be because the integration test limits the heap but cannot control external memory, and in our investigation, we saw a significant amount of external memory being used before an OOM (300+MB).

We need to add an MKI test so that even if we tweak the heap or memory on Serverless, our test will continue to prevent regression.

@xcrzx xcrzx added bug Fixes for quality problems that affect the customer experience triage_needed Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. labels Jul 11, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@xcrzx xcrzx added technical debt Improvement of the software architecture and operational architecture Team:Detections and Resp Security Detection Response Team Team:Detection Rule Management Security Detection Rule Management Team Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area and removed bug Fixes for quality problems that affect the customer experience labels Jul 11, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management)

@sophiec20
Copy link
Contributor

This is an RCA follow up action that is overdue (and unfortunately was labelled incorrectly so we did not notice until now).
Can we please get this item scheduled, so we can help avoid OOMs during package installs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.18 candidate Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area rca-action Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. technical debt Improvement of the software architecture and operational architecture v8.18.0
Projects
None yet
Development

No branches or pull requests

4 participants