Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Rule import creates extra rules while importing a large number of rules #176207

Open
Tracked by #179907
maximpn opened this issue Feb 5, 2024 · 6 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Rule Import/Export Security Solution Detection Rule Import & Export workflow Feature:Rule Management Security Solution Detection Rule Management area impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc.

Comments

@maximpn
Copy link
Contributor

maximpn commented Feb 5, 2024

Kibana version:
8.12.0

Describe the bug:

More rules than expected are created upon importing rules.

Screenshot 2024-02-05 at 10 26 32
Screenshot 2024-02-05 at 10 27 19
Screenshot 2024-02-05 at 10 29 23

In attempt to import 9930 rules in fact 118411 rules got created.

Steps to reproduce:

  • Clear ES state
  • Open Rules Management table
  • Make sure there are no custom rules in the table nor prebuilt rules installed
  • Import 9930 rules by providing rules_export.ndjson.zip (unzipping is required)
  • Wait for the operation to be finished

ER: 9930 rules got imported and there are no any errors/warning appearing.
AR: 118411 got imported and there are error about conflicts.

@maximpn maximpn added bug Fixes for quality problems that affect the customer experience impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Feature:Rule Management Security Solution Detection Rule Management area Team:Detection Rule Management Security Detection Rule Management Team Feature:Rule Import/Export Security Solution Detection Rule Import & Export workflow labels Feb 5, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management)

@banderror
Copy link
Contributor

banderror commented Feb 5, 2024

@maximpn Not sure I correctly understand the description and steps to reproduce. Does the app create rules on export? Could you please record a video demonstrating the bug?

@banderror
Copy link
Contributor

Chatted with @maximpn about the fact that this can also happen on rule duplication and rule upgrade and is likely caused by a race condition around rule ids. So would make sense to check these workflows as well.

@maximpn maximpn changed the title [Security Solution] Rule import creates extra rules while exporting large number of rules [Security Solution] Rule import creates extra rules while importing large number of rules Feb 8, 2024
@maximpn maximpn changed the title [Security Solution] Rule import creates extra rules while importing large number of rules [Security Solution] Rule import creates extra rules while importing a large number of rules Feb 8, 2024
@maximpn maximpn self-assigned this Feb 9, 2024
maximpn added a commit that referenced this issue Mar 6, 2024
…ule Management API endpoints (#177329)

**Fixes: #177277

## Summary

This PR set a reasonably high (1 hour) socket timeout for potentially long running Rule Management API endpoints.

It's important to note this fix only mitigates closing TCP connection risks. Proxies have own TCP connection timeout though it's higher than default node.js 2 minutes.

## Details

When performing operations on a large number of rules and/or in a resource limited or suffering from performance degradation environment endpoints may take more time than default node.js socket timeout which is 2 minutes. According to the [HTTP spec](https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4) browser should retry if the connection was closed by the server. Taking into account API endpoint's handler isn't terminated after closing a TCP connection a retry attempt will spawn a new request processing in parallel. Under some circumstance it can lead to creating multiple rules with the same `rule_id` and for example end up creating more rules than expected like described here #176207.
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Mar 6, 2024
…ule Management API endpoints (elastic#177329)

**Fixes: elastic#177277

## Summary

This PR set a reasonably high (1 hour) socket timeout for potentially long running Rule Management API endpoints.

It's important to note this fix only mitigates closing TCP connection risks. Proxies have own TCP connection timeout though it's higher than default node.js 2 minutes.

## Details

When performing operations on a large number of rules and/or in a resource limited or suffering from performance degradation environment endpoints may take more time than default node.js socket timeout which is 2 minutes. According to the [HTTP spec](https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4) browser should retry if the connection was closed by the server. Taking into account API endpoint's handler isn't terminated after closing a TCP connection a retry attempt will spawn a new request processing in parallel. Under some circumstance it can lead to creating multiple rules with the same `rule_id` and for example end up creating more rules than expected like described here elastic#176207.

(cherry picked from commit 05d3dfa)
kibanamachine referenced this issue Mar 6, 2024
…nning Rule Management API endpoints (#177329) (#178084)

# Backport

This will backport the following commits from `main` to `8.13`:
- [[Security Solution] Set socket timeout for potentially long running
Rule Management API endpoints
(#177329)](#177329)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Maxim
Palenov","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-03-06T12:05:10Z","message":"[Security
Solution] Set socket timeout for potentially long running Rule
Management API endpoints (#177329)\n\n**Fixes:
https://github.com/elastic/kibana/issues/177277**\r\n\r\n##
Summary\r\n\r\nThis PR set a reasonably high (1 hour) socket timeout for
potentially long running Rule Management API endpoints.\r\n\r\nIt's
important to note this fix only mitigates closing TCP connection risks.
Proxies have own TCP connection timeout though it's higher than default
node.js 2 minutes.\r\n\r\n## Details\r\n\r\nWhen performing operations
on a large number of rules and/or in a resource limited or suffering
from performance degradation environment endpoints may take more time
than default node.js socket timeout which is 2 minutes. According to the
[HTTP
spec](https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4)
browser should retry if the connection was closed by the server. Taking
into account API endpoint's handler isn't terminated after closing a TCP
connection a retry attempt will spawn a new request processing in
parallel. Under some circumstance it can lead to creating multiple rules
with the same `rule_id` and for example end up creating more rules than
expected like described here
https://github.com/elastic/kibana/issues/176207.","sha":"05d3dfa4471904fb2b494b6af8a6cdb81fe869dc","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","release_note:skip","impact:high","Team:Detections
and Resp","Team: SecuritySolution","Feature:Rule
Management","Team:Detection Rule
Management","v8.13.0","v8.14.0"],"title":"[Security Solution] Set socket
timeout for potentially long running Rule Management API
endpoints","number":177329,"url":"https://github.com/elastic/kibana/pull/177329","mergeCommit":{"message":"[Security
Solution] Set socket timeout for potentially long running Rule
Management API endpoints (#177329)\n\n**Fixes:
https://github.com/elastic/kibana/issues/177277**\r\n\r\n##
Summary\r\n\r\nThis PR set a reasonably high (1 hour) socket timeout for
potentially long running Rule Management API endpoints.\r\n\r\nIt's
important to note this fix only mitigates closing TCP connection risks.
Proxies have own TCP connection timeout though it's higher than default
node.js 2 minutes.\r\n\r\n## Details\r\n\r\nWhen performing operations
on a large number of rules and/or in a resource limited or suffering
from performance degradation environment endpoints may take more time
than default node.js socket timeout which is 2 minutes. According to the
[HTTP
spec](https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4)
browser should retry if the connection was closed by the server. Taking
into account API endpoint's handler isn't terminated after closing a TCP
connection a retry attempt will spawn a new request processing in
parallel. Under some circumstance it can lead to creating multiple rules
with the same `rule_id` and for example end up creating more rules than
expected like described here
https://github.com/elastic/kibana/issues/176207.","sha":"05d3dfa4471904fb2b494b6af8a6cdb81fe869dc"}},"sourceBranch":"main","suggestedTargetBranches":["8.13"],"targetPullRequestStates":[{"branch":"8.13","label":"v8.13.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/177329","number":177329,"mergeCommit":{"message":"[Security
Solution] Set socket timeout for potentially long running Rule
Management API endpoints (#177329)\n\n**Fixes:
https://github.com/elastic/kibana/issues/177277**\r\n\r\n##
Summary\r\n\r\nThis PR set a reasonably high (1 hour) socket timeout for
potentially long running Rule Management API endpoints.\r\n\r\nIt's
important to note this fix only mitigates closing TCP connection risks.
Proxies have own TCP connection timeout though it's higher than default
node.js 2 minutes.\r\n\r\n## Details\r\n\r\nWhen performing operations
on a large number of rules and/or in a resource limited or suffering
from performance degradation environment endpoints may take more time
than default node.js socket timeout which is 2 minutes. According to the
[HTTP
spec](https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4)
browser should retry if the connection was closed by the server. Taking
into account API endpoint's handler isn't terminated after closing a TCP
connection a retry attempt will spawn a new request processing in
parallel. Under some circumstance it can lead to creating multiple rules
with the same `rule_id` and for example end up creating more rules than
expected like described here
https://github.com/elastic/kibana/issues/176207.","sha":"05d3dfa4471904fb2b494b6af8a6cdb81fe869dc"}}]}]
BACKPORT-->

Co-authored-by: Maxim Palenov <[email protected]>
@banderror
Copy link
Contributor

Put on hold: #177159 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Rule Import/Export Security Solution Detection Rule Import & Export workflow Feature:Rule Management Security Solution Detection Rule Management area impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc.
Projects
None yet
Development

No branches or pull requests

4 participants