Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Fix flaky test: x-pack/test/detection_engine_api_integration/security_and_spaces/update_prebuilt_rules_package/update_prebuilt_rules_package·ts - update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package #163241

Merged

Conversation

jpdjere
Copy link
Contributor

@jpdjere jpdjere commented Aug 7, 2023

Fixes: #162658

Summary

  • Fixes flaky test: x-pack/test/detection_engine_api_integration/security_and_spaces/update_prebuilt_rules_package/update_prebuilt_rules_package·ts
  • Test title: update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package

Passing flaky test runner

300 runs: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2799

Root cause and what this PR does

Refactor

  • Moves the refreshing of the indexes within the utility function that write to indexes:
    • installPrebuiltRules
    • upgradePrebuiltRules
    • installPrebuiltRulesAndTimelines (legacy, but still used)
    • installPrebuiltRulesFleetPackage
  • Creates 2 new utils:
    • installPrebuiltRulesPackageByVersion, which installs security_detection_engine package via Fleet API, with its version passed in as param
    • getInstalledRules, reusable function to fetch all installed rules

@jpdjere jpdjere self-assigned this Aug 7, 2023
@jpdjere
Copy link
Contributor Author

jpdjere commented Aug 7, 2023

@jpdjere
Copy link
Contributor Author

jpdjere commented Aug 7, 2023

@jpdjere jpdjere changed the title [Security Solution] Flaky test runner: x-pack/test/detection_engine_api_integration/security_and_spaces/update_prebuilt_rules_package/update_prebuilt_rules_package·ts - update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package [Security Solution] Fix flaky test: x-pack/test/detection_engine_api_integration/security_and_spaces/update_prebuilt_rules_package/update_prebuilt_rules_package·ts - update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package Aug 8, 2023
@jpdjere jpdjere requested a review from maximpn August 8, 2023 15:09
@jpdjere jpdjere added release_note:skip Skip the PR/issue when compiling release notes test failed-test A test failure on a tracked branch, potentially flaky-test Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Rule Management Security Detection Rule Management Team v8.10.0 labels Aug 8, 2023
@jpdjere jpdjere marked this pull request as ready for review August 8, 2023 15:10
@jpdjere jpdjere requested a review from a team as a code owner August 8, 2023 15:10
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Copy link
Contributor

@maximpn maximpn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpdjere thank you for fixing this flaky test 👍

Could you describe the root cause in the ticket's description? I'd be good to isolate es.indices.refresh() to the utility functions where it's needed and have a clear comment describing why it's needed there.

Have you checked also all the above requests have ?refresh=wait_for so the changed data is available? I'm a bit afraid we don't make it clear here in tests so any further readers will be confused.

@@ -133,6 +134,8 @@ export default ({ getService }: FtrProviderContext): void => {
// Verify that the _perform endpoint returns the same number of installed rules as the status endpoint
// and the _review endpoint
const installPrebuiltRulesResponse = await installPrebuiltRules(supertest);
await es.indices.refresh({ index: ALL_SAVED_OBJECT_INDICES });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite easy to forget why es.indices.refresh() is needed and what part of test it's related to. It'd be nicer to have it right before an HTTP call depending on it so anybody reusing it doesn't have to keep in mind es.indices.refresh() has to be called before. For example getPrebuiltRulesStatus() should contain es.indices.refresh() inside.

@@ -133,6 +134,8 @@ export default ({ getService }: FtrProviderContext): void => {
// Verify that the _perform endpoint returns the same number of installed rules as the status endpoint
// and the _review endpoint
const installPrebuiltRulesResponse = await installPrebuiltRules(supertest);
await es.indices.refresh({ index: ALL_SAVED_OBJECT_INDICES });

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installed rules fetching chunk looks to be repeated. It could be a reusable function with es.indices.refresh() inside

const { body: rulesResponse } = await supertest
        .get(`${DETECTION_ENGINE_RULES_URL_FIND}?per_page=10000`)
        .set('kbn-xsrf', 'true')
        .send()
        .expect(200);

@jpdjere
Copy link
Contributor Author

jpdjere commented Aug 10, 2023

@maximpn Refactored the utils according to your feedback.

In the case of utility functions that write to an index, I moved the refresh method to within the util and added a comment of why there. Notice that this is done after writing to the index for each case, to leave the test in a state in which it is ready and updated for any subsequent read operation and assertion on those results.

Also moved the "find rules" to its own utility getInstalledRules; this one doesn't need an index refresh since it's a read operation.

Copy link
Contributor

@maximpn maximpn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpdjere thank you for addressing my comments 🙏

* @returns Fleet install package response
*/

export const getInstalledRules = async (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: IMHO fetchInstalledRules would be the best name for the functionality under the hood as it clearly describes that it fetches the data presumable from an endpoint. While get usually used for sync functions extracting some data like Map.get. At the same time getInstalledRules conforms with the other functions in the folder like getPrebuiltRulesStatus. Ideally getPrebuiltRulesStatus should be renamed to fetchPrebuiltRulesStatus as well as the others but it definitely out of scope of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I'll address this in any of the following PRs I'm working on. Thanks for all the feedback!

@jpdjere jpdjere added the backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) label Aug 11, 2023
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @jpdjere

@jpdjere jpdjere removed the backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) label Aug 11, 2023
@jpdjere jpdjere merged commit 2ba659d into elastic:main Aug 11, 2023
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Aug 11, 2023
jloleysens added a commit to jloleysens/kibana that referenced this pull request Aug 11, 2023
* main: (64 commits)
  [ML] Transforms: Fix privileges check. (elastic#163687)
  [Log Explorer] Add test suite for Dataset Selector (elastic#163079)
  [Security Solution][Endpoint] Add API checks to Endpoint Policy create/update for checking `endpointPolicyProtections` is enabled (elastic#163429)
  [Security Solution] Fix flaky test: x-pack/test/detection_engine_api_integration/security_and_spaces/update_prebuilt_rules_package/update_prebuilt_rules_package·ts - update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package (elastic#163241)
  [Security Solution] expandable flyout - replace feature flag with advanced settings toggle (elastic#161614)
  [DOCS] Adds the release notes for the 8.9.1 release. (elastic#163578)
  [FTR] Implement browser network condition utils (elastic#163633)
  [Security Solution] Unskip rules table auto-refresh Cypress tests (elastic#163451)
  [Security Solution] Re-enable fixed rule snoozing Cypress test (elastic#160037)
  [Flaky Test elastic#111821] Mock `moment` to avoid midnight TZ issues (elastic#163157)
  Document interactive setup (elastic#163619)
  [Lens] Align decoration color with text color for layer actions (elastic#163630)
  [Lens] Relax counter field checks for saved visualizations with unsupported operations (elastic#163515)
  [Security Solution][Endpoint] Removes pMap and uses a for loop instead (elastic#163509)
  [Enterprise Search] Update Workplace Search connectors doclink (elastic#163676)
  Update APM (main) (elastic#163623)
  [Serverless] Partially fix lens/maps/visualize breadcrumbs missing title  (elastic#163476)
  [Flaky elastic#118272] Unskip tests (elastic#163319)
  [APM] Make service group saved objects exportable (elastic#163569)
  [Observability AI Assistant] Action menu item (elastic#163463)
  ...
jpdjere added a commit that referenced this pull request Aug 24, 2023
…e_api_integration/security_and_spaces/update_prebuilt_rules_package/update_prebuilt_rules_package·ts - update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package (#164558)

**NOTE: This is a manual backport of
#163241. It had to be done
manually because of changes that happened from 8.9 -> 8.10 like the
common folder restructuring and some type changes because of the OpenAPI
effort, but it is mostly the changes to the same files as that PR.**

**Original PR description follows:**

Fixes: #162658

## Summary

- Fixes flaky test:
`x-pack/test/detection_engine_api_integration/security_and_spaces/update_prebuilt_rules_package/update_prebuilt_rules_package·ts`
- Test title: `update_prebuilt_rules_package should allow user to
install prebuilt rules from scratch, then install new rules and upgrade
existing rules from the new package`

## Passing flaky test runner

300 runs:
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2799

## Root cause and what this PR does

- On initial investigation, the flaky test runner was executed for this
test with 300 iterations and all of them succeeded. This gives us great
confidence that the test is not actually flaky.
- Further investigation showed that @kibanamachine reported this tests
as failing when running the CI for two backport PRs to `8.9`:
-
https://buildkite.com/elastic/kibana-on-merge/builds/33282#0189987d-3a80-49c2-8332-3105ec3c2109
-
https://buildkite.com/elastic/kibana-on-merge/builds/33444#0189b1fa-4bc4-4422-9ce9-5c9a24f11ad5
- These flakiness was caused **by a race condition** between the writing
of rules into indeces, and them being made available for reading. This
is a known issue, already and solved for other integration test, by
manually refreshing ES's indeces.
- In order to reduce the probability of flakiness in a similar scenario,
this PR adds code to refresh the indices after each rule installation or
upgrade during the test.

## Refactor

- Moves the refreshing of the indexes within the utility function that
write to indexes:
    - `installPrebuiltRules`
    - `upgradePrebuiltRules`
    - `installPrebuiltRulesAndTimelines` (legacy, but still used)
    - `installPrebuiltRulesFleetPackage`
- Creates 2 new utils:
- `installPrebuiltRulesPackageByVersion`, which installs
`security_detection_engine` package via Fleet API, with its version
passed in as param
- `getInstalledRules`, reusable function to fetch all installed rules
    -
@jpdjere jpdjere added the flake-docs Temp label to gather PRs used to create dev docs label Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting failed-test A test failure on a tracked branch, potentially flaky-test flake-docs Temp label to gather PRs used to create dev docs release_note:skip Skip the PR/issue when compiling release notes Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. test v8.10.0
Projects
None yet
5 participants