[SIEM] [ML] Starting a job without enough memory doesn't always show Out of Memory error #54382

spong · 2020-01-09T18:24:05Z

If a job fails to start due to a lack of memory this error should be presented to the user by means of an error toast. While this behavior is present in some cases (see #45316), there seems to be some inconsistency on whether or not an error is displayed.

In the below gif, you'll notice that the force_start_datafeed request returns an empty object instead of an error. In this instance, when we next refresh the state of all jobs, we can show that the job was not able to start due to a lack of memory by putting the error message/current state as hover text on the job (or potentially a callout at the top saying we're at max utilization).

State stuck as 'opening' in ML UI:

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-01-09T18:24:09Z

Pinging @elastic/siem (Team:SIEM)

spong · 2020-01-10T17:08:01Z

As introduced in #50766, looks like we can add even more details around the no node/OoM cases to better improve the UX here.

spong · 2020-06-25T22:33:50Z

@MadameSheema this is still relevant, and should be prioritized as we continue to add more ML jobs.

cybersecdiva · 2023-08-17T22:45:08Z

Tested in 8.9.0 BC5

Build Details:
VERSION: 8.9.0 BC5
BUILD: 64715
COMMIT: beb56356c5c037441f89264361302513ff5bd9f8

Preconditions:

Kibana must be running
ML node must be set up with the lowest memory (1 GB)

Describe the bug:
Starting an ML job without enough memory doesn't always show Out of Memory error

Steps to reproduce:

Navigate to Security—> Manage Rules—> ML job settings
Under ML job settings enable a minimum of 10 ML jobs
Save the change when the Pop-up notification displays to confirm changes
Reload the page for the change to take effect
Navigate to Security --> Manage --> Rules

Current behavior

No Out of Memory errors are warnings for ML jobs started (over 10+ jobs enabled)

Expected behavior:

No Out of Memory errors are warnings for ML jobs started (over 10+ jobs enabled)

Observations:

A review of the ML job memory usages shows that when there are more than 10+ jobs even 20 or more jobs on enabled on the cluster, there are no out of memory error limits displayed

Screenshots of behavior:

Machine Learning Memory Usage

Conclusion:

Behavior appears to be performing as expected. Validated ✅ bug is fiexed.

@MadameSheema and @spong FYI Updated observations

spong added bug Fixes for quality problems that affect the customer experience Team:SIEM labels Jan 9, 2020

spong self-assigned this Jan 10, 2020

spong mentioned this issue Apr 9, 2020

[SIEM] [ML] Deployments without ML Nodes do not show an error when enabling jobs #63155

Closed

MadameSheema added the Team:Detections and Resp Security Detection Response Team label Oct 1, 2020

MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Oct 27, 2020

peluja1012 added the impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. label Oct 28, 2020

peluja1012 added the Feature:ML Rule Security Solution Machine Learning rule type label Nov 17, 2020

spong mentioned this issue Mar 4, 2021

[Security Solution] [Detections] Enabling ML jobs from Security App keeps trying when no ML Nodes Available #93337

Closed

spong mentioned this issue Jun 21, 2021

[Security Solution][Detections] Migrate ML Job Settings UI out of popover to dedicated UI #102837

Open

peluja1012 added the Team:Detection Rule Management Security Detection Rule Management Team label Sep 15, 2021

spong removed their assignment Nov 10, 2021

cybersecdiva added fixed QA:Validated Issue has been validated by QA labels Aug 17, 2023

cybersecdiva closed this as completed Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SIEM] [ML] Starting a job without enough memory doesn't always show Out of Memory error #54382

[SIEM] [ML] Starting a job without enough memory doesn't always show Out of Memory error #54382

spong commented Jan 9, 2020

elasticmachine commented Jan 9, 2020

spong commented Jan 10, 2020

spong commented Jun 25, 2020

cybersecdiva commented Aug 17, 2023

[SIEM] [ML] Starting a job without enough memory doesn't always show Out of Memory error #54382

[SIEM] [ML] Starting a job without enough memory doesn't always show Out of Memory error #54382

Comments

spong commented Jan 9, 2020

elasticmachine commented Jan 9, 2020

spong commented Jan 10, 2020

spong commented Jun 25, 2020

cybersecdiva commented Aug 17, 2023

Tested in 8.9.0 BC5

Observations:

Screenshots of behavior:

Conclusion: