Adds ML supervised model Problem child package #2115

alvarezmelissa87 · 2021-11-03T00:03:30Z

What does this PR do?

Adds the ML supervised model package for Problem child model.

Package includes:

pipelines
ml_module
ml_model
security rules

Checklist

I have reviewed tips for building integrations and this pull request is aligned with them.
I have verified that all data streams collect metrics or logs.
I have added an entry to my package's changelog.yml file.
If I'm introducing a new feature, I have modified the Kibana version constraint in my package's manifest.yml file to point to the latest Elastic stack release (e.g. ^7.13.0).

Author's Checklist

[ ]

How to test this PR locally

Related issues

Screenshots

elasticmachine · 2021-11-03T00:13:02Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-03-28T14:39:27.986+0000
Duration: 21 min 46 sec

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

peteharverson · 2021-12-17T15:57:40Z

Is it possible to set a description field for the model, so that something like The Problem child model is used to detect living off the land (LOtl) activity. appears in the models list:

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

packages/ml_problem_child/changelog.yml

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

peteharverson · 2022-01-31T11:45:49Z

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

+                            },
+                            {
+                                "by_field_name": "host.hostname",
+                                "detector_description": "high sum by host",


As above, could the description here be edited to indicate the field it is running on blocklist_label? I don't have enough context on what information is held in blocklist_label to advise here.

@peteharverson So the way we classify processes as malicious is using either a supervised model or a blocklist for things the model may have missed. I think adding this in the description will introduce complexity.

OK. Thanks for explanation @ajosh0504

peteharverson · 2022-01-31T11:49:51Z

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

+                            },
+                            {
+                                "by_field_name": "user.name",
+                                "detector_description": "high sum by host",


Could we include the field name in the description of this detector - what is it a high sum of?

peteharverson · 2022-01-31T11:49:55Z

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

+                        "detectors": [
+                            {
+                                "by_field_name": "user.name",
+                                "detector_description": "high sum by user",


Could we include the field name in the description of this detector - what is it a high sum of?

As above, I think introducing field names in the descriptions might introduce complexity, since users may not know what the fields exactly mean.

I agree with Apoorva, field names aren't always the most user-centric and don't add that much value (imo) for the complexity they could add

peteharverson · 2022-01-31T11:50:08Z

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

+                        "detectors": [
+                            {
+                                "by_field_name": "process.parent.name",
+                                "detector_description": "high sum by parent process",


Could we include the field name in the description of this detector - what is it a high sum of?

peteharverson · 2022-01-31T11:50:14Z

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

+                            },
+                            {
+                                "by_field_name": "process.parent.name",
+                                "detector_description": "high sum by host",


Could we include the field name in the description of this detector - what is it a high sum of?

packages/ml_problem_child/manifest.yml

ajosh0504

Thanks for getting this done! Super excited to see these models in packages. Just one question. So once a user installs a package, what's the process for them to update/change artifacts produced by the package?

packages/ml_problem_child/manifest.yml

ajosh0504 · 2022-01-31T17:40:43Z

packages/ml_problem_child/docs/README.md

@@ -0,0 +1,6 @@
+# ML Problem Child


Can we keep the name consistent everywhere? I'm seeing ProblemChild, problem child and ProblemChild being used. We've been going with ProblemChild so far.

Updated to use ProblemChild everywhere in 7d33b9d

ajosh0504 · 2022-01-31T17:41:11Z

packages/ml_problem_child/docs/README.md

@@ -0,0 +1,6 @@
+# ML Problem Child
+
+The Problem child model package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LOtl) activity.


Suggested change

The Problem child model package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LOtl) activity.

The Problem child package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LotL) activity.

Updated in 7d33b9d

packages/ml_problem_child/manifest.yml

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

ajosh0504 · 2022-01-31T21:10:28Z

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

+                            },
+                            {
+                                "by_field_name": "host.hostname",
+                                "detector_description": "high sum by host",


@peteharverson So the way we classify processes as malicious is using either a supervised model or a blocklist for things the model may have missed. I think adding this in the description will introduce complexity.

ajosh0504 · 2022-01-31T21:11:47Z

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json

+                        "detectors": [
+                            {
+                                "by_field_name": "user.name",
+                                "detector_description": "high sum by user",


As above, I think introducing field names in the descriptions might introduce complexity, since users may not know what the fields exactly mean.

ajosh0504 · 2022-01-31T21:28:04Z

As with the DGA model, should the model file name here be something more understandable? It's currently problemchild_20210526_1.0.json. This will also need to be changed in the inference pipeline.

@ajosh0504 - this file name is the name of the model id since the file name is used to determine the model id to send up to the put trained models api to install it to ES. Filenames being the asset id is consistent with the way fleet installs things.

dishadasgupta · 2022-01-31T23:11:20Z

packages/ml_problem_child/elasticsearch/ingest_pipeline/ml_problem_child_ingest_pipeline.yml

@@ -0,0 +1,10 @@
+---
+description: "A pipeline of pipelines for ProblemChild detection"


Is there a way to make this description a bit clearer? I feel like "pipeline of pipelines" at first read could be a bit confusing

How about just Pipelines for ProblemChild detection? Is this label actually shown anywhere in the UI?

dishadasgupta · 2022-01-31T23:19:34Z

packages/ml_problem_child/docs/README.md

+
+The Problem child model package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LOtl) activity.
+
+To download the assets, click **Settings** > **Install ML Problem child assets**.


I'm assuming this is within a cluster? Are there any steps needed before this? i.e. ProblemChild requires Platinum level capabilities, is that denoted anywhere (unless I missed it)? Maybe even a quick note in the README would be helpful

The package manifest file has the license type that is required - that is enforced in Fleet and the installation will fail due to failing to meet the required license type.

alvarezmelissa87 · 2022-02-02T15:29:43Z

cc @lcawl for package/model description text in the readme and such.

packages/problem_child/manifest.yml

brokensound77

So cool to see this finally going into the product (after being in an experimental state for so long). I left a few suggestions.

@spong FYSA, if I am not mistaken, this and #2352 and the first integrations outside of the prebuilt rules to add rules destined for the detection engine. Not sure if there were any assumptions or considerations to be made.

brokensound77 · 2022-02-02T22:25:46Z

packages/problem_child/kibana/security_rule/34184d4e-ef61-477b-8d76-5c93448c29bf.json

+        "author": [
+            "Elastic"
+        ],
+        "description": "A supervised machine learning model (ProblemChild) or its blocklist has identified\na suspicious Windows process event to be malicious activity.\n",


is the line break in the middle intentional?

Shouldn't need a trailing one either

These lines breaks were part of the release and haven't been changed. I'd need confirmation that it's okay to remove them.

FWIW there doesn't appear to be any issue with rendering (these newlines are ignored):

brokensound77 · 2022-02-02T22:28:22Z

packages/problem_child/kibana/security_rule/9a2e372a-cbeb-4ad6-a288-017ef086324c.json

+        "license": "Elastic License",
+        "max_signals": 10000,
+        "name": "Machine Learning Detected a Suspicious Windows Event with a High Malicious Probability Score",
+        "query": "(problemchild.prediction:1 and problemchild.prediction_probability \u003e 0.98) or blocklist_label:1",


was the \u003e a unicode conversion error?

Yep - fixed! thanks! 7d0d452

Hmmm, I'm still seeing this in the latest. The other changes from 7d0d452 appear to have made it though? (i.e. packages/problem_child/manifest.yml title has been updated to ProblemChild instead ML ProblemChild)

Thanks for catching this - this should now be a > instead of the encoded version 👍 Fixed in c3692a17d515ebdd31e95e03652f735d04db5748

@spong - looks like running the elastic-package format command is what's causing the change from > to the encoded version. Is this expected?
cc @mtojek - also - running elastic-package check locally is failing with a reference to the ml_model file name - though the update was made to package-spec to match that file name pattern and I've tested with the package-spec validator test so curious if this could be something going on locally? Might need to chat with you quickly on that one.

Oh wow, looks like you are right

integrations/packages/security_detection_engine/kibana/security_rule/a9b05c3b-b304-4bf9-970d-acdfaef2944c.json

Line 16 in 1d18f55

"query": "/* Registry Path ends with backslash */\nregistry where /* length(registry.data.strings) \u003e 0 and */\n registry.path : (\"HKEY_USERS\\\\*\\\\Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Run\\\\\",\n \"HKU\\\\*\\\\Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Run\\\\\",\n \"HKLM\\\\Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Run\\\\\", \n \"HKLM\\\\Software\\\\WOW6432Node\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Run\\\\\", \n \"HKEY_USERS\\\\*\\\\Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Policies\\\\Explorer\\\\Run\\\\\",\n \"HKU\\\\*\\\\Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Policies\\\\Explorer\\\\Run\\\\\",\n \"HKLM\\\\Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Policies\\\\Explorer\\\\Run\\\\\")\n",

I guess it is ok to leave it converted then 🤷

What we can see in the CI output is:

[2022-03-18T15:59:14.998Z] Error: checking package failed: formatting the integration failed (path: /var/lib/jenkins/workspace/est-manager_integrations_PR-2115/src/github.com/elastic/integrations/packages/problem_child, failFast: true): walking through the integration files failed: formatting file failed (path: /var/lib/jenkins/workspace/est-manager_integrations_PR-2115/src/github.com/elastic/integrations/packages/problem_child/kibana/security_rule/9a2e372a-cbeb-4ad6-a288-017ef086324c.json): file is not formatted (path: /var/lib/jenkins/workspace/est-manager_integrations_PR-2115/src/github.com/elastic/integrations/packages/problem_child/kibana/security_rule/9a2e372a-cbeb-4ad6-a288-017ef086324c.json)

so it confirms that you have to post the formatted code. Sometimes the formatted code might be harder to read, but we don't have a choice here if we want to depend on standard JSON libraries. Please run elastic-package format and post the formatted content.

cc @mtojek - also - running elastic-package check locally is failing with a reference to the ml_model file name - though the update was made to package-spec to match that file name pattern and I've tested with the package-spec validator test so curious if this could be something going on locally? Might need to chat with you quickly on that one.

It depends on which elastic-package release you have locally and which one is in the go.mod in the Integrations.

BTW we're struggling a bit with updating the dependency - here.

brokensound77 · 2022-02-02T22:37:31Z

packages/problem_child/docs/README.md

+The ProblemChild package contains the [ProblemChild model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LotL) activity.
+
+To download the assets, click **Settings** > **Install ML ProblemChild assets**.
+


is it worth elaborating on this to explain the relationship between the model, jobs, and rules. I believe the job and rules will be installed but in a disabled state. Also if the rule is enabled before the job, it will throw an error.

The schema for rules accepts defining enabled = true, though I am not totally sure on the ml jobs

How about adding something to the Configuration section of the readme - after the to download the assets bit
Something like Ingest data with the installed ingest pipeline to enrich your indices with inference data and run the provided anomaly detection jobs.

If rules are enabled in the same way as described in https://www.elastic.co/guide/en/security/master/rules-ui-management.html, worst case we can link there. For example, add a sentence like, "For more information about activating the detection rules, refer to .

cc @peteharverson, @lcawl, @brokensound77

The goal is to ultimately have a page in the docs that walks through the installation and use of this integration's assets, but since that's not available to link to yet, I agree with your suggestion for a couple of lines in the Configuration section of the readme as a stop-gap.

Added in 11eb4a152ddadf13c75d6e469e97f01d6db9dd06
cc @lcawl, @peteharverson

@lcawl, @peteharverson, @Winterflower
The latest screenshots and README file have been updated with more detailed instructions on how to use the package - as discussed - in c3692a17d515ebdd31e95e03652f735d04db5748
Would appreciate confirmation that it's what we want when you get a chance! 🙏

peteharverson

Latest changes LGTM

…nd notice

mtojek · 2022-03-22T22:18:37Z

packages/problemchild/kibana/ml_module/problemchild-ml.json

+            }
+        ]
+    },
+    "id": "problem_child",


@alvarezmelissa87 What about this "id"? CI mentioned that field.

Ah, thanks - missed this the first time! Updated in 38fd11b

spong · 2022-03-24T22:54:25Z

Just commenting to update that we've come to consensus on how we'll handle the aforementioned issue around rule_id uniqueness between packages, and that there is no reason to hold up the merging or publishing of this package from the Security Solution side. Please see this comment elastic/kibana#128202 (comment) for all the details, and thank you all for all your input here! 🙂

mtojek · 2022-03-28T14:38:46Z

/test

* adds ml_problem_child package * add ml_module to problem child package * ensure modules in right path. update query * fix jobs configs * move datafeeds into attributes * update ingest pipeline name and update job configs * remove hardcoded indices for datafeeds. update descriptions * adds job groups and security rules * update deprecated property and package manifest minimum version * format files * rename package folder. update logo * fix encoding error * update package name to match validation pattern. fix encoding in rule * Add (experimental) to job descriptions) * update README with more asset info * add license requirement to card and readme * add asset context to readme * update card title and description * update overview config section with more instructions * change back to basic license but add platinum subscription language and notice * update codeowners file * update codeowners and readme * update owners in manifest * ensure files formatted correctly * update ml_module asset id to match filename * rename problem_child directory to problemchild for consistency * update ml module id to match filename * fix module id

alvarezmelissa87 added the enhancement New feature or request label Nov 3, 2021

alvarezmelissa87 self-assigned this Nov 3, 2021

alvarezmelissa87 force-pushed the ml-problem-child-package branch 2 times, most recently from 6a60abe to fc68b33 Compare December 6, 2021 18:06

alvarezmelissa87 force-pushed the ml-problem-child-package branch from fc68b33 to b83c6dd Compare December 8, 2021 23:02

peteharverson reviewed Dec 17, 2021

View reviewed changes

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json Outdated Show resolved Hide resolved

peteharverson reviewed Dec 17, 2021

View reviewed changes

packages/ml_problem_child/kibana/ml_module/problem-child-ml.json Outdated Show resolved Hide resolved

alvarezmelissa87 force-pushed the ml-problem-child-package branch from 8074733 to d6d356e Compare December 20, 2021 20:02

peteharverson mentioned this pull request Dec 22, 2021

[Discuss] Support stored scripts in Fleet packages elastic/package-spec#202

Open

alvarezmelissa87 force-pushed the ml-problem-child-package branch from 3876651 to 0cac4a9 Compare January 20, 2022 22:44

peteharverson reviewed Jan 31, 2022

View reviewed changes

packages/ml_problem_child/changelog.yml Outdated Show resolved Hide resolved

packages/ml_problem_child/changelog.yml Outdated Show resolved Hide resolved

peteharverson reviewed Jan 31, 2022

View reviewed changes

packages/ml_problem_child/manifest.yml Outdated Show resolved Hide resolved

peteharverson reviewed Jan 31, 2022

View reviewed changes

packages/ml_problem_child/manifest.yml Outdated Show resolved Hide resolved

ajosh0504 reviewed Jan 31, 2022

View reviewed changes

dishadasgupta reviewed Jan 31, 2022

View reviewed changes

alvarezmelissa87 marked this pull request as ready for review February 2, 2022 00:14

peteharverson reviewed Feb 2, 2022

View reviewed changes

packages/problem_child/manifest.yml Outdated Show resolved Hide resolved

brokensound77 reviewed Feb 2, 2022

View reviewed changes

peteharverson mentioned this pull request Feb 4, 2022

Add ML modules to Apache integration readme #2640

Merged

4 tasks

alvarezmelissa87 requested a review from lcawl February 4, 2022 16:28

peteharverson approved these changes Feb 4, 2022

View reviewed changes

lcawl mentioned this pull request Feb 5, 2022

Add ML modules and detection rules to problem child readme alvarezmelissa87/integrations#2

Closed

alvarezmelissa87 added 14 commits March 21, 2022 21:24

fix encoding error

7947f8e

update package name to match validation pattern. fix encoding in rule

2d4b065

Add (experimental) to job descriptions)

55bc30f

update README with more asset info

d226c14

add license requirement to card and readme

116d91b

add asset context to readme

0a467e9

update card title and description

4b8bcf6

update overview config section with more instructions

3eb6c18

change back to basic license but add platinum subscription language a…

c283038

…nd notice

update codeowners file

fe6f9b7

update codeowners and readme

eaec910

update owners in manifest

1d8bbdc

ensure files formatted correctly

07e522b

update ml_module asset id to match filename

22cc5f4

alvarezmelissa87 force-pushed the ml-problem-child-package branch from 4fe4ad2 to 22cc5f4 Compare March 22, 2022 03:24

alvarezmelissa87 added 2 commits March 22, 2022 09:21

rename problem_child directory to problemchild for consistency

8142e08

update ml module id to match filename

f6bd70c

mtojek reviewed Mar 22, 2022

View reviewed changes

fix module id

38fd11b

spong mentioned this pull request Mar 24, 2022

[Security Solution][Detections] Add package tests for rule_id uniqueness and ensuring correct sourcePackage elastic/kibana#128547

Open

mtojek approved these changes Mar 28, 2022

View reviewed changes

alvarezmelissa87 merged commit 99ce3f1 into elastic:main Mar 29, 2022

alvarezmelissa87 deleted the ml-problem-child-package branch March 29, 2022 00:30

alvarezmelissa87 mentioned this pull request Apr 6, 2022

Fix typo in ML supervised model Problem child package #3022

Merged

4 tasks

spong mentioned this pull request Jan 11, 2023

[Change Proposal] Allow security_rule objects to have rule IDs different from the object IDs elastic/package-spec#459

Closed

andrewkroh added Integration:problemchild Living off the Land Attack Detection New Integration Issue or pull request for creating a new integration package. labels Aug 13, 2024

		@@ -0,0 +1,6 @@
		# ML Problem Child

		The Problem child model package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LOtl) activity.

	The Problem child model package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LOtl) activity.
	The Problem child package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LotL) activity.

		@@ -0,0 +1,10 @@
		---
		description: "A pipeline of pipelines for ProblemChild detection"


		The Problem child model package contains the [Problem child model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LOtl) activity.

		To download the assets, click Settings > Install ML Problem child assets.

		The ProblemChild package contains the [ProblemChild model and associated assets](https://www.elastic.co/blog/problemchild-generate-alerts-to-detect-living-off-the-land-attacks), which are used to detect living off the land (LotL) activity.

		To download the assets, click Settings > Install ML ProblemChild assets.

Adds ML supervised model Problem child package #2115

Adds ML supervised model Problem child package #2115

Conversation

alvarezmelissa87 commented Nov 3, 2021 • edited Loading

What does this PR do?

Checklist

Author's Checklist

How to test this PR locally

Related issues

Screenshots

elasticmachine commented Nov 3, 2021 • edited Loading

💚 Build Succeeded

Build stats

🤖 GitHub comments

peteharverson commented Dec 17, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajosh0504 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajosh0504 commented Jan 31, 2022 • edited by alvarezmelissa87 Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dishadasgupta Jan 31, 2022 • edited Loading

Choose a reason for hiding this comment

alvarezmelissa87 Feb 1, 2022 • edited Loading

Choose a reason for hiding this comment

alvarezmelissa87 commented Feb 2, 2022

brokensound77 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alvarezmelissa87 Feb 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alvarezmelissa87 Mar 18, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alvarezmelissa87 Feb 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peteharverson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spong commented Mar 24, 2022

mtojek commented Mar 28, 2022

alvarezmelissa87 commented Nov 3, 2021 •

edited

Loading

elasticmachine commented Nov 3, 2021 •

edited

Loading

ajosh0504 commented Jan 31, 2022 •

edited by alvarezmelissa87

Loading

dishadasgupta Jan 31, 2022 •

edited

Loading

alvarezmelissa87 Feb 1, 2022 •

edited

Loading

alvarezmelissa87 Feb 3, 2022 •

edited

Loading

alvarezmelissa87 Mar 18, 2022 •

edited

Loading

alvarezmelissa87 Feb 7, 2022 •

edited

Loading