Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect-flowbits: adding details for flowbits v4 #9691

Closed

Conversation

hadiqaalamdar
Copy link
Contributor

@hadiqaalamdar hadiqaalamdar commented Oct 25, 2023

Task #6309

Link to redmine ticket:
Previous PR: #9688

Describe changes:

  • added the recommended changes from the last PR.
  • The SV tests id 1, 8 and 9 are failing

Output from console:

===> flowbits: Sub test #1: FAIL : expected 1 matches; got 0 for filter {'filename': 'rules.json', 'count': 1, 'match': {'id': 1, 'lists.packet.matches[0].name': 'flowbits', 'lists.packet.matches[0].flowbits.action': 'noalert'}}
===> flowbits: Sub test #8: FAIL : expected 1 matches; got 0 for filter {'filename': 'rules.json', 'count': 1, 'match': {'id': 8, 'lists.postmatch.matches[0].name': 'flowbits', 'lists.postmatch.matches[0].flowbits.cmd': 'isset', 'lists.postmatch.matches[0].flowbits.ored_flowbits': 2, 'lists.postmatch.matches[0].flowbits.ored_variables[0]': 'fb1', 'lists.postmatch.matches[0].flowbits.ored_variables[1]': 'fb2'}}
===> flowbits: Sub test #9: FAIL : expected 1 matches; got 0 for filter {'filename': 'rules.json', 'count': 1, 'match': {'id': 9, 'lists.packet.matches[0].name': 'flowbits', 'lists.packet.matches[0].flowbits.ored_flowbits': 1, 'lists.packet.matches[0].flowbits.cmd': 'isnotset', 'lists.packet.matches[0].flowbits.ored_variables[0]': 'fb1', 'lists.packet.matches[0].flowbits.ored_variables[1]': 'fb2'}}

SV BRANCH = OISF/suricata-verify#1441

if (cd->or_list_size == 0) {
jb_set_string(js, "name", VarNameStoreSetupLookup(cd->idx, VAR_TYPE_FLOW_BIT));
} else if (cd->or_list_size > 0) {
jb_set_uint(js, "ored_flowbits", cd->or_list_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the use of "ored", IMO, "flowbits" should be fine.

jb_open_object(js, "flowbits");
switch (cd->cmd) {
case DETECT_FLOWBITS_CMD_NOALERT:
jb_set_string(js, "action", "noalert");
Copy link
Member

@inashivb inashivb Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this behavior is diff in 6.0.x, so you may have to check the tests
nvm we're just adding it so it doesn't matter

jb_set_string(js, "name", VarNameStoreSetupLookup(cd->idx, VAR_TYPE_FLOW_BIT));
} else if (cd->or_list_size > 0) {
jb_set_uint(js, "ored_flowbits", cd->or_list_size);
jb_open_array(js, "ored_variables");
Copy link
Member

@inashivb inashivb Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the output simple. Like

flowbits:
   name: [fb1]

or

flowbits:
  name: [fb1, fb2]  # in case of an or operation

Also, one thing to note is that even in an or list, not all flowbits may have matched. From what I gather, we want to log the postmatch stuff so we should only log the names of the flowbit that matched? I'll check today and get back if there's a better way to retrieve it.

Also, need to look if or list may be specific to certain cmds.

Copy link
Contributor Author

@hadiqaalamdar hadiqaalamdar Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what you mean by changing the output. Currently, I have output being stored in name and the ored variables in ored_variables. Should I just have the name output? Also, or list is specific only to isset and isnotset cmds from what I understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no such thing as ored variables. Those are also flowbit names separated by | which signifies or operator. And, there are flowbit names which are individually present in a rule.
I still have to check the code a bit so please wait on it. I'll get to it very soon now.

Copy link
Member

@inashivb inashivb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @hadiqaalamdar !

After looking at the code, a few things to consider:

  1. A rule can have multiple flowbits, what would it's layout look like in rules.json if we were to proceed with the current code?
  2. Flowbit oring is supported only for isset and isnotset cmds. When you think about it, it makes sense why.
    - Even in case of oring, there will be just one match because of how the or operator works.
    - We currently do not store which flowbit out of the or'd flowbits actually was responsible for a match, we'll have to. (somewhere in DetectFlowbitsData)

In conclusion,

  1. We do not need to store a list in case of or operator's usage.
  2. We may need a list to sanitize the output in case multiple flowbits are used in the same rule. [This can be done after we have something working]
  3. We need lots of tests for all the scenarios. Please refer to pre-existing s-v tests about flowbits.

Recommendation:
Questions to ask yourself before proceeding.

  1. What are flowbits?
  2. What all commands are supported by flowbits?
  3. What do flowbits rules look like?
  4. What should the output of engine-analysis be in my view?
  5. Write a suricata-verify test with all the different rules you can come up with about flowbits.

These would help you get a clearer idea about what needs to be done. If needed, feel free to ask your mentors for brief overview of concepts and/or validate your understanding.

Please know that we understand this one is a complicated task so you can take your time. 😉

@jufajardini jufajardini added the outreachy Contributions made by Outreachy applicants label Oct 26, 2023
@github-actions
Copy link

NOTE: This PR may contain new authors:

Hadiqa Alamdar Bukhari <[email protected]>

@codecov
Copy link

codecov bot commented Oct 26, 2023

Codecov Report

Merging #9691 (e373c3f) into master (2fe2d82) will decrease coverage by 0.01%.
The diff coverage is 45.45%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9691      +/-   ##
==========================================
- Coverage   82.39%   82.39%   -0.01%     
==========================================
  Files         968      968              
  Lines      274337   274370      +33     
==========================================
+ Hits       226047   226055       +8     
- Misses      48290    48315      +25     
Flag Coverage Δ
fuzzcorpus 64.66% <0.00%> (-0.04%) ⬇️
suricata-verify 60.91% <45.45%> (-0.01%) ⬇️
unittests 62.86% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@hadiqaalamdar
Copy link
Contributor Author

hadiqaalamdar commented Oct 26, 2023

@inashivb
I've looked through the flowbits code however, I'm not really sure on how to deal with the ored flowbits. This function here is what I found to be most relevant to the type of code that I need to add. My thinking was to create an array and use a for loop to store any flowbits, but since I'm not as well acquainted with the suricata code base, I'm unsure on how to proceed.

Edit: Also, should all flowbit names be stored in the same name array? Is there a pre-existing function that I can use to parse through flowbit names or should I use regex and store the names in an array if cd->or_list_size > 0

@inashivb
Copy link
Member

inashivb commented Oct 27, 2023

@inashivb I've looked through the flowbits code however, I'm not really sure on how to deal with the ored flowbits. This function here is what I found to be most relevant to the type of code that I need to add. My thinking was to create an array and use a for loop to store any flowbits, but since I'm not as well acquainted with the suricata code base, I'm unsure on how to proceed.

Hi @hadiqaalamdar !
Seems like I had a misunderstanding about these engine analysis tasks in general that was cleared up after talking w Juliana. I have edited my last review (striked through the unneeded stuff) in case you'd want to check it later if needed.
Don't worry about the issues, we are here to assist you through them while learning things ourselves. Thank you for being patient with us. 😃

Edit: Also, should all flowbit names be stored in the same name array?

Yes, that was my suggestion.

The thing is for a rule like
Rule 1

flowbits:isset,fb1|fb2|fb3

fb1, fb2 and fb3 are the names of flowbits that are separated by or operand (|).
So, in this case, if fb1 is set, we return immediately and for Suricata, its a match!
But, in case fb1 isn't set, we check fb2, if fb2 is set, we return and for Suricata its a match. and so on.

Now, please note the following rule
Rule 2

flowbits:isset,fb1; flowbits:isset,fb2; flowbits:isset,fb3;

In this case, we check if fb1 is set, if yes, we move on to check if fb2 is set and so on.. Only when fb1, fb2 and fb3 are set would the rule considered to be a match. So, this is and logical operation.

Conclusion:

  1. or operation is not anything complicated, it's just flowbit names ored together.
  2. We must have tests for all such scenarios in s-v and see the corresponding output.
  3. Since Rule 1 mentioned above also makes use of flowbit names only, I think we should put it under name as well.
  4. So, ideally, flowbits name should be an array which can store one or more flowbits as shown in the example here: detect-flowbits: adding details for flowbits v4 #9691 (comment) depending on how many flowbits are in a rule for the same command.
  5. We can optionally have an operand key (we can decide on the name later) where we can store if we want to whether it was an or operation but things can become complicated if we tried to do it for and operation.

Your next steps:

  1. Make more flowbit tests with cmds and combinations as mentioned in the rules above.
  2. Don't have separate arrays for ored stuff, put flowbit names under the flowbit.name array.

Does this help you, Hadiqa?

@hadiqaalamdar
Copy link
Contributor Author

Thank you for your detailed response. I'll keep working on this and let you know if I have any further questions

@jufajardini
Copy link
Contributor

The thing is for a rule like Rule 1

flowbits:isset,fb1|fb2|fb3

fb1, fb2 and fb3 are the names of flowbits that are separated by or operand (|). So, in this case, if fb1 is set, we return immediately and for Suricata, its a match! But, in case fb1 isn't set, we check fb2, if fb2 is set, we return and for Suricata its a match. and so on.

Now, please note the following rule Rule 2

flowbits:isset,fb1; flowbits:isset,fb2; flowbits:isset,fb3;

In this case, we check if fb1 is set, if yes, we move on to check if fb2 is set and so on.. Only when fb1, fb2 and fb3 are set would the rule considered to be a match. So, this is and logical operation.

I think that the only bit where things seem still a bit fuzzy is this:
Since the engine-analysis output isn't based on matches - as it only analysing rules, not traffic matches, my understanding is that we should focus on outputting a description of the rule. Therefore, for a rule like Rule 1 you've mentioned, we should log what is the flowbits action, and an array with all three flowbit variables, as that's what the rule contains. Does that make sense?

@jufajardini
Copy link
Contributor

@inashivb I've looked through the flowbits code however, I'm not really sure on how to deal with the ored flowbits. This function here is what I found to be most relevant to the type of code that I need to add. My thinking was to create an array and use a for loop to store any flowbits, but since I'm not as well acquainted with the suricata code base, I'm unsure on how to proceed.

Edit: Also, should all flowbit names be stored in the same name array? Is there a pre-existing function that I can use to parse through flowbit names or should I use regex and store the names in an array if cd->or_list_size > 0

Now, to create the json array when we have more than one variable in the same rule, we can still use VarNameStoreSetupLookup, and we don't have to create an array as a data structure, I'd say that if we open an array json object and have a loop using VarNameStoreSetupLookup, we should be able to log all existing flowbit variables for a given rule.

@hadiqaalamdar
Copy link
Contributor Author

hadiqaalamdar commented Oct 27, 2023

The thing is for a rule like Rule 1

flowbits:isset,fb1|fb2|fb3

fb1, fb2 and fb3 are the names of flowbits that are separated by or operand (|). So, in this case, if fb1 is set, we return immediately and for Suricata, its a match! But, in case fb1 isn't set, we check fb2, if fb2 is set, we return and for Suricata its a match. and so on.

Now, please note the following rule Rule 2

flowbits:isset,fb1; flowbits:isset,fb2; flowbits:isset,fb3;

In this case, we check if fb1 is set, if yes, we move on to check if fb2 is set and so on.. Only when fb1, fb2 and fb3 are set would the rule considered to be a match. So, this is and logical operation.

@jufajardini from what I understand, I should create one name array json object that stores all the flowbit names. The array will store flowbits in 2 ways

  1. if cd->or_list_size = 0 the flowbit would be at cd->idx and then stored in name.
  2. if cd->or_list_size > 0 there are multiple flowbits and a for loop should be used alongside VarNameStoreSetupLookup to store all the flowbits in name.

My thinking is that this should cover both Rule 1 and Rule 2. Please let me know if I'm missing something.

@jufajardini
Copy link
Contributor

The thing is for a rule like Rule 1

flowbits:isset,fb1|fb2|fb3

fb1, fb2 and fb3 are the names of flowbits that are separated by or operand (|). So, in this case, if fb1 is set, we return immediately and for Suricata, its a match! But, in case fb1 isn't set, we check fb2, if fb2 is set, we return and for Suricata its a match. and so on.
Now, please note the following rule Rule 2

flowbits:isset,fb1; flowbits:isset,fb2; flowbits:isset,fb3;

In this case, we check if fb1 is set, if yes, we move on to check if fb2 is set and so on.. Only when fb1, fb2 and fb3 are set would the rule considered to be a match. So, this is and logical operation.

@jufajardini from what I understand, I should create one name array json object that stores all the flowbit names. The array will store flowbits in 2 ways

  1. if cd->or_list_size = 0 the flowbit would be at cd->idx and then stored in name.
  2. if cd->or_list_size > 0 there are multiple flowbits and a for loop should be used alongside VarNameStoreSetupLookup to store all the flowbits in name.

My thinking is that this should cover both Rule 1 and Rule 2. Please let me know if I'm missing something.

From your explanation, this sounds about right!

@jasonish
Copy link
Member

Just a comment on the format of the JSON. For the simple case we have:

"name": "flowbits",
          "flowbits": {
            "cmd": "isset",
            "name": "fb2"
          }

This looks OK, I'm not so sure about the OR case.. I'd rather see something like:

         "name": "flowbits",
         "flowbits": {
           "cmd": "isset",
           "names": [
             "fb1",
             "fb2"
           ],
           "operator": "or"
         }

Or the following is often used to express such conditions as well:

         "name": "flowbits",
         "flowbits": {
           "cmd": "isset",
           "or": ["fb1", "fb2"],
         }

@hadiqaalamdar
Copy link
Contributor Author

I'll give this a try, thank you!

@inashivb
Copy link
Member

inashivb commented Oct 28, 2023

This looks OK, I'm not so sure about the OR case.. I'd rather see something like:

         "name": "flowbits",
         "flowbits": {
           "cmd": "isset",
           "names": [
             "fb1",
             "fb2"
           ],
           "operator": "or"
         }

This was exactly my suggestion. Wasn't sure on the name of the operator or operand key though. Thanks for clarifying the entire format properly! :)

@jufajardini
Copy link
Contributor

I'll give this a try, thank you!

Hi there, I know there was a lot of back and forth for the discussions here, so please let us know if you are having trouble following up on what we mentioned here, ok?

@hadiqaalamdar
Copy link
Contributor Author

Hi there, I know there was a lot of back and forth for the discussions here, so please let us know if you are having trouble following up on what we mentioned here, ok?

Hi, sorry for the delay. I will continue to work on this and share an update this week.

@hadiqaalamdar
Copy link
Contributor Author

New PR: #9971

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
outreachy Contributions made by Outreachy applicants
Development

Successfully merging this pull request may close these issues.

4 participants