Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service_nftables_disabled fails after remediation #10424

Closed
marcusburghardt opened this issue Apr 4, 2023 · 32 comments
Closed

service_nftables_disabled fails after remediation #10424

marcusburghardt opened this issue Apr 4, 2023 · 32 comments
Assignees
Labels
productization-issue Issue found in upstream stabilization process. RHEL8 Red Hat Enterprise Linux 8 product related. triaged
Milestone

Comments

@marcusburghardt
Copy link
Member

marcusburghardt commented Apr 4, 2023

Description of problem:

This rule was introduced by #10390.
It is failing after remediation when checking CIS Server Level 2 profile.

SCAP Security Guide Version:

master branch as of 2023-04-01

Operating System Version:

RHEL8.8 and RHEL9.2

Steps to Reproduce:

  1. ./build_product rhel8
  2. oscap xccdf eval --progress --remediate --profile xccdf_org.ssgproject.content_profile_cis --report /cis_remediate_report.html /ssg-rhel8-ds.xml
  3. oscap xccdf eval --progress --profile xccdf_org.ssgproject.content_profile_cis --results cis-xccdf-results.xml --report cis.html /ssg-rhel8-ds.xml

Actual Results:

xccdf_org.ssgproject.content_rule_service_nftables_disabled - fail

Expected Results:

xccdf_org.ssgproject.content_rule_service_nftables_disabled - pass

Additional Information/Debugging Steps:

@marcusburghardt marcusburghardt added productization-issue Issue found in upstream stabilization process. RHEL8 Red Hat Enterprise Linux 8 product related. labels Apr 4, 2023
@marcusburghardt marcusburghardt added this to the 0.1.68 milestone Apr 4, 2023
@jan-cerny
Copy link
Collaborator

It passes using AutoMatus:

[jcerny@thinkpad scap-security-guide{master}]$ python3 tests/automatus.py rule --libvirt qemu:///system ssgts_rhel8 service_nftables_disabled
Setting console output to log level INFO
INFO - The base image option has not been specified, choosing libvirt-based test environment.
INFO - Logging into /home/jcerny/work/git/scap-security-guide/logs/rule-custom-2023-04-05-1025/test_suite.log
INFO - xccdf_org.ssgproject.content_rule_service_nftables_disabled
INFO - Script service_disabled.pass.sh using profile (all) OK
INFO - Script service_enabled.fail.sh using profile (all) OK

So I will take a look what will happen in the context of the whole profile.

@jan-cerny jan-cerny self-assigned this Apr 5, 2023
@jan-cerny
Copy link
Collaborator

I haven't reproduced this in a RHEL 8.8 VM. The rule is pass after the remediation and the nftables service is masked after the remediation. The rule is templated so it should work the same way as all other service_disabled rules. @marcusburghardt I suspect that there can be problem with the service itself, maybe something that prevents the service from being disabled?

@marcusburghardt
Copy link
Member Author

I haven't reproduced this in a RHEL 8.8 VM. The rule is pass after the remediation and the nftables service is masked after the remediation. The rule is templated so it should work the same way as all other service_disabled rules. @marcusburghardt I suspect that there can be problem with the service itself, maybe something that prevents the service from being disabled?

Same here @jan-cerny . Yesterday I also executed few profile tests locally and couldn't reproduce it. I also tested a systems with the nftables service disabled during the rule implementation and it worked fine. I am investigating the service to see if any details is missed.

@jan-cerny
Copy link
Collaborator

I suspect that the state might be changed during the reboot, I'm trying it

@jan-cerny
Copy link
Collaborator

In this productization test there is a reboot and a scan is performed both before and after reboot. Before the reboot the service is masked but after the reboot the service is unmasked. I have tried to reproduce it locally on a virtual machine and on various remote machines, but I wasn't able to reproduce it outside the specific environment that is used during the productization, which prevents me from debugging it. I suspect that it can be something specific to the infrastructure.

For the time being let's keep the issue opened and observe the result of the next week productization test. Then, we might solve it by adding a (temporary) waiver.

@Mab879
Copy link
Member

Mab879 commented Apr 10, 2023

This is still an issue in the latest run.

@jan-cerny
Copy link
Collaborator

I'm going to revisit this issue this week.

@mildas
Copy link
Contributor

mildas commented Apr 12, 2023

Sanity/machine-hardening test is one of those where it fails. If you want, I can fairly quickly get you a machine where the test was run and the rule fails.

The issue doesn't seem to be environment problem. It fails also in kickstart test. Kickstart test installs VM, hardens it via Anaconda addon, and performs VM scan after its first boot. No beakerlib, no workarounds (except few rule unselects so it's accessible), basically freshly installed RHEL.

@jan-cerny
Copy link
Collaborator

I wasn't able to reproduce this. You can ping me off-list to get some details about my machines. I'm honestly giving up.

@jan-cerny jan-cerny removed their assignment Apr 14, 2023
@marcusburghardt
Copy link
Member Author

Sanity/machine-hardening test is one of those where it fails. If you want, I can fairly quickly get you a machine where the test was run and the rule fails.

The issue doesn't seem to be environment problem. It fails also in kickstart test. Kickstart test installs VM, hardens it via Anaconda addon, and performs VM scan after its first boot. No beakerlib, no workarounds (except few rule unselects so it's accessible), basically freshly installed RHEL.

It would be great if you can provide me access for this machine Milan.

@marcusburghardt
Copy link
Member Author

I wasn't able to reproduce this. You can ping me off-list to get some details about my machines. I'm honestly giving up.

Thanks for the efforts @jan-cerny . You provided value information with your tests. I will try to continue the investigation in this issue.

@marcusburghardt marcusburghardt self-assigned this Apr 14, 2023
@yuumasato
Copy link
Member

This still happens as of this week.

@mildas
Copy link
Contributor

mildas commented May 15, 2023

@marcusburghardt
On reserved machine service_nftables_disabled passes by default. There, I did service_nftables_disabled check after every CIS Level 2 rule remediation. And it starts failing right after service_firewalld_enabled. So that's the collision.

I haven't done any further investigation, if there's service dependency, if there's more problematic rules in CIS (and firewalld was just the first hit) or what's going on. I might look at it later, but wanted to inform you as you might already know what's going on.

@marcusburghardt
Copy link
Member Author

@marcusburghardt On reserved machine service_nftables_disabled passes by default. There, I did service_nftables_disabled check after every CIS Level 2 rule remediation. And it starts failing right after service_firewalld_enabled. So that's the collision.

I haven't done any further investigation, if there's service dependency, if there's more problematic rules in CIS (and firewalld was just the first hit) or what's going on. I might look at it later, but wanted to inform you as you might already know what's going on.

@mildas I didn't find any relationship between service_nftables_disabled and service_firewalld_enabled. I also checked the template and the remediation. Everything seems to be ok.

Something weird is that the first scan should not pass. It seems by any reason the first scan is unable to properly assess the systemd units:
Screenshot from 2023-05-15 17-35-33

So, the remediation is not applied.
However, after reboot, the systemd units are properly assessed:
Screenshot from 2023-05-15 17-37-12

This second scan is correct and it is failing because the nftables.service unit is not masked. It would only be masked if the remediation would be applied. But since the initial scan is not properly discovering the nftables.service state, it is reporting a false positive.

Do you have any idea on why the nftables.service state is not detected during the first scan?

@mildas
Copy link
Contributor

mildas commented May 16, 2023

Do you have any idea on why the nftables.service state is not detected during the first scan?

I haven't found anything obvious, services doesn't reveal anything, and oscap devel log didn't help me either why it doesn't see it on first scan.

@jan-cerny Could you look into it? Or should we report it against openscap project? See last 2 message, but basically openscap doesn't see that nftables service is not masked until firewalld_enabled gets remediated.

@jan-cerny
Copy link
Collaborator

I can reproduce the situation that @marcusburghardt described, ie. the situation that oscap can't read the state of the nftables service.

I have found that reason is that oscap doesn't get the data about the nftables.service systemd unit from dbus.

However, even systemd doesn't show this unit.

This gives no output:

systemctl list-units --all | grep nftables

Also, this doesn't give any output:

 dbus-send --system --print-reply --reply-timeout=2000 --type=method_call --dest=org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager.ListUnits | grep nftables

OTOH the status can be displayed

[root@kvm-05-guest10 build]# systemctl status nftables
● nftables.service - Netfilter Tables
   Loaded: loaded (/usr/lib/systemd/system/nftables.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:nft(8)

I guess that this is some specific behavior of systemd/dbus that I don't understand.

@marcusburghardt
Copy link
Member Author

I can reproduce the situation that @marcusburghardt described, ie. the situation that oscap can't read the state of the nftables service.

I have found that reason is that oscap doesn't get the data about the nftables.service systemd unit from dbus.

However, even systemd doesn't show this unit.

This gives no output:

systemctl list-units --all | grep nftables

Also, this doesn't give any output:

 dbus-send --system --print-reply --reply-timeout=2000 --type=method_call --dest=org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager.ListUnits | grep nftables

OTOH the status can be displayed

[root@kvm-05-guest10 build]# systemctl status nftables
● nftables.service - Netfilter Tables
   Loaded: loaded (/usr/lib/systemd/system/nftables.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:nft(8)

I guess that this is some specific behavior of systemd/dbus that I don't understand.

@jan-cerny , in your test environment, could you try to execute the systemctl daemon-reload command before the systemctl list-units --all | grep nftables, please? After rebooting the system it seems to work, so probably this command should help, but we need to confirm if it is the case.

@jan-cerny
Copy link
Collaborator

Thanks! I will try it.

@jan-cerny
Copy link
Collaborator

The result is that systemctl daemon-reload doesn't change anything. After executing it, the systemctl list-units --all | grep nftables still returns nothing. Also the output of other commands is still the same as in my previous comment. I tried multiple times.

@marcusburghardt
Copy link
Member Author

The result is that systemctl daemon-reload doesn't change anything. After executing it, the systemctl list-units --all | grep nftables still returns nothing. Also the output of other commands is still the same as in my previous comment. I tried multiple times.

Ok. Thank you very much for this test @jan-cerny . I was considering we might have an option to workaround this issue without a reboot, but it doesn't seem to be the case.

@marcusburghardt
Copy link
Member Author

@mildas and @jan-cerny , would you agree to move this issue to the scanner and waive this rule on content side?

@jan-cerny
Copy link
Collaborator

@marcusburghardt This doesn't seem to be an issue on the content side. However, I'm not sure if it's an issue in the scanner. I don't know where the issue exactly is. In OpenSCAP, we only perform a dbus call of the org.freedesktop.systemd1.Manager.ListUnits method and parse the returned value. I have shown in the comment above that in this situation this dbus call doesn't return any data about the nftables unit. So, the first option is that we call a wrong method in OpenSCAP. The second option is a problem with systemd or the nftables itself.

@marcusburghardt
Copy link
Member Author

@marcusburghardt This doesn't seem to be an issue on the content side. However, I'm not sure if it's an issue in the scanner. I don't know where the issue exactly is. In OpenSCAP, we only perform a dbus call of the org.freedesktop.systemd1.Manager.ListUnits method and parse the returned value. I have shown in the comment above that in this situation this dbus call doesn't return any data about the nftables unit. So, the first option is that we call a wrong method in OpenSCAP. The second option is a problem with systemd or the nftables itself.

I see. It makes sense. So we need more investigation on this to make it clear the source of the problem. It sounds reasonable to keep this issue opened here. It is also clear to us the rule itself is working as expected, so we should be fine to waive this issue in productization tests while this issue is open. Is it ok for you @mildas ?

@mildas
Copy link
Contributor

mildas commented May 17, 2023

Sure, I will update waivers.
I suggest contacting someone from systemd if the could briefly check it. That could save us time. There's still some time before next RHEL release, so if we act quickly, it could get fixed there. We just need to know whom to assign bugzilla and create it.

@evgenyz
Copy link
Member

evgenyz commented May 22, 2023

The nftables.service is:

[Unit]
Description=Netfilter Tables
Documentation=man:nft(8)
Wants=network-pre.target
Before=network-pre.target

[Service]
Type=oneshot
ProtectSystem=full
ProtectHome=true
ExecStart=/sbin/nft -f /etc/sysconfig/nftables.conf
ExecReload=/sbin/nft 'flush ruleset; include "/etc/sysconfig/nftables.conf";'
ExecStop=/sbin/nft flush ruleset
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

The most important part of it is Type=oneshot: https://trstringer.com/simple-vs-oneshot-systemd-service/. This might be the reason it is not listed.

@evgenyz
Copy link
Member

evgenyz commented May 22, 2023

When I call the method via D-Spy I get ('nftables.service', 'Netfilter Tables', 'loaded', 'inactive', 'dead', '', '/org/freedesktop/systemd1/unit/nftables_2eservice', 0, '', '/'),.

@jan-cerny jan-cerny modified the milestones: 0.1.68, 0.1.69 May 29, 2023
@marcusburghardt marcusburghardt removed their assignment Jun 5, 2023
@marcusburghardt
Copy link
Member Author

This issue is not content related, but something related to D-BUS and systemd. @evgenyz, would you like to investigate if we can fix this on the scanner side?

@mildas
Copy link
Contributor

mildas commented Jun 5, 2023

Could you create BZ either to openscap or dbus and close this issue? @evgenyz or @marcusburghardt

@ggbecker
Copy link
Member

ggbecker commented Sep 5, 2023

We need to retest this using the openscap that contains the fix OpenSCAP/openscap#1980

@Mab879 Mab879 modified the milestones: 0.1.70, 0.1.71 Oct 2, 2023
@vojtapolasek vojtapolasek modified the milestones: 0.1.71, 0.1.72 Nov 28, 2023
@marcusburghardt marcusburghardt modified the milestones: 0.1.72, 0.1.73 Jan 29, 2024
@vojtapolasek vojtapolasek modified the milestones: 0.1.73, 0.1.74 Apr 30, 2024
@Mab879 Mab879 modified the milestones: 0.1.74, 0.1.75 Jul 29, 2024
@Mab879 Mab879 modified the milestones: 0.1.75, 0.1.76 Nov 6, 2024
@Mab879 Mab879 removed the blocked Issue that can't be fixed in content. label Dec 19, 2024
@Mab879
Copy link
Member

Mab879 commented Dec 19, 2024

The OpenSCAP fix has been released, this should be reviewed.

@Mab879 Mab879 added the triaged label Dec 19, 2024
@Mab879
Copy link
Member

Mab879 commented Jan 13, 2025

This seems to fixed now. Closing.

@Mab879 Mab879 closed this as completed Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
productization-issue Issue found in upstream stabilization process. RHEL8 Red Hat Enterprise Linux 8 product related. triaged
Projects
None yet
Development

No branches or pull requests

8 participants