Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Killing multi process fuzzer #2110

Closed
vringar opened this issue Apr 25, 2024 · 16 comments
Closed

Killing multi process fuzzer #2110

vringar opened this issue Apr 25, 2024 · 16 comments
Labels
bug Something isn't working

Comments

@vringar
Copy link
Contributor

vringar commented Apr 25, 2024

Report based on abcb2bf

Describe the bug
When killing a fuzzer using the launcher, using ctrl+C in the attached terminal only the root process gets killed while others keeps running in the background

To Reproduce
Steps to reproduce the behavior:

  1. Go to fuzzers/qemu_systemmode
  2. cargo make build
  3. cargo make run
  4. Wait until a couple of progress reports appear
  5. Press ctrl+C

Expected behavior
I would expect an orderly shutdown of all involved processes. The other ones don't even respond to a SIGTERM but have to be SIGKILLed by the kernel

Screen output/Screenshots
If applicable, add copy-paste of the screen output or screenshot that shows the issue. Please ensure the output is in English and not in Chinese, Russian, German, etc.

^C
Fuzzing stopped by user. Good bye.
[2024-04-25T20:41:49Z ERROR libafl::events::llmp] Failed to send tcp message OsError(
        Os {
            code: 32,
            kind: BrokenPipe,
            message: "Broken pipe",
        },
        "io::Error ocurred",
        ErrorBacktrace,
    )
thread 'main' panicked at /run/media/stefan/02e400c2-1bdd-4d46-a0dd-044d6b4f3af4/uni/LibAFL/libafl/src/events/llmp.rs:1421:21:
Fuzzer-respawner: Storing state in crashed fuzzer instance did not work, no point to spawn the next client! This can happen if the child calls `exit()`, in that case make sure it uses `abort()`, if it got killed u
nrecoverable (OOM), or if there is a bug in the fuzzer itself. (Child exited with: -1)
stack backtrace:
<lots of backtrace>
<Terminal is still blocked>
^C
<Terminal is now usable again>
> pgrep qemu_systemmode
<PID of a running process>
> pkill qemu_systemmode
> pgrep qemu_systemmode
<PID of a running process>
> pkill -9 qemu_systemmode
> pgrep qemu_systemmode
<No more output>

Additional context
Add any other context about the problem here.

@vringar vringar added the bug Something isn't working label Apr 25, 2024
@tokatoka
Copy link
Member

we do have the handler for ctrl-c.
and for example, ctrl-c works on qemu_launcher

@tokatoka
Copy link
Member

Fuzzer-respawner: Storing state in crashed fuzzer instance did not work, no point to spawn the next client! This can happen if the child calls exit(), in that case make sure it uses abort(), if it got killed u
nrecoverable (OOM), or if there is a bug in the fuzzer itself. (Child exited with: -1)

This is more of a problem as you should not see this log.
I suspect if it's qemu that's intercepting the ctrl-c

@tokatoka
Copy link
Member

@rmalmain can you check?

@tokatoka tokatoka reopened this Apr 29, 2024
@tokatoka
Copy link
Member

i noticed one issue withou t signal handling i'll fix it soon

@tokatoka
Copy link
Member

Can you try if #2124 fixes your issue?

@vringar
Copy link
Contributor Author

vringar commented Apr 30, 2024

I just saw that #2124 got merged into main, and the child processes now react correctly to getting SIGTERMed. However, the propagation from parent to child still doesn't seem to work.

@tokatoka
Copy link
Member

tokatoka commented Apr 30, 2024

can you explain what do you mean by "However, the propagation from parent to child still doesn't seem to work."?

What we do is

  1. for child process, on the reception of sigint or sigterm, exit with exit code 100.
  2. for the restarter parent process, we ignore sigint.
  3. the parent will check the child's exit code, if it is 100. then exit too.

Note that, for example, if you run it on N cores, you will see 2N + 1 processes.
The first very initial process is the Launcher process
and for each client you have one restarter (which I call the parent) and one client (which I call the child)

For me, as far as I tried with qemu_launcher

  1. SIGINT in the terminal -> all process are killed (cuz everybody is in foreground)
  2. SIGINT to the launcher -> all process are killed (i don't remember how launcher works but this looks fine)
  3. SIGINT to (one of) the parent -> Nothing happens (I code it to behave so, but maybe this is what you are talking about?)
  4. SIGINT to (one of) the child -> The child the dies, the corresponding parent becomes zombie process.

What exactly do you want to propagate from the parent to the child? won't the signal be sent to the all the foreground process in the same process group? (assuming you press ctrl-c

@tokatoka
Copy link
Member

also can you give me an example to reproduce other than qemu_systemmode?
i don't want to install arm stuff

@vringar
Copy link
Contributor Author

vringar commented May 2, 2024

Okay, I am an absolute novice, so all I was aware of was the Launcher process, which I called parent and that there were new processes getting spawned which I called "child". I was unaware of the existence of the restarter.

From here on, I will use your terminology, because it is more precise.
From my observation, when running cargo make run in fuzzer/qemu_systemmode I see that upon pressing ctrl+c the Launcher process exits, but that the parent and child keep on running. A pkill qemu_systemmode then kills the child and pkill -9 qemu_systemmode finally kills the parent.

I'm unsure how cargo make handles the multiple processes, what I have seen in other runners is:

  1. All subprocesses receive the SIGINT (This would match your 1.)
  2. Only the Launcher as the initial process receives the signal (This would match your 2.)
  3. The Launcher gets SIGKILLED and the children survive

As I see the Fuzzing stopped by user. Good bye. I can assume that 3 is not happening, and as the child processes respect the SIGINT, I assume 1. isn't happening either.
This would suggest that

SIGINT to the launcher -> all process are killed (i don't remember how launcher works but this looks fine)

is not working as expected for me.

also can you give me an example to reproduce other than qemu_systemmode

I'll look into finding a better reproducer. Thanks for taking the time to debug this with me.

@tokatoka
Copy link
Member

tokatoka commented May 2, 2024

wait.. i think you don't have fork feature enabled right?
then sigint is ignored for both the child and the parent

@vringar
Copy link
Contributor Author

vringar commented May 2, 2024

I'm using this unmodified Cargo.toml and since fork is in the default features and the default features aren't disabled, I'm assuming this is using fork mode.

@tokatoka
Copy link
Member

tokatoka commented May 2, 2024

Can you try #2132 ?

@vringar
Copy link
Contributor Author

vringar commented May 2, 2024

Just tried it, but unfortunately, I'm still getting the same behaviour as described an hour ago.
(I should mention that this is still much nicer than the initial behaviour, where you had to press ctrl+c three times to regain control over the terminal so #2124 was definitely a massive improvement)

@tokatoka
Copy link
Member

tokatoka commented May 2, 2024

for sure something (something outside libafl) is overriding sigint handler then.
what you can do is attach to the child process and send signal and see if sigint handler is called.
next thing you can do is break at signal() syscall. and see who are calling that besides from libafl

@tokatoka
Copy link
Member

tokatoka commented May 2, 2024

I cannot reproduce what you said

toka@toka:~/LibAFL/fuzzers/qemu_systemmode$ cargo make run
[cargo-make] INFO - cargo make 0.37.9
[cargo-make] INFO - Calling cargo metadata to extract project info
[cargo-make] INFO - Cargo metadata done
[cargo-make] INFO - Project: qemu_systemmode
[cargo-make] INFO - Build File: Makefile.toml
[cargo-make] INFO - Task: run
[cargo-make] INFO - Profile: development
[cargo-make] INFO - Running Task: legacy-migration
[cargo-make] INFO - Execute Command: "cargo" "make" "-e" "FEATURE=classic" "-e" "TARGET_DEFINE=TARGET_CLASSIC" "run_fuzzer"
[cargo-make][1] INFO - Calling cargo metadata to extract project info
[cargo-make][1] INFO - Cargo metadata done
[cargo-make][1] INFO - Project: qemu_systemmode
[cargo-make][1] INFO - Build File: Makefile.toml
[cargo-make][1] INFO - Task: run_fuzzer
[cargo-make][1] INFO - Profile: development
[cargo-make][1] INFO - Skipping Task: legacy-migration
[cargo-make][1] INFO - Skipping Task: target_dir
[cargo-make][1] INFO - Execute Command: "arm-none-eabi-gcc" "-ggdb" "-ffreestanding" "-nostartfiles" "-lgcc" "-T" "/home/toka/LibAFL/fuzzers/qemu_systemmode/example/mps2_m3.ld" "-mcpu=cortex-m3" "/home/toka/LibAFL/fuzzers/qemu_systemmode/example/main.c" "/home/toka/LibAFL/fuzzers/qemu_systemmode/example/startup.c" "-D" "TARGET_CLASSIC" "-I" "/home/toka/LibAFL/fuzzers/qemu_systemmode/target/classic/release/include" "-o" "/home/toka/LibAFL/fuzzers/qemu_systemmode/target/classic/example.elf"
[cargo-make][1] INFO - Execute Command: "/home/toka/LibAFL/fuzzers/qemu_systemmode/target/classic/release/qemu_systemmode" "-icount" "shift=auto,align=off,sleep=off" "-machine" "mps2-an385" "-monitor" "null" "-kernel" "/home/toka/LibAFL/fuzzers/qemu_systemmode/target/classic/example.elf" "-serial" "null" "-nographic" "-snapshot" "-drive" "if=none,format=qcow2,file=/home/toka/LibAFL/fuzzers/qemu_systemmode/target/classic/dummy.qcow2" "-S"
FUZZ_INPUT @ 0x29c
main address = 0x136
Breakpoint address = 0x78
Devices = ["timer", "cpu_common", "cpu", "armv7m_nvic", "armv7m_systick", "armv7m", "or-irq", "cmsdk-apb-uart", "cmsdk-apb-uart", "cmsdk-apb-uart", "cmsdk-apb-uart", "cmsdk-apb-uart", "cmsdk-apb-timer", "cmsdk-apb-timer", "cmsdk-apb-dualtimer", "cmsdk-apb-watchdog", "led", "led", "led", "led", "led", "led", "led", "led", "mps2-scc", "led", "led", "mps2-fpgaio", "pl022_ssp", "or-irq", "pl022_ssp", "pl022_ssp", "or-irq", "pl022_ssp", "pl022_ssp", "i2c_bus", "i2c_bus", "i2c_bus", "i2c_bus", "lan9118"]
[UserStats   #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 12/12 (100%)
[Testcase    #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 12/12 (100%)
We imported 2 inputs from disk.
[UserStats   #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 16/16 (100%)
[Testcase    #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 2, objectives: 0, executions: 2, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 2, objectives: 0, executions: 2, exec/sec: 0.000, edges: 16/16 (100%)
[UserStats   #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 2, objectives: 0, executions: 2, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 2, objectives: 0, executions: 2, exec/sec: 0.000, edges: 16/16 (100%)
[Testcase    #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 3, objectives: 0, executions: 3, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 3, objectives: 0, executions: 3, exec/sec: 0.000, edges: 16/16 (100%)
[UserStats   #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 3, objectives: 0, executions: 3, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 3, objectives: 0, executions: 3, exec/sec: 0.000, edges: 16/16 (100%)
[Testcase    #1]  (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 4, objectives: 0, executions: 4, exec/sec: 0.000, edges: 100.000%
                  (CLIENT) corpus: 4, objectives: 0, executions: 4, exec/sec: 0.000, edges: 16/16 (100%)
^Cqemu_systemmode: terminating on signal 2
Fuzzing stopped by user. Good bye.

toka@toka:~/LibAFL/fuzzers/qemu_systemmode$ [2024-05-02T13:27:47Z ERROR libafl::events::llmp] Connection refused.
thread 'main' panicked at /home/toka/LibAFL/libafl/src/events/llmp.rs:1602:21:
Fuzzer-respawner: Storing state in crashed fuzzer instance did not work, no point to spawn the next client! This can happen if the child calls `exit()`, in that case make sure it uses `abort()`, if it got killed unrecoverable (OOM), or if there is a bug in the fuzzer itself. (Child exited with: 0)
stack backtrace:
   0:     0x5796ea582cbf - std::backtrace_rs::backtrace::libunwind::trace::h516a98be21ea1e9d
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/../../backtrace/src/backtrace/libunwind.rs:105:5
   1:     0x5796ea582cbf - std::backtrace_rs::backtrace::trace_unsynchronized::h0beee091dd45e212
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x5796ea582cbf - std::sys_common::backtrace::_print_fmt::hd6e747bb9d3f708b
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/sys_common/backtrace.rs:68:5
   3:     0x5796ea582cbf - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h88bd8885dd8bc971
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x5796ea36094b - core::fmt::rt::Argument::fmt::h0a98b623e411d353
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/core/src/fmt/rt.rs:165:63
   5:     0x5796ea36094b - core::fmt::write::h05543e7a8f793da9
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/core/src/fmt/mod.rs:1157:21
   6:     0x5796ea55b6a2 - std::io::Write::write_fmt::h32fe40906746efba
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/io/mod.rs:1832:15
   7:     0x5796ea5886a9 - std::sys_common::backtrace::_print::h18bd86a960ce0efd
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x5796ea5886a9 - std::sys_common::backtrace::print::h46870d6ea993b433
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x5796ea587ece - std::panicking::default_hook::{{closure}}::h1eb28bb3c5a81eb1
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/panicking.rs:271:22
  10:     0x5796ea5879e9 - std::panicking::default_hook::h0776def55f4233d2
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/panicking.rs:291:9
  11:     0x5796ea588c34 - std::panicking::rust_panic_with_hook::h7775800bf8e4e70f
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/panicking.rs:788:13
  12:     0x5796ea588a14 - std::panicking::begin_panic_handler::{{closure}}::h09c5c8c95aa35ba4
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/panicking.rs:657:13
  13:     0x5796ea588969 - std::sys_common::backtrace::__rust_end_short_backtrace::hecf6df049235318b
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/sys_common/backtrace.rs:171:18
  14:     0x5796ea588956 - rust_begin_unwind
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/panicking.rs:645:5
  15:     0x5796ea286c85 - core::panicking::panic_fmt::hc9686837370900a4
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/core/src/panicking.rs:72:14
  16:     0x5796ea2bbd12 - libafl::events::llmp::RestartingMgr<EMH,MT,S,SP>::launch::h53ef78917c0968dd
  17:     0x5796ea2b1ad2 - libafl::events::launcher::Launcher<CF,EMH,MT,S,SP>::launch_with_hooks::hf8019bce1a975f8b
                               at /home/toka/LibAFL/libafl/src/events/launcher.rs:270:44
  18:     0x5796ea2b1ad2 - libafl::events::launcher::Launcher<CF,(),MT,S,SP>::launch::h347a78eb6c19c3b7
                               at /home/toka/LibAFL/libafl/src/events/launcher.rs:169:9
  19:     0x5796ea2b1ad2 - qemu_systemmode::fuzzer_classic::fuzz::h9f1207446724cee0
                               at /home/toka/LibAFL/fuzzers/qemu_systemmode/src/fuzzer_classic.rs:261:11
  20:     0x5796ea2b1ad2 - qemu_systemmode::main::h7051ccdd482a4712
                               at /home/toka/LibAFL/fuzzers/qemu_systemmode/src/main.rs:14:5
  21:     0x5796ea2ad823 - core::ops::function::FnOnce::call_once::hf6e5f96cfc3e1006
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/core/src/ops/function.rs:250:5
  22:     0x5796ea2ad823 - std::sys_common::backtrace::__rust_begin_short_backtrace::h7db4db98a6907f8a
                               at /rustc/ccfcd950b333fed046275dd8d54fe736ca498aa7/library/std/src/sys_common/backtrace.rs:155:18
  23:     0x5796ea2adbe2 - main
  24:     0x707ac3a29d90 - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  25:     0x707ac3a29e40 - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:392:3
  26:     0x5796ea2a4045 - _start
  27:                0x0 - <unknown>
^C
toka@toka:~/LibAFL/fuzzers/qemu_systemmode$ ps ax | grep qemu
 335559 pts/0    S+     0:00 grep --color=auto qemu

For me, after I run cargo make run and ctrl-c twice
then every qemu process is killed. so there's no issue

@tokatoka tokatoka closed this as completed May 2, 2024
@tokatoka
Copy link
Member

tokatoka commented May 3, 2024

I think now with #2133 we gave it proper exit handling and removing this stack trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants