-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Next/20200407/v3 #4785
Merged
Merged
Next/20200407/v3 #4785
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Also add tests for parsing them.
Make it easy to compare 'struct timeval's and get their difference.
…d by_both. The only difference between threshold calculations for by_src/by_dst, by_rule or by_both is which table stores the DetectThresholdEntry. Refactor the ThresholdHandlePacket* functions to do table lookup and storage individually, but calculate thresholds in a common function.
…hreshold table. Ensure that the by_rule threshold table is initialized if a rule is thresholded by_rule. Replace manual table reallocaton with calls to the common function.
Fixes runs with --enable-debug-validation. The target did not init a packet pool, so for a tunnel packet would try to get a packet from an uninitialized pool. In non-debug mode, this silently works by falling back to a packet from alloc. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ffff35a6801 in __GI_abort () at abort.c:79 #2 0x00007ffff359639a in __assert_fail_base (fmt=0x7ffff371d7d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555557fe7260 "!(pool->initialized == 0)", file=file@entry=0x555557fe7220 "tmqh-packetpool.c", line=line@entry=253, function=function@entry=0x555557fe7500 <__PRETTY_FUNCTION__.21181> "PacketPoolGetPacket") at assert.c:92 #3 0x00007ffff3596412 in __GI___assert_fail (assertion=0x555557fe7260 "!(pool->initialized == 0)", file=0x555557fe7220 "tmqh-packetpool.c", line=253, function=0x555557fe7500 <__PRETTY_FUNCTION__.21181> "PacketPoolGetPacket") at assert.c:101 #4 0x00005555577e24be in PacketPoolGetPacket () at tmqh-packetpool.c:253 #5 0x0000555556914ecd in PacketGetFromQueueOrAlloc () at decode.c:183 #6 0x00005555569161e1 in PacketTunnelPktSetup (tv=0x555559863980 <tv>, dtv=0x614000068e40, parent=0x61e0000fc080, pkt=0x61e0000fc470 "LL", len=72, proto=DECODE_TUNNEL_IPV4) at decode.c:286 #7 0x00005555569de694 in DecodeIPv4inIPv6 (tv=0x555559863980 <tv>, dtv=0x614000068e40, p=0x61e0000fc080, pkt=0x61e0000fc470 "LL", plen=72) at decode-ipv6.c:59 #8 0x00005555569e60b5 in DecodeIPV6ExtHdrs (tv=0x555559863980 <tv>, dtv=0x614000068e40, p=0x61e0000fc080, pkt=0x61e0000fc470 "LL", len=112) at decode-ipv6.c:522 #9 0x00005555569e846f in DecodeIPV6 (tv=0x555559863980 <tv>, dtv=0x614000068e40, p=0x61e0000fc080, pkt=0x61e0000fc420 "cLL", len=255) at decode-ipv6.c:641 #10 0x0000555556a032f9 in DecodeRaw (tv=0x555559863980 <tv>, dtv=0x614000068e40, p=0x61e0000fc080, pkt=0x61e0000fc420 "cLL", len=255) at decode-raw.c:70 #11 0x0000555557659ba8 in DecodePcapFile (tv=0x555559863980 <tv>, p=0x61e0000fc080, data=0x614000068e40) at source-pcap-file.c:412 #12 0x0000555556573401 in LLVMFuzzerTestOneInput (data=0x613000000047 "\241\262\315\064", size=339) at tests/fuzz/fuzz_sigpcap.c:158 #13 0x0000555557a4dc66 in main (argc=2, argv=0x7fffffffdfa8) at tests/fuzz/onefile.c:51 That line: BUG_ON(pool->initialized == 0);
3 tasks
victorjulien
added a commit
to victorjulien/suricata
that referenced
this pull request
Oct 31, 2021
The Suricata AF_PACKET code opens a socket per thread, then after some minor setup enters a loop where the socket is poll()'d with a timeout. When the poll() call returns a non zero positive value, the AF_PACKET ring will be processed. The ringbuffer processing logic has a pointer into the ring where we last checked the ring. From this position we will inspect each frame until we find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This means the frame is currently owned by the kernel. There is a special case handling for starting the ring processing but finding a TP_STATUS_KERNEL immediately. This logic then skip to the next frame, rerun the check, etc until it either finds an initialized frame or the last frame of the ringbuffer. The problem was, however, that the initial uninitialized frame was possibly (likely?) still being initialized by the kernel. A data race between the notification through the socket (the poll()) and the updating of the `tp_status` field in the frame could lead to a valid frame getting skipped. Of note is that for example libpcap does not do frame scanning. Instead it simply exits it ring processing loop. Also interesting is that libpcap uses atomic loads and stores on the tp_status field. This skipping of frames had 2 bad side effects: 1. in most cases, the buffer would be full enough that the frame would be processed in the next pass of the ring, but now the frame would out of order. This might have lead to packets belong to the same flow getting processed in the wrong order. 2. more severe is the soft lockup case. The skipped frame sits at ring buffer index 0. The rest of the ring has been cleared, after the initial frame was skipped. As our pass of the ring stops at the end of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code exits the ring processing loop at goes back to poll(). However, poll() will not indicate that there is more data, as the stale frame in the ring blocks the kernel from populating more frames beyond it. This is now a dead lock, as the kernel waits for Suricata and Suricata never touches the ring until it hears from the kernel. The scan logic will scan the whole ring at most once, so it won't reconsider the stale frame either. This patch addresses the issues in several ways: 1. the startup "discard" logic was fixed to not skip over kernel frames. Doing so would get us in a bad state at start up. 2. Instead of scanning the ring, we now enter a busy wait loop when encountering a kernel frame where we didn't expect one. This means that if we got a > 0 poll() result, we'll busy wait until we get at least one frame. 3. Error handling is unified and cleaned up. Any frame error now returns the frame to the kernel and progresses the frame pointer. 4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we yield to poll() immediately, as the next expected status of that frame is TP_STATUS_KERNEL. 5. the ring is no longer processed until the "end" of the ring (so highest index), but instead we process at most one full ring size per run. 6. Work with a copy of `tp_status` instead of accessing original touched also by the kernel. Bug: OISF#4785.
victorjulien
added a commit
to victorjulien/suricata
that referenced
this pull request
Nov 1, 2021
The Suricata AF_PACKET code opens a socket per thread, then after some minor setup enters a loop where the socket is poll()'d with a timeout. When the poll() call returns a non zero positive value, the AF_PACKET ring will be processed. The ringbuffer processing logic has a pointer into the ring where we last checked the ring. From this position we will inspect each frame until we find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This means the frame is currently owned by the kernel. There is a special case handling for starting the ring processing but finding a TP_STATUS_KERNEL immediately. This logic then skip to the next frame, rerun the check, etc until it either finds an initialized frame or the last frame of the ringbuffer. The problem was, however, that the initial uninitialized frame was possibly (likely?) still being initialized by the kernel. A data race between the notification through the socket (the poll()) and the updating of the `tp_status` field in the frame could lead to a valid frame getting skipped. Of note is that for example libpcap does not do frame scanning. Instead it simply exits it ring processing loop. Also interesting is that libpcap uses atomic loads and stores on the tp_status field. This skipping of frames had 2 bad side effects: 1. in most cases, the buffer would be full enough that the frame would be processed in the next pass of the ring, but now the frame would out of order. This might have lead to packets belong to the same flow getting processed in the wrong order. 2. more severe is the soft lockup case. The skipped frame sits at ring buffer index 0. The rest of the ring has been cleared, after the initial frame was skipped. As our pass of the ring stops at the end of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code exits the ring processing loop at goes back to poll(). However, poll() will not indicate that there is more data, as the stale frame in the ring blocks the kernel from populating more frames beyond it. This is now a dead lock, as the kernel waits for Suricata and Suricata never touches the ring until it hears from the kernel. The scan logic will scan the whole ring at most once, so it won't reconsider the stale frame either. This patch addresses the issues in several ways: 1. the startup "discard" logic was fixed to not skip over kernel frames. Doing so would get us in a bad state at start up. 2. Instead of scanning the ring, we now enter a busy wait loop when encountering a kernel frame where we didn't expect one. This means that if we got a > 0 poll() result, we'll busy wait until we get at least one frame. 3. Error handling is unified and cleaned up. Any frame error now returns the frame to the kernel and progresses the frame pointer. 4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we yield to poll() immediately, as the next expected status of that frame is TP_STATUS_KERNEL. 5. the ring is no longer processed until the "end" of the ring (so highest index), but instead we process at most one full ring size per run. 6. Work with a copy of `tp_status` instead of accessing original touched also by the kernel. Bug: OISF#4785.
victorjulien
added a commit
to victorjulien/suricata
that referenced
this pull request
Nov 2, 2021
The Suricata AF_PACKET code opens a socket per thread, then after some minor setup enters a loop where the socket is poll()'d with a timeout. When the poll() call returns a non zero positive value, the AF_PACKET ring will be processed. The ringbuffer processing logic has a pointer into the ring where we last checked the ring. From this position we will inspect each frame until we find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This means the frame is currently owned by the kernel. There is a special case handling for starting the ring processing but finding a TP_STATUS_KERNEL immediately. This logic then skip to the next frame, rerun the check, etc until it either finds an initialized frame or the last frame of the ringbuffer. The problem was, however, that the initial uninitialized frame was possibly (likely?) still being initialized by the kernel. A data race between the notification through the socket (the poll()) and the updating of the `tp_status` field in the frame could lead to a valid frame getting skipped. Of note is that for example libpcap does not do frame scanning. Instead it simply exits it ring processing loop. Also interesting is that libpcap uses atomic loads and stores on the tp_status field. This skipping of frames had 2 bad side effects: 1. in most cases, the buffer would be full enough that the frame would be processed in the next pass of the ring, but now the frame would out of order. This might have lead to packets belong to the same flow getting processed in the wrong order. 2. more severe is the soft lockup case. The skipped frame sits at ring buffer index 0. The rest of the ring has been cleared, after the initial frame was skipped. As our pass of the ring stops at the end of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code exits the ring processing loop at goes back to poll(). However, poll() will not indicate that there is more data, as the stale frame in the ring blocks the kernel from populating more frames beyond it. This is now a dead lock, as the kernel waits for Suricata and Suricata never touches the ring until it hears from the kernel. The scan logic will scan the whole ring at most once, so it won't reconsider the stale frame either. This patch addresses the issues in several ways: 1. the startup "discard" logic was fixed to not skip over kernel frames. Doing so would get us in a bad state at start up. 2. Instead of scanning the ring, we now enter a busy wait loop when encountering a kernel frame where we didn't expect one. This means that if we got a > 0 poll() result, we'll busy wait until we get at least one frame. 3. Error handling is unified and cleaned up. Any frame error now returns the frame to the kernel and progresses the frame pointer. 4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we yield to poll() immediately, as the next expected status of that frame is TP_STATUS_KERNEL. 5. the ring is no longer processed until the "end" of the ring (so highest index), but instead we process at most one full ring size per run. 6. Work with a copy of `tp_status` instead of accessing original touched also by the kernel. Bug: OISF#4785.
victorjulien
added a commit
to victorjulien/suricata
that referenced
this pull request
Nov 5, 2021
The Suricata AF_PACKET code opens a socket per thread, then after some minor setup enters a loop where the socket is poll()'d with a timeout. When the poll() call returns a non zero positive value, the AF_PACKET ring will be processed. The ringbuffer processing logic has a pointer into the ring where we last checked the ring. From this position we will inspect each frame until we find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This means the frame is currently owned by the kernel. There is a special case handling for starting the ring processing but finding a TP_STATUS_KERNEL immediately. This logic then skip to the next frame, rerun the check, etc until it either finds an initialized frame or the last frame of the ringbuffer. The problem was, however, that the initial uninitialized frame was possibly (likely?) still being initialized by the kernel. A data race between the notification through the socket (the poll()) and the updating of the `tp_status` field in the frame could lead to a valid frame getting skipped. Of note is that for example libpcap does not do frame scanning. Instead it simply exits it ring processing loop. Also interesting is that libpcap uses atomic loads and stores on the tp_status field. This skipping of frames had 2 bad side effects: 1. in most cases, the buffer would be full enough that the frame would be processed in the next pass of the ring, but now the frame would out of order. This might have lead to packets belong to the same flow getting processed in the wrong order. 2. more severe is the soft lockup case. The skipped frame sits at ring buffer index 0. The rest of the ring has been cleared, after the initial frame was skipped. As our pass of the ring stops at the end of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code exits the ring processing loop at goes back to poll(). However, poll() will not indicate that there is more data, as the stale frame in the ring blocks the kernel from populating more frames beyond it. This is now a dead lock, as the kernel waits for Suricata and Suricata never touches the ring until it hears from the kernel. The scan logic will scan the whole ring at most once, so it won't reconsider the stale frame either. This patch addresses the issues in several ways: 1. the startup "discard" logic was fixed to not skip over kernel frames. Doing so would get us in a bad state at start up. 2. Instead of scanning the ring, we now enter a busy wait loop when encountering a kernel frame where we didn't expect one. This means that if we got a > 0 poll() result, we'll busy wait until we get at least one frame. 3. Error handling is unified and cleaned up. Any frame error now returns the frame to the kernel and progresses the frame pointer. 4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we yield to poll() immediately, as the next expected status of that frame is TP_STATUS_KERNEL. 5. the ring is no longer processed until the "end" of the ring (so highest index), but instead we process at most one full ring size per run. 6. Work with a copy of `tp_status` instead of accessing original touched also by the kernel. Bug: OISF#4785.
victorjulien
added a commit
to victorjulien/suricata
that referenced
this pull request
Nov 11, 2021
The Suricata AF_PACKET code opens a socket per thread, then after some minor setup enters a loop where the socket is poll()'d with a timeout. When the poll() call returns a non zero positive value, the AF_PACKET ring will be processed. The ringbuffer processing logic has a pointer into the ring where we last checked the ring. From this position we will inspect each frame until we find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This means the frame is currently owned by the kernel. There is a special case handling for starting the ring processing but finding a TP_STATUS_KERNEL immediately. This logic then skip to the next frame, rerun the check, etc until it either finds an initialized frame or the last frame of the ringbuffer. The problem was, however, that the initial uninitialized frame was possibly (likely?) still being initialized by the kernel. A data race between the notification through the socket (the poll()) and the updating of the `tp_status` field in the frame could lead to a valid frame getting skipped. Of note is that for example libpcap does not do frame scanning. Instead it simply exits it ring processing loop. Also interesting is that libpcap uses atomic loads and stores on the tp_status field. This skipping of frames had 2 bad side effects: 1. in most cases, the buffer would be full enough that the frame would be processed in the next pass of the ring, but now the frame would out of order. This might have lead to packets belong to the same flow getting processed in the wrong order. 2. more severe is the soft lockup case. The skipped frame sits at ring buffer index 0. The rest of the ring has been cleared, after the initial frame was skipped. As our pass of the ring stops at the end of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code exits the ring processing loop at goes back to poll(). However, poll() will not indicate that there is more data, as the stale frame in the ring blocks the kernel from populating more frames beyond it. This is now a dead lock, as the kernel waits for Suricata and Suricata never touches the ring until it hears from the kernel. The scan logic will scan the whole ring at most once, so it won't reconsider the stale frame either. This patch addresses the issues in several ways: 1. the startup "discard" logic was fixed to not skip over kernel frames. Doing so would get us in a bad state at start up. 2. Instead of scanning the ring, we now enter a busy wait loop when encountering a kernel frame where we didn't expect one. This means that if we got a > 0 poll() result, we'll busy wait until we get at least one frame. 3. Error handling is unified and cleaned up. Any frame error now returns the frame to the kernel and progresses the frame pointer. 4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we yield to poll() immediately, as the next expected status of that frame is TP_STATUS_KERNEL. 5. the ring is no longer processed until the "end" of the ring (so highest index), but instead we process at most one full ring size per run. 6. Work with a copy of `tp_status` instead of accessing original touched also by the kernel. Bug: OISF#4785.
victorjulien
added a commit
to victorjulien/suricata
that referenced
this pull request
Nov 12, 2021
The Suricata AF_PACKET code opens a socket per thread, then after some minor setup enters a loop where the socket is poll()'d with a timeout. When the poll() call returns a non zero positive value, the AF_PACKET ring will be processed. The ringbuffer processing logic has a pointer into the ring where we last checked the ring. From this position we will inspect each frame until we find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This means the frame is currently owned by the kernel. There is a special case handling for starting the ring processing but finding a TP_STATUS_KERNEL immediately. This logic then skip to the next frame, rerun the check, etc until it either finds an initialized frame or the last frame of the ringbuffer. The problem was, however, that the initial uninitialized frame was possibly (likely?) still being initialized by the kernel. A data race between the notification through the socket (the poll()) and the updating of the `tp_status` field in the frame could lead to a valid frame getting skipped. Of note is that for example libpcap does not do frame scanning. Instead it simply exits it ring processing loop. Also interesting is that libpcap uses atomic loads and stores on the tp_status field. This skipping of frames had 2 bad side effects: 1. in most cases, the buffer would be full enough that the frame would be processed in the next pass of the ring, but now the frame would out of order. This might have lead to packets belong to the same flow getting processed in the wrong order. 2. more severe is the soft lockup case. The skipped frame sits at ring buffer index 0. The rest of the ring has been cleared, after the initial frame was skipped. As our pass of the ring stops at the end of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code exits the ring processing loop at goes back to poll(). However, poll() will not indicate that there is more data, as the stale frame in the ring blocks the kernel from populating more frames beyond it. This is now a dead lock, as the kernel waits for Suricata and Suricata never touches the ring until it hears from the kernel. The scan logic will scan the whole ring at most once, so it won't reconsider the stale frame either. This patch addresses the issues in several ways: 1. the startup "discard" logic was fixed to not skip over kernel frames. Doing so would get us in a bad state at start up. 2. Instead of scanning the ring, we now enter a busy wait loop when encountering a kernel frame where we didn't expect one. This means that if we got a > 0 poll() result, we'll busy wait until we get at least one frame. 3. Error handling is unified and cleaned up. Any frame error now returns the frame to the kernel and progresses the frame pointer. 4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we yield to poll() immediately, as the next expected status of that frame is TP_STATUS_KERNEL. 5. the ring is no longer processed until the "end" of the ring (so highest index), but instead we process at most one full ring size per run. 6. Work with a copy of `tp_status` instead of accessing original touched also by the kernel. Bug: OISF#4785. (cherry picked from commit a022648)
victorjulien
added a commit
to victorjulien/suricata
that referenced
this pull request
Nov 12, 2021
The Suricata AF_PACKET code opens a socket per thread, then after some minor setup enters a loop where the socket is poll()'d with a timeout. When the poll() call returns a non zero positive value, the AF_PACKET ring will be processed. The ringbuffer processing logic has a pointer into the ring where we last checked the ring. From this position we will inspect each frame until we find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This means the frame is currently owned by the kernel. There is a special case handling for starting the ring processing but finding a TP_STATUS_KERNEL immediately. This logic then skip to the next frame, rerun the check, etc until it either finds an initialized frame or the last frame of the ringbuffer. The problem was, however, that the initial uninitialized frame was possibly (likely?) still being initialized by the kernel. A data race between the notification through the socket (the poll()) and the updating of the `tp_status` field in the frame could lead to a valid frame getting skipped. Of note is that for example libpcap does not do frame scanning. Instead it simply exits it ring processing loop. Also interesting is that libpcap uses atomic loads and stores on the tp_status field. This skipping of frames had 2 bad side effects: 1. in most cases, the buffer would be full enough that the frame would be processed in the next pass of the ring, but now the frame would out of order. This might have lead to packets belong to the same flow getting processed in the wrong order. 2. more severe is the soft lockup case. The skipped frame sits at ring buffer index 0. The rest of the ring has been cleared, after the initial frame was skipped. As our pass of the ring stops at the end of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code exits the ring processing loop at goes back to poll(). However, poll() will not indicate that there is more data, as the stale frame in the ring blocks the kernel from populating more frames beyond it. This is now a dead lock, as the kernel waits for Suricata and Suricata never touches the ring until it hears from the kernel. The scan logic will scan the whole ring at most once, so it won't reconsider the stale frame either. This patch addresses the issues in several ways: 1. the startup "discard" logic was fixed to not skip over kernel frames. Doing so would get us in a bad state at start up. 2. Instead of scanning the ring, we now enter a busy wait loop when encountering a kernel frame where we didn't expect one. This means that if we got a > 0 poll() result, we'll busy wait until we get at least one frame. 3. Error handling is unified and cleaned up. Any frame error now returns the frame to the kernel and progresses the frame pointer. 4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we yield to poll() immediately, as the next expected status of that frame is TP_STATUS_KERNEL. 5. the ring is no longer processed until the "end" of the ring (so highest index), but instead we process at most one full ring size per run. 6. Work with a copy of `tp_status` instead of accessing original touched also by the kernel. Bug: OISF#4785. (cherry picked from commit a022648)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#4760, rebased
fuzz sigpcap target improvement
PRScript output (if applicable):
Passed