Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pcap: make stoppable when no packets to sniff #256

Merged
merged 4 commits into from
Aug 8, 2017

Conversation

jcrussell
Copy link
Contributor

In newer versions of libpcap (1.8.1), the read timeout will not cause
the read to return if there are no packets to sniff. To fix this, I
changed the pacp handle to non-blocking and added a new helper,
pcap_wait, that uses select to wait up to the read timeout for
packets. I have tested that it has the desired behavior but I have not
benchmarked it to determine how much of a slowdown, if any, there is.

Fixes #253.

In newer versions of libpcap (1.8.1), the read timeout will not cause
the read to return if there are no packets to sniff. To fix this, I
changed the pacp handle to non-blocking and added a new helper,
pcap_wait, that uses `select` to wait up to the read timeout for
packets. I have tested that it has the desired behavior but I have not
benchmarked it to determine how much of a slowdown, if any, there is.

Fixes google#253.
pcap/pcap.go Outdated
closeMu sync.Mutex
// stop is closed by Handle.Close to signal to getNextBufPtrLocked to stop
// trying to read packets
stop chan bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a stop channel here, let's use an atomic. From quick benchmarking, they appear to be around 4x faster:

package main

import (
        "sync/atomic"
        "testing"
)

func BenchmarkSelectBool(b *testing.B) {
        c := make(chan bool)
        for i := 0; i < b.N; i++ {
                select {
                case <-c:
                default:
                }
        }
}
func BenchmarkSelectStruct(b *testing.B) {
        c := make(chan struct{})
        for i := 0; i < b.N; i++ {
                select {
                case <-c:
                default:
                }
        }
}
func BenchmarkAtomic(b *testing.B) {
        x := uint64(0)
        for i := 0; i < b.N; i++ {
                if atomic.LoadUint64(&x) != 0 {
                        b.Fatal("oops")
                }
        }
}

gives this result:

gconnell@rockyduck:~$ go test select_test.go --test.bench=.
testing: warning: no tests to run
BenchmarkSelectBool-12          200000000                9.10 ns/op
BenchmarkSelectStruct-12        200000000                9.10 ns/op
BenchmarkAtomic-12              1000000000               2.51 ns/op
PASS
ok      command-line-arguments  8.305s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, updated.

pcap/pcap.go Outdated
@@ -336,8 +379,27 @@ func (p *Handle) getNextBufPtrLocked(ci *gopacket.CaptureInfo) error {
}
var result NextError
for {
// test if we need to stop
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do the call to pcap_next_ex FIRST in the loop, then fall back to the timeout only if we don't have a packet.

for atomic.LoadUint32(&p.stop) == 0 {
  result = NextError(pcap_next_ex(...))
  if (we should error) {
    return err
  } else if (we have packet) {
    return packet
  }
  pcap_wait(...)
}

This way, we don't call an extraneous (expensive) select syscall in between every packet, only if there isn't one already on the queue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote the core loop as you suggested.

pcap/pcap.go Outdated
default:
}

if p.timeout < 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be a little more self-documenting, let's say: if p.timeout == BlockForever

pcap/pcap.go Outdated
int fd;
struct timeval tv;

fd = pcap_get_selectable_fd(p);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the man pages, pcap_get_selectable_fd() is not available on Windows.

Folks do use this for winpcap, so we'll need to hande that eventuality.

Copy link
Contributor Author

@jcrussell jcrussell Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I update the package documentation to note this? Or should we check runtime.GOOS and do something else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only things were that simple... :(

Breaking Windows users is a non-starter, so we need to write code that compiles for both Windows and others. Here, this probably means splitting the implementation up into different files and using https://golang.org/pkg/go/build/ build constraints to specify Windows vs. non-Windows code.

In short, we'll probably end up with three files: pcap.go, for common stuff. pcap_win.go, with the windows implementation of Windows-specific functionality. pcap_other.go with non-windows stuff.

It would have been really great if libpcap hadn't decided to do backwards-incompatible changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added separate files for windows/non-windows. From the standard library, there was one example of a +build !windows and the filename was _unix.go instead of _other.go.

In the Windows version, I left the pcap handle in blocking mode -- it should work at least as well as the existing code.

pcap/pcap.go Outdated
if p.timeout < 0 {
C.pcap_wait(p.cptr, 0)
} else {
// need to wait less than the read timeout according to pcap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to http://www.tcpdump.org/manpages/pcap_get_selectable_fd.3pcap.html:

select() and poll() do not work correctly on BPF devices; pcap_get_selectable_fd() will return a file descriptor on most of those versions (the exceptions being FreeBSD 4.3 and 4.4), but a simple select() or poll() will not indicate that the descriptor is readable until a full buffer's worth of packets is received, even if the read timeout expires before then. To work around this, an application that uses select() or poll() to wait for packets to arrive must put the pcap_t in non-blocking mode, and must arrange that the select() or poll() have a timeout less than or equal to the read timeout, and must try to read packets after that timeout expires, regardless of whether select() or poll() indicated that the file descriptor for the pcap_t is ready to be read or not. (That workaround will not work in FreeBSD 4.3 and later; however, in FreeBSD 4.6 and later, select() and poll() work correctly on BPF devices, so the workaround isn't necessary, although it does no harm.)

Does this mean that in FreeBSD 4.3-4.6, this code will fail to work? Or that it will fail to work in FreeBSD 4.3 on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpret that as meaning that it won't work on FreeBSD 4.3-4.6.

Another thing to note in the package documentation? Is it worth trying to test?

Rewrite `getNextBufPtrLocked` to try to read a packet first before
calling `pcap_wait` which uses an expensive `select` syscall. Replaced
`stop` channel with atomic uint64.
Create pcap_windows and pcap_unix for windows and non-windows
functionality, respectively. libpcap does not support
pcap_get_selectable_fd on Windows so we cannot use it. In the best case,
this issue does not exist on Windows.
@gconnell
Copy link
Collaborator

Sorry to take so long on this, but if you're still up for this and want to merge up to head, I can take another look and see if we can push this through.

@avgerin0s
Copy link

Any news on this one?

@jcrussell
Copy link
Contributor Author

@gconnell: merged to master -- sorry for the delay.

@gconnell
Copy link
Collaborator

gconnell commented Aug 8, 2017

Hey, jcrussel, I still get "This branch cannot be rebased due to conflicts"... are you sure you sync'd to the most current master?

@gconnell
Copy link
Collaborator

gconnell commented Aug 8, 2017

Oh, it looks like you just may not have pushed the merge up to your pull request branch...

@jcrussell
Copy link
Contributor Author

jcrussell commented Aug 8, 2017

Hm, I did merge (25a8a4c).

Did you want me to rebase it instead? I think that breaks the PR though, doesn't it?

@gconnell gconnell merged commit 7d9ba85 into google:master Aug 8, 2017
@gconnell
Copy link
Collaborator

gconnell commented Aug 8, 2017

Never mind, apparently my merge worked fine. I just have my merge/rebase dropdown set to rebase by default, and it was showing me an error.

@avgerin0s
Copy link

@jcrussell @gconnell Thanks both. Any ideas when this is going to be released? It's been a long time since one (see #331)

@gconnell
Copy link
Collaborator

gconnell commented Aug 8, 2017

Apologies for being horrible at actual releases/tagging. Just created v1.1.13 at head.

@avgerin0s
Copy link

@gconnell Thanks for releasing. closed #331

avgerin0s added a commit to avgerin0s/goreplay that referenced this pull request Sep 2, 2017
in order to be compatible with libpcap 1.8+.

This fixing the bug when a program using libpcap could not terminate
properly.

For more information see google/gopacket#256
traetox pushed a commit to traetox/gopacket that referenced this pull request Jan 2, 2019
pcap: make stoppable when no packets to sniff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock when trying to quit
3 participants