Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race on stream.memory access #95

Closed
Tracked by #1690
SaveTheRbtz opened this issue Aug 31, 2022 · 5 comments · Fixed by #100
Closed
Tracked by #1690

Data race on stream.memory access #95

SaveTheRbtz opened this issue Aug 31, 2022 · 5 comments · Fixed by #100
Assignees
Labels
kind/bug A bug in existing code (including security flaws) P1 High: Likely tackled by core team if no one steps up

Comments

@SaveTheRbtz
Copy link

We have a couple of test failures on a -race build due to yamux racing on stream.memory. Looking at the code sendWindowUpdate does mention that it should be called under a lock but since it is called from Read I'm not sure any of the existing users (incl. swarm) synchronize Read/Close with a separate lock.

What do you think is the best approach for solving this race?

WARNING: DATA RACE
Read at 0x00c05798a548 by goroutine 1474:
  github.com/libp2p/go-yamux/v3.(*Session).Close()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/session.go:299 +0x3e7
  github.com/libp2p/go-yamux/v3.(*Session).exitErr()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/session.go:317 +0xce
  github.com/libp2p/go-yamux/v3.(*Session).send()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/session.go:489 +0x44
  github.com/libp2p/go-yamux/v3.newSession.func2()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/session.go:159 +0x39

Previous write at 0x00c05798a548 by goroutine 3271:
  github.com/libp2p/go-yamux/v3.(*Stream).sendWindowUpdate()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/stream.go:234 +0x651
  github.com/libp2p/go-yamux/v3.(*Stream).Read()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/stream.go:123 +0x224
  github.com/libp2p/go-libp2p/p2p/muxer/yamux.(*stream).Read()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/muxer/yamux/stream.go:17 +0x4e
  github.com/libp2p/go-libp2p/p2p/net/swarm.(*Stream).Read()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/net/swarm/swarm_stream.go:55 +0x93
  bufio.(*Reader).Read()
      /opt/hostedtoolcache/go/1.18.5/x64/src/bufio/bufio.go:222 +0x2b5
  io.ReadAtLeast()
      /opt/hostedtoolcache/go/1.18.5/x64/src/io/io.go:331 +0xdd
  io.ReadFull()
      /opt/hostedtoolcache/go/1.18.5/x64/src/io/io.go:350 +0x268
  github.com/libp2p/go-msgio/protoio.(*uvarintReader).ReadMsg()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/protoio/uvarint_reader.go:82 +0x238
  github.com/libp2p/go-libp2p-pubsub.(*PubSub).handleNewStream()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/comm.go:66 +0x397
  github.com/libp2p/go-libp2p-pubsub.(*PubSub).handleNewStream-fm()
      <autogenerated>:1 +0x4d
  github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).SetStreamHandler.func1()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/host/basic/basic_host.go:568 +0x86
  github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).newStreamHandler.func1()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/host/basic/basic_host.go:411 +0x74

Goroutine 1474 (running) created at:
  github.com/libp2p/go-yamux/v3.newSession()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/session.go:159 +0xb24
  github.com/libp2p/go-yamux/v3.Server()
      /home/runner/go/pkg/mod/github.com/libp2p/go-yamux/[email protected]/mux.go:119 +0x231
  github.com/libp2p/go-libp2p/p2p/muxer/yamux.(*Transport).NewConn()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/muxer/yamux/transport.go:44 +0x84
  github.com/libp2p/go-libp2p/p2p/muxer/muxer-multistream.(*Transport).NewConn()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/muxer/muxer-multistream/multistream.go:74 +0x2f9
  github.com/libp2p/go-libp2p/p2p/net/upgrader.(*upgrader).setupMuxer.func1()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/net/upgrader/upgrader.go:206 +0x128

Goroutine 3271 (running) created at:
  github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).newStreamHandler()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/host/basic/basic_host.go:411 +0x82e
  github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).newStreamHandler-fm()
      <autogenerated>:1 +0x4d
  github.com/libp2p/go-libp2p/p2p/net/swarm.(*Conn).start.func1.1()
      /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/p2p/net/swarm/swarm_conn.go:133 +0x102
==================
@marten-seemann
Copy link
Contributor

Do you have a test case that reproduces this? Our CI runs our tests with race detector, and it hasn't occurred there.

@SaveTheRbtz
Copy link
Author

It is currently being triggered by some of our integration test. It won't be trivial to make a reproducible test given that the code that accesses s.memory is conditioned on time.Now() and rtt:

go-yamux/stream.go

Lines 222 to 240 in a39b231

now := time.Now()
if rtt := s.session.getRTT(); flags == 0 && rtt > 0 && now.Sub(s.epochStart) < rtt*4 {
var recvWindow uint32
if s.recvWindow > math.MaxUint32/2 {
recvWindow = min(math.MaxUint32, s.session.config.MaxStreamWindowSize)
} else {
recvWindow = min(s.recvWindow*2, s.session.config.MaxStreamWindowSize)
}
if recvWindow > s.recvWindow {
grow := recvWindow - s.recvWindow
if err := s.session.memoryManager.ReserveMemory(int(grow), 128); err == nil {
s.recvWindow = recvWindow
s.memory += int(grow)
_, delta = s.recvBuf.GrowTo(s.recvWindow, true)
}
}
}
s.epochStart = now

@marten-seemann
Copy link
Contributor

Just had a closer look at the code. I'd assume the simplest fix is to actually hold the lock when calling sendWindowUpdate, as the comment requires. The problem is, we don't want to hold the lock when calling sendMsg, as that call might block.

Do you want to submit a PR?

@marten-seemann marten-seemann added the kind/bug A bug in existing code (including security flaws) label Aug 31, 2022
@SaveTheRbtz
Copy link
Author

I can try! Holding a session lock when growing stream's the memory would be sufficient. Though locking patten there is a bit weird (child object grabbing parent's lock) and also Reservation/Release pattern is asymmetric: reservation is done by the stream, while release is done by the session.

@marten-seemann
Copy link
Contributor

Maybe one could split the sendWindowUpdate into a getWindowUpdate function, and have the caller send it out?

@p-shahi p-shahi added the P1 High: Likely tackled by core team if no one steps up label Sep 2, 2022
@p-shahi p-shahi added this to go-libp2p Sep 2, 2022
@p-shahi p-shahi linked a pull request Sep 2, 2022 that will close this issue
@p-shahi p-shahi moved this to 🏃‍♀️ In Progress in go-libp2p Sep 2, 2022
@marten-seemann marten-seemann self-assigned this Sep 5, 2022
Repository owner moved this from 🏃‍♀️ In Progress to 🎉 Done in go-libp2p Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) P1 High: Likely tackled by core team if no one steps up
Projects
Archived in project
3 participants