-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: datarace in the handshake #17
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
coverage: | ||
range: "50...100" | ||
comment: off |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,13 +36,13 @@ func NewMultistream(c io.ReadWriteCloser, proto string) Multistream { | |
type lazyConn struct { | ||
rhandshake bool // only accessed by 'Read' should not call read async | ||
|
||
rhlock sync.Mutex | ||
rhlock sync.RWMutex | ||
rhsync bool //protected by mutex | ||
rerr error | ||
|
||
whandshake bool | ||
|
||
whlock sync.Mutex | ||
whlock sync.RWMutex | ||
whsync bool | ||
werr error | ||
|
||
|
@@ -51,7 +51,11 @@ type lazyConn struct { | |
} | ||
|
||
func (l *lazyConn) Read(b []byte) (int, error) { | ||
if !l.rhandshake { | ||
l.rhlock.RLock() | ||
rhandshake := l.rhandshake | ||
l.rhlock.RUnlock() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This check was intentionally racy to avoid atomic operations on the fast path. Go routines should erroneously conclude that the handshake hasn't happened at most once (and the synchronization inside the handshake functions should prevent double handshakes). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that it might be only case when races are "good" but it screws over race test in all code that uses this. This means that we can't run race detector on big part of go-libp2p and go-ipfs code unless we tolerate false positives. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, there's got to be a way to tell go this is "fine" (hopefully without writing a C extension). I know I've seen code in the go standard library that does this... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, we can probably use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can't use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was trying to do it previously with Once but I run into some problems. One of them was that I had to implement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why even do it in a goroutine? We need to wait for it to finish anyways. Can't we just have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could have two Once's, one for the read handshake, one for the write handshake (where each handshake side would asynchronously fire off the other side of the handshake (in a once)). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, sorry for the massive delay... |
||
|
||
if !rhandshake { | ||
go l.writeHandshake() | ||
err := l.readHandshake() | ||
if err != nil { | ||
|
@@ -69,13 +73,17 @@ func (l *lazyConn) Read(b []byte) (int, error) { | |
} | ||
|
||
func (l *lazyConn) readHandshake() error { | ||
l.rhlock.Lock() | ||
defer l.rhlock.Unlock() | ||
|
||
l.rhlock.RLock() | ||
rhsync := l.rhsync | ||
l.rhlock.RUnlock() | ||
// if we've already done this, exit | ||
if l.rhsync { | ||
if rhsync { | ||
return l.rerr | ||
} | ||
|
||
l.rhlock.Lock() | ||
defer l.rhlock.Unlock() | ||
|
||
l.rhsync = true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is racy. You can't check, unlock, then relock and expect the state to stay the same. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right, I think I missed because I restructured this code twice while writing it reverting to simpler solution. Let's figure out #17 (comment) first. |
||
|
||
for _, proto := range l.protos { | ||
|
@@ -96,13 +104,17 @@ func (l *lazyConn) readHandshake() error { | |
} | ||
|
||
func (l *lazyConn) writeHandshake() error { | ||
l.whlock.Lock() | ||
defer l.whlock.Unlock() | ||
|
||
if l.whsync { | ||
l.whlock.RLock() | ||
whsync := l.whsync | ||
l.whlock.RUnlock() | ||
// if we've already done this, exit | ||
if whsync { | ||
return l.werr | ||
} | ||
|
||
l.whlock.Lock() | ||
defer l.whlock.Unlock() | ||
|
||
l.whsync = true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same race here. |
||
|
||
buf := bufio.NewWriter(l.con) | ||
|
@@ -119,7 +131,11 @@ func (l *lazyConn) writeHandshake() error { | |
} | ||
|
||
func (l *lazyConn) Write(b []byte) (int, error) { | ||
if !l.whandshake { | ||
l.whlock.RLock() | ||
whandshake := l.whandshake | ||
l.whlock.RUnlock() | ||
|
||
if !whandshake { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, intentionally a racy fast-path. |
||
go l.readHandshake() | ||
err := l.writeHandshake() | ||
if err != nil { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we disable macos for a reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OSX is delaying builds on Travis so I disable it in repos that don't have code that might depend on the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.