-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MM-55415] Fix race condition on stop #124
Conversation
@@ -252,6 +254,14 @@ func (s *Server) InitSession(cfg SessionConfig, closeCb func() error) error { | |||
s.log.Error("failed to create ICE message", mlog.Err(err), mlog.String("sessionID", cfg.SessionID)) | |||
return | |||
} | |||
|
|||
select { | |||
case <-us.closeCh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're just checking if the session has been closed before writing to its receiveCh? (just restating for my own understanding)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we do it under lock (us.mut
), that's the key essentially. So we are guaranteeing that either closeCh
has been closed already when we are inside this block and we exit early otherwise s.receiveCh
must be still open and we proceed. Unfortunately I couldn't merge the two selects because the order of selection is not deterministic so we could still trigger a race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's random on purpose. Gotcha thanks!
@cpoile There's another small race (see test failure), this time totally our fault (mine I should say) . Will send a fix on top of this in a minute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that Go found the race.
Summary
Fixing this on our side for expediency. This race could cause a
panic
when stopping the server as closing the peer connection didn't guarantee for theOnICECandidate
handler to be done so we'd go ahead and close thereceiveCh
which could still be written into, leading to a crash.I believe the race was introduced after the recent dependency upgrade, more specifically by pion/ice@898746c
I'll still see if I can send a patch upstream as I believe this sort of cases should be dealt with by the underlying API, essentially guaranteeing that after closing the
ice.Agent
no goroutines created by it would still be running.Ticket Link
https://mattermost.atlassian.net/browse/MM-55415