-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handleGRPC can panic #13
Comments
TLDR: In the rare occasion during shutdown where a new gprc connection comes in after grpcServer shutdown and before httpServer shutdown, the user program panics. Managed to reliably reproduce the issue.
This only happens in the rare occasion where user program is shutting down and a new grpc connection comes in between grpcServer shutdown and httpServer shutdown. This is caused by httpServer and grpcServer closing order in code using gmux. There is no bug inside gmux The program panics when the servers are closed in this order in user code: grpcServer.GracefulStop()
httpServer.Shutdown(context.Background()) This is because The solution is to reverse the closing order in user code, i.e. httpServer.Shutdown(context.Background())
grpcServer.GracefulStop() such that the gmux configured httpServer stops accepting connections first. |
@carsonip How is |
@axw It is not closed in |
@carsonip I don't mean closing the server, but the channel. You said:
How does closing the server close the channel? |
@axw Good question. It isn't immediately obvious. In apm-server, on startup, we start the grpc server with s.grpcServer.Serve(s.httpServer.grpcListener) which calls func (s *Server) Serve(lis net.Listener) error {
...
ls := &listenSocket{Listener: lis}
s.lis[ls] = true
...
} In apm-server, on shutdown, s.grpcServer.GracefulStop() which calls func (s *Server) GracefulStop() {
...
for lis := range s.lis {
lis.Close()
}
...
} which in turn calls func (l *chanListener) Close() error {
l.closeOnce.Do(func() {
close(l.conns)
})
return nil
} |
Thanks @carsonip! Makes sense now. I think it would still be good to fix in gmux, as it feels like a bit of a footgun. Not urgent, but can we leave this open? |
Yes, let's leave this open. Would like to know how we can fix this properly, otherwise this will be a point to document in readme. People usually say "only writers should close the channel" but in this case gmux has no idea whether the channel is closed / should be closed due to reasons including but not limited to shutdown. We may have to add something to signal shutdown to gmux so that it stops sending stuff into the channel. |
Could we just never close the |
I have a draft of a fix in gmux at #16 although it may be a little ugly. CI isn't happy and tests are timing out for the PR and even on main. |
(*mux).handleGRPC(..)
occasionally panics with following error:I have not yet found how to reproduce it.
The text was updated successfully, but these errors were encountered: