Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: gRPC v1.4.1, gateway v1.2.2, metadata Incoming/OutgoingContext #7896

Merged
merged 3 commits into from
Jun 15, 2017

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented May 8, 2017

Fix #7888.

@gyuho gyuho added the WIP label May 8, 2017
@gyuho gyuho changed the title *: gRPC v1.3.0, gateway v1.2.2, metadata New/FromOutgoingContext [DO NOT MERGE] *: gRPC v1.3.0, gateway v1.2.2, metadata New/FromOutgoingContext May 8, 2017
@gyuho gyuho force-pushed the metadata-grpc branch 2 times, most recently from 4169033 to 0605168 Compare May 8, 2017 17:44
@gyuho gyuho changed the title [DO NOT MERGE] *: gRPC v1.3.0, gateway v1.2.2, metadata New/FromOutgoingContext *: gRPC v1.3.0, gateway v1.2.2, metadata New/FromOutgoingContext May 8, 2017
@gyuho gyuho removed the WIP label May 8, 2017
@gyuho gyuho force-pushed the metadata-grpc branch from 496d90b to 10e2052 Compare May 8, 2017 18:06
@gyuho gyuho added the WIP label May 8, 2017
@gyuho gyuho changed the title *: gRPC v1.3.0, gateway v1.2.2, metadata New/FromOutgoingContext *: gRPC v1.3.0, gateway v1.2.2, metadata Incoming/OutgoingContext May 8, 2017
@gyuho gyuho force-pushed the metadata-grpc branch from 10e2052 to c4a32d7 Compare May 8, 2017 18:45
@gyuho gyuho removed the WIP label May 8, 2017
@gyuho gyuho force-pushed the metadata-grpc branch 2 times, most recently from f891e3f to 51fd3eb Compare May 8, 2017 20:32
@gyuho gyuho added this to the v3.3.0 milestone May 9, 2017
@gyuho gyuho force-pushed the metadata-grpc branch 2 times, most recently from 25012d2 to 93ee59c Compare May 31, 2017 03:41
@gyuho gyuho added the WIP label May 31, 2017
@gyuho
Copy link
Contributor Author

gyuho commented May 31, 2017

fyi, still investigating the breaks:

go test -v -race -tags cluster_proxy -run TestV3ElectionObserve
=== RUN   TestV3ElectionObserve
--- PASS: TestV3ElectionObserve (0.33s)
PASS
Too many goroutines running after all test(s).
1 instances of:
google.golang.org/grpc/transport.(*recvBufferReader).Read(...)
	/Users/gyuho/go/src/google.golang.org/grpc/transport/transport.go:144 +0x70a
google.golang.org/grpc/transport.(*Stream).Read(...)
	/Users/gyuho/go/src/google.golang.org/grpc/transport/transport.go:332 +0x7f
io.ReadAtLeast(...)
	/usr/local/go/src/io/io.go:307 +0xb7
io.ReadFull(...)
	/usr/local/go/src/io/io.go:325 +0x73
google.golang.org/grpc.(*parser).recvMsg(...)
	/Users/gyuho/go/src/google.golang.org/grpc/rpc_util.go:219 +0xa0
google.golang.org/grpc.recv(0xc4201745a0, 0x2514520, 0x257e3a8, 0xc42045a480, 0x0, 0x0, 0x1ec92a0, 0xc420168560, 0x7fffffff, 0x0, ...)
	/Users/gyuho/go/src/google.golang.org/grpc/rpc_util.go:315 +0x61
google.golang.org/grpc.(*clientStream).RecvMsg(...)
	/Users/gyuho/go/src/google.golang.org/grpc/stream.go:376 +0x1d4
github.com/coreos/etcd/etcdserver/api/v3election/v3electionpb.(*electionObserveClient).Recv(...)
	/Users/gyuho/go/src/github.com/coreos/etcd/etcdserver/api/v3election/v3electionpb/v3election.pb.go:318 +0x86
github.com/coreos/etcd/proxy/grpcproxy.(*electionProxy).Observe(...)
	/Users/gyuho/go/src/github.com/coreos/etcd/proxy/grpcproxy/election.go:53 +0x1c4
github.com/coreos/etcd/proxy/grpcproxy/adapter.(*es2ec).Observe.func1(...)
	/Users/gyuho/go/src/github.com/coreos/etcd/proxy/grpcproxy/adapter/election_client_adapter.go:48 +0x138
github.com/coreos/etcd/proxy/grpcproxy/adapter.newPipeStream.func1(0xc42014dc40, 0xc4201ca3c0, 0xc4201ca420, 0x25129a0, 0xc420155bc0, 0x0, 0x0, 0x0, 0xc420155bc0, 0x3465400, ...)
	/Users/gyuho/go/src/github.com/coreos/etcd/proxy/grpcproxy/adapter/chan_stream.go:154 +0x7c
created by github.com/coreos/etcd/proxy/grpcproxy/adapter.newPipeStream
	/Users/gyuho/go/src/github.com/coreos/etcd/proxy/grpcproxy/adapter/chan_stream.go:163 +0x48c
exit status 1
FAIL	github.com/coreos/etcd/integration	0.381s
// blocks on v3election.pb.go
func (x *electionObserveClient) Recv() (*LeaderResponse, error) {
	m := new(LeaderResponse)
	fmt.Println("electionObserveClient electionObserveClient 1")
	if err := x.ClientStream.RecvMsg(m); err != nil {
		fmt.Println("electionObserveClient electionObserveClient 2", err)
		return nil, err
	}
	fmt.Println("electionObserveClient electionObserveClient 3")
	return m, nil
}

Bisect tells it breaks since grpc/grpc-go#1136.


Another failure:
go test -v -race -tags cluster_proxy -run TestSTM
=== RUN   TestSTMConflict
--- PASS: TestSTMConflict (0.85s)
=== RUN   TestSTMAbort
--- PASS: TestSTMAbort (0.19s)
PASS
Too many goroutines running after all test(s).
1 instances of:
context.propagateCancel.func1(...)
	/usr/local/go/src/context/context.go:262 +0x188
created by context.propagateCancel
	/usr/local/go/src/context/context.go:267 +0x298
1 instances of:
github.com/coreos/etcd/clientv3.(*watchGrpcStream).run(...)
	/Users/gyuho/go/src/github.com/coreos/etcd/clientv3/watch.go:430 +0x1f94
created by github.com/coreos/etcd/clientv3.(*watcher).newWatcherGrpcStream
	/Users/gyuho/go/src/github.com/coreos/etcd/clientv3/watch.go:228 +0x5b1
1 instances of:
google.golang.org/grpc/transport.(*recvBufferReader).Read(...)
	/Users/gyuho/go/src/google.golang.org/grpc/transport/transport.go:144 +0x70a
google.golang.org/grpc/transport.(*Stream).Read(...)
	/Users/gyuho/go/src/google.golang.org/grpc/transport/transport.go:332 +0x7f
io.ReadAtLeast(...)
	/usr/local/go/src/io/io.go:307 +0xb7
io.ReadFull(...)
	/usr/local/go/src/io/io.go:325 +0x73
google.golang.org/grpc.(*parser).recvMsg(...)
	/Users/gyuho/go/src/google.golang.org/grpc/rpc_util.go:219 +0xa0
google.golang.org/grpc.recv(0xc420183be0, 0x2514520, 0x257e3a8, 0xc422137680, 0x0, 0x0, 0x1ec9ca0, 0xc42030f8b0, 0x7fffffff, 0x0, ...)
	/Users/gyuho/go/src/google.golang.org/grpc/rpc_util.go:315 +0x61
google.golang.org/grpc.(*clientStream).RecvMsg(...)
	/Users/gyuho/go/src/google.golang.org/grpc/stream.go:376 +0x1d4
github.com/coreos/etcd/etcdserver/etcdserverpb.(*watchWatchClient).Recv(...)
	/Users/gyuho/go/src/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:2473 +0x86
github.com/coreos/etcd/clientv3.(*watchGrpcStream).serveWatchClient(...)
	/Users/gyuho/go/src/github.com/coreos/etcd/clientv3/watch.go:557 +0x5d
created by github.com/coreos/etcd/clientv3.(*watchGrpcStream).newWatchClient
	/Users/gyuho/go/src/github.com/coreos/etcd/clientv3/watch.go:712 +0x79e
exit status 1
FAIL	github.com/coreos/etcd/integration	2.264s

@gyuho gyuho force-pushed the master branch 3 times, most recently from 44ca396 to 4301f49 Compare June 2, 2017 15:53
MakMukhi pushed a commit to grpc/grpc-go that referenced this pull request Jun 14, 2017
This patch writes client-side error before closing the active stream
to fix blocking `RecvMsg` issue on `grpc.ClientStream` [1].

Previous gRPC client stream just exits on `ClientTransport.Error` [2].
And latest gRPC added another select case on client connection context
cancel [3]. Now when client stream closes from client connection context
cancel, it calls `CloseStream` with `ErrClientConnClosing` error. And then
the stream gets deleted from `*http2Client.activeStreams`, without processing
the error [4]. Then in-flight `RecvMsg` call on this client will block on
`*parser.Reader.recvMsg` [5].

In short,

1. `ClientConn.Close`.
2. in-flight streams will receive case `<-cc.ctx.Done()`
   https://github.com/grpc/grpc-go/blob/master/stream.go#L253-L255.
3. `cs.closeTransportStream(ErrClientConnClosing)` calls `cs.t.CloseStream(cs.s, err)`.
4. `CloseStream(cs.s, err)` calls `delete(t.activeStreams, s.id)`
   without handling the error.
5. in-flight streams will never receive error, left hanging.

I can reproduce in etcd tests with in-flight `recvMsg` calls to `Observe` RPC.

---
[1] etcd-io/etcd#7896 (comment)
[2] https://github.com/grpc/grpc-go/blob/v1.2.x/stream.go#L235-L238
[3] #1136
[4] https://github.com/grpc/grpc-go/blob/master/transport/http2_client.go#L569
[5] https://github.com/grpc/grpc-go/blob/master/rpc_util.go#L280

Signed-off-by: Gyu-Ho Lee <[email protected]>
menghanl pushed a commit to menghanl/grpc-go that referenced this pull request Jun 14, 2017
This patch writes client-side error before closing the active stream
to fix blocking `RecvMsg` issue on `grpc.ClientStream` [1].

Previous gRPC client stream just exits on `ClientTransport.Error` [2].
And latest gRPC added another select case on client connection context
cancel [3]. Now when client stream closes from client connection context
cancel, it calls `CloseStream` with `ErrClientConnClosing` error. And then
the stream gets deleted from `*http2Client.activeStreams`, without processing
the error [4]. Then in-flight `RecvMsg` call on this client will block on
`*parser.Reader.recvMsg` [5].

In short,

1. `ClientConn.Close`.
2. in-flight streams will receive case `<-cc.ctx.Done()`
   https://github.com/grpc/grpc-go/blob/master/stream.go#L253-L255.
3. `cs.closeTransportStream(ErrClientConnClosing)` calls `cs.t.CloseStream(cs.s, err)`.
4. `CloseStream(cs.s, err)` calls `delete(t.activeStreams, s.id)`
   without handling the error.
5. in-flight streams will never receive error, left hanging.

I can reproduce in etcd tests with in-flight `recvMsg` calls to `Observe` RPC.

---
[1] etcd-io/etcd#7896 (comment)
[2] https://github.com/grpc/grpc-go/blob/v1.2.x/stream.go#L235-L238
[3] grpc#1136
[4] https://github.com/grpc/grpc-go/blob/master/transport/http2_client.go#L569
[5] https://github.com/grpc/grpc-go/blob/master/rpc_util.go#L280

Signed-off-by: Gyu-Ho Lee <[email protected]>
@gyuho gyuho changed the title *: gRPC v1.3.0, gateway v1.2.2, metadata Incoming/OutgoingContext *: gRPC v1.4.1, gateway v1.2.2, metadata Incoming/OutgoingContext Jun 15, 2017
@gyuho gyuho removed the WIP label Jun 15, 2017
@gyuho
Copy link
Contributor Author

gyuho commented Jun 15, 2017

@heyitsanthony @xiang90 PTAL.

@gyuho gyuho merged commit 5fedaf2 into etcd-io:master Jun 15, 2017
@gyuho gyuho deleted the metadata-grpc branch June 15, 2017 23:42
@codecov-io
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@b9a53db). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #7896   +/-   ##
=========================================
  Coverage          ?   76.52%           
=========================================
  Files             ?      342           
  Lines             ?    26584           
  Branches          ?        0           
=========================================
  Hits              ?    20344           
  Misses            ?     4783           
  Partials          ?     1457
Impacted Files Coverage Δ
proxy/grpcproxy/lease.go 88.57% <100%> (ø)
clientv3/lease.go 92.33% <100%> (ø)
auth/store.go 80.48% <100%> (ø)
proxy/grpcproxy/watch.go 94.44% <100%> (ø)
etcdserver/api/v3rpc/interceptor.go 92.98% <100%> (ø)
clientv3/client.go 84.17% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b9a53db...5e059fd. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants