-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClientV3: Add integration test for server timeout #10211
Changes from all commits
0187061
3f6d508
e48af62
01afcf1
332e0a8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -104,3 +104,4 @@ PermitWithoutStream | |
__lostleader | ||
ErrConnClosing | ||
unfreed | ||
ServerParameters |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,8 +26,69 @@ import ( | |
"go.etcd.io/etcd/integration" | ||
"go.etcd.io/etcd/pkg/testutil" | ||
"google.golang.org/grpc" | ||
"google.golang.org/grpc/codes" | ||
"google.golang.org/grpc/status" | ||
) | ||
|
||
// TestServerGRPCKeepAliveTimeout tests ServerParameters keepalive ping on server-side that | ||
// after having pinged for keepalive check, the server waits for a duration of Timeout and | ||
// if no activity is seen even after that the connection is closed on server-side. The keepalive | ||
// takes few extra seconds to get in impact so typically the time it takes to close the connection | ||
// is somewhat higher than specified Timeout value. | ||
func TestServerGRPCKeepAliveTimeout(t *testing.T) { | ||
defer testutil.AfterTest(t) | ||
|
||
clus := integration.NewClusterV3(t, &integration.ClusterConfig{ | ||
Size: 1, | ||
GRPCKeepAliveInterval: 1 * time.Second, | ||
GRPCKeepAliveTimeout: 500 * time.Millisecond, | ||
}) | ||
defer clus.Terminate(t) | ||
|
||
eps := []string{clus.Members[0].GRPCAddr()} | ||
ccfg := clientv3.Config{ | ||
Endpoints: []string{eps[0]}, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just |
||
DialTimeout: 12 * time.Second, | ||
} | ||
cli, err := clientv3.New(ccfg) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need cli in this test, giving that NewClusterV3() already created clients? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, the client is created here. I believe we need this for the desired config. Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I was not able to understand. Do you mean we need this for the desired config for the cluster? My understanding is that the keepAlive parameters were configured on server side. And clus.clients are configured inside NewClusterV3. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's what my understanding is - to override defaults used in NewClusterV3. In this particular case, it's just the endpoint though and it's more readable this way. Hope makes sense but I am open for suggestion. Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the endpoints of clus.clients are configured in NewClusterV3: It is not clear to me what override means. cli and clus.client(1) are two different ClientV3 instances. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree, I tried to fix your concern in the new commit. Thanks! |
||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
defer cli.Close() | ||
|
||
// give keepalive some time | ||
time.Sleep(4 * time.Second) | ||
|
||
if _, err = cli.Put(context.TODO(), "foo", "bar"); err != nil { | ||
t.Fatal(err) | ||
} | ||
// TODO: keepalive sometimes doesn't work on first attempt. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gyuho hi, somewhat inspired by this https://github.com/etcd-io/etcd/blob/master/clientv3/integration/black_hole_test.go#L197 I have a for loop added here. Also, the info log should help understand how often the test iterates. Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm... when I run test locally it usually pass in a first attempt and sometime more for the third run of test but here it's failing again :( (Outside tests, it runs fine by manually dropping client connection as well.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gyuho so this time it did pass build. Seems like 1 cpu is always fine but 2 and 4 doesn't go in first attempt (on my local env it actually does pass mostly). Can we keep this solution and watch it (I will be watching) to see behavior? Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gyuho I guess we need a better approach. Thinking but if you have a suggestion please let me know. Thanks!! |
||
for i := 0; i < 5; i++ { | ||
clus.Members[0].Blackhole() | ||
time.Sleep(10 * time.Second) | ||
// remove blackhole but by now the keepalive ping should have triggered server to | ||
// close server-to-client connection. | ||
clus.Members[0].Unblackhole() | ||
_, err = cli.Put(context.TODO(), "foo1", "bar1") | ||
if err != nil { | ||
ev, ok := status.FromError(err) | ||
if !ok { | ||
t.Fatal(err) | ||
} | ||
if ev.Code() != codes.Unavailable { | ||
t.Fatal(err) | ||
} | ||
break | ||
} else { | ||
t.Logf("info: expected an error, received none.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just |
||
} | ||
} | ||
|
||
if err == nil { | ||
t.Error("rpc error expected") | ||
} | ||
} | ||
|
||
// TestBalancerUnderBlackholeKeepAliveWatch tests when watch discovers it cannot talk to | ||
// blackholed endpoint, client balancer switches to healthy one. | ||
// TODO: test server-to-client keepalive ping | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also document what this is testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that sounds good.