Skip to content

Commit

Permalink
Merge #139238
Browse files Browse the repository at this point in the history
139238: testcluster: deflake TestManualReplication r=iskettaneh a=iskettaneh

When we transfer a lease from N1 to N2, N2 might not immediately know about the lease transfer, and might still think that N1 holds the lease. With leader leases however, since sometimes it takes time until store liveness grants support, the test might run into timing issues where N2 thinks that N1 has the lease, but the lease is UNUSABLE since now() is so close to the lease min_expiration time. Also, sometimes N2 can't determine the lease validity all together.

This commit fixes this by wrapping the lease enquiry after the lease transfer by a succeeds soon.

Couldn't reproduce the bug after more than 10,000 attempts.

Fixes: #136801

Release Note: None

Co-authored-by: Ibrahim Kettaneh <[email protected]>
  • Loading branch information
craig[bot] and iskettaneh committed Jan 16, 2025
2 parents 5b61a0e + 391a68c commit 521d415
Showing 1 changed file with 11 additions and 6 deletions.
17 changes: 11 additions & 6 deletions pkg/testutils/testcluster/testcluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ func TestManualReplication(t *testing.T) {
}
}

// Transfer the lease to node 1.
// Transfer the lease to node 2.
target := tc.Target(0)
leaseHolder, err := tc.FindRangeLeaseHolder(tableRangeDesc, &target)
if err != nil {
Expand All @@ -150,11 +150,16 @@ func TestManualReplication(t *testing.T) {
// Check that the lease holder has changed. We'll use the old lease holder as
// the hint, since it's guaranteed that the old lease holder has applied the
// new lease.
target = tc.Target(0)
leaseHolder, err = tc.FindRangeLeaseHolder(tableRangeDesc, &target)
if err != nil {
t.Fatal(err)
}
// We wrap this in a SucceedsSoon because N2 might have not applied the lease
// yet, it might think that the N1 is the current leaseholder, and the lease
// might be INVALID if the time is close to the lease expiration time, or in
// the case of leader leases, N2 might not be able to determine the validity
// of the lease all together.
testutils.SucceedsSoon(t, func() error {
leaseHolder, err = tc.FindRangeLeaseHolder(tableRangeDesc, &target)
return err
})

if leaseHolder.StoreID != tc.Servers[1].StorageLayer().GetFirstStoreID() {
t.Fatalf("expected lease on server idx 1 (node: %d store: %d), but is on node: %+v",
tc.Server(1).NodeID(),
Expand Down

0 comments on commit 521d415

Please sign in to comment.