-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: DispatchTracker to replace everything #2179
Changes from all commits
9b55448
1f334ea
6767359
86ad77a
c2be355
555f1d0
7ae9c6c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -502,10 +502,18 @@ void ClusterFamily::DflyClusterConfig(CmdArgList args, ConnectionContext* cntx) | |
before = tl_cluster_config->GetOwnedSlots(); | ||
} | ||
|
||
auto cb = [&](util::ProactorBase* pb) { tl_cluster_config = new_config; }; | ||
DispatchTracker tracker{server_family_->GetListeners(), cntx->conn()}; | ||
auto cb = [&tracker, &new_config](util::ProactorBase* pb) { | ||
tl_cluster_config = new_config; | ||
tracker.TrackOnThread(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do we need TrackOnThread? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not do it correctly if it takes only a few lines 🙂
Because it's possibly waiting for totally different stuff to finish, including commands that would already be running with the new cluster config. The connection has spurious suspends (Yields()), so we might miss operations |
||
}; | ||
Comment on lines
+505
to
+509
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So we can now pause writes when replacing the cluster config, to actually return an error without applying. wdyt? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As in run {
paused = true
tracker.track_on_thread()
}
if not tracker.wait( 1s ):
return "Can't replace cluster config with constant ops running"
# no writes
run {
tl_config = new_config
paused = false
} but why should we pause under heavy load without long running ops 😞 ? One more option would be using a global tx, but that doesn't prevent write commands from still being scheduled |
||
server_family_->service().proactor_pool().AwaitFiberOnAll(std::move(cb)); | ||
DCHECK(tl_cluster_config != nullptr); | ||
|
||
if (!tracker.Wait(absl::Seconds(1))) { | ||
LOG(WARNING) << "Cluster config change timed out"; | ||
} | ||
|
||
SlotSet after = tl_cluster_config->GetOwnedSlots(); | ||
if (ServerState::tlocal()->is_master) { | ||
auto deleted_slots = GetDeletedSlots(is_first_config, before, after); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to make it safe to pass a single
{this}
, but if we just copy the span it becomes a dangling array... So either use a vector or include absl fixed array, but given how rare this operation is we can simply use a vector