Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt(server): Short-circuit ExecuteAsync(). #1601

Merged
merged 4 commits into from
Jul 31, 2023
Merged

opt(server): Short-circuit ExecuteAsync(). #1601

merged 4 commits into from
Jul 31, 2023

Conversation

chakaz
Copy link
Collaborator

@chakaz chakaz commented Jul 30, 2023

With this change, an attempt to execute a hop will short circuit (i.e. run the callback inline instead of a async hop) if it can (i.e. it should run in a single shard which happens to be the current shard).
This change has to potential to reduce latency of execution commands, and can be further utilized in the future for multi/eval if we'll run all commands on the target shard (if it's just one).

@chakaz chakaz changed the title feat(server): Short-circuit ExecuteAsync(). opt(server): Short-circuit ExecuteAsync(). Jul 30, 2023
@chakaz chakaz requested a review from romange July 30, 2023 12:59
Comment on lines 847 to 852
EngineShard* shard = EngineShard::tlocal();
if (unique_shard_cnt_ == 1 && shard != nullptr && shard->shard_id() == unique_shard_id_) {
DVLOG(1) << "Short-circuit ExecuteAsync " << DebugId();
shard->PollExecution("exec_cb", this);
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should move the use_count_ increment below your fast-path (see lines 833 and 890)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes DCHECK(trans != head); (engine_shard_set.cc:386) to fail in many tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, @dranikpg has a point. I am afraid you have a leak when you increase use_count_ like this,.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but now I understand the complexity of this change. I think that run_count_ should also move. we need to deep dive into the logic there and it can not be done during these hours :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should carefully look into it because the use_count_ mechanism is a little fragile. But I'm more than sure that just decrementing it (instead of moving the increment below) will work 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't I be calling intrusive_ptr_release(this); instead of just decrementing, to make sure that it's deleted?
BTW I don't understand how this mechanism works with multi/lua, how are transactions not destroyed after every single operation? We don't use run_count_.fetch_add() but .store() instead..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is run_count_ and there is use_count_. #jobsecurity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added an explicit call to intrusive_ptr_release(this);.
@romange @dranikpg - please be extra pedantic :)

@romange
Copy link
Collaborator

romange commented Jul 30, 2023

Please describe the reasoning in the commit message (what's impacted, why the change etc). The same true for any other non-trivial PR.

@@ -844,6 +844,14 @@ void Transaction::ExecuteAsync() {
// IsArmedInShard in other threads.
run_count_.store(unique_shard_cnt_, memory_order_release);

EngineShard* shard = EngineShard::tlocal();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please mention that we can not use coordinator_index_ because we may offload this function to run in a different thread.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: using ServerState::tlocal()->thread_index() is slightly better because you do not need to check for nullptr.

@chakaz chakaz requested review from dranikpg and romange July 31, 2023 03:52
Copy link
Collaborator

@romange romange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@chakaz chakaz requested a review from romange July 31, 2023 04:17
@chakaz chakaz merged commit fba0800 into main Jul 31, 2023
@chakaz chakaz deleted the short-circuit branch July 31, 2023 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants