-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(server): Adjust batching behavior to reduce network latency on MULTI blocks #1777
Conversation
a small correction: Can you please add here how you reproduced the scenario with increased latency? |
I reproduced it by running the provided benchmark from the issue ( |
src/facade/dragonfly_connection.cc
Outdated
// As a small optimization, if the queue was never larger than 1, skip the Yield() call. | ||
if (dispatch_q_.size() == 1 && queue_was_larger_than_1) { | ||
ThisFiber::Yield(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With memtier_benchmark --pipeline=2
you'll yield a lot 🤔 But its still more effective, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be more efficient to check yield only if we didn't yield on the previous iteration (i.e. the transaction was run with inline scheduling)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we need helio machinery for this. Moreover, iouring api allows learning if there is more data pending in the network buffer pending after the last Recv
call (IORING_CQE_F_SOCK_NONEMPTY from 5.19). I do not think I implemented support for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does trigger quite a few more Yields, I'm not sure what's the performance impact
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use the new helio epoch interface. Looked at the logs with memtier_benchmark and the JuiceFS benchmark and it looks quite good, probably 1 yield every 10 commands or something.
src/facade/dragonfly_connection.cc
Outdated
// command in the queue and let the producer fiber push more commands if it wants | ||
// to. | ||
// As a small optimization, if the queue was never larger than 1, skip the Yield() call. | ||
if (dispatch_q_.size() == 1 && queue_was_larger_than_1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it be that we wake up on non-empty queue and it does not have size > 1 ?
I thought we only push to dispatch_q if we have more than 1 request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least all pubsub/monitor messages go over the dispatch queue and there we can wake for a single element
…queue when needed. 2. Allow the receive buffer to grow when needed. 3. Improve debugging logs for batching behavior.
6e04377
to
3908c3e
Compare
The whole 'triggering the bug' process looks like this:
ParseRedis
loop reads commands from the first part of the packet and pushes them into the dispatch queue.Close #1285