-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk: Add TimerWheel to Bulk to improve latency #1640
Conversation
private readonly int congestionDecreaseFactor = 5; | ||
private readonly int maxDegreeOfConcurrency; | ||
private readonly TimerWheel timerWheel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can E2E performance tests be added to make it easy to detect regressions in this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is where to put them. The Tools/Benchmark project is not the right place, and we don't have any E2E test projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to block on this. This can be done when the rest of the perf gates are done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a new perf project and test project that are explicit contract validation tests.
* using timerwheel * tests
Closing due to in-activity, pease feel free to re-open. |
Description
Current Bulk implementation is optimized for high throughput utilization, which is the ideal scenario for Bulk.
There are, however, scenarios where the volume of data might be dynamic in nature and not always high (compared with the provisioned RU/s). In these scenarios where the volume of data is low compared with the provisioned throughput, Bulk leverages an internal timer to dispatch the batches that don't get filled.
This internal timer is leveraging a construct called
TimerPool
, which is an implementation of an ordered timer list. TheTimerPool
has a minimum dispatch time of 1 second, so the internal timers we were using for Bulk could dispatch half-filled batches only on a 1 second intervals.The effect was that low data volume usages had 1-2 second latency (see below in performance baselines).
This PR changes the
TimerPool
for aTimerWheel
(introduced in #1612) with a 100ms dispatch time. The effect is that on low volume scenarios, the overall latency is improved up to 90%, while normal high volume scenarios are un-affected.Why not use a TimerWheel with < 100ms? Because during performance benchmarking, using a lower amount (for example 50ms) was only helping in the case where the # of documents was less than the internal batch size (100), in all other scenarios, the performance ended up being worse than with 100ms because the batches were being dispatched too fast and wasn't letting them get filled.
Performance comparison
All comparisons were made using an Indexing Policy that does not index any field to increase throughput:
All time measurements were calculated by running the scenario 10 times and obtaining the average.
Container with 100K RU/s provisioned, Item size 1KB
Baseline (with PooledTimer 1s)
Change (with TimerWheel 100ms)
Container with 10K RU/s provisioned, Item size 1KB
Baseline (with PooledTimer 1s)
Change (with TimerWheel 100ms)
Versus TimerPool
To verify that TimerWheel is not bringing more allocations into the picture, this is a comparison between a 1 sec TimerPool and 1 sec TimerWheel, creating 10000 timers and creating 1 timer:
Type of change