-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async batch verification (+40% sig verification throughput) #5176
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent.
In that case we can probably go a bit further, create the Taskpool directly in another thread.
Because in this case, assuming you have a Rpi with 4 cores, only 3 will work on cryptography and there is 1 which will handle the rest but I think the rest might not be heavy enough to fully utilize it.
In that case we have a slight oversubscription but the OS routinely handles that fine.
Main issue is changing taskpools to handle work submitted from a thread that is not part of the threadpool.
# If the hardware is too slow to keep up or an event caused a temporary | ||
# buildup of signature verification tasks, the batch will be dropped so as to | ||
# recover and not cause even further buildup - this puts an (elastic) upper | ||
# bound on the amount of queued-up work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to the dropped work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the attestations are dropped instead of being processed -> no gossip etc
26328f4
to
6a4e9a4
Compare
# Pruning the queue here makes sure we catch up with processing if need be | ||
batchCrypto.pruneBatchQueue() # Skip old batches | ||
proc batchVerifyTask(task: ptr BatchTask) {.nimcall.} = | ||
# Task suitable for running in taskpools - look, no GC! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see you're just skipping the GC.
3af74be
to
23ed805
Compare
with a Flowvar in hand, one can wait for specific tasks instead of the whole pool so which thread "owns" the pool should ideally be less critical - it's already an mpmc:ish kind of setup where tasks can spawn more tasks and wait for tasks from anywhere, no? This, along with a preference to post tasks to not-this-thread. potentially one should allow |
23ed805
to
b8ad597
Compare
When batch verification is done, the main thread is blocked reducing concurrency. With this PR, the new thread signalling primitive in chronos is used to offload the full batch verification process to a separate thread allowing the main threads to continue async operations while the other threads verify signatures. Similar to previous behavior, the number of ongoing batch verifications is capped to prevent runaway resource usage. In addition to the asynchronous processing, 3 addition changes help drive throughput: * A loop is used for batch accumulation: this prevents a stampede of small batches in eager mode where both the eager and the scheduled batch runner would pick batches off the queue, prematurely picking "fresh" batches off the queue * An additional small wait is introduced for small batches - this helps create slightly larger batches which make better used of the increased concurrency * Up to 2 batches are scheduled to the threadpool during high pressure, reducing startup latency for the threads Together, these changes increase attestation verification throughput under load up to 30%.
When batch verification is done, the main thread is blocked reducing concurrency.
With this PR, the new thread signalling primitive in chronos is used to offload the full batch verification process to a separate thread allowing the main threads to continue async operations while the other threads verify signatures.
Similar to previous behavior, the number of ongoing batch verifications is capped to prevent runaway resource usage.
In addition to the asynchronous processing, 3 addition changes help drive throughput:
Together, these changes increase attestation verification throughput under load up to 40%.
More CPU cores used (110 vs 140% usage - the cores here are running other nimbus' - on my node, CPU hovers around 180% usage after this PR):
Better attestation throughput (7k -> ~10k/slot)
Increased batch size (leading to more efficient verification):
Depends on:
status-im/nim-chronos#406
status-im/nim-blscurve#154
status-im/nim-taskpools#33
#5169