You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.
As detailed in #9531, the ancient block import queue fills up frequently which was causing the ancient blocks sync to retract. That fix should prevent the retractions, but another issue is that the queue still fills up fairly frequently, probably due to contention with the NewBlocks sync or slow writing to disk (need to investigate). The effect of this is that lots of downloaded ancient blocks are discarded (the ones that can't fit in the queue) and they have to be downloaded again. The problem gets worse as we get into the 3_000_000+ range, downloading blocks faster than we can import them.
Brainstorming solutions
One solution would be to reuse the mechanism of the NewBlocks sync which pauses the downloading of blocks if the queue is full. For that to work we would have to figure out a way to keep the downloaded blocks which couldn't go in the queue. Could make the queue unbounded (like the tx and main block import queue), allowing the current batch to be added in its entirety, but with a check against queue_full which would pause the downloading of more blocks.
A potentially simpler alternative might be to change the number of blocks to request per round to be the same as the queue capacity, and only move to the next round once the queue is empty. Downside is that would slow down the syncing process, but that may not be much of a problem. Could mitigate that by checking the queue is half (or some %) full and then requesting that amount of blocks. Might still occasionally get a full queue because of a race condition but we can handle that.
Another avenue might be to investigate whether we can improve the import speed, is it because of contention with NewBlocks sync or just limited by disk IO.
Pinging @tomusdrw as we've chatted about this before, if you have any thoughts.
The text was updated successfully, but these errors were encountered:
I think this fits a bit into more general way how we could re-work importing stuff from the network. I believe it would be worth to try mpsc::sync_channel for all the cases we have - the idea is that we would have one thread that handles the import and networking tasks would just push stuff to the channel - in case the bound is reached (to be determined experimentally), the channel starts to block the threads and wait for completion (so pretty much back-pressures the sync process).
This has to be carefuly tested though, cause we don't really want all IoWorker threads to be blocked with some tasks trying to import stuff to the DB (essentially we would return to issues we had before this process was done asynchronous). Potentially this could be reworked further to specialise the IoWorker threads and have different behaviour for "write" and "read-only" tasks - i.e. we really want the networking to be able to respond to all requests and keep connections running (best effort) while we are busy importing blocks, but we want to back-pressure importing new blocks/transactions if we are struggling with the current ones we have. That would require some kind of prioritization mechanism for different types of tasks that are run on the IoWorkers (i.e. opening/closing connections, responding to ping, handling different kinds of packets would need to have different priorities)
As detailed in #9531, the ancient block import queue fills up frequently which was causing the ancient blocks sync to retract. That fix should prevent the retractions, but another issue is that the queue still fills up fairly frequently, probably due to contention with the
NewBlocks
sync or slow writing to disk (need to investigate). The effect of this is that lots of downloaded ancient blocks are discarded (the ones that can't fit in the queue) and they have to be downloaded again. The problem gets worse as we get into the3_000_000+
range, downloading blocks faster than we can import them.Brainstorming solutions
One solution would be to reuse the mechanism of the
NewBlocks
sync which pauses the downloading of blocks if the queue is full. For that to work we would have to figure out a way to keep the downloaded blocks which couldn't go in the queue. Could make the queue unbounded (like the tx and main block import queue), allowing the current batch to be added in its entirety, but with a check againstqueue_full
which would pause the downloading of more blocks.A potentially simpler alternative might be to change the number of blocks to request per round to be the same as the queue capacity, and only move to the next round once the queue is empty. Downside is that would slow down the syncing process, but that may not be much of a problem. Could mitigate that by checking the queue is half (or some %) full and then requesting that amount of blocks. Might still occasionally get a full queue because of a race condition but we can handle that.
Another avenue might be to investigate whether we can improve the import speed, is it because of contention with
NewBlocks
sync or just limited by disk IO.Pinging @tomusdrw as we've chatted about this before, if you have any thoughts.
The text was updated successfully, but these errors were encountered: