-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Ancient block sync stalls #7008
Comments
normal behaviour/ misunderstanding- would please somebody of parity document(declare) this behavior, |
That's the ancient block download. Warp-sync downloads the latest snapshot and the last 30k blocks. After that, it starts downloading the full blockchain (yellow numbers). |
@5chdn @roninkaizen Please check the block numbers: ancient blocks being downloaded in the same range again und again. |
This is really weird indeed. Never seen this before. Could you restart with |
yes, please: Trace-Logs: |
Having similar issues with 1.8.2 |
Can we also make sure we do not purge ancient blocks whenever a warpsync kicks in? |
As a workaround:
@5chdn That's a separate issue, can you log it? |
👍 for this workaround With a 1.10.0 nightly build, warp took only ~40min from a fresh db! Peer connections are still really volatile, just dropped to ~10 peers. Got to a max of 25 peers, but spending most time with 1-5 peers. Will report back with sync status, currently at block 4880220. Warp synced to block 4880000 Update: confirmed, fully synced from scratch using warp and fast compaction!! Took ~30hrs from start-to-fully-synced. |
lght, this issue is about ancient block sync that happens after the warp sync. |
The issue Ancient block sync stalls has happened to me two times so far. Parity and Linux version strings follow:
Parity has been started with command:
It seems to happen once per ~1.5M blocks (1st occurrence at block 1639023, 2nd occurrence at block 2996831). At first occurrence, after the graceful restart (SIGHUP) the parity process continued to download ancient blocks from where it left off. There were parity process restarts in between the occurrences. Second occurrence of the issue just happened and I am keeping the process running at the stalled state. Blockchain head keeps syncing. If you need any further info from the stalled process or need access to the system, let me know (there is no private info on the system). If I did not hear back within 3 days, I planned to restart the process with TRACE loglevel. I expect the issue could happen again before all blocks are downloaded. While being in stale state, I took coredump of the process using gcore. Attached is gdb backtrace gdb_core_9187.txt. Log is at info level, so probably not useful. If you need the coredump file, I will share it (it contains no private info). If that helps, I can rebuild parity in a different way, e.g. with symbols and without optimizations. I would appreciate a hint how to do that as I have zero rust knowledge. |
This just happened to one of my fresh 1.10.0 nodes if anyone wants to debug this. |
I have tried twice with yesterday's commit 6552256. It might be attributed to compilation with rustc 1.26. Plus it seems there is a different memory allocation behavior while compiled with rustc 1.26 instead of 1.25. Either there is a memory leak or it cannot handle the situation of running in an OpenVZ container compared to running in KVM VM. Though running in OpenVZ container is probably non-interesting use case (I am using it because it comes with a good price), maybe it can help to know that the parity process in containerized node always dies after some time, what is not the case while compiled with 1.25. OpenVZ container reports correctly total amount of memory available, but not sure whether it is able to indicate memory allocation failure via ENOMEM under memory pressure or just invokes OOM killer.
in a container with 4GB RAM. Setting I am sorry I could not spend more time as this is just a hobby. Currently, I am trying to run both my nodes with vanilla 1.10.4 stable release binaries compiled with 1.25, downloaded from github. |
@GoodMirek Thanks so much for this, it's really helpful, however I don't think it's related to this issue. I'd also say that 4GB RAM is probably not enough to run Parity with those caches, ram usage can depend a bit on peers as well although not a ton. My own node consumes about 10gb ram with similar cache sizes, though that is still a bit above expectation for various reasons. It would be a separate issue to try to investigate running parity under OpenVZ vs KVM VM, it should work, but I personally don't know the requirements on the VM side. |
@folsen The point is that on the same container with the same command line options a previous version built with rustc 1.25 ran just fine for more than a week. Though I do not remember which commit hash it was, but not more than a month old. |
@GoodMirek Interesting, please open a separate issue for the difference between 1.25 and 1.26, we definitely don't want regressions between rust versions. |
configuration detail: chain is simlinked to another drive
After warp sync, I'm unable to get all missing blocks fetched. After new start Parity forgets the already fetched blocks and start downloading them again and again.
Here is a sample of 3 parity runs
Here is the log file from above (as text file)
parity-3x-sync.txt
The text was updated successfully, but these errors were encountered: