-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Initial Sync disk IO / write amplification / disk usage #6280
Comments
You are not alone with this issue. Probably related, but not obviously:
|
I am the user who posted the size of my node when i don't run parity is about 235GB. When i launched with this command overnight parity --pruning archive --snapshot-peers 40 --cache-size-db 256 --cache-size-blocks 128 --cache-size-queue 256 --cache-size-state 256 --cache-size 4096 --db-compaction hdd it peaked at 430GB in the morning and when i closed parity it went back to around 230GB. For information about my system i use:
and here's a pastebin of my syncing with parity |
Cache settings does not really affect write amplification. These are designed to reduce read amplification and minimize block processing times when the node is up to date. We have a long term plan to move to a custom database backend that would allow for a more efficient state I/O. |
subscribing to this as well, it's so annoying, parity is eating I/O like a monster constantly, 1000 times more than bitcoin or any bitcoin based coin. there is no workaround currently to control/limit the i/o without breaking syncing process @arkpar? |
@gituser could you post logs? |
Parity running with
just went from 250G to 500G in 1 hour, filling the volume. After resizing the volume to 750G, disk usage drops back to 250G as soon as I started parity again. I'm constantly having WTF moments working with parity :) It seems we have to deploy monit to restart parity as soon as it goes nuts to prevent it filling the whole volume in minutes. |
The work on that already started, but it includes a new database layer and a lot of refactoring. It will not be available before 1.8. #6418 |
The growing starts at around 09:30 where the reorg is logged. Reorgs happened often before so it's not clear this has anything to do with the issue. For reference:
|
The thing about archiveDB is that it keeps everything. It will keep the full state of all blocks processed, even those which are eventually reorganized out of the chain. I'm not sure how well rocksdb handles having that much data, but it will definitely put a strain on your storage. I am not sure that even a specialized database (which we are in the process of building) would alleviate this much. Something more useful might be a semi-pruned mode, where we discard non-canonical states after a certain point, but keep all state of canonical blocks. |
I created a chart of the chain folder size while syncing parity (v1.6.0) in A) For the purple line the zig-zag is due to the regular pruning that occurs, right? Higher resolution: https://imgur.com/a/cx9et |
This seems to be the result of compacting the RocksDB. RocksDB writes a LOT to disk and from time to time this is compacted, resulting in massive disk usage drops. |
I'm experiencing the exact same thing as @jlopp, even wrote a similar script. |
Just to follow up on this, we were syncing Parity 1.7.2 with a 500 GB disk. Eventually we increased it to a 1 TB disk and were able to complete the sync. So there definitely appears to be a huge inefficiency somewhere that is causing the disk usage to be far higher than it needs to be. I just checked and one of our nodes that is still syncing is using 660GB of disk space, but if I restart parity it drops to 300GB. |
Yep. Looks like a more permanent fix has been pushed out to 1.9 - as an interim solution, is there some way that Parity could trigger DB compaction more frequently, instead of having to stop and restart the process? |
we are looking for a more permanent solution for this and started working on our own database implementation https://github.com/debris/paritydb/ But 1.8 is about to be released very soon, therefore I modified the milestone. |
Cool; worth noting that I ran into similar issues with Ripple nodes - they also use RocksDB by default. Ripple ended up writing their own DB called NuDB and when we switched to it, the problems were fixed. |
In case some weary traveler with finite disk space happens upon this github ticket before v1.9 comes out, here's my simple script to get sync working on a Mac: import subprocess
import os
import time
import signal
while True:
print("Running Parity...")
proc = subprocess.Popen(['parity', '--tracing', 'on', '--pruning', 'archive'])
print("Parity running with pid {0}".format(proc.pid))
while True:
time.sleep(30)
# https://stackoverflow.com/a/787832
s = os.statvfs('/')
gigs_left = (s.f_bavail * s.f_frsize) / 1024 / 1024 / 1024
print('{0} GB left'.format(gigs_left))
if gigs_left < 90:
break
print("Terminating Parity...")
os.kill(proc.pid, signal.SIGINT)
proc.wait() |
Has anyone suggested an archive mode that stores only the balances at each block? I'm working on a fully decentralized accounting/auditing project that has been working fine since summer 2016, but over the last few weeks, Parity is constantly failing because its disc usage grows from 400GB to over 800GB about twice a day. This blows out my 1TB drive. The recent article about the chain's size argues that the archive mode is unneeded (and does not increase security) because one can always rebuild the state by replaying transactions. This is true, and a perfectly legitimate position, but it misses a point. Without some source of a "double-check" that the rebuilding of state from transactions is acurate, it's impossible to have any faith in the results. You can end up at the end of the process with the same state, but what happens if you don't. You have a bug, but without an archive of previous states, finding that bug is impossible. If there was a mode where, at each block, my code (which is building state from transaction history) could double check that it's correct and quickly identify problems. I know that some addresses don't even carry a balance, so this doesn't work for every address, but it would work for "accounting" where balances are all that really matters. Upshot: add a feature called |
Also a possible solution would be to store a checkpoint state every |
I asked a question a couple of days ago about the snapshot in Parity. (1) does the snapshot work even if one is not using archive mode, (2) can I get at the data in the snapshot? If there's a continuum from full archive mode to warp mode. Storing just balances would be closer to full archive and giving access to snapshots would be closer to warp mode. Both would work--balances at every block would be easier for my work, but either would be welcome because full archive is a real problem. |
🎉 |
Why do we care about old state anyway? Do smart contracts look back in time - I thought they can only see the block-chain and receipts? I don't believe they need to be able to see full history of each account, perhaps im wrong though. Nodes run synchronous and check the current state of current variables. My understanding is by having the full state you can run any and all transactions and therefore collect transaction gas fees etc... with a partial database you could not run all transactions broadcast... but you could still run a lot. Having said that, I reckon it would be feasible to write a node that selectively dumps massive chunks of old unused state performing some kind of opinionated and largely negative analysis of the chance a future transaction ever happening. Cos there must be a fair amount of crunk junk data in there. Plus way too much use of int256 means a lot of 0x000000 in front of yer digits. |
I'm trying to setup a full node with complete history, thus
pruning=archive
.Disk IO looks like this on Amazon EC2 instance type c4.8xlarge:
parity is constantly writing to disk with ~300-500 MByte/s and peaks reach ~5000 IOPS.
What really bothers me is that parity is wasting so much disk space. There are times where i can see that /home grows by 1 GB/s just watching
df -h /home
.Some time after 2 Millions blocks were passed disk usage on /home was ~80 GB of data from parity alone. When stopping parity, that 80 GB of disk usage magically shrinks to ~37 GB of disk usage just to grow with 1 GB/s again after restarting parity.
Parity even ran out of disk space after filling up a 100GB EBS volume on Amazon AWS and at that time it had only downloaded about 50% of blocks.
My questions are:
The text was updated successfully, but these errors were encountered: