disable state history log file compression #33
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I would like to disable ship's log file compression.
Compressing and writing the log is done synchronously from the main thread so I feel like these operations should be as quick as possible. (I can think of some complex ways to compress on another thread, but would suggest not adding more complexity). This change writes non-compressed zlib streams in to the log. The benefit of doing it this way is that the logs remain backward compatible with older versions -- nothing about the log format changed. Some small marginal overhead remains due to the zlib framing and checksumming.
The performance benefit from this change is substantial, with the initial state write being reduced from 12m30s to 6m20s. A significant remainder of the time is spent in some seemingly grossly inefficient ostream churn so I suspect when the writing is optimized further the zlib compression will come out as having more than a 2x overhead.
Of course, the size of the logs will be larger. In this case the initial state increased from 5.2GB to 44GB. A user who cares about space usage has other options available to them such as filesystem compression (which can use better compression algos like zstd). Or, we could potentially after a configuration item for compression vs no compression?