Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Zstd/LZ4 support for region files (1.15 edition) #2814

Closed
wants to merge 18 commits into from

Conversation

egg82
Copy link
Contributor

@egg82 egg82 commented Jan 1, 2020

Zstd using zstd-jni: https://github.com/luben/zstd-jni
LZ4 using lz4-java: https://github.com/lz4/lz4-java

Added new command-line option, --forceWrite, which will forcibly write the chunk/flush to disk regardless of whether or not the chunk has actually changed in a --forceUpgrade.
New compression only happens when a chunk is saved, so running a --forceUpgrade AND --forceWrite may be desirable.

New config option for the compression method to use.
There's a new option in PaperConfig (compression level) for Zstd which defaults to 6.
There's a new option in PaperConfig (compression level) for Zlib which defaults to 5.

Since the patch hooks Mojang's versioning system for new compression methods, it
should be internally consistent, easily-updatable, and reversible with a --forceUpgrade AND a --forceWrite.

This will likely conflict with plugins and programs expecting region files to be compressed using the current standard.

  • Implementation
  • Dictionary

Zstd using zstd-jni: https://github.com/luben/zstd-jni
LZ4 using lz4-java: https://github.com/lz4/lz4-java

New compression only happens when a chunk is saved, so running a force-upgrade may be desirable.

New config option for the compression method to use.
There's a new option in PaperConfig (heavy compression) for Zstd which defaults to false.

Since the patch hooks Mojang's versioning system for new compression methods, it
should be internally consistent, easily-updatable, and reversible with a force-upgrade.

This will likely conflict with plugins and programs expecting region files to be compressed using the current standard.
Zstd using zstd-jni: https://github.com/luben/zstd-jni
LZ4 using lz4-java: https://github.com/lz4/lz4-java

New compression only happens when a chunk is saved, so running a force-upgrade may be desirable.

New config option for the compression method to use.
There's a new option in PaperConfig (heavy compression) for Zstd which defaults to false.

Since the patch hooks Mojang's versioning system for new compression methods, it
should be internally consistent, easily-updatable, and reversible with a force-upgrade.

This will likely conflict with plugins and programs expecting region files to be compressed using the current standard.
@egg82
Copy link
Contributor Author

egg82 commented Jan 1, 2020

Dictionary coming soon, as soon as I train one. This is the rest of the implementation completed.

Due to some updates in the Zstd library, this version should be even faster than the previous implementation!

@egg82
Copy link
Contributor Author

egg82 commented Jan 1, 2020

Dictionary is done and I made some suggested corrections.

@egg82
Copy link
Contributor Author

egg82 commented Jan 4, 2020

Test results
Ratio = "how many times can the compressed data fit into its own uncompressed data?" - so, higher is better.
Speed is in us, or microseconds. eg. 2000us = 2ms. Lower is better.
Dict (or dictionary) is with a dictionary trained on the uncompressed data.

Byte array decompression can be made more efficient by including the uncompressed size in the region/chunk headers, but using both byte arrays and adding the new header would require modification that is beyond the scope of this patch. Perhaps a future patch.

Zlib stream compression levels:
https://docs.google.com/spreadsheets/d/1eKatLMxA560RerGIRyYVWC8n4bkkRe4Uj0sBFg2wtOY/edit?usp=sharing
Zlib stream w/ dict compression levels:
https://docs.google.com/spreadsheets/d/1PVq4vWJWLyj-crCiB1nUdC13zc4HcCsPKotv5dLOU9c/edit?usp=sharing

Zstd stream compression levels:
https://docs.google.com/spreadsheets/d/1k0YL8nQEPYtV6pY1U_uRc2jfIz4WDObpmrM0KLWPvaM/edit?usp=sharing
Zstd stream w/ dict compression levels:
https://docs.google.com/spreadsheets/d/1P_75_VPDTjopefps26fgNnfspA-Zg7DEmkUVb-lQjz0/edit?usp=sharing

LZ4:
https://docs.google.com/spreadsheets/d/1iNcqo07ySM-r2-1DTBOmd5vl809u97AmacdxSOMiPCQ/edit?usp=sharing

Since 1.15 relies on streams, I decided not to fight it and let the ~15us penalty for streams over byte arrays go, as converting to a byte array would require more code, and thus be less maintainable down the line.

tl;dr from Zlib (current) to Zlib (new, default level) would mean:

  1. An average compression ratio from ~9.001 to ~8.875
  2. An average compression speed from ~651.813us to ~375.993us
  3. An average decompression speed from ~89.391us to ~91.567us

from Zlib (new, default level) to Zstd (new, default level, with the included dictionary) would mean:

  1. An average compression ratio from ~8.875 to ~21.299
  2. An average compression speed from ~375.993us to ~346.219us
  3. An average decompression speed from ~91.567us to ~40.155us

@Black-Hole
Copy link
Contributor

How about a versioned dictionary? With future Minecraft versions it might be useful to update the dictionary.

@electronicboy electronicboy reopened this Feb 7, 2020
@zachbr zachbr self-requested a review February 7, 2020 03:12
@electronicboy electronicboy self-requested a review February 7, 2020 03:13
Copy link
Member

@electronicboy electronicboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanna clarify here,what's the behavior of the zstd dicts if we bump? Do we have to keep the existing dict for decompression chunks or is the dict saved/whatever with the chunk data?

+ if (regionCompressionMethod == RegionCompressionType.ZSTD) {
+ log("Using level " + regionCompressionZstdLevel + " for Zstd compression." + (regionCompressionZstdLevel >= 14 ? " (Make my PC hurt)" : ""));
+ }
+ regionCompressionZlibLevel = getInt("settings.region-compression.zlib-level", 5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should default from the field, same for the one above too

@Spottedleaf
Copy link
Member

existing dict needs to be kept, only the dict version is stored on chunk data

@Proximyst Proximyst added status: rebase required type: feature Request for a new Feature. labels Aug 24, 2020
@egg82 egg82 requested a review from a team as a code owner August 25, 2020 02:41
@y6x2digc
Copy link

y6x2digc commented Sep 9, 2020

Any news about this?

@stale
Copy link

stale bot commented Dec 2, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@MiniDigger MiniDigger added status: accepted Disputed bug is accepted as valid or Feature accepted as desired to be added. and removed resolution: stale labels Dec 9, 2020
@egg82
Copy link
Contributor Author

egg82 commented Jan 8, 2021

Closed in favor of #5029

@egg82 egg82 closed this Jan 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: accepted Disputed bug is accepted as valid or Feature accepted as desired to be added. status: rebase required type: feature Request for a new Feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants