Skip to content

Commit

Permalink
erofs-utils: update README for the upcoming 1.8
Browse files Browse the repository at this point in the history
Add descriptions to multi-threaded compression and reproducible builds.

Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
  • Loading branch information
hsiangkao committed Aug 8, 2024
1 parent 10c1590 commit 971f9eb
Showing 1 changed file with 67 additions and 27 deletions.
94 changes: 67 additions & 27 deletions README
Original file line number Diff line number Diff line change
Expand Up @@ -54,51 +54,91 @@ mkfs.erofs

Two main kinds of EROFS images can be generated: (un)compressed images.

- For uncompressed images, there will be none of compresssed files in
these images. However, it can decide whether the tail block of a
file should be inlined or not properly [1].
- For uncompressed images, there will be no compressed files in these
images. However, an EROFS image can contain files which consist of
various aligned data blocks and then a tail that is stored inline in
order to compact images [1].

- For compressed images, it'll try to use the given algorithms first
- For compressed images, it will try to use the given algorithms first
for each regular file and see if storage space can be saved with
compression. If not, fallback to an uncompressed file.
compression. If not, it will fall back to an uncompressed file.

How to generate EROFS images (LZ4 for Linux 5.3+, LZMA for Linux 5.16+)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Note that EROFS supports per-file compression configuration, proper
configuration options need to be enabled to parse compressed files by
the Linux kernel.

Currently lz4(hc) and lzma are available for compression, e.g.
How to generate EROFS images
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Compression algorithms could be specified with the command-line option
`-z` to build a compressed EROFS image from a local directory:
$ mkfs.erofs -zlz4hc foo.erofs.img foo/

Or leave all files uncompressed as an option:
Supported algorithms by the Linux kernel:
- LZ4 (Linux 5.3+);
- LZMA (Linux 5.16+);
- DEFLATE (Linux 6.6+);
- Zstandard (Linux 6.10+).

Alternatively, generate an uncompressed EROFS from a local directory:
$ mkfs.erofs foo.erofs.img foo/

In addition, you could specify a higher compression level to get a
(slightly) better compression ratio than the default level, e.g.
Additionally, you can specify a higher compression level to get a
(slightly) smaller image than the default level:
$ mkfs.erofs -zlz4hc,12 foo.erofs.img foo/

Note that all compressors are still single-threaded for now, thus it
could take more time on the multiprocessor platform. Multi-threaded
approach is already in our TODO list.
Multi-threaded support can be explicitly enabled with the ./configure
option `--enable-multithreading`; otherwise, single-threaded compression
will be used for now. It may take more time on multiprocessor platforms
if multi-threaded support is not enabled.

Currently, both `-Efragments` (not `-Eall-fragments`) and `-Ededupe`
don't support multi-threading due to time limitations.

Reproducible builds
~~~~~~~~~~~~~~~~~~~

Reproducible builds are typically used for verification and security,
ensuring the same binaries/distributions to be reproduced in a
deterministic way.

Images generated by the same version of `mkfs.erofs` will be identical
to previous runs if the same input is specified, and the same options
are used.

Specifically, variable timestamps and filesystem UUIDs can result in
unreproducible EROFS images. `-T` and `-U` can be used to fix them.

How to generate EROFS big pcluster images (Linux 5.13+)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In order to get much better compression ratios (thus better sequential
read performance for common storage devices), big pluster feature has
been introduced since linux-5.13, which is not forward-compatible with
old kernels.

In details, -C is used to specify the maximum size of each big pcluster
in bytes, e.g.
By default, EROFS formatter compresses data into separate one-block
(e.g. 4KiB) filesystem physical clusters for outstanding random read
performance. In other words, each EROFS filesystem block can be
independently decompressed. However, other similar filesystems
typically compress data into "blocks" of 128KiB or more for much smaller
images. Users may prefer smaller images for archiving purposes, even if
random performance is compromised with those configurations, and even
worse when using 4KiB blocks.

In order to fulfill users' needs, big plusters has been introduced
since Linux 5.13, in which each physical clusters will be more than one
blocks.

Specifically, `-C` is used to specify the maximum size of each pcluster
in bytes:
$ mkfs.erofs -zlz4hc -C65536 foo.erofs.img foo/

So in that case, pcluster size can be 64KiB at most.
Thus, in this case, pcluster sizes can be up to 64KiB.

Note that large pcluster size can cause bad random performance, so
please evaluate carefully in advance. Or make your own per-(sub)file
compression strategies according to file access patterns if needed.
Note that large pcluster size can degrade random performance (though it
may improve sequential read performance for typical storage devices), so
please evaluate carefully in advance. Alternatively, you can make
per-(sub)file compression strategies according to file access patterns
if needed.

How to generate EROFS images with multiple algorithms (Linux 5.16+)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
How to generate EROFS images with multiple algorithms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It's possible to generate an EROFS image with files in different
algorithms due to various purposes. For example, LZMA for archival
Expand Down

0 comments on commit 971f9eb

Please sign in to comment.