Skip to content

Building the library

Oz edited this page Oct 23, 2023 · 36 revisions

🧰 Requirements

  • Operating System: Windows, Linux, macOS[^1].
  • Compiler: MSVC 2015 or later[^2], MinGW v6.4 or later, GCC v4.9 or later, Clang 3.5 or later.
  • Build Automation: CMake v3.11 or later
  • Third-party libraries: 7-zip's source code (any version, e.g. v23.01) [^3]

[^1]: On Windows, you should link your program also with oleaut32 (e.g., -lbit7z -loleaut32).
On Linux and macOS, you should link your program also with dl (e.g., -lbit7z -ldl).
If you are using the library via CMake, these dependencies will be linked automatically to your project.

[^2]: MSVC 2010 was supported until v2.x, MSVC 2012/2013 until v3.x.

[^3]: Since the stable v4, the source code of 7-zip is automatically downloaded when configuring/using the bit7z project via CMake; you can specify the version of 7-zip to be used for compiling bit7z by passing the option BIT7Z_7ZIP_VERSION to CMake (e.g., -DBIT7Z_7ZIP_VERSION="22.01"). Alternatively, you can manually specify the path to the 7-zip source code using the BIT7Z_CUSTOM_7ZIP_PATH CMake option.

Note: you should preferably use the same 7-zip version used to build the shared libraries you are going to employ in your software.

⚙️ How to Build

The library can be built using CMake:

cd <bit7z folder>
mkdir build && cd build
cmake ../ -DCMAKE_BUILD_TYPE=Release
cmake --build . -j --config Release

You can also directly integrate the library into your project via CMake:

  • Download bit7z and copy it into a sub-directory of your project (e.g., third_party), or add it as a git submodule of your repository.
  • Then, use the command add_subdirectory() in your CMakeLists.txt to include bit7z.
  • Finally, link the bit7z library using the target_link_libraries() command.

For example:

add_subdirectory( ${CMAKE_SOURCE_DIR}/third_party/bit7z )
target_link_libraries( ${YOUR_TARGET} PRIVATE bit7z )

The library is highly customizable:

Build Option Description
BIT7Z_7ZIP_VERSION
BIT7Z_AUTO_FORMAT Enables the automatic format detection of input archives (default: OFF).
BIT7Z_BUILD_TESTS
BIT7Z_BUILD_DOCS
BIT7Z_CUSTOM_7ZIP_PATH
BIT7Z_DISABLE_ZIP_ASCII_PWD_CHECK
BIT7Z_ENABLE_SANITIZERS
BIT7Z_GENERATE_PIC
BIT7Z_REGEX_MATCHING Enables the support for the extraction of files matching regular expressions.
BIT7Z_TESTS_FILESYSTEM
BIT7Z_TESTS_USE_SYSTEM_7ZIP
BIT7Z_USE_NATIVE_STRING
BIT7Z_USE_STD_BYTE
BIT7Z_VS_LIBNAME_OUTDIR_STYLE
Windows-only options
Build Option Description
BIT7Z_AUTO_PREFIX_LONG_PATHS Enables or disables automatically prepending paths with the Windows long path prefix, allowing to handle archives and files with long paths (default: OFF).
BIT7Z_PATH_SANITIZATION Enables or disables path sanitization when extracting archives containing files with invalid Windows filenames (default: OFF).
BIT7Z_USE_SYSTEM_CODEPAGE Enables or disables using the default Windows codepage for string conversions (default: OFF).
MSVC-only options
Build Option Description
BIT7Z_ANALYZE_CODE
BIT7Z_STATIC_RUNTIME
Unix-only options

BIT7Z_USE_LEGACY_IUNKNOWN:

Clang-only options

BIT7Z_LINK_LIBCPP:

📑 7-zip Version

While configuring bit7z via CMake, it automatically downloads the latest version of 7-zip currently supported by the library.

Optionally, you can specify a different version of 7-zip via the CMake option BIT7Z_7ZIP_VERSION (e.g., -DBIT7Z_7ZIP_VERSION="22.01").

Alternatively, you can specify a custom path containing the 7-zip source code via the option BIT7Z_CUSTOM_7ZIP_PATH.

Please note that, in general, it is best to use the same version of 7-zip of the shared libraries that you will use at runtime.

Using 7-zip v23.01 on Linux and macOS

Expand for more details!

On Linux and macOS, 7-zip v23.01 introduced breaking changes to the IUnknown interface. If you build bit7z for such a version of 7-zip (the default), it will not support using the shared libraries from previous versions of 7-zip (or from p7zip). Conversely, bit7z made for earlier versions of 7-zip or for p7zip is incompatible with the shared libraries from 7-zip v23.01 and later.

You can build the shared libraries of 7-zip v23.01 in a backward-compatible mode by defining the macro Z7_USE_VIRTUAL_DESTRUCTOR_IN_IUNKNOWN. If this is your case, you can build bit7z for v23.01 using the option BIT7Z_USE_LEGACY_IUNKNOWN (in this case, bit7z will be compatible also with previous versions of 7-zip/p7zip).

🌐 String Encoding

By default, bit7z follows the UTF-8 Everywhere Manifesto to simplify the use of the library within cross-platform projects. In short, this means that:

  • The default path string type is std::string.
  • Input std::strings are considered as UTF-8 encoded; output std::strings are UTF-8 encoded.
Expand for more details and for other string encoding options!

On POSIX systems, std::strings are usually already UTF-8 encoded, and no configuration is needed.

The situation is a bit more complex on Windows since, by default, Windows treats std::strings as encoded using the system code page, which may not necessarily be UTF-8, like, for example, Windows-1252.

If your program deals exclusively with ASCII-only strings, you should be fine with the default bit7z settings (as ASCII characters are also UTF-8).

However, if you need to handle non-ASCII/Unicode characters, as it is likely, you have the following options:

  • Enforcing using the UTF-8 code page for your whole application, as explained by Microsoft here:

    • Recommended, but supported only since Windows 10 1903 and later.
  • Manually ensuring the encoding of the std::strings passed to bit7z:

    • You can use some string encoding library or C++11's UTF-8 string literals for input strings.
    • User-input strings (e.g., the password of an archive) can be handled as explained here; in short: read the input as an UTF-16 wide string (e.g., via ReadConsoleW), and convert it to UTF-8 (bit7z provides a utility function for this, bit7z::to_tstring).
    • You can correctly print the UTF-8 output strings from bit7z (e.g., the path/name metadata of a file in an archive) to the console by calling SetConsoleOutputCP(CP_UTF8) before.
  • Configuring bit7z to use UTF-16 encoded wide strings (i.e., std::wstring) by enabling the BIT7Z_USE_NATIVE_STRING option via CMake.

    • If your program is Windows-only, or you already use wide strings on Windows, this might be the best choice since it will avoid any internal string conversions (7-zip always uses wide strings).

    • This option makes developing cross-platform applications slightly inconvenient since you'll still have to use std::string on POSIX systems.

    • The library provides a type alias bit7z::tstring and a macro function BIT7Z_STRING for defining wide string variables and literals on Windows and narrow ones on other platforms.

    • You must programmatically set the standard input and output encoding to UTF-16 to correctly read and print Unicode characters:

      #include <fcntl.h> //for _O_U16TEXT
      #include <io.h>  //for _setmode
      
      _setmode(_fileno(stdout), _O_U16TEXT); // setting the stdout encoding to UTF16
      _setmode(_fileno(stdin), _O_U16TEXT); // setting the stdin encoding to UTF16
  • Configuring bit7z to use the system code page encoding for std::string by enabling the BIT7Z_USE_SYSTEM_CODEPAGE option via CMake.

    • Not recommended: using this option, your program will be limited in the set of characters it can pass to and read from bit7z.
Clone this wiki locally