All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- Removed the CI test with Spack on CentOS 7
- Added the
sarus ps
command to list running containers. More details here. - Added the
sarus kill
command to terminate (and subsequently remove) containers. - Added the
-n, --name
option thesarus run
command to specify the name of the container to run. If the option is not specified, Sarus assigns a default name in the formsarus-container-*
. More details here - MPI hook: added support for the environment variable
MPI_COMPATIBILITY_TYPE
that defines the behaviour of the compatibility check of the libraries that the hook mounts. Valid values aremajor
,full
andstrict
. Default value ismajor
. More details here. - MPI hook: added support for the
HOOK_ROOTLESS
environment variable, enabling the hook to be used in rootless container runtimes. More details here. - SSH Hook: added a poststop functionality that kills the Dropbear process in case the hook does not join the container's PID namespace. More details here.
- Configuration templates and documentation for OCI hooks now use the
createRuntime
,createContainer
, orstartContainer
execution stages instead of theprestart
stage, which has been deprecated since version 1.0.2 of the OCI Runtime specification. The only exception is the NVIDIA Container Toolkit hook. - Updated the build environment of the Sarus static standalone package to Alpine Linux 3.20 with a GCC 13.2.1 toolchain.
- Updated recommended runc version to 1.1.14
- Updated recommended Boost version to 1.85.0
- Updated recommended RapidJSON version to commit ab1842a2da
- SSH hook: added support for the
com.hooks.ssh.port
OCI annotation, which allows to customize the port used by the Dropbear server.
- MPI hook: verbosity levels for log messages about ABI compatibility and library replacements have been slightly adjusted.
In particular, a warning about adding libraries into the container has been moved to a higher verbosity level
(i.e. it will only be displayed when using the
--verbose
or--debug
global command-line options). - SSH hook: the default port used by the Dropbear server is now set through the
SERVER_PORT_DEFAULT
environment variable in the hook JSON configuration file. TheSERVER_PORT
variable is still supported for backward compatibility, althoughSERVER_PORT_DEFAULT
takes precedence if set.
- SSH hook: usage of the
SERVER_PORT
environment variable in the hook JSON configuration file has been deprecated. Support for it will be removed in a future release.
- Glibc hook: fixed detection of the container's glibc version, which was causing a shell-init error on some systems
- SSH hook: permissions on the container's authorized keys file are now set explicitly, fixing possible errors caused by applying unsuitable defaults from the process.
- Changed the implementation of the lock for the image repository metadata file to a mechanism based on flock(2). The new implementation can support both shared locks (a.k.a. read locks) and exclusive locks (a.k.a. write locks), and improves the startup time when launching large numbers of containers at scale.
- Updated recommended runc version to 1.1.12
- Updated recommended libnvidia-container version to 1.14.5
- Updated recommended NVIDIA Container Toolkit version to 1.14.5
- Updated CI integration tests on Rocky 8 to use Python 3.9, solving a problem of missing wheel packages for the previous Python version
- Updated CI distributed tests to use Docker Compose V2 and Compose file format version 3
- Updated automatic documentation build to use Sphinx 7.2.6 and Sphinx RTD Theme 2.0.0
- SSH Hook: added support for the
com.hooks.ssh.pidfile_container
OCI annotation, which allows to customize the path to the Dropbear daemon PIDfile inside the container. - SSH Hook: added support for the
com.hooks.ssh.pidfile_host
OCI annotation, which optionally copies the PIDfile of the Dropbear server to the specified path on the host. - SSH Hook: added support for the
OVERLAY_MOUNT_HOME_SSH
environment variable, which allows to control the creation of an overlay filesystem on top of the container's${HOME}/.ssh
directory. More details here
- SSH Hook: added support for the
com.hooks.ssh.authorize_ssh_key
OCI annotation, which allows to authorize a user-provided public key for connecting to the running container. - Added a User Guide section about using Visual Studio Code's Remote Development extension in conjunction with Sarus and the SSH hook. More details here
- The configuration files for the SSH hook and the Slurm sync hook are no longer generated automatically as part of the CMake installation process. In other words, the aforementioned hooks are no longer configured and enabled by default.
- Updated recommended runc version to 1.1.9
- Updated CI tests from source on Fedora (36 -> 38) and OpenSUSE Leap (15.4 -> 15.5)
- Fixed support for image manifests which are provided by registries as multi-line, not indented JSON
- Fixed parsing from the command line of image references which feature registry host and image name, but no namespaces (e.g.
<registry>/<image>
)
- The installation directory of Sarus binaries is now always verified by the security checks. Previously the check on this directory could be skipped if no Sarus hooks were configured and if the runc and init binaries were located elsewhere.
- Added the
sarus hooks
command to list the hooks configured for the engine - Added the
--annotation
option tosarus run
for setting custom annotations in the OCI bundle. More details here - Added the
--mpi-type
option tosarus run
for selecting an MPI hook among those configured by the system administrator - Added a warning message when acquisition of a lock file on the local repository metadata file is taking an unusually long time. The message is displayed at a configurable interval (default 10 seconds), until the lock acquisition timeout is reached.
- Added support for the optional
defaultMPIType
parameter in thesarus.json
configuration file. More details here. - Added support for the optional
repositoryMetadataLockTimings
parameter in thesarus.json
configuration file. More details here. - Added the AMD GPU OCI hook to provide access to ROCm AMD GPU devices inside the container. More details here
- Added a new OCI hook to perform arbitrary sequences of bind mounts and device mounts into containers. The hook is meant to streamline the implementation and usage of advanced features which can be enabled through sets of related mounts. More details here.
- Added a note about the Boost minimum required version 1.77 when building on ARM.
- Sarus will now exit with an error if an operation requiring a lock file on the local repository metadata cannot acquire a lock within the configured timeout duration (default 60 seconds). Previously, Sarus would keep attempting to acquire a lock indefinitely.
- When printing error traces, entries related to standard C++ exceptions now provide clearer information
- Updated recommended runc version to 1.1.6
- Updated recommended libnvidia-container version to 1.13.0
- Updated recommended NVIDIA Container Toolkit version to 1.13.0
- Fixed a race condition when pulling private images concurrently with the same user
- Fixed a bug which was causing repository metadata files and their corresponding lockfiles to be created or atomically updated with root group ownership after executing a
sarus run
command. The aforementioned files are now correctly created or updated with user and group ownership of the user who launched Sarus.
- Added support for passing command-line options to
mksquashfs
through themksquashfsOptions
parameter in thesarus.json
configuration file - Added explicit forwarding of standard signals from engine to OCI runtime
- Added experimental support for the PMIx v3 interface. Given its experimental nature, this feature has to be enabled through a parameter in the
sarus.json
configuration file - Added CI unit and integration tests from source on Rocky Linux 8 and 9
- The
sarus run
andsarus images
commands now automatically remove images missing the internal SquashFS or metadata file, and report them as not available - The MPI hook and Glibc hook no longer enter the container PID namespace
- The Slurm Global Sync hook and the Timestamp hook no longer enter any container namespace
- Updated recommended runc version to 1.1.3
- Updated recommended libnvidia-container version to 1.11.0
- Updated recommended NVIDIA Container Toolkit version to 1.11.0
- Updated documentation about the NVIDIA Container Toolkit to refer more specifically to the NVIDIA Container Runtime hook
- The
configure_installation.sh
script can now acquire custom values for the local and/or centralized repository paths from environment variables. More details here - Updated CI tests from source on Ubuntu (21.10 -> 22.04), Fedora (35 -> 36) and OpenSUSE Leap (15.3 -> 15.4)
- Removed CI tests from source on Ubuntu 20.04 and CentOS 7
- The executable pointed by the
mksquashfsPath
parameter in thesarus.json
configuration file has been excluded from the security checks. Themksquashfs
utility is only used bysarus pull
andsarus load
commands, which already run without privileges
- Changed the default registry to
docker.io
. When the server is not entered as part of the image reference, thesarus run
command first looks underdocker.io
repositories and, if the image is not available, falls back to images under the previous default server (index.docker.io
). This is done to preserve compatibility with existing workflows. Thesarus images
andsarus rmi
commands treat images fromindex.docker.io
as images from a 3rd party registry. - If the image manifest obtained from a registry during a pull does not feature the
mediaType
property, Sarus now attempts to process the manifest as an OCI Manifest V1 instead of failing with an error. - Updated recommended libnvidia-container version to 1.10.0
- Updated recommended NVIDIA Container Toolkit version to 1.10.0
- Replaced Travis public CICD with Github Actions
- Fixed an issue in the generation of manifest digests, where the digest result was incorrectly influenced by JSON formatting
- Fixed an inconsistency with Skopeo which was preventing to pull private images from Docker Hub
- Added Skopeo as a dependency to pull or load container images
- Added Umoci as a dependency to unpack OCI images
- Added support for pulling, running and removing images by digest
- Added the
--digests
option tosarus images
for displaying digests of locally available images - Added the
--username
and--password-stdin
options tosarus pull
for supplying authentication credentials directly on the command line. More details here - Added support for the optional
containersPolicy
parameter in thesarus.json
configuration file. More details here. - Added support for the optional
containersRegistries.dPath
parameter in thesarus.json
configuration file. More details here. - Added support for labels defined in OCI image configurations
- Added CI unit and integration tests from source on Ubuntu 21.10, Debian 11 and Fedora 35
- Added git submodule for RapidJSON (commit fcb23c2dbf) to simplify dependency management and build process
- The
sarus images
command now displays the image ID by default. The image ID, as defined by the OCI Image Specification, is the hash of the image's configuration JSON. More details here - The
sarus pull
command now skips the pull if the requested image is already available locally and up-to-date - zlib is no longer a dependency of Sarus itself, but remains a dependency of the Dropbear software used by the SSH hook
- Updated the build environment of the Sarus static standalone package to Alpine Linux 3.15
- Removed the
insecureRegistries
parameter fromsarus.json
and the built-in support for insecure registries. Access to insecure registries via Skopeo must now be enabled through containers-registries.conf(5) files. More details here - Removed dependencies on cpprestsdk, libarchive, OpenSSL, libcap, and libexpat
- Removed CI unit and integration tests from source on Ubuntu 18.04, Debian 10 and Fedora 34
- The Glibc hook now uses the output of
ldd
to detect the version of glibc - Sarus now attempts to parse the Bearer authorization token regardless of the value of the
Content-Type
response header when pulling images
- Added support for proxy connections when pulling images from remote registries
- Added CMake option to control build of unit test executables
- Updated recommended runc version to 1.0.3
- Updated recommended libnvidia-container version to 1.7.0
- Updated recommended NVIDIA Container Toolkit version to 1.7.0
- Updated CppUTest framework for unit tests to version 4.0
- Fixed generation of README files for standalone archives
- Added the ability to pull from insecure registries via
insecureRegistries
parameter insarus.json
- Added the
-e/--env
option tosarus run
for setting environment variables inside the container. More details here - Added the
--device
option tosarus run
for mounting and whitelisting devices inside containers. More details here - Added support for the optional
siteDevices
parameter in thesarus.json
configuration file. This parameter can be used by administrators for defining devices to be automatically mounted and whitelisted inside containers. - Added the
--pid
option tosarus run
for setting the container PID namespace. More details here - Added support for applying seccomp profiles to containers
- Added support for applying AppArmor profiles to containers
- Added support for applying SELinux labels to container processes and to mounts performed by the OCI runtime
- The MPI hook whitelists access to devices bind mounted inside containers
- cgroup filesystems are mounted inside containers
- Added script to check for host requirements in CI, linked in documentation.
- Added CI unit and integration tests from source on Fedora 34 and OpenSUSE Leap 15.3
- Containers now use the host's PID namespace by default. A private PID namespace can be requested through the CLI
- The
--ssh
option ofsarus run
now implies--pid=private
- Changed format of the
environment
parameter in thesarus.json
configuration file - Updated documentation about how the initial environment variables are set in containers
- Updated recommended Boost version to 1.77.0
- Updated recommended Cpprestsdk version to 2.10.18
- Updated recommended libarchive version to 3.5.2
- Updated recommended RapidJSON version to commit 00dbcf2
- Updated recommended runc version to 1.0.2
- Updated recommended libnvidia-container version to 1.5.1
- Updated recommended NVIDIA Container Toolkit version to 1.5.1
- Updated Dropbear software used by the SSH hook to version 2020.81
- Miscellaneous updates to Dockerfiles used for CI stages; in particular, the Sarus static standalone package is now built on Alpine Linux 3.14 with a GCC 10.3.1 toolchain
- Corrected the error message when attempting to pull an image by digest
- The use of the
bind-propagation
property for bind mounts (deprecated in Sarus 1.1.0) has now been removed. All bind mounts are done with recursive private (rprivate
) propagation.
- Access to custom devices within containers is not allowed by default
- Added CI unit and integration tests from source on Ubuntu 20.04
- Added regular cleanups of CI caches on GitLab
- Added diagrams representing CI/CD workflows to developer documentation
- Added Markdown builder for Sphinx documentation
- Updated minimum required CMake version to 2.8.12
- Improved clarity of some messages from the MPI hook
- Updated copyright notice and license formatting
- Migrated container images used by unit and integration tests to Quay.io
- Fixed bug preventing extraction of image layers with hardlinks pointing to absolute paths
- Small fix to RapidJSON installation documentation
- Added
CONTRIBUTING.md
file with guidelines about contributing to the project - Added CI tests for the Spack package on Ubuntu 18.04, Debian 10, CentOS 7, Fedora 31, OpenSUSE Leap 15.2
- Added
wget
andautoconf
as buildtime dependencies in the Spack package - Added a documentation note about compiler selection when installing on CentOS 7 using the Spack package
- Added a documentation note about installing the static version of the glibc libraries when installing using the Spack package
- Fixed a bug preventing bind mounts to
/dev
in the container
- Removed the CI test for the Spack package on Ubuntu 16.04
- Support for pulling images from registries which do not use content redirect for blobs
- Fixed extraction of image layers when replacing directories with other file types
- MPI and Glibc hooks skip entries from the dynamic linker cache if such entries do not exist in the container's filesystem
- Slurm global sync hook drops privileges at startup
- MPI and Glibc hooks now perform validations with user credentials for host mounts and writes
- Customizable sarus and hooks configuration templates within etc folder
- Port number used by the SSH hook is now configurable
- Added note in the User Guide about bind mounting FUSE filesystems into Sarus containers
- The OCI hooks are now configured through OCI hook JSON configuration files. The previous OCI hooks configuration through
sarus.json
is no longer supported and Sarus Administrators should reconfigure their hooks according to the Sarus' hook documentation page - Replaced the custom OpenSSH used by the SSH hook with Dropbear
- Made CPU affinity detection more robust
- Updated recommended tini version to 0.19.0
- Updated recommended libnvidia-container version to 1.2.0
- Updated recommended NVIDIA Container Toolkit version to 1.2.1
- CLI: fixed detection of option values separated by whitespace
- CLI: 'sarus run' does not return an error anymore when passing an option (i.e. a token starting with "-") as the first argument to the container application. This allows to directly pass options to containers which feature an entrypoint
- Support for root_squashed filesystems as image storage and as bind mounts sources
- When executing unit tests through the CTest program, tests now run in the directory of the test binary
- Fixed broken links in the documentation
- Enabled Sarus to print log messages from the OCI Hooks
- Better documentation for ABI Compatibility here
- Added User Guide section about running MPI applications without the MPI hook. See here
- Added documentation about requiring Linux kernel >= 3.0 and util-linux >= 2.20
- Added AddressSanitizer CI job
- The glibc Hook is no longer activated by default, unless the
--mpi
option is used. To activate it explicitly, the new--glibc
option ofsarus run
can be used. See here - Using OCI annotations instead of environment variables to pass information to hooks. It is an internal change, transparent to users, moving towards OCI Hooks independence from Sarus
- Most of the Environment Variables for Hooks were renamed. Sarus Administrators should check the new names in the respective hook documentation pages
- OCI MPI Hook will now enable MPI "backwards" library injections, issuing a warning. More details here
- Improved the retrieval of image manifests from remote registries to better leverage the OCI Distribution specification
- Removed the explicit use of the
autoclear
option when loop-mounting squashfs images. Explicit use of the option causes a failure on Linux kernels >= 5.4. Theautoclear
option is still set implicitly by themount
system utility since June 2011 for kernels > 2.6.37. - Updated Spack packages and installation instructions
- Updated documentation about the NVIDIA Container Toolkit. See here
- The SSH and Slurm global sync hooks now use configurable paths for their resources and are no longer dependant on Sarus-specific directories
- Reviewed and updated documentation about runtime security checks. See here
- Several improvements to the Continuous Integration workflow
- Fixed bug on OCI MPI Hook which failed to run containers having multiple versions of an MPI Dependency library
- Runtime security checks no longer fail if a checked path does not exist
- Fixed setting of default bind propagation values for custom mounts
- Fixed parsing of authentication challenges from the NVIDIA GPU Cloud registry
- Fixed the ability to pull images from the Quay.io registry
- Compiling now with -fstack-protector-strong as a measure against buffer overflows
- Added the
--workdir
option tosarus run
for setting the initial working directory inside the container. - Added "Communications" and "Publications" section to project README.
- Added documentation about complementing Sarus with Skopeo for interacting with 3rd party registries.
- Added integration tests for security checks.
- Updated libarchive dependency to version 3.4.1.
- Updated recommended runc version to 1.0.0-rc10.
- Improved string parsing by using Boost functions.
- Site/user bind mounts have "recursive private" propagation by default. More details here.
- Extensive code refactoring on the Native MPI hook:
- Easier to extend and better control of performed actions.
- More robust symlink generation.
- Enhanced ABI version resolution.
- Improved unit tests.
- Factored out non-specific code to common utility functions.
- The Slurm global sync hook is activated only when the user requests activation of the SSH hook.
- Transitioned integration tests to Python 3 and pytest.
- Integration tests for the virtual cluster reuse the same Docker image of unit and integration tests.
- Updated cookbook page about the Intel Cluster Edition software.
- Deprecated the use of the
bind-propagation
property for site/user bind mounts. It will be removed in a future release.
- Fixed propagation of CPU affinity from the host to the container process.
- Fixed some hyperlinks in the documentation
- Changes to security checks:
- Reorganized and unified code for the checks.
- Root ownership is checked based on uid, regardless of gid.
- Root ownership for directories is checked recursively all the way up to the
/
directory. - Always check that
sarus.json
is untamperable regardless of the value of the configuration parameter.
- Improved usage of libarchive to prevent image contents from spilling outside of the expansion directory when extracting layers.
- The SSH hook runs sshd as an unprivileged process.
- Added the possibility to build Sarus as a static standalone binary.
- The CI generates a standalone archive which packages a static binary of Sarus, the hooks and binary dependencies. This archive can be used to quickly deploy Sarus.
- Added a script to configure a Sarus installation regardless of the installation method. The script automatically sets up Sarus for basic functionality. Advanced features can be enabled by editing the configuration file.
- Added a mechanism to preserve PMI2 file descriptors from the host into the container.
- Sarus' stdout and stderr file descriptors are duplicated and exposed to OCI hooks. The glibc hook reuses these file descriptors to print messages about its activation.
- Added the possibility to start an init process within the container with the
--init
option tosarus run
. An init process is useful for handling signals or reaping zombie processes within containers (see the User guide for further details). Sarus uses tini as its default init process. - Added support for Travis CI.
- Added a Spack package for Sarus.
- Added a Quickstart documentation page.
- Added CI tests to verify installation from source on various Linux distributions. The scripts for these tests are also reused to generate code snippets in the documentation.
- Errors generating from incorrect CLI usage print error messages instead of exception traces. Traces are still displayed when using
--verbose
or--debug
options. /dev/shm
is bind mounted from the host instead of having the OCI runtime create a new filesystem.- Execution of security checks is now controlled through a parameter on the configuration file rather than a CMake option.
- Restored use of
pivot_root()
by runc. - Improved output format and clarity of
sarus --help
. - Changes to Sarus SSH key generation:
- By default,
sarus ssh-keygen
generates ssh keys only if no keys already exist.sarus ssh-keygen --overwrite
can be used to force the regeneration of keys. - Generation of keys is now protected by a lockfile to prevent race conditions.
- By default,
- Updated recommended runc version to 1.0.0-rc9 (now also bundled in the standalone archive).
- Improved accuracy of test coverage data.
- Updated documentation for the NVIDIA Container Toolkit (formerly NVIDIA GPU hook).
- Refactored and streamlined documentation about installation and configuration procedures.
- Fixed files/directories generation in some situations by explicitly setting umask at the start of Sarus.
- Improved consistency of some integration tests.
- Added the glibc hook: performs replacement of the container's glibc stack with an host counterpart. This ensures that resources injected from the host (e.g. MPI) work correctly on images which feature glibc versions too old for the host resources.
- Added cookbook page about the Intel Cluster Edition software
- Added support for publishing documentation on Read the Docs.
- Extensive review and update of the cookbook.
- Native MPI hook: improved generation of symlinks in the container for more robust detection of the injected libraries by the dynamic linker.
- Print an error message when an unrecognized CLI global option is detected.
- Updated license and applied license header to source files.
- Added the Timestamp hook and related documentation.
- Container processes now inherit supplementary gids from the host process that called Sarus.
- The SSH hook build scripts now compile OpenSSL with a single process. Multi-process building was causing the process to fail on some Linux flavors.
- The Slurm global sync hook doesn't perform any action if Slurm environment variables are not defined in the container.
- Updated recommended runc version to 1.0.0-rc8.
- Removed explicit documentation about building runc from source. A link to the related section in the official runc project is still present.
- Various fixes to documentation.
- Fixed validation of destination paths for site/user mounts.
- Fixed application of whiteouts during extraction of image layers.
- New documentation content:
- Section about building runc from source
- Page about Slurm global sync hook
- User doc about initial working directory in the container
- Developer doc about running unit and integration tests
- CI checks to verify documentation can be built whether the source directory is a git repository or not.
- CI checks to verify correct detection of version string.
- CI checks to verify VERSION file is in sync with latest git tag.
- Cookbook example about OpenMPI through SSH.
- Explicitly set permissions of the OCI bundle directory
- Enabled security checks when performing integration tests.
- Updated documentation about security checks.
- The SSH hook is built by default when building Sarus.
- Fixed extraction of image layers containing directories without the executable bit set.
- Various fixes to documentation.
- Updated recommended runc version to commit 6635b4f0c6 (addresses CVE-2019-5736).
- Cookbook with use cases as part of the documentation.
- Improved output of
sarus images
: includes server name in the repository string if the image is not from Docker Hub. - Improved help messages of
sarus run
,sarus rmi
: clearly state that the image repository must match the value displayed bysarus images
.
- Fixed the
--version
global option. - Pass the
--no-pivot
option to runc to avoid mount problems on CLE6.
- Security checks recursively verify parent directories and writable permissions of group or others.
Initial release.