The Open Containers Initiative (OCI) container format, which grew out of Docker, is the dominant standard for cloud-focused containerized deployments of software. Although {Project}'s own container format has many unique advantages, it's likely you will need to work with Docker/OCI containers at some point.
{Project} aims for maximum compatibility with Docker, within the constraints on a runtime that is well suited for use on shared systems and especially in HPC environments.
Using {Project} you can:
- Pull, run, and build from most containers on Docker Hub, without changes.
- Pull, run, and build from containers hosted on other registries, including private registries deployed on premise, or in the cloud.
- Pull and build from OCI containers in archive formats, or cached in a local Docker daemon.
This section will highlight these workflows, and discuss the limitations and best practices to keep in mind when creating containers targeting both Docker and {Project}.
Docker Hub is the most common place that projects publish public container images. At some point, it's likely that you will want to run or build from containers that are hosted there.
It's easy to run a public Docker Hub container with {Project}. Just
put docker://
in front of the container repository and tag. To run
the container that's called sylabsio/lolcow:latest
:
$ {command} run docker://sylabsio/lolcow:latest INFO: Converting OCI blobs to SIF format INFO: Starting build... Getting image source signatures Copying blob 16ec32c2132b done Copying blob 5ca731fc36c2 done Copying config fd0daa4d89 done Writing manifest to image destination Storing signatures 2021/10/04 14:50:21 info unpack layer: sha256:16ec32c2132b43494832a05f2b02f7a822479f8250c173d0ab27b3de78b2f058 2021/10/04 14:50:23 info unpack layer: sha256:5ca731fc36c28789c5ddc3216563e8bfca2ab3ea10347e07554ebba1c953242e INFO: Creating SIF file... _____________________________ < Mon Oct 4 14:50:30 CDT 2021 > ----------------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
Note that {Project} retrieves blobs and configuration data from Docker Hub, extracts the layers that make up the Docker container, and creates a SIF file from them. This SIF file is kept in your {Project} :ref:`cache directory <sec:cache>`, so if you run the same Docker container again the downloads and conversion aren't required.
To obtain the Docker container as a SIF file in a specific location,
which you can move, share, and keep for later, {command} pull
it:
$ {command} pull docker://sylabsio/lolcow INFO: Using cached SIF image $ ls -l lolcow_latest.sif -rwxr-xr-x 1 myuser myuser 74993664 Oct 4 14:55 lolcow_latest.sif
If it's the first time you pull the container it'll be downloaded and translated. If you have pulled the container before, it will be copied from the cache.
Note
{command} pull
of a Docker container actually runs a
{command} build
behind the scenes, since we are translating
from OCI to SIF. If you {command} pull
a Docker container
twice, the output file isn't identical because metadata such as dates
from the conversion will vary. This differs from pulling a SIF
container (e.g. from an oras:// or
library://
URI), which always
give you an exact copy of the image.
Docker Hub introduced limits on anonymous access to its API in November
2020. Every time you use a docker://
URI to run, pull etc. a
container {Project} will make requests to Docker Hub in order to
check whether the container has been modified there. On shared systems,
and when running containers in parallel, this can quickly exhaust the
Docker Hub API limits.
We recommend that you {command} pull
a Docker image to a local
SIF, and then always run from the SIF file, rather than using
{command} run docker://...
repeatedly.
Alternatively, if you have signed up for a Docker Hub account, make sure
that you authenticate before using docker://
container URIs.
To make use of the API limits under a Docker Hub account, or to access private containers, you'll need to authenticate to Docker Hub. There are a number of ways to do this with {Project}.
The {command} registry login
command supports logging into Docker
Hub and other OCI registries. For Docker Hub, the registry hostname is
docker.io
, so you will need to login as below, specifying your
username:
$ {command} registry login --username myuser docker://docker.io Password / Token: INFO: Token stored in /home/myuser/.{command}/remote.yaml
The Password / Token you enter must be a Docker Hub CLI access token, which you should generate in the 'Security' section of your account profile page on Docker Hub.
To check which Docker / OCI registries you are currently logged in to,
use {command} registry list
.
To logout of a registry, so that your credentials are forgotten, use
{command} registry logout
:
$ {command} registry logout docker://docker.io INFO: Logout succeeded
If you have the docker
CLI installed on your machine, you can
docker login
to your account. This stores authentication information
in ~/.docker/config.json
. The process that {Project} uses to
retrieve Docker / OCI containers will attempt to use this information to
login.
Note
{Project} can only read credentials stored directly in
~/.docker/config.json
. It cannot read credentials from external
Docker credential helpers.
To perform a one-off interactive login, which will not store your
credentials, use the --docker-login
flag:
$ {command} pull --docker-login docker://myuser/private Enter Docker Username: myuser Enter Docker Password:
When calling {Project} in a CI/CD workflow, or other non-interactive scenario, it may be useful to specify Docker Hub login credentials using environment variables. These are often the default way of passing secrets into jobs within CI pipelines.
{Project} accepts a username, and password / token, as
{ENVPREFIX}_DOCKER_USERNAME
and {ENVPREFIX}_DOCKER_PASSWORD
respectively. These environment variables will override any stored
credentials.
If DOCKER_USERNAME
and DOCKER_PASSWORD
, without the {ENVPREFIX}_
prefix, are set they will also be used provided the {ENVPREFIX}_
equivalent
is not overriding them. This allows a single set of environment variables to be
set for both {command}
and docker
operations.
$ export {ENVPREFIX}_DOCKER_USERNAME=myuser $ export {ENVPREFIX}_DOCKER_PASSWORD=mytoken $ {command} pull docker://myuser/private
You can use docker://
URIs with {Project} to pull and run
containers from OCI registries other than Docker Hub. To do this, you'll
need to include the hostname or IP address of the registry in your
docker://
URI. Authentication with other registries is carried out
in the same basic manner, but sometimes you'll need to retrieve your
credentials using a specific tool, especially when working with Cloud
Service Provider environments.
Below are specific examples for some common registries. Most other registries follow a similar pattern for pulling public images, and authenticating to access private images.
Quay is an OCI container registry used by a large number of projects,
and hosted at https://quay.io
. To pull public containers from Quay,
just include the quay.io
hostname in your docker://
URI:
$ {command} pull docker://quay.io/bitnami/python:3.7 INFO: Converting OCI blobs to SIF format INFO: Starting build... ... $ {command} run python_3.7.sif Python 3.7.12 (default, Sep 24 2021, 11:48:27) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
To pull containers from private repositories you will need to generate a CLI token in the Quay web interface, then use it to login with {Project}. Use the same methods as described for Docker Hub above:
- Run
{command} registry login --username myuser docker://quay.io
to store your credentials for {Project}. - Use
docker login quay.io
ifdocker
is on your machine. - Use the
--docker-login
flag for a one-time interactive login. - Set the
{ENVPREFIX}_DOCKER_USERNAME
and{ENVPREFIX}_DOCKER_PASSWORD
environment variables.
The NVIDIA NGC catalog at https://ngc.nvidia.com contains various GPU software, packaged in containers. Many of these containers are specifically documented by NVIDIA as supported by {Project}, with instructions available.
Previously, an account and API token was required to pull NGC containers. However, they are now available to pull as a guest without login:
$ {command} pull docker://nvcr.io/nvidia/pytorch:21.09-py3 INFO: Converting OCI blobs to SIF format INFO: Starting build...
If you do need to pull containers using an NVIDIA account, e.g. if you have access to an NGC Private Registry, you will need to generate an API key in the web interface in order to authenticate.
Use one of the following authentication methods (detailed above for
Docker Hub), with the username $oauthtoken
and the password set to
your NGC API key.
- Run
{command} registry login --username \$oauthtoken docker://nvcr.io
to store your credentials for {Project}. - Use
docker login nvcr.io
ifdocker
is on your machine. - Use the
--docker-login
flag for a one-time interactive login. - Set the
{ENVPREFIX}_DOCKER_USERNAME="\$oauthtoken"
and{ENVPREFIX}_DOCKER_PASSWORD
environment variables.
See also: https://docs.nvidia.com/ngc/ngc-private-registry-user-guide/index.html
GitHub Container Registry is increasingly used to provide Docker
containers alongside the source code of hosted projects. You can pull a
public container from GitHub Container Registry using a ghcr.io
URI:
$ {command} pull docker://ghcr.io/containerd/alpine:latest INFO: Converting OCI blobs to SIF format INFO: Starting build...
To pull private containers from GHCR you will need to generate a personal access token in the GitHub web interface in order to authenticate. This token must have required scopes. See the GitHub documentation here.
Use one of the following authentication methods (detailed above for Docker Hub), with your username and personal access token:
- Run
{command} registry login --username myuser docker://ghcr.io
to store your credentials for {Project}. - Use
docker login ghcr.io
ifdocker
is on your machine. - Use the
--docker-login
flag for a one-time interactive login. - Set the
{ENVPREFIX}_DOCKER_USERNAME
and{ENVPREFIX}_DOCKER_PASSWORD
environment variables.
Note
{Project} can directly push SIF files to ghcr.io as well, using the
oras://
protocol.
The containers share the same namespace, but they have to be pulled
using the same protocol that they were pushed with.
To work with an AWS hosted Elastic Container Registry (ECR) generally requires authentication. There are various ways to generate credentials. You should follow one of the approaches in the ECR guide in order to obtain a username and password.
Warning
The ECR Docker credential helper cannot be used, as {Project}
does not currently support external credential helpers used with
Docker, only reading credentials stored directly in the
.docker/config.json
file.
The get-login-password
approach is the most straightforward. It uses
the AWS CLI to request a password, which can then be used to
authenticate to an ECR private registry in the specified region. The
username used in conjunction with this password is always AWS
.
$ aws ecr get-login-password --region region
Then login using one of the following methods:
- Run
{command} registry login --username AWS docker://<accountid>.dkr.ecr.<region>.amazonaws.com
to store your credentials for {Project}. - Use
docker login --username AWS <accountid>.dkr.ecr.<region>.amazonaws.com
ifdocker
is on your machine. - Use the
--docker-login
flag for a one-time interactive login. - Set the
{ENVPREFIX}_DOCKER_USERNAME=AWS
and{ENVPREFIX}_DOCKER_PASSWORD
environment variables.
You should now be able to pull containers from your ECR URI at
docker://<accountid>.dkr.ecr.<region>.amazonaws.com
.
An Azure hosted Azure Container Registry (ACR) will generally hold private images and require authentication to pull from. There are several ways to authenticate to ACR, depending on the account type you use in Azure. See the ACR documentation for more information on these options.
Generally, for identities, using az acr login
from the Azure CLI
will add credentials to .docker/config.json
which can be read by
{Project}.
Service Principle accounts will have an explicit username and password, and you should authenticate using one of the following methods:
- Run
{command} registry login --username myuser docker://myregistry.azurecr.io
to store your credentials for {Project}. - Use
docker login --username myuser myregistry.azurecr.io
ifdocker
is on your machine. - Use the
--docker-login
flag for a one-time interactive login. - Set the
{ENVPREFIX}_DOCKER_USERNAME
and{ENVPREFIX}_DOCKER_PASSWORD
environment variables.
The recent repository-scoped access token preview may be more
convenient. See the preview documentation
which details how to use az acr token create
to obtain a token name
and password pair that can be used to authenticate with the above
methods.
By default, {command} pull
from a docker://
URI will attempt to fetch
a container that matches the architecture of your host system. If you need to
retrieve a container that does not have the same architecture as your host (e.g.
an arm64
container on an amd64
host), you can use the --arch
options.
The --arch
option accepts a CPU architecture only. For example, to pull an
Ubuntu image for a 64-bit ARM system:
$ {command} pull --arch arm64 docker://ubuntu
If you try to run a container that does not match the host CPU architecture, it will likely fail:
$ {command} run ppc64le.sif FATAL: While checking image: could not open image ppc64le.sif: the image's architecture (ppc64le) could not run on the host's (amd64)
However, {Project} is able to make use of CPU emulation with QEMU, and the Linux kernel's binfmt_misc mechanism, to run containers that do not match the host CPU.
An adminstrator can configure emulation support by installing distribution packages, or using the multiarch/qemu-user-static container from Docker Hub:
$ sudo {command} run docker://multiarch/qemu-user-static --reset -p yes
Note
Running this container with sudo will modify system configuration files, and register binaries on the host.
It is now possible to run containers for other architectures:
# The host system is an AMD64 / x86_64 machine $ uname -m x86_64 # A ppc64le container can be run using emulation $ {command} run ppc64le.sif uname -m ppc64le
Running a container in this manner, using emulation, will be many times slower than running on a system where the CPU architecture matches the container. Emulation is often useful for testing and development purposes, but rarely appropriate when deploying a container to an HPC system.
If you wish to use an existing Docker or OCI container as the basis for a new container, you will need to specify it as the bootstrap source in {aProject} definition file.
Just as you can run or pull containers from different registries using a
docker://
URI, you can use different headers in a definition file to
instruct {Project} where to find the container you want to use as
the starting point for your build.
When you wish to build from a Docker or OCI container that's hosted in a
registry, such as Docker Hub, your definition file should begin with
Bootstrap: docker
, followed with a From:
line which specifies
the location of the container you wish to pull.
Docker Hub is the default registry, so when building from Docker Hub the
From:
header only needs to specify the container repository and
tag:
Bootstrap: docker
From: ubuntu:20.04
If you {command} build
a definition file with these lines,
{Project} will fetch the ubuntu:20.04
container image from
Docker Hub, and extract it as the basis for your new container.
To pull from a different Docker registry, you can either specify the
hostname in the From:
header, or use the separate Registry:
header. The following two examples are equivalent:
Bootstrap: docker
From: quay.io/bitnami/python:3.7
Bootstrap: docker
Registry: quay.io
From: bitnami/python:3.7
If you are building from an image in a private registry you will need to ensure that the credentials needed to access the image are available to {Project}.
A build might be run as the root
user, e.g. via sudo
, or under
your own account.
If you are running the build as root
, using sudo
, then any
stored credentials or environment variables must be available to the
root
user:
- Use the
--docker-login
flag for a one-time interactive login. I.E. runsudo {command} build --docker-login myimage.sif {Project}
. - Set the
{ENVPREFIX}_DOCKER_USERNAME
and{ENVPREFIX}_DOCKER_PASSWORD
environment variables. Pass the environment variables through sudo to theroot
build process by runningsudo -E {command} build ...
. - Run
sudo {command} registry login ...
to store your credentials for theroot
user on your system. This is separate from storing the credentials under your own account. - Use
sudo docker login
ifdocker
is on your machine. This is separate from storing the credentials under your own account.
If you are running the build under your account you do not need to specially set credentials for the root user.
As well as being hosted in a registry, Docker / OCI containers might be found inside a running Docker daemon, or saved as an archive. {Project} can build from these locations by using specialized bootstrap agents.
If you have pulled or run a container on your machine under docker
,
it will be cached locally by the Docker daemon. The docker images
command will list containers that are available:
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE sylabsio/lolcow latest 5a15b484bc65 2 hours ago 188MB
This indicates that sylabsio/lolcow:latest
has been cached locally
by Docker. You can directly build it into a SIF file using a
docker-daemon:
URI specifying the REPOSITORY:TAG
container
name:
$ {command} build lolcow_from_docker_cache.sif docker-daemon:sylabsio/lolcow:latest INFO: Starting build... Getting image source signatures Copying blob sha256:a2022691bf950a72f9d2d84d557183cb9eee07c065a76485f1695784855c5193 119.83 MiB / 119.83 MiB [==================================================] 6s Copying blob sha256:ae620432889d2553535199dbdd8ba5a264ce85fcdcd5a430974d81fc27c02b45 15.50 KiB / 15.50 KiB [====================================================] 0s Copying blob sha256:c561538251751e3685c7c6e7479d488745455ad7f84e842019dcb452c7b6fecc 14.50 KiB / 14.50 KiB [====================================================] 0s Copying blob sha256:f96e6b25195f1b36ad02598b5d4381e41997c93ce6170cab1b81d9c68c514db0 5.50 KiB / 5.50 KiB [======================================================] 0s Copying blob sha256:7f7a065d245a6501a782bf674f4d7e9d0a62fa6bd212edbf1f17bad0d5cd0bfc 3.00 KiB / 3.00 KiB [======================================================] 0s Copying blob sha256:70ca7d49f8e9c44705431e3dade0636a2156300ae646ff4f09c904c138728839 116.56 MiB / 116.56 MiB [==================================================] 6s Copying config sha256:73d5b1025fbfa138f2cacf45bbf3f61f7de891559fa25b28ab365c7d9c3cbd82 3.33 KiB / 3.33 KiB [======================================================] 0s Writing manifest to image destination Storing signatures INFO: Creating SIF file... INFO: Build complete: lolcow_from_docker_cache.sif
The tag name must be included in the URI. Unlike when pulling from a
registry, the docker-daemon
bootstrap agent will not try to pull a
latest
tag automatically.
Note
In the example above, the build was performed without sudo
. This
is possible only when the user is part of the docker
group on the
host, since {Project} must contact the Docker daemon through its
socket. If you are not part of the docker
group you will need to
use sudo
for the build to complete successfully.
To build from an image cached by the Docker daemon in a definition file
use Bootstrap: docker-daemon
, and a From: <REPOSITORY>:TAG
line:
Bootstrap: docker-daemon
From: sylabsio/lolcow:latest
Docker allows containers to be exported into single file tar archives. These cannot be run directly, but are intended to be imported into Docker to run at a later date, or another location. {Project} can build from (or run) these archive files, by extracting them as part of the build process.
If an image is listed by the docker images
command, then we can
create a tar archive file using docker save
and the image ID:
$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE sylabsio/lolcow latest 5a15b484bc65 2 hours ago 188MB $ docker save 5a15b484bc65 -o lolcow.tar
If we examine the contents of the tar file we can see that it contains the layers and metadata that make up a Docker container:
$ tar tvf lolcow.tar drwxr-xr-x 0 0 0 0 Aug 16 11:22 2f0514a4c044af1ff4f47a46e14b6d46143044522fcd7a9901124209d16d6171/ -rw-r--r-- 0 0 0 3 Aug 16 11:22 2f0514a4c044af1ff4f47a46e14b6d46143044522fcd7a9901124209d16d6171/VERSION -rw-r--r-- 0 0 0 401 Aug 16 11:22 2f0514a4c044af1ff4f47a46e14b6d46143044522fcd7a9901124209d16d6171/json -rw-r--r-- 0 0 0 75156480 Aug 16 11:22 2f0514a4c044af1ff4f47a46e14b6d46143044522fcd7a9901124209d16d6171/layer.tar -rw-r--r-- 0 0 0 1499 Aug 16 11:22 5a15b484bc657d2b418f2c20628c29945ec19f1a0c019d004eaf0ca1db9f952b.json drwxr-xr-x 0 0 0 0 Aug 16 11:22 af7e389ea6636873dbc5adc17826e8401d96d3d384135b2f9fe990865af202ab/ -rw-r--r-- 0 0 0 3 Aug 16 11:22 af7e389ea6636873dbc5adc17826e8401d96d3d384135b2f9fe990865af202ab/VERSION -rw-r--r-- 0 0 0 946 Aug 16 11:22 af7e389ea6636873dbc5adc17826e8401d96d3d384135b2f9fe990865af202ab/json -rw-r--r-- 0 0 0 118356480 Aug 16 11:22 af7e389ea6636873dbc5adc17826e8401d96d3d384135b2f9fe990865af202ab/layer.tar -rw-r--r-- 0 0 0 266 Dec 31 1969 manifest.json
We can convert this tar file into {aProject} container using the
docker-archive
bootstrap agent. Because the agent accesses a file,
rather than an object hosted by a service, it uses :<filename>
, not
://<location>
. To build a tar archive directly to a SIF container:
$ {command} build lolcow_tar.sif docker-archive:lolcow.tar INFO: Starting build... Getting image source signatures Copying blob sha256:2f0514a4c044af1ff4f47a46e14b6d46143044522fcd7a9901124209d16d6171 119.83 MiB / 119.83 MiB [==================================================] 6s Copying blob sha256:af7e389ea6636873dbc5adc17826e8401d96d3d384135b2f9fe990865af202ab 15.50 KiB / 15.50 KiB [====================================================] 0s Copying config sha256:5a15b484bc657d2b418f2c20628c29945ec19f1a0c019d004eaf0ca1db9f952b 3.33 KiB / 3.33 KiB [======================================================] 0s Writing manifest to image destination Storing signatures INFO: Creating SIF file... INFO: Build complete: lolcow_tar.sif
Note
The docker-archive
bootstrap agent can also handle gzipped Docker
archives (.tar.gz
or .tgz
files).
To build an image using a definition file, which starts from a container
in a Docker archive, use Bootstrap: docker-archive
and specify the
filename in the From:
line:
Bootstrap: docker-archive
From: lolcow.tar
Though Docker / OCI container compatibility is a goal of {Project}, there are some differences and limitations due to the way {Project} was designed to work well on shared systems and HPC clusters. If you are having difficulty running a specific Docker container, check through the list of differences below. There are workarounds for many of the issues that you are most likely to face.
{Project}'s container image format (SIF) is generally read-only. This permits containers to be run in parallel from a shared location on a network filesystem, support in-built signing and verification, and offer encryption. A container's filesystem is mounted directly from the SIF, as SquashFS, so cannot be written to by default.
When a container is run using Docker its layers are extracted, and the resulting container filesystem can be written to and modified by default. If a Docker container expects to write files, you will need to follow one of the following methods to allow it to run under {Project}.
- A directory from the host can be passed into the container with the
--bind
or--mount
flags. It needs to be mounted inside the container at the location where files will be written. - The
--writable-tmpfs
flag can be used to allow files to be created in a special temporary overlay. Any changes are lost when the container exits. The SIF file is never modified. - The container can be converted to a sandbox directory, and executed
with the
--writable
flag, which allows modification of the sandbox content. - A writable overlay partition can be added to the SIF file, and the
container executed with the
--writable
flag. Any changes made are kept permanently in the overlay partition.
Of these methods, only --writable-tmpfs
is always safe to run in
parallel. Each time the container is executed, a separate temporary
overlay is used and then discarded.
Binding a directory into a container, or running a writable sandbox may or may not be safe, depending on the program executed. The program must use, and the filesystem support, some type of locking in order that the parallel runs do not interfere.
A writable overlay file in a SIF partition cannot be used in parallel. {Project} will refuse to run concurrently using the same SIF writable overlay partition.
Note
The --writable-tmpfs
size is controlled by sessiondir max size
in
{command}.conf
. This defaults to 64MiB, and may need to be increased if
your workflows create larger temporary files.
The Dockerfile
used to build a Docker container may contain a
USER
statement. This tells the container runtime that it should run
the container under the specified user account.
Because {Project} is designed to provide easy and safe access to data on the host system, work under batch schedulers, etc., it does not permit changing the user account the container is run as.
Any USER
statement in a Dockerfile
will be ignored by
{Project} when the container is run. In practice, this often does
not affect the execution of the software in the container. Software that
is written in a way that requires execution under a specific user
account will generally require modification for use with {Project}.
{Project}'s --fakeroot
mode will start a container as a fake
root
user, mapped to the user's real account outside of the
container. When using the fakeroot mode that is based on /etc/subuid,
then inside the container it is possible to change to another user
account which is mapped to different subuids
belonging to the original user. It may be possible to execute software
expecting a fixed user account manually inside such a --fakeroot
shell.
A default installation of {Project} will mount the user's home
directory, /tmp
directory, and the current working directory, into
each container that is run. Administrators may also configure e.g. HPC
project directories to automatically bind mount. Docker does not mount
host directories into the container by default.
The home directory mount is the most likely to cause problems when
running Docker containers. Various software will look for packages,
plugins, and configuration files in $HOME
. If you have, for example,
installed packages for Python into your home directory (pip install
--user
) then a Python container may find and attempt to use them. This
can cause conflicts and unexpected behavior.
If you experience issues, use the --contain
option to stop
{Project} automatically binding directories into the container. You
may need to use --bind
or --mount
to then add back e.g. an HPC
project directory that you need access to.
# Without --contain, python in the container finds packages # in your $HOME directory. $ {command} exec docker://python:3.9 pip list Package Version ---------- ------- pip 21.2.4 rstcheck 3.3.1 setuptools 57.5.0 wheel 0.37.0 # With --contain, python in the container only finds packages # installed in the container. $ {command} exec --contain docker://python:3.9 pip list Package Version ---------- ------- pip 21.2.4 setuptools 57.5.0 wheel 0.37.0
{Project} propagates most environment variables set on the host into the container, by default. Docker does not propagate any host environment variables into the container. Environment variables may change the behaviour of software.
To disable automatic propagation of environment variables, the
--cleanenv / -e
flag can be specified. When --cleanenv
is used,
only variables on the host that are prefixed with {ENVPREFIX}ENV_
are set in the container:
# Set a host variable $ export HOST_VAR=123 # Set a container environment variable $ export "{ENVPREFIX}ENV_FORCE_VAR="123" $ {command} run docker://alpine env | grep VAR FORCE_VAR=123 HOST_VAR=ABC $ {command} run --cleanenv docker://alpine env | grep VAR FORCE_VAR=123
Any environment variables set via an ENV
line in a Dockerfile
will be
available when the container is run with {Project}. You can override them
with {ENVPREFIX}ENV_
vars, or the --env / --env-file
flags, but they
will not be overridden by host environment variables.
For example, the docker://openjdk:latest
container sets JAVA_HOME
:
# Set a host JAVA_HOME export JAVA_HOME=/test # Check JAVA_HOME in the docker container. # This value comes from ENV in the Dockerfile. $ {command} run docker://openjdk:latest echo \$JAVA_HOME /usr/java/openjdk-17 # Override JAVA_HOME in the container export {ENVPREFIX}ENV_JAVA_HOME=/test $ {command} run docker://openjdk:latest echo \$JAVA_HOME /test
The default behavior of {Project} differs from Docker/OCI handling of environment variables as {Project} uses a shell interpreter to process environment on container startup, in a manner that evaluates environment variables. To avoid the extra evaluation of variables that {Project} performs you can:
- Follow the instructions in the :ref:`escaping-environment` section to explictly escape environment variables.
- Use the
--no-eval
flag.
--no-eval
prevents {Project} from evaluating environment variables on
container startup, so that they will take the same value as with a Docker/OCI
runtime:
# Set an environment variable that would run `date` if evaluated $ export {ENVPREFIX}_MYVAR='$(date)' # Default behavior # MYVAR was evaluated in the container, and is set to the output of `date` $ {command} run ~/ubuntu_latest.sif env | grep MYVAR MYVAR=Tue Apr 26 14:37:07 CDT 2022 # --no-eval / --compat behavior # MYVAR was not evaluated and is a literal `$(date)` $ {command} run --no-eval ~/ubuntu_latest.sif env | grep MYVAR MYVAR=$(date)
Because {Project} favors an integration over isolation approach it does not, by default, use all the methods through which a container can be isolated from the host system. This makes it much easier to run a {Project} container like any other program, while the unique security model ensures safety. You can access the host's network, GPUs, and other devices directly. Processes in the container are not numbered separately from host processes. Hostnames are not changed, etc.
Most containers are not impacted by the differences in isolation. If you require more isolation, than {Project} provides by default, you can enable some of the extra namespaces that Docker uses, with flags:
--ipc / -i
creates a separate IPC (inter process communication) namespace, for SystemV IPC objects and POSIX message queues.--net / -n
creates a new network namespace, abstracting the container networking from the host.--userns / -u
runs the container unprivileged, inside a user namespace and avoiding setuid setup code if it is installed.--uts
creates a new UTS namespace, which allows a different hostname and/or NIS domain for the container.
To limit presentation of devices from the host into the container, use
the --contain
flag. As well as preventing automatic binds of host
directories into the container, --contain
sets up a minimal /dev
directory, rather than binding in the entire host /dev
tree.
Note
When using the --nv
or --rocm
flags, GPU devices are present
in the container even when --contain
is used.
When {aProject} container is run using the --pid / p
option, or
started as an instance (which implies --pid
), a shim init process is
executed that will run the container payload itself.
The shim process helps to ensure signals are propagated correctly from the terminal, or batch schedulers etc. when containers are not designed for interactive use. Because Docker does not provide an init process by default, some containers have been designed to run their own init process, which cannot operate under the control of {Project}'s shim.
For example, a container using the tini
init process will produce
warnings when started as an instance, or if run with --pid
. To work
around this, use the --no-init
flag to disable the shim:
$ {command} run --pid tini_example.sif [WARN tini (2690)] Tini is not running as PID 1 . Zombie processes will not be re-parented to Tini, so zombie reaping won't work. To fix the problem, run Tini as PID 1. $ {command} run --pid --no-init tini_example.sif ... # NO WARNINGS
If Docker-like behavior is important, {Project} can be started with
the --compat
flag. This flag is a convenient short-hand alternative
to using all of:
--containall
--no-init
--no-umask
--writable-tmpfs
--no-eval
A container run with --compat
has:
- A writable root filesystem, using a temporary overlay where changes are discarded at container exit.
- No automatic bind mounts of
$HOME
or other directories from the host into the container. - Empty temporary
$HOME
and/tmp
directories, the contents of which will be discarded at container exit. - A minimal
/dev
tree, that does not expose host devices inside the container (except GPUs when used with--nv
or--rocm
). - A clean environment, not including environment variables set on the host.
- Its own PID and IPC namespaces.
- No shim init process.
- Argument and environment variable handling matching Docker / OCI runtimes, with respect to evaluation and escaping.
These options will allow most, but not all, Docker / OCI containers to execute correctly under {Project}. The user namespace and network namespace are not used, as these negate benefits of SIF and direct access to high performance cluster networks.
When a container is run using docker
, its default behavior depends
on the CMD
and/or ENTRYPOINT
set in the Dockerfile
that was
used to build it, along with any arguments on the command line. The
CMD
and ENTRYPOINT
can also be overridden by flags.
{AProject} container has the concept of a runscript, which is a
single shell script defining what happens when you {command} run
the container. Because there is no internal concept of CMD
and
ENTRYPOINT
, {Project} must create a runscript from the CMD
and ENTRYPOINT
when converting a Docker container. The behavior of
this script mirrors Docker as closely as possible.
If the Docker container only has an ENTRYPOINT
- that ENTRYPOINT
is run, with any arguments appended:
# ENTRYPOINT="date" # Runs 'date' $ {command} run mycontainer.sif Wed 06 Oct 2021 02:42:54 PM CDT # Runs 'date --utc` $ {command} run mycontainer.sif --utc Wed 06 Oct 2021 07:44:27 PM UTC
If the Docker container only has a CMD
- the CMD
is run, or is
replaced with any arguments:
# CMD="date" # Runs 'date' $ {command} run mycontainer.sif Wed 06 Oct 2021 02:45:39 PM CDT # Runs 'echo hello' $ {command} run mycontainer.sif echo hello hello
If the Docker container has a CMD
and ENTRYPOINT
, then we run
ENTRYPOINT
with either CMD
as default arguments, or replaced
with any user supplied arguments:
# ENTRYPOINT="date" # CMD="--utc" # Runs 'date --utc' $ {command} run mycontainer.sif Wed 06 Oct 2021 07:48:43 PM UTC # Runs 'date -R' $ {command} run mycontainer.sif -R Wed, 06 Oct 2021 14:49:07 -0500
There is no flag to override an ENTRYPOINT
set for a Docker
container. Instead, use {command} exec
to run an arbitrary program
inside a container.
Because {Project} runscripts are evaluated shell scripts, arguments can behave slightly differently than in Docker/OCI runtimes if they contain shell code that may be evaluated.
If you are using a container that was directly built or run from a Docker/OCI
source, with {Project} 1.1.0 or later, the --no-eval
flag will prevent
this extra evaluation so that arguments are handled in a compatible manner:
# docker/OCI behavior $ docker run -it --rm alpine echo "\$HOSTNAME" $HOSTNAME # {Project} default $ {command} run docker://alpine echo "\$HOSTNAME" p700 # {Project} with --no-eval $ {command} run --no-eval docker://alpine echo "\$HOSTNAME" $HOSTNAME
Note
--no-eval
will not change argument behavior for containers built with
{Project} 1.1.0 or earlier, as the handling is implemented in the runscript
that is built into the container.
You can check the version of {Project} used to build a container with
{command} inspect mycontainer.sif
.
To avoid evaluation without --no-eval
, and when using containers built
earlier than {Project} 1.1.0, you will need to add an extra level of shell
escaping to arguments on the command line:
$ docker run -it --rm alpine echo "\$HOSTNAME" $HOSTNAME $ {command} run docker://alpine echo "\$HOSTNAME" p700 $ {command} run docker://alpine echo "\\\$HOSTNAME" $HOSTNAME
If you are running a binary inside a docker://
container directly,
using the exec
command, the argument handling mirrors Docker/OCI
runtimes as there is no evaluated runscript.
As detailed previously, {Project} can make use of most Docker and OCI images without issues, or via simple workarounds. In general, however, there are some best practices that should be applied when creating Docker / OCI containers that will also be run using {Project}.
- Don't require execution by a specific user
Avoid using the
USER
instruction in your Docker file, as it is ignored by {Project}. Install and configure software inside the container so that it can be run by any user.
- Don't install software under /root or in another user's home directory
Because a Docker container builds and runs as the
root
user by default, it's tempting to install software into root's home directory (/root
). Permissions on/root
are usually set so that it is inaccessible to non-root users. When the container is run as another user the software may be inaccessible.Software inside another user's home directory, e.g.
/home/myapp
, may be obscured by {Project}'s automatic mounts onto/home
.Install software into system-wide locations in the container, such as under
/usr
or/opt
to avoid these issues.
- Support a read-only filesystem
Because of the immutable nature of the SIF format, a container run with {Project} is read-only by default.
Try to ensure your container will run with a read-only filesystem. If this is not possible, document exactly where the container needs to write, so that a user can bind in a writable location, or use
--writable-tmpfs
as appropriate.You can test read-only execution with Docker using
docker run --read-only --tmpfs /run --tmpfs /tmp sylabsio/lolcow
.
- Be careful writing to /tmp
{Project} mounts the host
/tmp
into the container, by default. This means you must be be careful when writing sensitive information to/tmp
, and should ensure your container cleans up files it writes there.
- Consider library caches / ldconfig
If your
Dockerfile
adds libraries and / or manipulates the ld search path in the container (ld.so.conf
/ld.so.conf.d
), you should ensure the library cache is updated during the build.Because {Project} runs containers read-only by default, the cache and any missing library symlinks may not be able to be updated / created at execution time.
Run
ldconfig
toward the end of yourDockerfile
to ensure symbolic links and the theld.so.cache
are up-to-date.
If you experience problems pulling containers from a private registry,
check your credentials carefully. You can {command} pull
with the
--docker-login
flag to perform an interactive login. This may be
useful if you are unsure whether you have stored credentials properly
via {command} registry login
or docker login
.
OCI registries expect different values for username and password fields.
Some require a token to be generated and used instead of your account
password. Some take a generic username, and rely only on the token to
identify you. Consult the documentation for your registry carefully.
Look for instructions that detail how to login via docker login
without external helper programs, if possible.
If a Docker container fails to start, the most common cause is that it needs to write files, while {Project} runs read-only by default.
Try running with the --writable-tmpfs
option, or the --compat
flag (which enables additional compatibility fixes).
You can also look for error messages mentioning 'permission denied' or
'read-only filesystem'. Note where the program is attempting to write,
and use --bind
or --mount
to bind a directory from the host
system into that location. This will allow the container to write the
needed files, which will appear in the directory you bind in.
If a Docker container runs, but exhibits unexpected behavior, the most likely cause is the different level of isolation that {Project} provides vs Docker.
Try running the container with the --contain
option, or the
--compat
option (which is more strict). This disables the automatic
mount of your home directory, which is a common source of issues where
software in the container loads configuration or packages that may be
present there.
The community Slack channels and mailing list are excellent places to ask for help with running a specific Docker container. Other users may have already had success running the same container or software. Please don't report issues with specific Docker containers on GitHub, unless you believe they are due to a bug in {Project}.
An alternative to running Docker containers with {Project} is to
re-write the Dockerfile
as a definition file, and build a native SIF
image.
The table below gives a quick reference comparing Dockerfile and {Project} definition files. For more detail please see :ref:`definition-files`.
{Project} Definition file | Dockerfile | ||
---|---|---|---|
Section | Description | Section | Description |
Bootstrap |
Defines the source of
the base image to build
your container from.
Many bootstrap agents
are supported, e.g.
library , docker ,http , shub ,yum , debootstrap . |
- | Can only bootstrap
from Docker Hub.
|
From: |
Specifies the base
image from which to the
build the container.
|
FROM |
Creates a layer from
the specified docker image.
|
%arguments |
Section to set the
default values for
defined variables in
the definition file.
Used for image building
process.
|
ARG |
Support templating build.
Users can change the variable
values among different builds.
|
%setup |
Run setup commands
outside of the
container (on the host
system) after the base
image bootstrap.
|
- | Not supported.
|
%files |
Copy files from
your host to
the container, or
between build stages.
|
COPY |
Copy files from
your host to
the container, or
between build stages.
|
%environment |
Declare and set
container environment
variables.
|
ENV |
Declare and set
a container environment
variable.
|
%help |
Provide a help
section for your
container image.
|
- | Not supported.
|
%post |
Commands that will
be run at
build-time.
|
RUN |
Commands that will
be run at
build-time.
|
%runscript |
Commands that will
be run when you
{command} run the container image.
|
ENTRYPOINT
CMD |
Commands / arguments
that will run in the
container image.
|
%startscript |
Commands that will
be run when
an instance is started.
|
- | Not Applicable.
|
%test |
Commands that run
at the very end
of the build process
to validate the
container using
a method of your
choice. (to verify
distribution or
software versions
installed inside
the container)
|
HEALTHCHECK |
Commands that verify
the health status of
the container.
|
%apps |
Allows you to install
internal modules
based on the concept
of SCIF-apps.
|
- | Not supported.
|
%labels |
Section to add and
define metadata
describing your
container.
|
LABEL |
Declare container
metadata as a
key-value pair.
|