Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Dockerfile to use pip install and venv #1509

Merged
merged 9 commits into from
Aug 31, 2021
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ def test_python_ndcg_at_k(rating_true, rating_pred, rating_nohit):
col_prediction=DEFAULT_RATING_COL,
k=10,
)
== 1
== pytest.approx(1.0, TOL)
)
assert ndcg_at_k(rating_true, rating_nohit, k=10) == 0.0
assert ndcg_at_k(rating_true, rating_pred, k=10) == pytest.approx(0.38172, TOL)
Expand Down
129 changes: 79 additions & 50 deletions tools/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,109 +9,138 @@ FROM ubuntu:18.04 AS base
LABEL maintainer="Microsoft Recommender Project <[email protected]>"

ARG HOME
ARG VIRTUAL_ENV
ENV HOME="${HOME}"
WORKDIR ${HOME}

# Install base dependencies
# Exit if VIRTUAL_ENV is not specified correctly
RUN if [ "${VIRTUAL_ENV}" != "conda" ] && [ "${VIRTUAL_ENV}" != "venv" ]; then \
echo 'VIRTUAL_ENV argument should be either "conda" or "venv"'; exit 1; fi

# Install base dependencies, cmake (for xlearn) and libpython3.6 (for cornac)
RUN apt-get update && \
apt-get install -y curl git
apt-get install -y curl build-essential cmake libpython3.6

# Install Anaconda
ARG ANACONDA="https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh"
RUN curl ${ANACONDA} -o anaconda.sh && \
RUN if [ "${VIRTUAL_ENV}" = "conda" ] ; then curl ${ANACONDA} -o anaconda.sh && \
/bin/bash anaconda.sh -b -p conda && \
rm anaconda.sh && \
echo ". ${HOME}/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
echo "conda activate base" >> ~/.bashrc
ENV PATH="${HOME}/conda/bin:${PATH}"
echo "conda activate base" >> ~/.bashrc ; fi

# Clone Recommenders repo
ARG BRANCH="main"
RUN git clone --depth 1 --single-branch -b ${BRANCH} https://github.com/microsoft/recommenders
ENV PATH="${HOME}/${VIRTUAL_ENV}/bin:${PATH}"

# Setup Jupyter notebook configuration
ENV NOTEBOOK_CONFIG="${HOME}/.jupyter/jupyter_notebook_config.py"
RUN mkdir ${HOME}/.jupyter && \
echo "c.NotebookApp.token = ''" >> ${NOTEBOOK_CONFIG} && \
echo "c.NotebookApp.ip = '0.0.0.0'" >> ${NOTEBOOK_CONFIG} && \
echo "c.NotebookApp.allow_root = True" >> ${NOTEBOOK_CONFIG} && \
echo "c.NotebookApp.open_browser = False" >> ${NOTEBOOK_CONFIG} && \
echo "c.MultiKernelManager.default_kernel_name = 'python3'" >> ${NOTEBOOK_CONFIG}
# Python version supported by recommenders
RUN if [ "${VIRTUAL_ENV}" = "conda" ] ; then conda install python=3.6; fi
SHELL ["/bin/bash", "-c"]
RUN if [ "${VIRTUAL_ENV}" = "venv" ] ; then apt-get -y install python3.6; \
apt-get -y install python3-pip; \
apt-get -y install python3.6-venv; \
python3.6 -m venv --system-site-packages $HOME/venv; \
source $HOME/venv/bin/activate; \
pip install --upgrade pip; \
pip install --upgrade setuptools; fi


###########
# CPU Stage
###########
FROM base AS cpu

RUN python recommenders/tools/generate_conda_file.py --name base
RUN if [ "${VIRTUAL_ENV}" = "venv" ] ; then source $HOME/venv/bin/activate; \
pip install recommenders[xlearn,examples]; fi
RUN if [ "${VIRTUAL_ENV}" = "conda" ] ; then pip install recommenders[xlearn,examples]; fi


###############
# PySpark Stage
###############
FROM base AS pyspark

# Install base dependencies
# Install Java version 8
RUN apt-get update && \
apt-get install -y libgomp1 openjdk-8-jre

# Install Spark
ARG SPARK="http://archive.apache.org/dist/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz"
RUN mkdir spark && \
curl ${SPARK} -o spark.tgz && \
tar xzf spark.tgz --strip-components 1 -C spark && \
rm spark.tgz

# Setup Conda environment
RUN python recommenders/tools/generate_conda_file.py --name base --pyspark

ENV JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64" \
PYSPARK_PYTHON="${HOME}/conda/bin/python" \
PYSPARK_DRIVER_PYTHON="${HOME}/conda/bin/python" \
SPARK_HOME="${HOME}/spark"
PYSPARK_PYTHON="${HOME}/${VIRTUAL_ENV}/bin/python" \
PYSPARK_DRIVER_PYTHON="${HOME}/${VIRTUAL_ENV}/bin/python"

# Install dependencies in Conda environment
RUN pip install recommenders[spark,xlearn,examples]


###########
# GPU Stage
FROM nvidia/cuda:9.0-base AS gpu
###########
FROM nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04 AS gpu

ARG HOME
ARG VIRTUAL_ENV
WORKDIR ${HOME}

# Get up to date with base
COPY --from=base ${HOME} .
RUN apt-get update && \
apt-get install -y build-essential cmake libpython3.6
ENV PATH="${HOME}/${VIRTUAL_ENV}/bin:${PATH}"

# Install dependencies in virtual environment
SHELL ["/bin/bash", "-c"]
RUN if [ "${VIRTUAL_ENV}" = "venv" ] ; then apt-get -y install python3.6; \
apt-get -y install python3-pip; \
apt-get -y install python3.6-venv; \
python3.6 -m venv --system-site-packages $HOME/venv; \
source $HOME/venv/bin/activate; \
pip install --upgrade pip; \
pip install --upgrade setuptools; \
pip install recommenders[gpu,xlearn,examples]; fi

# Setup Conda environment
ENV PATH="${HOME}/conda/bin:${PATH}"
RUN python recommenders/tools/generate_conda_file.py --name base --gpu
RUN if [ "${VIRTUAL_ENV}" = "conda" ] ; then \
pip install recommenders[gpu,xlearn,examples] -f https://download.pytorch.org/whl/cu100/torch_stable.html; fi


############
# Full Stage
############
FROM gpu AS full

ARG HOME
WORKDIR ${HOME}

COPY --from=pyspark ${HOME}/spark spark

# Setup Conda environment
RUN python recommenders/tools/generate_conda_file.py --name base --gpu --pyspark
# Install Java version 8
RUN apt-get update && \
apt-get install -y libgomp1 openjdk-8-jre

ENV JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64" \
PYSPARK_PYTHON="${HOME}/conda/bin/python" \
PYSPARK_DRIVER_PYTHON="${HOME}/conda/bin/python" \
SPARK_HOME="${HOME}/spark"
PYSPARK_PYTHON="${HOME}/${VIRTUAL_ENV}/bin/python" \
PYSPARK_DRIVER_PYTHON="${HOME}/${VIRTUAL_ENV}/bin/python"

# Install dependencies in Conda environment
RUN pip install recommenders[all]


#############
# Final Stage
#############
FROM $ENV AS final

# Install XLearn dependencies
RUN apt-get update && \
apt-get install -y build-essential cmake
# Setup Jupyter notebook configuration
ENV NOTEBOOK_CONFIG="${HOME}/.jupyter/jupyter_notebook_config.py"
RUN mkdir ${HOME}/.jupyter && \
echo "c.NotebookApp.token = ''" >> ${NOTEBOOK_CONFIG} && \
echo "c.NotebookApp.ip = '0.0.0.0'" >> ${NOTEBOOK_CONFIG} && \
echo "c.NotebookApp.allow_root = True" >> ${NOTEBOOK_CONFIG} && \
echo "c.NotebookApp.open_browser = False" >> ${NOTEBOOK_CONFIG} && \
echo "c.MultiKernelManager.default_kernel_name = 'python3'" >> ${NOTEBOOK_CONFIG}

# Install Conda packages
RUN conda env update -f base.yaml && \
conda clean -fay && \
python -m ipykernel install --user --name 'python3' --display-name 'python3'
# Register the environment with Jupyter
RUN if [ ${VIRTUAL_ENV} = "conda" ]; then python -m ipykernel install --user --name base --display-name "Python (base)"; fi
RUN if [ ${VIRTUAL_ENV} = "venv" ]; then source $HOME/venv/bin/activate; \
python -m ipykernel install --user --name venv --display-name "Python (venv)"; fi

ARG HOME
WORKDIR ${HOME}/recommenders
WORKDIR ${HOME}

EXPOSE 8888
CMD ["jupyter", "notebook"]
26 changes: 16 additions & 10 deletions tools/docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,12 @@ Once the container is running you can access Jupyter notebooks at http://localho
Building and Running with Docker
--------------------------------

See examples below for the case of conda. If you use venv instead, replace `--build-arg VIRTUAL_ENV=conda` with `--build-arg VIRTUAL_ENV=venv`.
<details>
<summary><strong><em>CPU environment</em></strong></summary>

```
DOCKER_BUILDKIT=1 docker build -t recommenders:cpu --build-arg ENV="cpu" .
DOCKER_BUILDKIT=1 docker build -t recommenders:cpu --build-arg ENV=cpu --build-arg VIRTUAL_ENV=conda .
docker run -p 8888:8888 -d recommenders:cpu
```

Expand All @@ -31,7 +32,7 @@ docker run -p 8888:8888 -d recommenders:cpu
<summary><strong><em>PySpark environment</em></strong></summary>

```
DOCKER_BUILDKIT=1 docker build -t recommenders:pyspark --build-arg ENV="pyspark" .
DOCKER_BUILDKIT=1 docker build -t recommenders:pyspark --build-arg ENV=pyspark --build-arg VIRTUAL_ENV=conda .
docker run -p 8888:8888 -d recommenders:pyspark
```

Expand All @@ -41,7 +42,7 @@ docker run -p 8888:8888 -d recommenders:pyspark
<summary><strong><em>GPU environment</em></strong></summary>

```
DOCKER_BUILDKIT=1 docker build -t recommenders:gpu --build-arg ENV="gpu" .
DOCKER_BUILDKIT=1 docker build -t recommenders:gpu --build-arg ENV=gpu --build-arg VIRTUAL_ENV=conda .
docker run --runtime=nvidia -p 8888:8888 -d recommenders:gpu
```

Expand All @@ -51,7 +52,7 @@ docker run --runtime=nvidia -p 8888:8888 -d recommenders:gpu
<summary><strong><em>GPU + PySpark environment</em></strong></summary>

```
DOCKER_BUILDKIT=1 docker build -t recommenders:full --build-arg ENV="full" .
DOCKER_BUILDKIT=1 docker build -t recommenders:full --build-arg ENV=full --build-arg VIRTUAL_ENV=conda .
docker run --runtime=nvidia -p 8888:8888 -d recommenders:full
```

Expand All @@ -64,22 +65,27 @@ There are several build arguments which can change how the image is built. Simil

Build Arg|Description|
---------|-----------|
ENV|Environment to use, options: cpu, psypark, gpu, full|
BRANCH|Git branch of the repo to use (defaults to `main`)
ENV|Environment to use, options: cpu, psypark, gpu, full (defaults to cpu)|
VIRTUAL_ENV|Virtual environment to use; mandatory argument, must be one of "conda", "venv"|
ANACONDA|Anaconda installation script (defaults to miniconda3 4.6.14)|
SPARK|Spark installation tarball (defaults to Spark 2.3.1)|

Example using the staging branch:
Example:

```
DOCKER_BUILDKIT=1 docker build -t recommenders:cpu --build-arg ENV="cpu" --build-arg BRANCH="staging" .
DOCKER_BUILDKIT=1 docker build -t recommenders:cpu --build-arg ENV=cpu --build-arg VIRTUAL_ENV=conda .
```

In order to see detailed progress with BuildKit you can provide a flag during the build command: ```--progress=plain```

Running tests with docker
-------------------------

To run the tests using e.g. the CPU image, do the following:
```
docker run -it recommenders:cpu pytest tests/unit -m "not spark and not gpu and not notebooks"
docker run -it recommenders:cpu bash -c 'pip install pytest; \
pip install pytest-cov; \
apt-get install -y git; \
git clone https://github.com/microsoft/recommenders.git; \
cd recommenders; \
pytest tests/unit -m "not spark and not gpu and not notebooks"'
```