Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Docker build failed at 'Failed to build xlearn' #1121

Closed
r-yanyo opened this issue Jun 11, 2020 · 5 comments
Closed

[BUG] Docker build failed at 'Failed to build xlearn' #1121

r-yanyo opened this issue Jun 11, 2020 · 5 comments
Labels
bug Something isn't working

Comments

@r-yanyo
Copy link

r-yanyo commented Jun 11, 2020

Description

Docker build failed at 'Failed to build xlearn' in only GPU stage.

In which platform does it happen?

macOS: 10.15.4
docker: 19.03.8

I also tried to build in two debian server, and failed with same error.

How do we replicate the issue?

DOCKER_BUILDKIT=1 docker build -t recommenders:gpu --build-arg ENV="gpu" .
or
DOCKER_BUILDKIT=1 docker build -t recommenders:full --build-arg ENV="full" .

Other Comments

I issued this problem firstly in #1120 (comment)

$ DOCKER_BUILDKIT=1 docker build -t recommenders:gpu --build-arg ENV="gpu" . --no-cache=true
[+] Building 1353.8s (16/17)
 => [internal] load build definition from Dockerfile                                                      0.0s
 => => transferring dockerfile: 39B                                                                       0.0s
 => [internal] load .dockerignore                                                                         0.0s
 => => transferring context: 2B                                                                           0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:9.0-base                                           1.7s
 => [internal] load metadata for docker.io/library/ubuntu:18.04                                           1.8s
 => [base 1/7] FROM docker.io/library/ubuntu:18.04@sha256:3235326357dfb65f1781dbc4df3b834546d8bf914e82cc  0.0s
 => [gpu 1/4] FROM docker.io/nvidia/cuda:9.0-base@sha256:56bfa4e0b6d923bf47a71c91b4e00b62ea251a04425598d  0.0s
 => CACHED [base 2/7] WORKDIR /root                                                                       0.0s
 => CACHED [gpu 2/4] WORKDIR /root                                                                        0.0s
 => [base 3/7] RUN apt-get update &&     apt-get install -y curl git wget build-essential               201.0s
 => [base 4/7] RUN wget https://github.com/Kitware/CMake/releases/download/v3.15.2/cmake-3.15.2.tar.gz  775.4s
 => [base 5/7] RUN curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o anacon  10.0s
 => [base 6/7] RUN git clone --depth 1 --single-branch -b master https://github.com/microsoft/recommende  4.0s
 => [base 7/7] RUN mkdir /root/.jupyter &&     echo "c.NotebookApp.token = ''" >> /root/.jupyter/jupyter  0.7s
 => [gpu 3/4] COPY --from=base /root .                                                                    1.5s
 => [gpu 4/4] RUN python recommenders/scripts/generate_conda_file.py --name base --gpu                    0.6s
 => ERROR [final 1/2] RUN conda env update -f base.yaml &&     conda clean -fay &&     python -m ipyke  355.8s
------
 > [final 1/2] RUN conda env update -f base.yaml &&     conda clean -fay &&     python -m ipykernel install --user --name 'python3' --display-name 'python3':
#16 1.113 Collecting package metadata (repodata.json): ...working... done
#16 28.35 Solving environment: ...working... done
#16 122.7
#16 122.7 Downloading and Extracting Packages
nltk-3.4.5           | 1.7 MB    | ########## | 100%
plac-0.9.6           | 38 KB     | ########## | 100%
_pytorch_select-0.2  | 2 KB      | ########## | 100%

.............
**TOO LONG MESSAGE**
..............

256=1462193f8e28e57c59cf26e52acb94ae8f8b9480cea8c52fa591e239917050bc
#16 353.9   Stored in directory: /root/.cache/pip/wheels/c3/af/84/3962a6af7b4ab336e951b7877dcfb758cf94548bb1771e0679
#16 353.9   Building wheel for gast (setup.py): started
#16 353.9   Building wheel for gast (setup.py): finished with status 'done'
#16 353.9   Created wheel for gast: filename=gast-0.2.2-py3-none-any.whl size=7539 sha256=9dc42929648b7939f16315d477a1f5537ec13581f17cec57dd8e008ecbd83d8d
#16 353.9   Stored in directory: /root/.cache/pip/wheels/19/a7/b9/0740c7a3a7d1d348f04823339274b90de25fbcd217b2ee1fbe
#16 353.9   Building wheel for fusepy (setup.py): started
#16 353.9   Building wheel for fusepy (setup.py): finished with status 'done'
#16 353.9   Created wheel for fusepy: filename=fusepy-3.0.1-py3-none-any.whl size=10503 sha256=06d27d1182817ef6bf929d5853b38141a86f9bc2df01581c9427e4182e71005e
#16 353.9   Stored in directory: /root/.cache/pip/wheels/21/5c/83/1dd7e8a232d12227e5410120f4374b33adeb4037473105b079
#16 353.9   Building wheel for simplejson (setup.py): started
#16 353.9   Building wheel for simplejson (setup.py): finished with status 'done'
#16 353.9   Created wheel for simplejson: filename=simplejson-3.17.0-py3-none-any.whl size=55462 sha256=7d5a833313f073093114b197eb5fc7380ba1565614b4fd801b585df42eca0514
#16 353.9   Stored in directory: /root/.cache/pip/wheels/c2/95/a2/477b46cfe980061fddc4cb76ed4657eceec35d2b2090355e41
#16 353.9 Successfully built memory-profiler nvidia-ml-py3 sacremoses wrapt termcolor absl-py gast fusepy simplejson
#16 353.9 Failed to build xlearn
#16 353.9 Installing collected packages: azureml-dataprep-native, distro, dotnetcore2, fusepy, azureml-dataprep, oauthlib, requests-oauthlib, isodate, msrest, PyJWT, adal, msrestazure, azureml-train-restclients-hyperdrive, applicationinsights, azureml-telemetry, azure-common, contextlib2, azure-graphrbac, pathspec, jsonpickle, jmespath, azure-mgmt-storage, jeepney, SecretStorage, azure-mgmt-authorization, azure-mgmt-resource, azure-mgmt-containerregistry, ruamel.yaml.clib, ruamel.yaml, websocket-client, docker, azure-mgmt-keyvault, pyasn1, ndg-httpsclient, azureml-core, azureml-train-core, azureml-train, azureml-pipeline-core, azureml-pipeline-steps, azureml-pipeline, azureml-contrib-notebook, azureml-widgets, azureml-tensorboard, azureml-sdk, azure-core, azure-storage-blob, msal, colorama, bcrypt, pynacl, paramiko, portalocker, msal-extensions, pkginfo, azure-nspkg, azure-cli-nspkg, azure-cli-telemetry, humanfriendly, azure-mgmt-core, argcomplete, tabulate, knack, azure-cli-core, azure-mgmt-cosmosdb, regex, typed-ast, appdirs, toml, black, patsy, statsmodels, category-encoders, pymongo, networkx, hyperopt, idna, zope.event, greenlet, zope.interface, gevent, Werkzeug, itsdangerous, flask, locustio, memory-profiler, nbconvert, pydocumentdb, pymanopt, xlearn, docutils, botocore, s3transfer, boto3, sentencepiece, sacremoses, tokenizers, filelock, transformers, grpcio, absl-py, markdown, protobuf, tensorboard, tensorflow-estimator, wrapt, keras-preprocessing, google-pasta, opt-einsum, termcolor, h5py, keras-applications, astor, gast, tensorflow-gpu, nvidia-ml-py3, schema, simplejson, PythonWebHDFS, coverage, json-tricks, nni
#16 353.9   Attempting uninstall: idna
#16 353.9     Found existing installation: idna 2.9
#16 353.9     Uninstalling idna-2.9:
#16 353.9       Successfully uninstalled idna-2.9
#16 353.9   Attempting uninstall: nbconvert
#16 353.9     Found existing installation: nbconvert 5.6.1
#16 353.9     Uninstalling nbconvert-5.6.1:
#16 353.9       Successfully uninstalled nbconvert-5.6.1
#16 353.9     Running setup.py install for xlearn: started
#16 353.9     Running setup.py install for xlearn: finished with status 'error'
#16 353.9
#16 353.9 Pip subprocess error:
#16 353.9   ERROR: Command errored out with exit status 1:
#16 353.9    command: /root/conda/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-1ko158_p/xlearn/setup.py'"'"'; __file__='"'"'/tmp/pip-install-1ko158_p/xlearn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-_39dsaf8
#16 353.9        cwd: /tmp/pip-install-1ko158_p/xlearn/
#16 353.9   Complete output (51 lines):
#16 353.9   /root/conda/lib/python3.6/site-packages/setuptools/dist.py:454: UserWarning: Normalizing '0.40.a1' to '0.40a1'
#16 353.9     warnings.warn(tmpl.format(**locals()))
#16 353.9   running bdist_wheel
#16 353.9   running build
#16 353.9   running build_py
#16 353.9   Traceback (most recent call last):
#16 353.9     File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 37, in silent_call
#16 353.9       subprocess.check_output(cmd, stderr=shut_up)
#16 353.9     File "/root/conda/lib/python3.6/subprocess.py", line 356, in check_output
#16 353.9       **kwargs).stdout
#16 353.9     File "/root/conda/lib/python3.6/subprocess.py", line 423, in run
#16 353.9       with Popen(*popenargs, **kwargs) as process:
#16 353.9     File "/root/conda/lib/python3.6/subprocess.py", line 729, in __init__
#16 353.9       restore_signals, start_new_session)
#16 353.9     File "/root/conda/lib/python3.6/subprocess.py", line 1364, in _execute_child
#16 353.9       raise child_exception_type(errno_num, err_msg, err_filename)
#16 353.9   FileNotFoundError: [Errno 2] No such file or directory: 'cmake': 'cmake'
#16 353.9
#16 353.9   During handling of the above exception, another exception occurred:
#16 353.9
#16 353.9   Traceback (most recent call last):
#16 353.9     File "<string>", line 1, in <module>
#16 353.9     File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 126, in <module>
#16 353.9       url='https://github.com/aksnzhy/xlearn')
#16 353.9     File "/root/conda/lib/python3.6/site-packages/setuptools/__init__.py", line 161, in setup
#16 353.9       return distutils.core.setup(**attrs)
#16 353.9     File "/root/conda/lib/python3.6/distutils/core.py", line 148, in setup
#16 353.9       dist.run_commands()
#16 353.9     File "/root/conda/lib/python3.6/distutils/dist.py", line 955, in run_commands
#16 353.9       self.run_command(cmd)
#16 353.9     File "/root/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
#16 353.9       cmd_obj.run()
#16 353.9     File "/root/conda/lib/python3.6/site-packages/wheel/bdist_wheel.py", line 223, in run
#16 353.9       self.run_command('build')
#16 353.9     File "/root/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command
#16 353.9       self.distribution.run_command(command)
#16 353.9     File "/root/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
#16 353.9       cmd_obj.run()
#16 353.9     File "/root/conda/lib/python3.6/distutils/command/build.py", line 135, in run
#16 353.9       self.run_command(cmd_name)
#16 353.9     File "/root/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command
#16 353.9       self.distribution.run_command(command)
#16 353.9     File "/root/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
#16 353.9       cmd_obj.run()
#16 353.9     File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 104, in run
#16 353.9       compile_cpp()
#16 353.9     File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 75, in compile_cpp
#16 353.9       silent_call(cmake_cmd, raise_error=True, error_msg='Please install CMake first')
#16 353.9     File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 41, in silent_call
#16 353.9       raise Exception(error_msg);
#16 353.9   Exception: Please install CMake first
#16 353.9   ----------------------------------------
#16 353.9   ERROR: Failed building wheel for xlearn
#16 353.9 ERROR: azureml-core 1.0.69 has requirement ruamel.yaml<=0.15.89,>=0.15.35, but you'll have ruamel-yaml 0.16.10 which is incompatible.
#16 353.9 ERROR: nni 1.5 has requirement hyperopt==0.1.2, but you'll have hyperopt 0.1.1 which is incompatible.
#16 353.9 ERROR: nni 1.5 has requirement scikit-learn<0.22,>=0.20, but you'll have scikit-learn 0.22.1 which is incompatible.
#16 353.9     ERROR: Command errored out with exit status 1:
#16 353.9      command: /root/conda/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-1ko158_p/xlearn/setup.py'"'"'; __file__='"'"'/tmp/pip-install-1ko158_p/xlearn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-bpbpxvcm/install-record.txt --single-version-externally-managed --compile --install-headers /root/conda/include/python3.6m/xlearn
#16 353.9          cwd: /tmp/pip-install-1ko158_p/xlearn/
#16 353.9     Complete output (55 lines):
#16 353.9     /root/conda/lib/python3.6/site-packages/setuptools/dist.py:454: UserWarning: Normalizing '0.40.a1' to '0.40a1'
#16 353.9       warnings.warn(tmpl.format(**locals()))
#16 353.9     running install
#16 353.9     running build
#16 353.9     running build_py
#16 353.9     Traceback (most recent call last):
#16 353.9       File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 37, in silent_call
#16 353.9         subprocess.check_output(cmd, stderr=shut_up)
#16 353.9       File "/root/conda/lib/python3.6/subprocess.py", line 356, in check_output
#16 353.9         **kwargs).stdout
#16 353.9       File "/root/conda/lib/python3.6/subprocess.py", line 423, in run
#16 353.9         with Popen(*popenargs, **kwargs) as process:
#16 353.9       File "/root/conda/lib/python3.6/subprocess.py", line 729, in __init__
#16 353.9         restore_signals, start_new_session)
#16 353.9       File "/root/conda/lib/python3.6/subprocess.py", line 1364, in _execute_child
#16 353.9         raise child_exception_type(errno_num, err_msg, err_filename)
#16 353.9     FileNotFoundError: [Errno 2] No such file or directory: 'cmake': 'cmake'
#16 353.9
#16 353.9     During handling of the above exception, another exception occurred:
#16 353.9
#16 353.9     Traceback (most recent call last):
#16 353.9       File "<string>", line 1, in <module>
#16 353.9       File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 126, in <module>
#16 353.9         url='https://github.com/aksnzhy/xlearn')
#16 353.9       File "/root/conda/lib/python3.6/site-packages/setuptools/__init__.py", line 161, in setup
#16 353.9         return distutils.core.setup(**attrs)
#16 353.9       File "/root/conda/lib/python3.6/distutils/core.py", line 148, in setup
#16 353.9         dist.run_commands()
#16 353.9       File "/root/conda/lib/python3.6/distutils/dist.py", line 955, in run_commands
#16 353.9         self.run_command(cmd)
#16 353.9       File "/root/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
#16 353.9         cmd_obj.run()
#16 353.9       File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 93, in run
#16 353.9         install.run(self)
#16 353.9       File "/root/conda/lib/python3.6/site-packages/setuptools/command/install.py", line 61, in run
#16 353.9         return orig.install.run(self)
#16 353.9       File "/root/conda/lib/python3.6/distutils/command/install.py", line 545, in run
#16 353.9         self.run_command('build')
#16 353.9       File "/root/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command
#16 353.9         self.distribution.run_command(command)
#16 353.9       File "/root/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
#16 353.9         cmd_obj.run()
#16 353.9       File "/root/conda/lib/python3.6/distutils/command/build.py", line 135, in run
#16 353.9         self.run_command(cmd_name)
#16 353.9       File "/root/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command
#16 353.9         self.distribution.run_command(command)
#16 353.9       File "/root/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
#16 353.9         cmd_obj.run()
#16 353.9       File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 104, in run
#16 353.9         compile_cpp()
#16 353.9       File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 75, in compile_cpp
#16 353.9         silent_call(cmake_cmd, raise_error=True, error_msg='Please install CMake first')
#16 353.9       File "/tmp/pip-install-1ko158_p/xlearn/setup.py", line 41, in silent_call
#16 353.9         raise Exception(error_msg);
#16 353.9     Exception: Please install CMake first
#16 353.9     ----------------------------------------
#16 353.9 ERROR: Command errored out with exit status 1: /root/conda/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-1ko158_p/xlearn/setup.py'"'"'; __file__='"'"'/tmp/pip-install-1ko158_p/xlearn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-bpbpxvcm/install-record.txt --single-version-externally-managed --compile --install-headers /root/conda/include/python3.6m/xlearn Check the logs for full command output.
#16 353.9
#16 353.9
#16 353.9 CondaEnvException: Pip failed
#16 353.9
------
failed to solve with frontend dockerfile.v0: failed to build LLB: executor failed running [/bin/sh -c conda env update -f base.yaml &&     conda clean -fay &&     python -m ipykernel install --user --name 'python3' --display-name 'python3']: runc did not terminate sucessfully
@r-yanyo r-yanyo added the bug Something isn't working label Jun 11, 2020
@r-yanyo
Copy link
Author

r-yanyo commented Jun 11, 2020

@gramhagen asked me to tag @yueguoguo.

@gramhagen
Copy link
Collaborator

might be related to #1143

@gramhagen
Copy link
Collaborator

ok, dug in a bit further, since gpu is pulling from a separate image it isn't getting the cmake installation from the base section of the dockerfile. I'm pushing cmake installation down to the final stage and just installing from apt-get.

@yueguoguo is there any issue using the version in aptitude (3.10.2) versus what was previously here (3.15.2)?

@yueguoguo
Copy link
Collaborator

@gramhagen I have not tried aptitude (3.10.2) for building the image. Does it still break the building by using the old version?

@gramhagen
Copy link
Collaborator

it builds / installs fine and I was able to import xlearn afterwards, so I believe it is working fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants