diff --git a/CODEOWNERS b/CODEOWNERS new file mode 100644 index 0000000000..43d8c57893 --- /dev/null +++ b/CODEOWNERS @@ -0,0 +1,12 @@ +# Watchers and contributors to DMLC GluonNLP repo directories/packages/files +# Please see documentation of use of CODEOWNERS file at +# https://help.github.com/articles/about-codeowners/ and +# https://github.com/blog/2392-introducing-code-owners +# +# Anybody can add themselves or a team as additional watcher or contributor +# to get notified about changes in a specific package. +# See https://help.github.com/articles/about-teams how to setup teams. + + +# Global owners +* @dmlc/gluon-nlp-committers @dmlc/gluon-nlp-reviewers diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000000..81b284a9ef --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,77 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +In the interest of fostering an open and welcoming environment, we as +contributors and maintainers pledge to making participation in our project and +our community a harassment-free experience for everyone, regardless of age, body +size, disability, ethnicity, sex characteristics, gender identity and expression, +level of experience, education, socio-economic status, nationality, personal +appearance, race, religion, or sexual identity and orientation. + +## Our Standards + +Examples of behavior that contributes to creating a positive environment +include: + +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members + +Examples of unacceptable behavior by participants include: + +* The use of sexualized language or imagery and unwelcome sexual attention or + advances +* Trolling, insulting/derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or electronic + address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Our Responsibilities + +Project maintainers are responsible for clarifying the standards of acceptable +behavior and are expected to take appropriate and fair corrective action in +response to any instances of unacceptable behavior. + +Project maintainers have the right and responsibility to remove, edit, or +reject comments, commits, code, wiki edits, issues, and other contributions +that are not aligned to this Code of Conduct, or to ban temporarily or +permanently any contributor for other behaviors that they deem inappropriate, +threatening, offensive, or harmful. + +## Scope + +This Code of Conduct applies both within project spaces and in public spaces +when an individual is representing the project or its community. Examples of +representing a project or community include using an official project e-mail +address, posting via an official social media account, or acting as an appointed +representative at an online or offline event. Representation of a project may be +further defined and clarified by project maintainers. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported by contacting the project team in GitHub issues/pull requests +by mentioning @dmlc/gluon-nlp-committers. All +complaints will be reviewed and investigated and will result in a response that +is deemed necessary and appropriate to the circumstances. The project team is +obligated to maintain confidentiality with regard to the reporter of an incident. +Further details of specific enforcement policies may be posted separately. + +Project maintainers who do not follow or enforce the Code of Conduct in good +faith may face temporary or permanent repercussions as determined by other +members of the project's leadership. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, +available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html + +[homepage]: https://www.contributor-covenant.org + +For answers to common questions about this code of conduct, see +https://www.contributor-covenant.org/faq diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000000..261eeb9e9f --- /dev/null +++ b/LICENSE @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/README.md b/README.md index 2b1aae9fc8..768495f9f6 100644 --- a/README.md +++ b/README.md @@ -72,3 +72,14 @@ python3 -m gluonnlp.cli.preprocess help # Run Unittests You may go to [tests](tests) to see all how to run the unittests. + + +# Use Docker +You can use Docker to launch a JupyterLab development environment with GluonNLP installed. + +``` +docker pull gluonai/gluon-nlp:v1.0.0 +docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 gluonai/gluon-nlp:v1.0.0 +``` + +For more details, you can refer to the guidance in [tools/docker]. diff --git a/src/gluonnlp/__init__.py b/src/gluonnlp/__init__.py index 31e7e08557..a11355a405 100644 --- a/src/gluonnlp/__init__.py +++ b/src/gluonnlp/__init__.py @@ -1,4 +1,4 @@ -__version__ = '1.0.0.dev0' +__version__ = '1.0.0' from . import base from . import data from . import models diff --git a/tests/test_models_bert.py b/tests/test_models_bert.py index a0d9a8d742..f2a2ffdfc1 100644 --- a/tests/test_models_bert.py +++ b/tests/test_models_bert.py @@ -16,7 +16,7 @@ def test_bert_small_cfg(compute_layout): cfg = BertModel.get_cfg() cfg.defrost() cfg.MODEL.vocab_size = 100 - cfg.MODEL.units = 12 * 8 + cfg.MODEL.units = 12 * 4 cfg.MODEL.hidden_size = 64 cfg.MODEL.num_layers = 2 cfg.MODEL.num_heads = 2 @@ -31,7 +31,7 @@ def test_bert_small_cfg(compute_layout): # Sample data batch_size = 4 - sequence_length = 16 + sequence_length = 8 num_mask = 3 inputs = mx.np.random.randint(0, 10, (batch_size, sequence_length)) token_types = mx.np.random.randint(0, 2, (batch_size, sequence_length)) diff --git a/tools/batch/README.md b/tools/batch/README.md new file mode 100644 index 0000000000..716bdeb55d --- /dev/null +++ b/tools/batch/README.md @@ -0,0 +1,13 @@ +# Launch AWS Jobs +For contributors of GluonNLP, you can try to launch jobs via AWS Batch. +Once you've correctly configured the AWS CLI, you may use the following command: + +``` +python3 submit-job.py \ +--region us-east-1 \ +--job-type p3.2x \ +--work-dir tools/batch \ +--remote https://github.com/dmlc/gluon-nlp \ +--command "python3 hello_world.py" \ +--wait +``` diff --git a/tools/batch/docker/Dockerfile b/tools/batch/docker/Dockerfile index a9ef4aaad4..c0b8592ca7 100644 --- a/tools/batch/docker/Dockerfile +++ b/tools/batch/docker/Dockerfile @@ -1,27 +1,27 @@ FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 - RUN apt-get update && apt-get install -y --no-install-recommends \ - build-essential \ - locales \ - cmake \ - wget \ - subversion \ - git \ - curl \ - vim \ - unzip \ - sudo \ - ca-certificates \ - libjpeg-dev \ - libpng-dev \ - libfreetype6-dev \ - python3-dev \ - python3-pip \ - python3-setuptools \ - libxft-dev &&\ - rm -rf /var/lib/apt/lists/* +RUN apt-get update && apt-get install -y --no-install-recommends \ + build-essential \ + locales \ + cmake \ + wget \ + subversion \ + git \ + curl \ + vim \ + unzip \ + sudo \ + ca-certificates \ + libjpeg-dev \ + libpng-dev \ + libfreetype6-dev \ + python3-dev \ + python3-pip \ + python3-setuptools \ + libxft-dev &&\ + rm -rf /var/lib/apt/lists/* - RUN pip3 install --upgrade pip && pip3 install awscli && pip3 install --pre 'mxnet-cu102' -f https://dist.mxnet.io/python - RUN git clone https://github.com/dmlc/gluon-nlp - WORKDIR gluon-nlp - ADD gluon_nlp_job.sh . +RUN pip3 install --upgrade pip && pip3 install awscli && pip3 install --pre 'mxnet-cu102' -f https://dist.mxnet.io/python +RUN git clone https://github.com/dmlc/gluon-nlp +WORKDIR gluon-nlp +ADD gluon_nlp_job.sh . diff --git a/tools/batch/hello_world.py b/tools/batch/hello_world.py new file mode 100644 index 0000000000..f84e06d6b9 --- /dev/null +++ b/tools/batch/hello_world.py @@ -0,0 +1,10 @@ +from gluonnlp.data.vocab import Vocab +import mxnet as mx + + +if __name__ == '__main__': + vocab = Vocab(['Hello', 'World!'], unk_token=None) + print(vocab) + num_gpus = mx.context.num_gpus() + print('Number of GPUS:', num_gpus) + diff --git a/tools/batch/submit-job.py b/tools/batch/submit-job.py index 87a447d1e1..f11fd50b64 100644 --- a/tools/batch/submit-job.py +++ b/tools/batch/submit-job.py @@ -20,8 +20,8 @@ 'p3.2x', 'p3.8x', 'p3.16x', 'p3dn.24x', 'c5n.18x'], default='g4dn.4x') parser.add_argument('--source-ref', - help='ref in GluonNLP main github. e.g. numpy, refs/pull/500/head', - type=str, default='numpy') + help='ref in GluonNLP main github. e.g. master, refs/pull/500/head', + type=str, default='master') parser.add_argument('--work-dir', help='working directory inside the repo. e.g. scripts/preprocess', type=str, default='scripts/preprocess') @@ -47,6 +47,7 @@ session = boto3.Session(profile_name=args.profile, region_name=args.region) batch, cloudwatch = [session.client(service_name=sn) for sn in ['batch', 'logs']] + def printLogs(logGroupName, logStreamName, startTime): kwargs = {'logGroupName': logGroupName, 'logStreamName': logStreamName, @@ -70,47 +71,48 @@ def printLogs(logGroupName, logStreamName, startTime): return lastTimestamp -def getLogStream(logGroupName, jobName, jobId): - response = cloudwatch.describe_log_streams( - logGroupName=logGroupName, - logStreamNamePrefix=jobName + '/' + jobId - ) - logStreams = response['logStreams'] - if not logStreams: - return '' - else: - return logStreams[0]['logStreamName'] - def nowInMillis(): endTime = long(total_seconds(datetime.utcnow() - datetime(1970, 1, 1))) * 1000 return endTime + job_definitions = { 'g4dn.4x': 'gluon-nlp-1-jobs:5', 'g4dn.8x': 'gluon-nlp-1-jobs:4', 'g4dn.12x': 'gluon-nlp-1-4gpu-jobs:1', 'g4dn.16x': 'gluon-nlp-1-jobs:3', - 'p3.2x': 'gluon-nlp-1-jobs:5', + 'p3.2x': 'gluon-nlp-1-jobs:11', 'p3.8x': 'gluon-nlp-1-4gpu-jobs:2', 'p3.16x': 'gluon-nlp-1-8gpu-jobs:1', 'p3dn.24x': 'gluon-nlp-1-8gpu-jobs:2', 'c5n.18x': 'gluon-nlp-1-cpu-jobs:2', } +job_queues = { + 'g4dn.4x': 'g4dn', + 'g4dn.8x': 'g4dn', + 'g4dn.12x': 'g4dn-multi-gpu', + 'g4dn.16x': 'g4dn', + 'p3.2x': 'p3', + 'p3.8x': 'p3-4gpu', + 'p3.16x': 'p3-8gpu', + 'p3dn.24x': 'p3dn-8gpu', + 'c5n.18x': 'c5n', +} + + def main(): spin = ['-', '/', '|', '\\', '-', '/', '|', '\\'] logGroupName = '/aws/batch/job' jobName = re.sub('[^A-Za-z0-9_\-]', '', args.name)[:128] # Enforce AWS Batch jobName rules jobType = args.job_type - jobQueue = jobType.split('.')[0] - if jobQueue == 'p3dn': - jobQueue = 'p3' + jobQueue = job_queues[jobType] jobDefinition = job_definitions[jobType] command = args.command.split() wait = args.wait - parameters={ + parameters = { 'SOURCE_REF': args.source_ref, 'WORK_DIR': args.work_dir, 'SAVED_OUTPUT': args.saved_output, @@ -135,7 +137,6 @@ def main(): running = False status_set = set() startTime = 0 - while wait: time.sleep(random.randint(5, 10)) describeJobsResponse = batch.describe_jobs(jobs=[jobId]) @@ -147,10 +148,10 @@ def main(): sys.exit(status == 'FAILED') elif status == 'RUNNING': - logStreamName = getLogStream(logGroupName, jobName, jobId) + logStreamName = describeJobsResponse['jobs'][0]['container']['logStreamName'] if not running: running = True - print('\rJob [{} - {}] is RUNNING.'.format(jobName, jobId)) + print('\rJob [{}, {}] is RUNNING.'.format(jobName, jobId)) if logStreamName: print('Output [{}]:\n {}'.format(logStreamName, '=' * 80)) if logStreamName: @@ -161,5 +162,6 @@ def main(): sys.stdout.flush() spinner += 1 + if __name__ == '__main__': main() diff --git a/tools/docker/README.md b/tools/docker/README.md new file mode 100644 index 0000000000..ce85b13a8a --- /dev/null +++ b/tools/docker/README.md @@ -0,0 +1,25 @@ +# Docker Support in GluonNLP +We provide the [Docker](https://www.docker.com/) container with everything set up to run GluonNLP. +With the prebuilt docker image, there is no need to worry about the operating systems or system dependencies. +You can launch a [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) development environment +and try out to use GluonNLP to solve your problem. + +## Run Docker +You can run the docker with the following command. + +``` +docker pull gluonai/gluon-nlp:gpu-latest +docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 --shm-size=4g gluonai/gluon-nlp:gpu-latest +``` + +Here, we open the ports 8888, 8787, 8786, which are used for connecting to JupyterLab. +Also, we set `--shm-size` to `4g`. This sets the shared memory storage to 4GB. Since NCCL will +create shared memory segments, this argument is essential for the JupyterNotebook to work with NCCL. +(See also https://github.com/NVIDIA/nccl/issues/290). + +## Build your own Docker Image +To build a docker image fom the dockerfile, you may use the following command: + +``` +docker build -f ubuntu18.04-devel-gpu.Dockerfile -t gluonai/gluon-nlp:gpu-latest . +``` diff --git a/tools/docker/devel_entrypoint.sh b/tools/docker/devel_entrypoint.sh new file mode 100644 index 0000000000..6a91eb26a3 --- /dev/null +++ b/tools/docker/devel_entrypoint.sh @@ -0,0 +1,5 @@ +#!/bin/bash + +source /start_jupyter.sh + +exec "$@" diff --git a/tools/docker/start_jupyter.sh b/tools/docker/start_jupyter.sh new file mode 100644 index 0000000000..695ad45d88 --- /dev/null +++ b/tools/docker/start_jupyter.sh @@ -0,0 +1,15 @@ +#!/bin/bash + +# Run Jupyter in foreground if $JUPYTER_FG is set +if [[ "${JUPYTER_FG}" == "true" ]]; then + jupyter-lab --allow-root --ip=0.0.0.0 --no-browser --NotebookApp.token='' + exit 0 +else + nohup jupyter-lab --allow-root --ip=0.0.0.0 --no-browser --NotebookApp.token='' > /dev/null 2>&1 & + + echo "Notebook server successfully started, a JupyterLab instance has been executed!" + echo "Make local folders visible by volume mounting to /workspace/notebook" + echo "To access visit http://localhost:8888 on your host machine." + echo 'Ensure the following arguments to "docker run" are added to expose the server ports to your host machine: + -p 8888:8888 -p 8787:8787 -p 8786:8786' +fi diff --git a/tools/docker/ubuntu18.04-devel-gpu.Dockerfile b/tools/docker/ubuntu18.04-devel-gpu.Dockerfile new file mode 100644 index 0000000000..1040453f88 --- /dev/null +++ b/tools/docker/ubuntu18.04-devel-gpu.Dockerfile @@ -0,0 +1,156 @@ +FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 + +LABEL maintainer="GluonNLP Team" + +ARG DEBIAN_FRONTEND=noninteractive + +ENV PYTHONDONTWRITEBYTECODE=1 \ + PYTHONUNBUFFERED=1 \ + LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/local/lib" \ + PYTHONIOENCODING=UTF-8 \ + LANG=C.UTF-8 \ + LC_ALL=C.UTF-8 + +ENV WORKDIR=/workspace +ENV SHELL=/bin/bash + +RUN apt-get update \ + && apt-get install -y --no-install-recommends \ + software-properties-common \ + build-essential \ + ca-certificates \ + curl \ + emacs \ + subversion \ + locales \ + cmake \ + git \ + libopencv-dev \ + htop \ + vim \ + wget \ + unzip \ + libopenblas-dev \ + ninja-build \ + openssh-client \ + openssh-server \ + python3-dev \ + python3-pip \ + python3-setuptools \ + libxft-dev \ + zlib1g-dev \ + && apt-get clean \ + && rm -rf /var/lib/apt/lists/* + +RUN python3 -m pip --no-cache-dir install --upgrade \ + pip \ + setuptools + +########################################################################### +# Horovod dependencies +########################################################################### + +# Install Open MPI +RUN mkdir /tmp/openmpi \ + && cd /tmp/openmpi \ + && curl -fSsL -O https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz \ + && tar zxf openmpi-4.0.1.tar.gz \ + && cd openmpi-4.0.1 \ + && ./configure --enable-orterun-prefix-by-default \ + && make -j $(nproc) all \ + && make install \ + && ldconfig \ + && rm -rf /tmp/openmpi + +# Create a wrapper for OpenMPI to allow running as root by default +RUN mv /usr/local/bin/mpirun /usr/local/bin/mpirun.real \ + && echo '#!/bin/bash' > /usr/local/bin/mpirun \ + && echo 'mpirun.real --allow-run-as-root "$@"' >> /usr/local/bin/mpirun \ + && chmod a+x /usr/local/bin/mpirun + +RUN echo "hwloc_base_binding_policy = none" >> /usr/local/etc/openmpi-mca-params.conf \ + && echo "rmaps_base_mapping_policy = slot" >> /usr/local/etc/openmpi-mca-params.conf + +ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH +ENV PATH=/usr/local/openmpi/bin/:/usr/local/bin:/root/.local/bin:$PATH + +RUN ln -s $(which ${PYTHON}) /usr/local/bin/python + +RUN mkdir -p ${WORKDIR} + +# install PyYAML==5.1.2 to avoid conflict with latest awscli +# python-dateutil==2.8.0 to satisfy botocore associated with latest awscli +RUN pip3 install --no-cache --upgrade \ + wheel \ + numpy==1.19.1 \ + pandas==0.25.1 \ + pytest \ + Pillow \ + requests==2.22.0 \ + scikit-learn==0.20.4 \ + scipy==1.2.2 \ + urllib3==1.25.8 \ + python-dateutil==2.8.0 \ + sagemaker-experiments==0.* \ + PyYAML==5.3.1 \ + mpi4py==3.0.2 \ + jupyterlab==2.2.4 \ + cmake \ + awscli + +# Install MXNet +RUN mkdir -p ${WORKDIR}/mxnet \ + && cd ${WORKDIR}/mxnet \ + && git clone --single-branch --branch master --recursive https://github.com/apache/incubator-mxnet \ + && cd incubator-mxnet \ + && mkdir build \ + && cd build \ + && cmake -DMXNET_CUDA_ARCH="3.0;5.0;6.0;7.0" -GNinja -C ../config/linux_gpu.cmake .. \ + && cmake --build . \ + && cd ../python \ + && python3 -m pip install -U -e . --user + +# Install Horovod +# TODO Fix once https://github.com/horovod/horovod/pull/2155 gets merged +RUN mkdir ${WORKDIR}/horovod \ + && cd ${WORKDIR}/horovod \ + && git clone --single-branch --branch mx2-pr --recursive https://github.com/eric-haibin-lin/horovod \ + && cd horovod \ + && ldconfig /usr/local/cuda/targets/x86_64-linux/lib/stubs \ + && HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL HOROVOD_WITHOUT_GLOO=1 \ + HOROVOD_WITH_MPI=1 HOROVOD_WITH_MXNET=1 HOROVOD_WITHOUT_PYTORCH=1 \ + HOROVOD_WITHOUT_TENSORFLOW=1 python3 setup.py install --user \ + && ldconfig + +RUN mkdir -p ${WORKDIR}/notebook +RUN mkdir -p ${WORKDIR}/data +RUN mkdir -p /.init +RUN cd ${WORKDIR} \ + && git clone https://github.com/dmlc/gluon-nlp \ + && cd gluon-nlp \ + && git checkout master \ + && python3 -m pip install -U -e ."[extras]" --user + +COPY start_jupyter.sh /start_jupyter.sh +COPY devel_entrypoint.sh /devel_entrypoint.sh +RUN chmod +x /devel_entrypoint.sh + +EXPOSE 8888 +EXPOSE 8787 +EXPOSE 8786 + +WORKDIR ${WORKDIR} + +# Debug horovod by default +RUN echo NCCL_DEBUG=INFO >> /etc/nccl.conf + +# Revise default shell to /bin/bash +RUN jupyter notebook --generate-config \ + && echo "c.NotebookApp.terminado_settings = { 'shell_command': ['/bin/bash'] }" >> /root/.jupyter/jupyter_notebook_config.py + +# Add Tini +ARG TINI_VERSION=v0.19.0 +ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /tini +RUN chmod +x /tini +ENTRYPOINT [ "/tini", "--", "/devel_entrypoint.sh" ] +CMD ["/bin/bash"]