Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stuck at building wheel #1077

Closed
neurosynapse opened this issue Aug 5, 2024 · 4 comments
Closed

stuck at building wheel #1077

neurosynapse opened this issue Aug 5, 2024 · 4 comments

Comments

@neurosynapse
Copy link

pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/NVIDIA/TransformerEngine.git@stable
Cloning https://github.com/NVIDIA/TransformerEngine.git (to revision stable) to /tmp/pip-req-build-fa900tpa
Running command git clone --filter=blob:none --quiet https://github.com/NVIDIA/TransformerEngine.git /tmp/pip-req-build-fa900tpa
Running command git checkout -b stable --track origin/stable
Switched to a new branch 'stable'
Branch 'stable' set up to track remote branch 'stable' from 'origin'.
Resolved https://github.com/NVIDIA/TransformerEngine.git to commit 3ec998e
Running command git submodule update --init --recursive -q
Preparing metadata (setup.py) ... done
Requirement already satisfied: packaging in /usr/lib/python3/dist-packages (from transformer-engine==1.8.0+3ec998e) (21.3)
Collecting pydantic
Using cached pydantic-2.8.2-py3-none-any.whl (423 kB)
Requirement already satisfied: typing-extensions>=4.6.1 in /home/rob/.local/lib/python3.10/site-packages (from pydantic->transformer-engine==1.8.0+3ec998e) (4.12.2)
Collecting pydantic-core==2.20.1
Using cached pydantic_core-2.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
Collecting annotated-types>=0.4.0
Using cached annotated_types-0.7.0-py3-none-any.whl (13 kB)
Building wheels for collected packages: transformer-engine
Building wheel for transformer-engine (setup.py) ... |

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

Ubuntu 22.04

Python 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:12:24) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

torch 2.4.0+cu121

@neurosynapse
Copy link
Author

rtx 3090 ti

@timmoon10
Copy link
Collaborator

We use Ninja to parallelize the build process and I suspect it's overwhelming your system resources. Can you try running with MAX_JOBS=1 in your environment?

@timmoon10 timmoon10 mentioned this issue Aug 9, 2024
13 tasks
@1195343015
Copy link

          Hm, I'd expect most systems could handle building with `MAX_JOBS=1`. I wonder if we could get more clues if you build with verbose output (`pip install -v -v .`).

Originally posted by @timmoon10 in #976 (comment)

It's useful for me !
And you should wait for more time.

@timmoon10
Copy link
Collaborator

When building with minimal resource requirements, we now recommend setting MAX_JOBS=1 and NVTE_BUILD_THREADS_PER_JOB=1 in the environment. This will of course drastically slow down the build process. Setting NVTE_CUDA_ARCHS to your GPU compute capability (e.g. NVTE_CUDA_ARCHS=90 for H100) may help speed up building, but will run into problems if you run on the wrong GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants