Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

triton build failing to access https://tritonlang.blob.core.windows.net/llvm-builds/ with HTTP Error 409 #4527

Open
lisaong opened this issue Aug 16, 2024 · 17 comments

Comments

@lisaong
Copy link

lisaong commented Aug 16, 2024

This was working earlier, so recently happened.

PublicAccessNotPermitted Public access is not permitted on this storage account. RequestId:372d399d-901e-0036-4086-ef20cc000000 Time:2024-08-16T02:46:44.4328589Z
[] #186 27.76 Building wheels for collected packages: triton
[] #186 27.76   Building wheel for triton (pyproject.toml): started
[] #186 28.67   error: subprocess-exited-with-error
[] #186 28.67   
[] #186 28.67   × Building wheel for triton (pyproject.toml) did not run successfully.
[] #186 28.67   │ exit code: 1
[] #186 28.67   ╰─> [7 lines of output]
[] #186 28.67       running bdist_wheel
[] #186 28.67       running build
[] #186 28.67       running build_py
[] #186 28.67       running build_ext
[] #186 28.67       downloading and extracting https://github.com/pybind/pybind11/archive/refs/tags/v2.11.1.tar.gz ...
[] #186 28.67       downloading and extracting https://tritonlang.blob.core.windows.net/llvm-builds/llvm-49af6502-ubuntu-x64.tar.gz ...
[] #186 28.67       error: HTTP Error 409: Public access is not permitted on this storage account.
[] #186 28.67       [end of output]
@prarit
Copy link

prarit commented Aug 16, 2024

I'm seeing this too. The link to the tarball is now restricted.

@prarit
Copy link

prarit commented Aug 16, 2024

In case anyone is looking into this: I've tried accessing https://tritonlang.blob.core.windows.net via an internal redhat.com network, a private network and via an online anonymizer. It looks like the site is locked down for some reason and access to the tarball is restricted as indicated in the initial report by @lisaong

@prarit
Copy link

prarit commented Aug 16, 2024

Is there a way to provide a 1-time download for the tarball? Perhaps through a direct commit to the triton repo itself?

@atalman
Copy link
Collaborator

atalman commented Aug 16, 2024

We are also seeing this when trying to build triton in pytorch: https://github.com/pytorch/pytorch/actions/runs/10421322991/job/28863284428

cc @ptillet @Jokeren

@jayfurmanek
Copy link
Contributor

That URL was changed back in June:
06e6799

Perhaps torch is on an older commit?

@ytaous
Copy link

ytaous commented Aug 17, 2024

The url is till blocked - any updates?

@TheTrustedComputer
Copy link

TheTrustedComputer commented Aug 17, 2024

I also faced this HTTP 409 error when building PyTorch 2.4 + ROCm. I think this problem of Triton not building points to AOTriton (AMD's fork) using the old upstream commit containing the now broken URL.

@fei-xx
Copy link

fei-xx commented Aug 17, 2024

I'm also seeing the similar issue. Any updates?

@cantor-set
Copy link

This is still an issue, AOTriton still can not be build

@prarit
Copy link

prarit commented Aug 18, 2024

For anyone attempting to build aotriton from ROCm. aotriton references a DIFFERENT repository than triton-lang, and I have filed an issue here: ROCm#631

pytorchmergebot added a commit to pytorch/pytorch that referenced this issue Aug 19, 2024
This reverts commit 32ed4a3.

Reverted #133454 on behalf of https://github.com/ZainRizvi due to Sorry, there's [an outage](triton-lang/triton#4527) that's preventing triton from being installed correctly, which has the side effect of breaking our docker builds. Reverting this PR since it requires a docker rebuild (which now fails) to give us more time to properly fix the docker builds. ([comment](#133454 (comment)))
pytorchmergebot referenced this issue in pytorch/pytorch Aug 19, 2024
The main purpose of this PR is change the XPU CD use rolling driver to support more clients GPU AOT build and enable Kineto. And also plan to enable python 3.13 for xpu CD.

Works for #114850
Pull Request resolved: #133454
Approved by: https://github.com/atalman
@hyhuang00
Copy link

That URL was changed back in June: 06e6799

Perhaps torch is on an older commit?

Thank you! Using the new URL solved this problem.

pytorchmergebot added a commit to mori360/pytorch that referenced this issue Aug 20, 2024
This reverts commit 32ed4a3.

Reverted pytorch#133454 on behalf of https://github.com/ZainRizvi due to Sorry, there's [an outage](triton-lang/triton#4527) that's preventing triton from being installed correctly, which has the side effect of breaking our docker builds. Reverting this PR since it requires a docker rebuild (which now fails) to give us more time to properly fix the docker builds. ([comment](pytorch#133454 (comment)))
@prarit
Copy link

prarit commented Aug 20, 2024

@jayfurmanek torch is indeed on an older commit. Is there any plan to provide the older llvm-49af6502-ubuntu-x64.tar.gz tarball at the new oatriton download link? If not, is there a way I and others could get a copy?

@vaenyr
Copy link

vaenyr commented Aug 21, 2024

For anyone who might be in a similar situation:

I've run into the same issue recently when trying to build PT for ROCm from source. I've tried a couple of ideas, including upgrading LLVM in AMD's triton fork, etc., all in vain. In the end what worked well for me was to follow the Building with custom LLVM part of README.

  1. Clone and checkout LLVM to commit 49af6502c6dcb4a7f7520178bd14df396f78240c.
  2. Build with AMDGPU support
  3. set the env variables as explained in the readme (I used export ...)
  4. build pytorch as usual

No more HTTP 409/missing blobs/version mismatch problems - everything went smoothly.

@prarit
Copy link

prarit commented Aug 21, 2024

Thanks. I had actually stumbled across this today and was trying it out :) It seems like it works -- I'm just worried that the AMD llvm may have additional flags we're not compiling with. Hopefully someone from the AMD ROCm team can confirm the above instructions are valid.

Stonepia added a commit to Stonepia/pytorch that referenced this issue Aug 22, 2024
XPU Triton does not need the workaround patch for setup.py. This commit skips it, and in the future could be directly removed when the triton-lang/triton#4527 is fixed.
@TheTrustedComputer
Copy link

TheTrustedComputer commented Aug 22, 2024

I also had mixed results on building Triton (upstream commit cb3d79a185e40c9d8a579bea07747a8a8d157d52 and AMD's triton commit 9b73a543a5545960bcaf2830900b0560eec443c5) with this particular LLVM commit (49af6502c6dcb4a7f7520178bd14df396f78240c). I seem to cannot get past this error installing the Python package:

CMake Error at CMakeLists.txt:61 (find_package):
Found package configuration file:

    /root/triton-llvm-build/lib/cmake/mlir/MLIRConfig.cmake

but it set MLIR_FOUND to FALSE so package "MLIR" is considered to be NOT
FOUND.  Reason given by package:

The following imported targets are referenced, but are missing:
LLVMNVPTXCodeGen LLVMNVPTXDesc LLVMNVPTXInfo



-- Configuring incomplete, errors occurred

I built LLVM with CMake using the following flags:

-B /root/triton-llvm-build \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_DEFAULT_TARGET_TRIPLE="x86_64-pc-linux-gnu" \
-DLLVM_TARGETS_TO_BUILD="AMDGPU;NVPTX;Native" \
-DLLVM_ENABLE_PROJECTS="mlir" \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DLLVM_CCACHE_BUILD=ON \
-G Ninja

The find command shows these targets are present, despite what CMake says.

find /root/triton-llvm-build -name "*LLVMNVPTXCodeGen*"

/root/triton-llvm-build/lib/Target/NVPTX/CMakeFiles/LLVMNVPTXCodeGen.dir
/root/triton-llvm-build/lib/libLLVMNVPTXCodeGen.a

find /root/triton-llvm-build -name "*LLVMNVPTXDesc*"

/root/triton-llvm-build/lib/Target/NVPTX/MCTargetDesc/CMakeFiles/LLVMNVPTXDesc.dir
/root/triton-llvm-build/lib/libLLVMNVPTXDesc.a

find /root/triton-llvm-build -name "*LLVMNVPTXInfo*"

/root/triton-llvm-build/lib/Target/NVPTX/TargetInfo/CMakeFiles/LLVMNVPTXInfo.dir
/root/triton-llvm-build/lib/libLLVMNVPTXInfo.a

Environment variables used:

LLVM_LIBRARY_DIR=/root/triton-llvm-build/lib
LLVM_INCLUDE_DIRS=/root/triton-llvm-build/include
LLVM_SYSPATH=/root/triton-llvm-build

@pramenku
Copy link

pramenku commented Aug 27, 2024

https://tritonlang.blob.core.windows.net/llvm-builds/llvm-ed4e505c-ubuntu-x64.tar.gz, this link is working again which was not working earlier.

$ wget https://tritonlang.blob.core.windows.net/llvm-builds/llvm-ed4e505c-ubuntu-x64.tar.gz
--2024-08-27 03:25:50-- https://tritonlang.blob.core.windows.net/llvm-builds/llvm-ed4e505c-ubuntu-x64.tar.gz
Resolving tritonlang.blob.core.windows.net (tritonlang.blob.core.windows.net)... 20.47.62.100, 20.47.62.112, 20.47.62.60
Connecting to tritonlang.blob.core.windows.net (tritonlang.blob.core.windows.net)|20.47.62.100|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 977588953 (932M) [application/x-tar]
Saving to: ‘llvm-ed4e505c-ubuntu-x64.tar.gz’

llvm-ed4e505c-ubuntu-x64.tar.gz 0%[ ] 31.63K 11.9KB/s

@TheTrustedComputer
Copy link

I can also confirm the links are working again.

pytorch-bot bot pushed a commit to pytorch/pytorch that referenced this issue Sep 13, 2024
malfet pushed a commit to aditew01/pytorch that referenced this issue Sep 13, 2024
This reverts commit 32ed4a3.

Reverted pytorch#133454 on behalf of https://github.com/ZainRizvi due to Sorry, there's [an outage](triton-lang/triton#4527) that's preventing triton from being installed correctly, which has the side effect of breaking our docker builds. Reverting this PR since it requires a docker rebuild (which now fails) to give us more time to properly fix the docker builds. ([comment](pytorch#133454 (comment)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests