-
Notifications
You must be signed in to change notification settings - Fork 23.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time taken to data loading increased in newer builds (ARM) #124922
Comments
I think your analysis is correct, but i wonder if builds after pytorch/builder#1787 would revert to the original behavior. @snadampal can we add some sort of a regression test to the release process to avoid such flaws in the future? |
Ok, taking for myself as pytorch/builder#1787 actually broke nightly builds, need to restart the process.. |
@malfet Thanks. |
Hi @dilililiwhy , can you please try with the latest nightly wheel, something like this this has the correct openBLAS and also correct libgomp. This was the main difference went into previous builder patch you had pointed out. |
Hi @malfet , yes, we need some regression tests around openblas and openmp. will check what can be included in the CD test. |
Time taken is return to normal by using dev20240426.
@snadampal Thanks for your help. Will there be a nightly version of torch 2.3.1? |
great! I have raised this PR to fix this issue on pytorch/2.3 release I think once it is merged to 2.3 release branch, there may be a nightly wheel. |
🐛 Describe the bug
When using the arm version cpu package of torch (2.2.1/2.2.2/2.3.0), the time taken to data loading increase (like "noise"). After looking back at historical updates, it seems that build product torch-2.3.0.dev20240207 introduced some changes which influenced the data loading, as torch-2.3.0.dev20240206 shows a normal behavior.
But commits to pytorch during that time did not seem to affect data loading, maybe this builder behavior modification about arm version package (pytorch/builder#1696) caused this issue? Have no ideas about the underlying dependency and need help.
Test Result (only change the torch)
For dev20240206, time taken to 30 steps
For 2.3.0, time taken to 30 steps
For dev20240207, time taken to 30 steps
Test Demo
Versions
PyTorch version: 2.3.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: CentOS Linux 7 (AltArch) (aarch64)
GCC version: (GCC) 10.2.1 20210130 (Red Hat 10.2.1-11)
Clang version: Could not collect
CMake version: version 3.29.0
Libc version: glibc-2.17
Python version: 3.8.19 (default, Apr 2 2024, 06:27:46) [GCC 10.2.1 20210130 (Red Hat 10.2.1-11)] (64-bit runtime)
Python platform: Linux-4.18.0-80.7.2.el7.aarch64-aarch64-with-glibc2.17
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 1
Core(s) per socket: 32
Socket(s): 2
NUMA node(s): 2
Model: 0
CPU max MHz: 2400.0000
CPU min MHz: 2400.0000
BogoMIPS: 200.00
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 32768K
NUMA node0 CPU(s): 0-31
NUMA node1 CPU(s): 32-63
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
Versions of relevant libraries:
[pip3] numpy==1.24.4
[pip3] torch==2.3.0
[pip3] torchvision==0.18.0
[conda] Could not collect
cc @seemethere @malfet @osalpekar @atalman @snadampal
The text was updated successfully, but these errors were encountered: