Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytorch Error in Sync batch norm #211

Open
krishnakanthnakka opened this issue Jul 9, 2019 · 1 comment
Open

pytorch Error in Sync batch norm #211

krishnakanthnakka opened this issue Jul 9, 2019 · 1 comment

Comments

@krishnakanthnakka
Copy link

CUDA =9.2 , GCC -6.0

Traceback (most recent call last):
File "experiments/segmentation/demo.py", line 16, in
output = model.evaluate(img)
File "/cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/models/base.py", line 78, in evaluate
pred = self.forward(x)
File "/cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/models/fcn.py", line 51, in forward
_, _, c3, c4 = self.base_forward(x)
File "/cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/models/base.py", line 67, in base_forward
x = self.pretrained.conv1(x)
File "/home/nakka/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nakka/.local/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home/nakka/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, kwargs)
File "/cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/nn/syncbn.py", line 122, in forward
self.activation, self.slope).view(input_shape)
File "/cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/functions/syncbn.py", line 95, in forward
y = lib.gpu.batchnorm_forward(x, _ex, _exs, gamma, beta, ctx.eps)
RuntimeError: cudaGetLastError() == cudaSuccess ASSERT FAILED at /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/syncbn_kernel.cu:289, please report a bug to PyTorch. (BatchNorm_Forward_CUDA at /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/syncbn_kernel.cu:289)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f4d5b06dfe1 in /home/nakka/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f4d5b06ddfa in /home/nakka/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: BatchNorm_Forward_CUDA(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, float) + 0x2c7 (0x7f4d471d5788 in /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/enclib_gpu.so)
frame #3: + 0x6fb5e (0x7f4d471afb5e in /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/enclib_gpu.so)
frame #4: + 0x6a4f5 (0x7f4d471aa4f5 in /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/enclib_gpu.so)
frame #5: + 0x62ce9 (0x7f4d471a2ce9 in /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/enclib_gpu.so)
frame #6: + 0x63004 (0x7f4d471a3004 in /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/enclib_gpu.so)
frame #7: + 0x5192c (0x7f4d4719192c in /cvlabdata2/home/krishna/packages/conda2/envs/py3.6/lib/python3.6/site-packages/encoding/lib/gpu/enclib_gpu.so)

frame #16: THPFunction_apply(_object
, _object
) + 0x581 (0x7f4d55c374d1 in /home/nakka/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

@zhanghang1989
Copy link
Owner

Could you try install CUDA 10.1 and reinstall pytorch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants