Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Compilation fails in master Cuda 10.1.105 GCC 7.4 Ubuntu 18.04 #16612

Open
larroy opened this issue Oct 24, 2019 · 16 comments
Open

Compilation fails in master Cuda 10.1.105 GCC 7.4 Ubuntu 18.04 #16612

larroy opened this issue Oct 24, 2019 · 16 comments

Comments

@larroy
Copy link
Contributor

larroy commented Oct 24, 2019

Description


[74/524] Building NVCC (Device) object CMakeFiles/cuda_compile_1.dir/src/operator/contrib/cuda_compile_1_generated_bounding_box.cu.o
FAILED: CMakeFiles/cuda_compile_1.dir/src/operator/contrib/cuda_compile_1_generated_bounding_box.cu.o
cd /home/piotr/mxnet_master/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib && /usr/local/bin/cmake -E make_directory /home/piotr/mxnet_master/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/. && /usr/local/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=Debug -D generated_file:STRING=/home/piotr/mxnet_master/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/./cuda_compile_1_generated_bounding_box.cu.o -D generated_cubin_file:STRING=/home/piotr/mxnet_master/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/./cuda_compile_1_generated_bounding_box.cu.o.cubin.txt -P /home/piotr/mxnet_master/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/cuda_compile_1_generated_bounding_box.cu.o.Debug.cmake
/home/piotr/mxnet_master/include/dmlc/./thread_local.h: In instantiation of ‘static T* dmlc::ThreadLocalStore<T>::Get() [with T = std::unordered_set<std::__cxx11::basic_string<char> >]’:
/home/piotr/mxnet_master/src/operator/contrib/./../../common/utils.h:461:28:   required from here
/home/piotr/mxnet_master/include/dmlc/./thread_local.h:46:15: error: cannot call member function ‘void dmlc::ThreadLocalStore<T>::RegisterDelete(T*) [with T = std::unordered_set<std::__cxx11::basic_string<char> >]’ without object
       Singleton()->RegisterDelete(ptr);
       ~~~~~~~~^~~~~
CMake Error at cuda_compile_1_generated_bounding_box.cu.o.Debug.cmake:279 (message):
  Error generating file
  /home/piotr/mxnet_master/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/./cuda_compile_1_generated_bounding_box.cu.o

To Reproduce

Building with the following config:

rev: ef56334...

./dev_menu.py build

--- # CMake configuration
USE_CUDA: "ON" # Build with CUDA support
USE_OLDCMAKECUDA: "OFF" # Build with old cmake cuda
USE_NCCL: "ON" # Use NVidia NCCL with CUDA
USE_OPENCV: "ON" # Build with OpenCV support
USE_OPENMP: "PLATFORM" # Build with Openmp support
USE_CUDNN: "ON" # Build with cudnn support) # one could set CUDNN_ROOT for search path
USE_SSE: "ON" # Build with x86 SSE instruction support IF NOT ARM
USE_F16C: "ON" # Build with x86 F16C instruction support) # autodetects support if "ON"
USE_LAPACK: "ON" # Build with lapack support
USE_MKL_IF_AVAILABLE: "OFF" # Use MKL if found
USE_MKLML_MKL: "OFF" # Use MKLDNN variant of MKL (if MKL found) IF USE_MKL_IF_AVAILABLE AND (NOT APPLE)
USE_MKLDNN: "OFF" # Use MKLDNN variant of MKL (if MKL found) IF USE_MKL_IF_AVAILABLE AND (NOT APPLE)
USE_OPERATOR_TUNING: "ON" # Enable auto-tuning of operators IF NOT MSVC
USE_GPERFTOOLS: "ON" # Build with GPerfTools support (if found)
USE_JEMALLOC: "ON" # Build with Jemalloc support
USE_DIST_KVSTORE: "OFF" # Build with DIST_KVSTORE support
USE_PLUGINS_WARPCTC: "OFF" # Use WARPCTC Plugins
USE_PLUGIN_CAFFE: "OFF" # Use Caffe Plugin
USE_CPP_PACKAGE: "OFF" # Build C++ Package
USE_MXNET_LIB_NAMING: "ON" # Use MXNet library naming conventions.
USE_GPROF: "OFF" # Compile with gprof (profiling) flag
USE_CXX14_IF_AVAILABLE: "OFF" # Build with C++14 if the compiler supports it
USE_VTUNE: "OFF" # Enable use of Intel Amplifier XE (VTune)) # one could set VTUNE_ROOT for search path
ENABLE_CUDA_RTC: "ON" # Build with CUDA runtime compilation support
BUILD_CPP_EXAMPLES: "ON" # Build cpp examples
INSTALL_EXAMPLES: "OFF" # Install the example source files.
USE_SIGNAL_HANDLER: "ON" # Print stack traces on segfaults.
USE_TENSORRT: "OFF" # Enable infeference optimization with TensorRT.
USE_ASAN: "OFF" # Enable Clang/GCC ASAN sanitizers.
ENABLE_TESTCOVERAGE: "OFF" # Enable compilation with test coverage metric output
CMAKE_BUILD_TYPE: "Debug"
CMAKE_CUDA_COMPILER_LAUNCHER: "ccache"
CMAKE_C_COMPILER_LAUNCHER: "ccache"
CMAKE_CXX_COMPILER_LAUNCHER: "ccache"

Steps to reproduce

(Paste the commands you ran that produced the error.)

What have you tried to solve it?

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

http://ix.io/1ZL6

@larroy larroy added the Bug label Oct 24, 2019
@larroy larroy changed the title Compilation fails in master Compilation fails in master Cuda 10.1 GCC 7.4 Ubuntu 18.04 Oct 24, 2019
@larroy
Copy link
Contributor Author

larroy commented Oct 24, 2019

Looks like it could be the commit after 91bb398

@anirudh2290

@ChaiBapchya
Copy link
Contributor

ChaiBapchya commented Oct 24, 2019

Was able to build it successfully for ef56334
with following build flags

$ python -c "from mxnet.runtime import feature_list; print(feature_list())"

[ ✔ CUDA, ✔ CUDNN, ✖ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✖ LAPACK, ✖ MKLDNN, ✖ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✔ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✔ DEBUG, ✖ TVM_OP]

Something off coz of NCCL?

@anirudh2290
Copy link
Member

Its probably because of the gcc version not supporting __thread construct. Looking into this.

@anirudh2290
Copy link
Member

anirudh2290 commented Oct 25, 2019

My earlier theory related to __thread was wrong. I am not able to reproduce it with :

mkdir build && cd build && cmake -DVERBOSE=1 -DUSE_CUDA=ON -DUSE_CUDNN=ON -DUSE_OPENMP=ON -DCMAKE_BUILD_TYPE=Debug -DUSE_DIST_KVSTORE=0 -DUSE_OPENCV=0 -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-10.1 -DCUDNN_ROOT=/usr/local/cuda-10.1 -DUSE_MKLDNN=0 -DUSE_MKL_IF_AVAILABLE=0 -DUSE_MKLML_MKL=0 -DUSE_ASAN=0 -GNinja -DUSE_OPERATOR_TUNING=1 -DUSE_CPP_PACKAGE=ON -DCUDA_ARCH_NAME=Auto -DUSE_INT64_TENSOR_SIZE=OFF -DUSE_TENSORRT=OFF -DUSE_NCCL=ON ..
ninja -v

Since CI passed without issues and my local build also passed with g++ 7.4 (ubuntu 18.04, cuda 10.1) I am suspecting some issue with your setup. Can you omit ccache and run the build directly? Did you do the submodule update?

@larroy
Copy link
Contributor Author

larroy commented Oct 25, 2019

Could be related to nvvc:

piotr@54-198-120-41:0:~/mxnet_master ((ef5633448...))+$ /usr/local/cuda-10.1/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

My environment is here:

https://github.com/larroy/ec2_launch_scripts

Tried gcc 6 and happened as well.

Also without ccache. The instance is pretty clean.

@anirudh2290 anirudh2290 added Build and removed Bug labels Oct 25, 2019
@anirudh2290
Copy link
Member

I have removed the "Bug" label for now. This requires more evidence to be classified as a "MXNet" bug. I have requested an AMI from @larroy offline to reproduce the issue on the specific version of nvcc. Currently, I have tried to build with CUDA 10.0 and CUDA 10.1 and haven't been able to reproduce on ubuntu 18.04 and g++ 7.4. Also, this issue wasn't reproduced on our supported environments checked by CI.

@larroy
Copy link
Contributor Author

larroy commented Oct 30, 2019

Hi @anirudh2290 you can repro the environment in AWS Ec with the following:

https://github.com/larroy/ec2_launch_scripts

Just execute launch.py all the environment is coded and coming from NVidia. Let me know if you have any issues.

Would be great if we could fix this issue.

@anirudh2290
Copy link
Member

Hi @larroy, I currently don't have the time to debug custom scripts and custom environments. You provided the CUDA version gcc version ubuntu version. I tried with this configuration for cmake build and haven't been able to reproduce the issue. Also, the issue has not been reproduced in the CI builds. With the current evidence, I highly suspect the issue to be specific to something happening in your environment.

Having said that, I can continue my work even if #16526 is reverted, though it may cause slightly additional work for frontend developers, developing on top of #16654 . So, if you can convince a committer about this revert, i won't be blocking it. Also, if this is going to be reverted, a CI stage should be added in the future, which would make #16526 fail the build.

@larroy
Copy link
Contributor Author

larroy commented Nov 1, 2019

Thanks for your help @anirudh2290 I think this is could be a bug with the NVCC that comes with Cuda 10.1.105 as seems to work with 10.1.243.

@hubutui
Copy link

hubutui commented Nov 4, 2019

I got a similar issue with ArchLinux, cuda 10.1.243, gcc 8.3.0, opencv 4.1.2. Here is my build log.

mxnet-buildlog.txt

@anirudh2290
Copy link
Member

@hubutui Looks like your issue is unrelated. I don't see issue related to ThreadLocalStore in your log.

@DickJC123
Copy link
Contributor

Yes, I believe this is a problem present in the original cuda 10.1 release (10.1.105), fixed by 10.1 Update 1 (10.1.168). Are you able to upgrade at least to this version, or are we looking for a work-around for 10.1.105?

@larroy
Copy link
Contributor Author

larroy commented Nov 4, 2019

I was able to upgrade and the problem went away with the updated CUDA.

@larroy larroy changed the title Compilation fails in master Cuda 10.1 GCC 7.4 Ubuntu 18.04 Compilation fails in master Cuda 10.1.105 GCC 7.4 Ubuntu 18.04 Nov 4, 2019
@DickJC123
Copy link
Contributor

And FYI, if you feel it worth trying to correct this for MXNet users on the original cuda 10.1, the fix to the problematic line in dmlc-core is:

      // nvcc fails to compile 'Singleton()->' on first cuda 10.1 release, fixed with update 1.
      (*Singleton()).RegisterDelete(ptr);

Worth a PR?

@larroy
Copy link
Contributor Author

larroy commented Nov 5, 2019

I think it would be user friendly to avoid obscure compilation errors for users if we can avoid it. Meaning I think it would be best to add a PR.

@anirudh2290
Copy link
Member

I agree, it would be worth opening a PR to dmlc-core. Thanks @DickJC123 !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants