Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py-faster-rcnn build Issue. In the making process. (make -j8 && make pycaffe) #509

Closed
yuanzhenjie opened this issue Mar 2, 2017 · 16 comments

Comments

@yuanzhenjie
Copy link

yuanzhenjie commented Mar 2, 2017

Could anybody help fix it. thanks a lot.

I have replaced the cudnn file with the new version
cuda8.0 cudnn-V5

[root@dl-gpu caffe-fast-rcnn]# make -j8 && make pycaffe 
CXX/LD -o .build_release/tools/caffe.bin
CXX/LD -o .build_release/tools/extract_features.bin
CXX/LD -o .build_release/examples/cpp_classification/classification.bin
/usr/bin/ld: warning: libhdf5_hl.so.8, needed by /usr/local/lib/libcaffe.so, may conflict with libhdf5_hl.so.10
/usr/bin/ld: warning: libhdf5.so.8, needed by /usr/local/lib/libcaffe.so, may conflict with libhdf5.so.10
.build_release/tools/extract_features.o:在函数‘int feature_extraction_pipeline<float>(int, char**)’中:
extract_features.cpp:(.text._Z27feature_extraction_pipelineIfEiiPPc[_Z27feature_extraction_pipelineIfEiiPPc]+0xe6):对‘caffe::Net<float>::Net(std::string const&, caffe::Phase, caffe::Net<float> const*)’未定义的引用
collect2: 错误:ld 返回 1
make: *** [.build_release/tools/extract_features.bin] 错误 1
make: *** 正在等待未完成的任务....
/usr/bin/ld: warning: libhdf5_hl.so.8, needed by /usr/local/lib/libcaffe.so, may conflict with libhdf5_hl.so.10
/usr/bin/ld: warning: libhdf5.so.8, needed by /usr/local/lib/libcaffe.so, may conflict with libhdf5.so.10
.build_release/examples/cpp_classification/classification.o:在函数‘Classifier::Classifier(std::string const&, std::string const&, std::string const&, std::string const&)’中:
classification.cpp:(.text+0x29a1):对‘caffe:/:Net<float>:usr:Net/bin(/ldstd::: warningstring : const&, libhdf5_hl.so.8,caffe::Phase needed , bycaffe::Net /<floatusr/>local/ const*)lib/?libcaffe.so,? may未 conflict? with?? ??libhdf5_hl.so.10
/的usr/?bin??/ld?: ?
warning: libhdf5.so.8, needed by /usr/local/lib/libcaffe.so, may conflict withcollect2: 错误:ld 返回 1
 libhdf5.so.10
.build_release/tools/caffe.o:在函数‘test()’中:
caffe.cpp:(.text+0xdad):对‘caffe::Net<float>::Net(std::string const&, caffe::Phase, caffe::Netmake: <*** [.build_release/examples/cpp_classification/classification.bin] 错误 1
float> const*)’未定义的引用
.build_release/tools/caffe.o:在函数‘train()’中:
caffe.cpp:(.text+0x1a8f):对‘caffe::P2PSync<float>::P2PSync(boost::shared_ptr<caffe::Solver<float> >, caffe::P2PSync<float>*, caffe::SolverParameter const&)’未定义的引用
caffe.cpp:(.text+0x1aae):对‘caffe::P2PSync<float>::run(std::vector<int, std::allocator<int> > const&)’未定义的引用
caffe.cpp:(.text+0x1ab6):对‘caffe::P2PSync<float>::~P2PSync()’未定义的引用
caffe.cpp:(.text+0x203f):对‘caffe::P2PSync<float>::~P2PSync()’未定义的引用
caffe.cpp:(.text+0x20e3):对‘caffe::P2PSync<float>::~P2PSync()’未定义的引用
.build_release/tools/caffe.o:在函数‘time()’中:
caffe.cpp:(.text+0x22be):对‘caffe::Net<float>::Net(std::string const&, caffe::Phase, caffe::Net<float> const*)’未定义的引用
caffe.cpp:(.text+0x25cc):对‘caffe::Layer<float>::Lock()’未定义的引用
caffe.cpp:(.text+0x26d1):对‘caffe::Layer<float>::Unlock()’未定义的引用
collect2: 错误 error :ld 返回 1
make: *** [.build_release/tools/caffe.bin] 错误 1

the Makefile.config file

## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).
 USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#       You should not set this flag if you will be reading LMDBs with any
#       possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3
# OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
                -gencode arch=compute_20,code=sm_21 \
                -gencode arch=compute_30,code=sm_30 \
                -gencode arch=compute_35,code=sm_35 \
                -gencode arch=compute_50,code=sm_50 \
                -gencode arch=compute_52,code=sm_52 \
                -gencode arch=compute_60,code=sm_60 \
                -gencode arch=compute_61,code=sm_61 \
                -gencode arch=compute_61,code=compute_61

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
 BLAS_INCLUDE := /usr/include/atlas
#/path/to/your/blas
 BLAS_LIB := /usr/lib64/atlas
#/path/to/your/blas
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib

# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app

# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
#PYTHON_INCLUDE := /usr/include/python2.7 \
                /usr/lib64/python2.7/site-packages/numpy/core/include/numpy/
#/usr/lib64/python2.7/site-packages/numpy/core/include/numpy/
#/usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
ANACONDA_HOME := /data/software/anaconda2
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
                  $(ANACONDA_HOME)/include/python2.7 \
                  $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
#                 /usr/lib/python3.5/dist-packages/numpy/core/include

# We need to be able to find libpythonX.X.so or .dylib.
#PYTHON_LIB := /usr/lib64
#/usr/lib
 PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)
 WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib

# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1

# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1

# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
Q ?= @
@Microos
Copy link

Microos commented Mar 4, 2017

To enable supports of CUDNNv5:
1).
cd caffe-fast-rcnn
git remote add caffe https://github.com/BVLC/caffe.git
git fetch caffe
git merge -X theirs caffe/master

2).
remove self_.attr("phase") = static_cast<int>(this->phase_); from include/caffe/layers/python_layer.hpp after merging.

3). (in caffe-fast-rcnn)
- mkdir build && cd build
- cmake ..
- make all -j8
- make pycaffe -j8

@yxchng
Copy link

yxchng commented Mar 8, 2017

@Microos why I can't do import caffe in python environment anymore after doing what you mentioned?

@Data-drone
Copy link

check ur path variables most likely

@yuanzhenjie
Copy link
Author

thanks All.
In this issue.
CentOS Linux release 7.1.1503 (Core)
Cuda8.0
cdnnV5
Python 2.7.12 |Anaconda 4.2.0 (64-bit)

I clean all. And do again git clone (py-faster-rcnn and submodule caffe-fast-rcnn)(like "Microos" talk but No clone BLVC/caffe).
Change the lib name of atlas ,( sudo ln -sv libsatlas.so.3.10 libcblas.so sudo ln -sv libsatlas.so.3.10 libatlas.so)(ATLAS现在的名称变了,要新建一下软连)
Using Python(Anaconda) as main python path. and do "conda upgrade"
No cmake, us make -j8 && make pycaffe.(cmake not a official way ).
finally, the py-faster-rcnn make success.

@yuanzhenjie
Copy link
Author

Also the py-faster-rcnn cuda files (the following files)has been replaced by new version cuda files (from BLVC/caffe)
/root/py-faster-rcnn/caffe-fast-rcnn/include/caffe/util/cudnn.hpp

include/caffe/layers/cudnn_relu_layer.hpp, src/caffe/layers/cudnn_relu_layer.cpp, src/caffe/layers/cudnn_relu_layer.cu

include/caffe/layers/cudnn_sigmoid_layer.hpp, src/caffe/layers/cudnn_sigmoid_layer.cpp, src/caffe/layers/cudnn_sigmoid_layer.cu

include/caffe/layers/cudnn_tanh_layer.hpp, src/caffe/layers/cudnn_tanh_layer.cpp, src/caffe/layers/cudnn_tanh_layer.cu

@Microos
Copy link

Microos commented Mar 16, 2017

@yxchng Did you add the $CAFFE_ROOT/python into your $PYTHONPATH?

@weiaicunzai
Copy link

@Microos Thanks, works for me.BTW, why this approach can solve the problem ?

@Queuecumber
Copy link
Contributor

Someone should probably include in the top-level readme the specific versions of CUDA/cuDNN that this is intended to work with, or instructions on how to update it, or at the very least link this issue.

@ericxian1997
Copy link

@Microos I add the $CAFFE_ROOT/python into your $PYTHONPATH now and i can now import caffe, it still tell me that it can not import _caffe: ImportError: No module named _caffe

@antran89
Copy link

What does

git merge -X theirs caffe/master
means? Can anyone explain?

@selinachenxi
Copy link

@antran89 just run the command, it will open for commit, you can write something or just ignore and close it. Everything will be fine.

@acmiyaguchi
Copy link

In case anyone is looking at this issue (since it is linked to from the README), I have some updated instructions for building this package against the latest CUDA on Ubuntu 17.10.

A copy of these instructions can be found in this gist.

Instructions

  • install cuda-9.0 (https://developer.nvidia.com/cuda-toolkit-archive)
  • install cudnn-7 (https://developer.nvidia.com/cudnn)
  • install dependencies for caffe
    • see the caffe installation guide [1] and wiki page [2]
  • install boost-1.65 or higher (http://www.boost.org/users/history/version_1_65_1.html) [3]
  • update caffe-fast-rcnn
    • Easiest way is to use a rebased branch on BVLC/master. See the notes below if you would rather merge the changes yourself.
    $ cd $FRCN_ROOT/caffe-fast-rcnn
    $ git remote add acmiyaguchi https://github.com/acmiyaguchi/caffe-fast-rcnn.git
    $ git checkout acmiyaguchi/faster-rcnn-rebased
    
    • optional: rebase against upstream master
    $ git remote add caffe https://github.com/BVLC/caffe.git
    $ git rebase caffe/master
    
  • set variables in Makefile.config
    • USE_CUDNN := 1
    • WITH_PYTHON_LAYER := 1
    • OPENCV_VERSION := 3
    • CUSTOM_CXX := g++-6
      • I symlink gcc-6 and g++-6 to gcc and g++ in /usr/local/cuda-9-0/bin/ instead of this setting, but either should work
    • remove compute_20 from CUDA_ARCH since it's been depreciated in cuda-9-0
    • add hdf5 for linking since it's been renamed in 17.10
      • INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/include/hdf5/serial/
      • LIBRARY_DIRS := $(PYTHON_LIB) /usr/lib/x86_64-linux-gnu/hdf5/serial/
  • build caffe
    • make -j${N_THREADS} all && make pycaffe
  • change self.param_str_ to self.param_str in $FRCN_ROOT/lib/rpn/proposal_layer.py.
    • see rebased-caffe.patch below for the full diff
  • validate that ./tools/demo.py works

Notes on Rebasing

BVLC/caffe and rbgirshick/caffe have diverged since this workaround [4] in March 2017. I've rebased the faster-rcnn patchset against the master caffe branch for convenience. [5]

This branch should merge cleanly into BVLC/caffe with the exception of the license.

If you rather merge the upstream changes yourself, the patch in [6] should outline the changes you need afterwards. -X theirs may not be the best merge policy. In particular, changes to the src/caffe/caffe.proto schema will overwrite the ROIPoolingParameter.

rebased-caffe.patch

diff --git a/lib/rpn/proposal_layer.py b/lib/rpn/proposal_layer.py
index b157160..e6d28cc 100644
--- a/lib/rpn/proposal_layer.py
+++ b/lib/rpn/proposal_layer.py
@@ -23,7 +23,7 @@ class ProposalLayer(caffe.Layer):
 
     def setup(self, bottom, top):
         # parse the layer parameter string, which must be valid YAML
-        layer_params = yaml.load(self.param_str_)
+        layer_params = yaml.load(self.param_str)
 
         self._feat_stride = layer_params['feat_stride']
         anchor_scales = layer_params.get('scales', (8, 16, 32))

@hualei-hualei
Copy link

@acmiyaguchi Thank you very much! This helps to solve my problems!

@guruvasuraj
Copy link

Hi,
I have issues running the ./demo.py with cuda-9.2 and cuDNN-7.1.2. Though I did not any issues while compiling tests, running any program under tools directory gives me the following error:

Traceback (most recent call last):
File "run_face_detection.py", line 9, in
from fast_rcnn.test import im_detect
File "/home/wiz/GPU_BACKUP/Face_speech_processor/Face_Detection/face-py-faster-rcnn/tools/../lib/fast_rcnn/test.py", line 17, in
from fast_rcnn.nms_wrapper import nms
File "/home/wiz/GPU_BACKUP/Face_speech_processor/Face_Detection/face-py-faster-rcnn/tools/../lib/fast_rcnn/nms_wrapper.py", line 9, in
from nms.gpu_nms import gpu_nms
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

I am not sure why it is looking for cuda-8.0. Is it mandatory to have cuda 8.0? Please advise.

Regards,
Guru.

@cewee
Copy link

cewee commented Aug 20, 2018

thanks to @acmiyaguchi i was able to create a docker file setting up a working version.
isn't pretty but might be useful for s.o.

https://gist.github.com/cewee/356b941a4006a502a67f68213f1a76b5

it's a modified version of:
https://hub.docker.com/r/duydv/caffe-faster-rcnn-cuda/~/dockerfile/

@navidre
Copy link

navidre commented Mar 23, 2020

Thank you @cewee.

It worked. I had to run the docker with the following command:

sudo docker run --gpus 8 -v /dev/null:/dev/raw1394 -it 242bdecfa254  /bin/bash

where 242bdecfa254 is the image ID and you can specify the number of GPUs.

The other thing is that the models can be downloaded with the following steps (original path has an issue):

cd data
wget https://dl.dropboxusercontent.com/s/o6ii098bu51d139/faster_rcnn_models.tgz 
tar zxvf faster_rcnn_models.tgz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests