This is a caffe version implementation of a hash network(DNNH/NINH) for similarity-based visual research.
The hash network is based on this paper: Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. Simultaneous feature learning and hash coding with deep neural networks, CVPR 2015.
For more details about the motivation, approaches, implementation, results&analysis and further improvements, please read my post. Any feedback is welcome!
- Deploy: Given the definition of loss layer, deploy the deep hashing pipeline on linux.
- Train: Write prototxt to define dnnh and bash files to execute for training on preprocessed triplet CIFAR-10 dataset.
- Test/Evaluate: Write prototxt to encode images and bash files to execute for image retrieval. Implement the metric of mean average precision (mAP) for evaluation.
- Analysis: Draw performances for 12-bit, 24-bit and 48-bit hash code and make some analysis.
- Presentation: Prepare a slide to show my work.
Hash training needs triplet data input. Here I use the triplet CIFAR-10 dataset. To obtain it:
- You can directly download the related zip file
cifar_hash_dataset.7z
from BaiduYun or OneDrive and extract it intocaffe-dnnh\runtime\cifar_hash_dataset
. - Or you can process the data by yourself. Scripts are provided for reference in
caffe-dnnh\runtime\cifar_hash_dataset_process_scripts\
.
You may directly download my caffe-dnnh zip and deploy (may need to fix errors due to different environment and version). Or you can follow the instructions to add files/contents to the newest caffe release. Here CAFFE-ROOT
refers to your root caffe directory and caffe-dnnh
to mine.
- Add file
caffe-dnnh/src/caffe/layers/triplet_ranking_hinge_loss_layer.cpp
to pathCAFFE-ROOT/src/caffe/layers
and filecaffe-dnnh/include/caffe/layers/triplet_ranking_hinge_loss_layer.hpp
to pathCAFFE-ROOT/include/caffe/layers
. - Modify file
CAFFE-ROOT/src/caffe/proto/caffe.proto
:- Add the following code directly.
// Message that stores parameters used by TripletRankingHingeLossLayer
message TripletRankingHingeLossParameter{
//Dimension for computing
optional int32 dim = 1 [default = 10];
//Margin
optional float margin = 2 [default = 1];
}
- Find
message LayerParameter
, addoptional TripletRankingHingeLossParameter triplet_ranking_hinge_loss_param = 151;
in it. - Find
message V1LayerParameter
, addoptional TripletRankingHingeLossParameter triplet_ranking_hinge_loss_param = 43;
in it. - Find
enum LayerType
inmessage V1LayerParameter
, addTRIPLET_RANKING_HINGE_LOSS=40;
in it.
Attention: the number above like 151, 43 are ID and should not be conflict with others. Search next available
in caffe.proto
you will find comment like // SolverParameter next available ID: 42 (last added: layer_wise_reduce)
and // LayerParameter next available layer-specific ID: 147 (last added: recurrent_param)
. Use next available ID and update the comment.
3. Add folder caffe-dnnh/runtime
to path CAFFE-ROOT/
.
4. Modify file CAFFE-ROOT/tools/caffe.cpp
refer to caffe-dnnh/tools/caffe.cpp
: Search ++++++++++
in caffe-dnnh/tools/caffe.cpp
and you will find what I add.
Attention: For CPU/GPU mode switch
- check
CPU_ONLY := 1
inCAFFE-ROOT/Makefile.config
- In folder
CAFFE-ROOT/runtime/
: checksolver_mode: GPU
in allsolver.prototxt
files (e.g.CAFFE-ROOT/runtime/12bit/train12_solver.prototxt
), check-gpu=0
in allrun_test.sh
files (e.g.CAFFE-ROOT/runtime/12bit/run_test.sh
)
Then follow the official Installation instructions to compile. Good luck!
cd caffe-dnnh/runtime/12bit # or: 24bit, 48bit
sh ./run_train.sh # or: sh ./resume_train.sh
run_train.sh
train deep hash neural network defined in prototxt and result models are stored in path caffe-dnnh/runtime/model
. You can modify parameters like max iteration, snapshot in solver prototxt. Also note that tens of thousands iterations take time, so you are recommended to train with GPU mode in the background like nohup sh ./run_train.sh &
and check output with command tail -100 nohup.out
. Read corresponding files for more details.
cd caffe-dnnh/runtime/12bit # or: 24bit, 48bit
sh ./run_test.sh
run_test.sh
: uses forward pass of dnnh defined in test12_query.prototxt
and test12_pool.prototxt
to encode query images and pool set images. Then compile and run CAFFE-ROOT/runtime/evaluate_map.cpp
for image retrieval evaluation. You can modify parameters (e.g. ITER
in run_test.sh
and top_neighbor_num
in evaluate_map.cpp
). Read corresponding files for more details.
I really appreciate their works!
- Dr.Tao Mei draw an outline of this research for me.
- The triplet ranking hinge loss layer is implemented by @FuchenUSTC in his caffe repository.
- Preprocessed triplet CIFAR-10 dataset and related scripts are shared by @FuchenUSTC. Read my post#dataset for more details about its structure so as to understand the structure of DNNH defined in prototxt.
- Networks structure and parameters are refered to codes_triplet_hashing1.zip provide by first author Hanjiang Lai.