Initial commit

venkai · Apr 24, 2017 · eb74aac · eb74aac
commit eb74aac
Show file tree

Hide file tree

Showing 33 changed files with 7,157 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,57 @@
+## General
+
+# Editor temporaries
+*.swp
+*~
+
+# Sublime Text settings
+*.sublime-workspace
+*.sublime-project
+
+# OS files
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
+
+## Caffe
+caffe
+caffe_rbdn
+caffe_colorization
+resources
+training_log
+testing_log
+snapshot
+
+# Data and models
+data
+Data
+models
+results
+*.caffemodel
+*.caffemodel.h5
+*.solverstate
+*.solverstate.h5
+*.binaryproto
+*leveldb
+*lmdb
+
+## Miscellaneous
+
+# Compiled python
+*.pyc
+
+# Compiled MATLAB
+*.mex*
+
+# Archives
+*.tar
+*.tar.*
+*.tgz
+*.zip
+*.rar
+*.7z
+
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,38 @@
+Copyright (c) 2017, Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+  list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+
+* If you use the work in scientific research or as part of a larger software
+  system, you are requested to cite the use in any related publications or
+  technical documentation. The work is based upon:
+
+    Santhanam, Venkataraman, Vlad I. Morariu, and Larry S. Davis.
+    "Generalized Deep Image to Image Regression."
+    arXiv preprint arXiv:1612.03268 (2016).
+
+    @article{santhanam2016generalized,
+      title={Generalized Deep Image to Image Regression},
+      author={Santhanam, Venkataraman and Morariu, Vlad I and Davis, Larry S},
+      journal={arXiv preprint arXiv:1612.03268},
+      year={2016}
+    }
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/README.md b/README.md
@@ -0,0 +1,54 @@
+# RBDN (Recursively Branched Deconvolutional Network)
+![placeholder](https://github.com/venkai/RBDN/blob/gh-pages/assets/placeholder.png)
+**RBDN** is an architecture for [**Generalized Deep Image to Image Regression**](https://arxiv.org/abs/1612.03268) which features 
+* a memory-efficient recursive branched scheme with extensive parameter sharing that computes an early learnable multi-context representation of the input, 
+* end-to-end preservation of local correspondences from input to output and 
+* ability to choose context-vs-locality based on task as well as apply a per-pixel multi-context non-linearity. 
+
+## Architecture
+![pipeline](https://github.com/venkai/RBDN/blob/gh-pages/assets/pipeline.png)
+**RBDN** gives state-of-the-art performance on 3 diverse *image-to-image regression* tasks: **Denoising**, **Relighting**, **Colorization**.
+
+## Installation & Usage
+
+- **Clone:** Run `git clone -b master --single-branch  https://github.com/venkai/RBDN.git`
+
+- **Setup:** Go to repository `cd RBDN` and run `./setup.sh`. This will fetch caffe, download pretrained caffe models for all 3 experiments (**denoising/relighting/colorization**) and inference data, as well as set up the directory structure and symbolic links for all the training/inference scripts.
+
+- **Install Caffe:** Note that `setup.sh` pulls 2 different branches of caffe into 2 separate directories: namely `caffe_colorization` used for **colorization** and `caffe_rbdn` which is used for both **denoising/relighting** experiments. Both these branches will eventually be merged with the master branch in [**venkai/caffe**](https://github.com/venkai/caffe). However for now, you would have to separately install both these caffe versions if you want to perform all 3 experiments.
+
+- **Data:** 
+
+  - Inference data is automatically downloaded by `setup.sh`.
+
+  - Training data/imglist for **relighting** experiment can be downloaded from either of these mirrors: [**[1]**](https://drive.google.com/file/d/0B3PoH3B39H2reWxzd3VDZDFVSlE/view?usp=sharing)/[**[2]**](https://drive.google.com/file/d/0B4c0dYlyY36JY3EwWUo3Y2MtNm8/view?usp=sharing)  
+  This downloads the file `multipie.tar.gz`. Move it to `./data/training` and run `tar xvzf multipie.tar.gz && rm multipie.tar.gz`
+
+  - **Denoising/colorization** experiments use the same training data/imglist: which is every single *unresized* train & validation image from both [**ImageNet ILSVRC2012**](https://arxiv.org/abs/1409.0575) and [**MS-COCO2014**](http://mscoco.org/) whose smallest spatial dimension is greater than **128** (**~1.7 million** images in total). You can simply download these datasets from their respective sources and place/symlink them within `./data/training/` without any preprocessing whatsoever. Place the appropriate imglist in `./data/training/imgset/train.txt` with the image-paths in `train.txt` being relative to `./data/training` 
+
+  - Note that data folders are not tracked by git.
+
+- **Inference:** Each experiment (**denoising/relighting/colorization**) has its own folder in `./inference` that contains an experiment specific MATLAB inference script `get_pred.m` which uses the [**Matcaffe**](http://caffe.berkeleyvision.org/tutorial/interfaces.html#matlab) interface to evaluate pretrained models in `./models`. The script `./inference/run_matcaffe.sh` can be used to load caffe dependencies to `LD_LIBRARY_PATH` and then start MATLAB interactively.
+
+- **Training:** Each experiment (**denoising/relighting/colorization**) has its own folder in `./training` that contain 2 key experiment specific scripts:
+  - `start_train.sh`: This starts training an **RBDN** model, either from scratch or from the most recent snapshot in the `snapshot` directory. You can pause training at any moment with `Ctrl+C` and most recent snapshot will be saved in `./snapshot/trn_iter_[*].solverstate`. Running `./start_train.sh` again will automatically resume from that snapshot. 
+  - `run_bn.sh`: This takes the most recent snapshot in `./snapshot` and prepares it for inference by passing training data through the network and computing global mean/variance for all the *batch-normalization* layers in the network. The resulting inference-ready model is saved as `./tst_[ITER].caffemodel`, where `ITER` is the iteration corresponding to the most recent snapshot.
+
+# License & Citation
+**RBDN** is released under a variant of the [BSD 2-Clause license](https://github.com/venkai/RBDN/blob/master/LICENSE). 
+
+If you find **RBDN** useful in your research, please consider citing our paper:
+
+```
+@article{santhanam2016generalized,
+  title={Generalized Deep Image to Image Regression},
+  author={Santhanam, Venkataraman and Morariu, Vlad I and Davis, Larry S},
+  journal={arXiv preprint arXiv:1612.03268},
+  year={2016}
+}
+```
+
+# Acknowledgments
+* We would like to thank [Yangqing Jia](http://daggerfs.com/), [Evan Shelhamer](http://imaginarynumber.net/) and the [**BVLC/BAIR**](http://bair.berkeley.edu/) team for creating & maintaining [**caffe**](http://caffe.berkeleyvision.org/), [Richard Zhang](https://richzhang.github.io/) for [colorization layers in caffe](https://github.com/richzhang/colorization) and [Hyeonwoo Noh](http://cvlab.postech.ac.kr/~hyeonwoonoh/), [Seunghoon Hong](http://cvlab.postech.ac.kr/~maga33/), [Dmytro Mishkin](https://github.com/ducha-aiki) for several useful caffe layers, all of which were instrumental in creating **RBDN**. 
+
+* This research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA R&D Contract No. 2014-14071600012. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.
diff --git a/inference/colorization/get_pred.m b/inference/colorization/get_pred.m
@@ -0,0 +1,57 @@
+function get_pred
+
+config=create_config;
+net = caffe.Net(config.model,config.weights,'test');
+imgdir='../Data/colorization';
+resdir='./results';
+if ~isdir(resdir), mkdir(resdir); end
+% Get image list
+d=dir(strcat(imgdir,'/*'));
+d={d.name}'; d=d(3:end);
+for i=1:length(d)
+    imgfile=d{i};
+    dt=strfind(imgfile,'.'); dt=dt(end);
+    resfile=strcat(resdir,'/',imgfile(1:dt),'png');
+    %if exist(resfile,'file')==2, continue; end;
+    I=imread(strcat(imgdir,'/',imgfile));
+    % If input image is larger that 800*800 (i.e. # of pixels > 640000),
+    % then network doesn't fit in GPU memory. Of course this depends on how much
+    % memory you have and if you are operating in CPU/GPU mode.
+    if numel(I(:,:,1))>640000
+      if size(I,1)>=size(I,2) 
+        I=imresize(I,[800,NaN],'bicubic');
+      else 
+        I=imresize(I,[NaN,800],'bicubic');
+      end
+    end
+    if size(I,3)==1, I=repmat(I,[1,1,3]); end
+    %imshow(I);
+    [h,w,~]=size(I);
+    I=padarray(I,[64,64],'symmetric');
+    I=I(1:end-rem(end,32),1:end-rem(end,32),:);
+    img_lab=rgb2lab(I);
+    img=permute(img_lab(:,:,1),[2,1,3])-50;
+    fprintf('[%d/%d] Processing %s: [%d x %d]\n',i,length(d),imgfile,size(img,1),size(img,2));
+    net.blobs('data').reshape([size(img,1) size(img,2) 1 1]); % reshape blob 'data'
+    net.reshape(); %Reshape remaining blobs accordingly
+    cnn_input={single(img)};
+    pred=net.forward(cnn_input);
+    pred=imresize(permute(pred{1},[2,1,3]),4);
+    pred_lab=zeros(size(img_lab));
+    pred_lab(:,:,1)=img_lab(:,:,1);
+    pred_lab(:,:,2:3)=pred;
+    pred_rgb=(lab2rgb(pred_lab));
+    pred_rgb=pred_rgb(65:h+64,65:w+64,:);
+    imwrite(pred_rgb,resfile);
+end
+caffe.reset_all();
+
+function config=create_config
+config.model='./test.prototxt';
+config.weights='../../models/rbdn_colorization.caffemodel';
+config.caffe_root = '../caffe_colorization';
+fprintf('initializing caffe..\n');
+addpath(fullfile(config.caffe_root, 'matlab'));
+config.gpuNum=0; caffe.set_mode_gpu(); caffe.set_device(config.gpuNum);
+%caffe.set_mode_cpu();
+caffe.reset_all();
diff --git a/inference/colorization/get_pred_resize.m b/inference/colorization/get_pred_resize.m
@@ -0,0 +1,51 @@
+function get_pred_resize(new_size)
+
+% Use this instead of get_pred.m for colorizing very high-res images.
+% All results in the paper were however generated with get_pred.m.
+
+
+if nargin < 1, new_size=[224, 224]; end;
+if numel(new_size)==1, new_size=[new_size, new_size]; end;
+
+config=create_config;
+net = caffe.Net(config.model,config.weights,'test');
+imgdir='../Data/colorization';
+resdir='./results';
+if ~isdir(resdir), mkdir(resdir); end
+% Get image list
+d=dir(strcat(imgdir,'/*'));
+d={d.name}'; d=d(3:end);
+for i=1:length(d)
+    imgfile=d{i};
+    dt=strfind(imgfile,'.'); dt=dt(end);
+    resfile=strcat(resdir,'/',imgfile(1:dt),'png');
+    I=imread(strcat(imgdir,'/',imgfile));
+    if size(I,3)==1, I=repmat(I,[1,1,3]); end
+    img_lab=rgb2lab(I);
+    I_rz=rgb2lab(imresize(I,new_size,'bicubic'));
+    img=permute(I_rz(:,:,1),[2,1,3])-50; % H*W -> W*H
+    img=img(1:end-rem(end,32),1:end-rem(end,32));
+    fprintf('Processing %s: [%d x %d]\n',imgfile,size(img,1),size(img,2));
+    % Reshape blob 'data'
+    net.blobs('data').reshape([size(img,1) size(img,2) 1 1]);
+    net.reshape(); %Reshape remaining blobs accordingly
+    cnn_input={single(img)};
+    pred=net.forward(cnn_input);
+    pred=imresize(permute(pred{1},[2,1,3]),[size(I,1),size(I,2)]);
+    pred_lab=zeros(size(img_lab));
+    pred_lab(:,:,1)=img_lab(:,:,1);
+    pred_lab(:,:,2:3)=pred;
+    pred_rgb=(lab2rgb(pred_lab));
+    imwrite(pred_rgb,resfile);
+end
+caffe.reset_all();
+
+function config=create_config
+config.model='./test.prototxt';
+config.weights='../../models/rbdn_colorization.caffemodel';
+config.caffe_root = '../caffe_colorization';
+fprintf('initializing caffe..\n');
+addpath(fullfile(config.caffe_root, 'matlab'));
+config.gpuNum=0; caffe.set_mode_gpu(); caffe.set_device(config.gpuNum);
+%caffe.set_mode_cpu();
+caffe.reset_all();