Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Venkat Santhanam authored and venkai committed Apr 24, 2017
0 parents commit eb74aac
Show file tree
Hide file tree
Showing 33 changed files with 7,157 additions and 0 deletions.
57 changes: 57 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
## General

# Editor temporaries
*.swp
*~

# Sublime Text settings
*.sublime-workspace
*.sublime-project

# OS files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

## Caffe
caffe
caffe_rbdn
caffe_colorization
resources
training_log
testing_log
snapshot

# Data and models
data
Data
models
results
*.caffemodel
*.caffemodel.h5
*.solverstate
*.solverstate.h5
*.binaryproto
*leveldb
*lmdb

## Miscellaneous

# Compiled python
*.pyc

# Compiled MATLAB
*.mex*

# Archives
*.tar
*.tar.*
*.tgz
*.zip
*.rar
*.7z

38 changes: 38 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
Copyright (c) 2017, Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* If you use the work in scientific research or as part of a larger software
system, you are requested to cite the use in any related publications or
technical documentation. The work is based upon:

Santhanam, Venkataraman, Vlad I. Morariu, and Larry S. Davis.
"Generalized Deep Image to Image Regression."
arXiv preprint arXiv:1612.03268 (2016).

@article{santhanam2016generalized,
title={Generalized Deep Image to Image Regression},
author={Santhanam, Venkataraman and Morariu, Vlad I and Davis, Larry S},
journal={arXiv preprint arXiv:1612.03268},
year={2016}
}

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
54 changes: 54 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# RBDN (Recursively Branched Deconvolutional Network)
![placeholder](https://github.com/venkai/RBDN/blob/gh-pages/assets/placeholder.png)
**RBDN** is an architecture for [**Generalized Deep Image to Image Regression**](https://arxiv.org/abs/1612.03268) which features
* a memory-efficient recursive branched scheme with extensive parameter sharing that computes an early learnable multi-context representation of the input,
* end-to-end preservation of local correspondences from input to output and
* ability to choose context-vs-locality based on task as well as apply a per-pixel multi-context non-linearity.

## Architecture
![pipeline](https://github.com/venkai/RBDN/blob/gh-pages/assets/pipeline.png)
**RBDN** gives state-of-the-art performance on 3 diverse *image-to-image regression* tasks: **Denoising**, **Relighting**, **Colorization**.

## Installation & Usage

- **Clone:** Run `git clone -b master --single-branch https://github.com/venkai/RBDN.git`

- **Setup:** Go to repository `cd RBDN` and run `./setup.sh`. This will fetch caffe, download pretrained caffe models for all 3 experiments (**denoising/relighting/colorization**) and inference data, as well as set up the directory structure and symbolic links for all the training/inference scripts.

- **Install Caffe:** Note that `setup.sh` pulls 2 different branches of caffe into 2 separate directories: namely `caffe_colorization` used for **colorization** and `caffe_rbdn` which is used for both **denoising/relighting** experiments. Both these branches will eventually be merged with the master branch in [**venkai/caffe**](https://github.com/venkai/caffe). However for now, you would have to separately install both these caffe versions if you want to perform all 3 experiments.

- **Data:**

- Inference data is automatically downloaded by `setup.sh`.

- Training data/imglist for **relighting** experiment can be downloaded from either of these mirrors: [**[1]**](https://drive.google.com/file/d/0B3PoH3B39H2reWxzd3VDZDFVSlE/view?usp=sharing)/[**[2]**](https://drive.google.com/file/d/0B4c0dYlyY36JY3EwWUo3Y2MtNm8/view?usp=sharing)
This downloads the file `multipie.tar.gz`. Move it to `./data/training` and run `tar xvzf multipie.tar.gz && rm multipie.tar.gz`

- **Denoising/colorization** experiments use the same training data/imglist: which is every single *unresized* train & validation image from both [**ImageNet ILSVRC2012**](https://arxiv.org/abs/1409.0575) and [**MS-COCO2014**](http://mscoco.org/) whose smallest spatial dimension is greater than **128** (**~1.7 million** images in total). You can simply download these datasets from their respective sources and place/symlink them within `./data/training/` without any preprocessing whatsoever. Place the appropriate imglist in `./data/training/imgset/train.txt` with the image-paths in `train.txt` being relative to `./data/training`

- Note that data folders are not tracked by git.

- **Inference:** Each experiment (**denoising/relighting/colorization**) has its own folder in `./inference` that contains an experiment specific MATLAB inference script `get_pred.m` which uses the [**Matcaffe**](http://caffe.berkeleyvision.org/tutorial/interfaces.html#matlab) interface to evaluate pretrained models in `./models`. The script `./inference/run_matcaffe.sh` can be used to load caffe dependencies to `LD_LIBRARY_PATH` and then start MATLAB interactively.

- **Training:** Each experiment (**denoising/relighting/colorization**) has its own folder in `./training` that contain 2 key experiment specific scripts:
- `start_train.sh`: This starts training an **RBDN** model, either from scratch or from the most recent snapshot in the `snapshot` directory. You can pause training at any moment with `Ctrl+C` and most recent snapshot will be saved in `./snapshot/trn_iter_[*].solverstate`. Running `./start_train.sh` again will automatically resume from that snapshot.
- `run_bn.sh`: This takes the most recent snapshot in `./snapshot` and prepares it for inference by passing training data through the network and computing global mean/variance for all the *batch-normalization* layers in the network. The resulting inference-ready model is saved as `./tst_[ITER].caffemodel`, where `ITER` is the iteration corresponding to the most recent snapshot.

# License & Citation
**RBDN** is released under a variant of the [BSD 2-Clause license](https://github.com/venkai/RBDN/blob/master/LICENSE).

If you find **RBDN** useful in your research, please consider citing our paper:

```
@article{santhanam2016generalized,
title={Generalized Deep Image to Image Regression},
author={Santhanam, Venkataraman and Morariu, Vlad I and Davis, Larry S},
journal={arXiv preprint arXiv:1612.03268},
year={2016}
}
```

# Acknowledgments
* We would like to thank [Yangqing Jia](http://daggerfs.com/), [Evan Shelhamer](http://imaginarynumber.net/) and the [**BVLC/BAIR**](http://bair.berkeley.edu/) team for creating & maintaining [**caffe**](http://caffe.berkeleyvision.org/), [Richard Zhang](https://richzhang.github.io/) for [colorization layers in caffe](https://github.com/richzhang/colorization) and [Hyeonwoo Noh](http://cvlab.postech.ac.kr/~hyeonwoonoh/), [Seunghoon Hong](http://cvlab.postech.ac.kr/~maga33/), [Dmytro Mishkin](https://github.com/ducha-aiki) for several useful caffe layers, all of which were instrumental in creating **RBDN**.

* This research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA R&D Contract No. 2014-14071600012. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.
57 changes: 57 additions & 0 deletions inference/colorization/get_pred.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
function get_pred

config=create_config;
net = caffe.Net(config.model,config.weights,'test');
imgdir='../Data/colorization';
resdir='./results';
if ~isdir(resdir), mkdir(resdir); end
% Get image list
d=dir(strcat(imgdir,'/*'));
d={d.name}'; d=d(3:end);
for i=1:length(d)
imgfile=d{i};
dt=strfind(imgfile,'.'); dt=dt(end);
resfile=strcat(resdir,'/',imgfile(1:dt),'png');
%if exist(resfile,'file')==2, continue; end;
I=imread(strcat(imgdir,'/',imgfile));
% If input image is larger that 800*800 (i.e. # of pixels > 640000),
% then network doesn't fit in GPU memory. Of course this depends on how much
% memory you have and if you are operating in CPU/GPU mode.
if numel(I(:,:,1))>640000
if size(I,1)>=size(I,2)
I=imresize(I,[800,NaN],'bicubic');
else
I=imresize(I,[NaN,800],'bicubic');
end
end
if size(I,3)==1, I=repmat(I,[1,1,3]); end
%imshow(I);
[h,w,~]=size(I);
I=padarray(I,[64,64],'symmetric');
I=I(1:end-rem(end,32),1:end-rem(end,32),:);
img_lab=rgb2lab(I);
img=permute(img_lab(:,:,1),[2,1,3])-50;
fprintf('[%d/%d] Processing %s: [%d x %d]\n',i,length(d),imgfile,size(img,1),size(img,2));
net.blobs('data').reshape([size(img,1) size(img,2) 1 1]); % reshape blob 'data'
net.reshape(); %Reshape remaining blobs accordingly
cnn_input={single(img)};
pred=net.forward(cnn_input);
pred=imresize(permute(pred{1},[2,1,3]),4);
pred_lab=zeros(size(img_lab));
pred_lab(:,:,1)=img_lab(:,:,1);
pred_lab(:,:,2:3)=pred;
pred_rgb=(lab2rgb(pred_lab));
pred_rgb=pred_rgb(65:h+64,65:w+64,:);
imwrite(pred_rgb,resfile);
end
caffe.reset_all();

function config=create_config
config.model='./test.prototxt';
config.weights='../../models/rbdn_colorization.caffemodel';
config.caffe_root = '../caffe_colorization';
fprintf('initializing caffe..\n');
addpath(fullfile(config.caffe_root, 'matlab'));
config.gpuNum=0; caffe.set_mode_gpu(); caffe.set_device(config.gpuNum);
%caffe.set_mode_cpu();
caffe.reset_all();
51 changes: 51 additions & 0 deletions inference/colorization/get_pred_resize.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
function get_pred_resize(new_size)

% Use this instead of get_pred.m for colorizing very high-res images.
% All results in the paper were however generated with get_pred.m.


if nargin < 1, new_size=[224, 224]; end;
if numel(new_size)==1, new_size=[new_size, new_size]; end;

config=create_config;
net = caffe.Net(config.model,config.weights,'test');
imgdir='../Data/colorization';
resdir='./results';
if ~isdir(resdir), mkdir(resdir); end
% Get image list
d=dir(strcat(imgdir,'/*'));
d={d.name}'; d=d(3:end);
for i=1:length(d)
imgfile=d{i};
dt=strfind(imgfile,'.'); dt=dt(end);
resfile=strcat(resdir,'/',imgfile(1:dt),'png');
I=imread(strcat(imgdir,'/',imgfile));
if size(I,3)==1, I=repmat(I,[1,1,3]); end
img_lab=rgb2lab(I);
I_rz=rgb2lab(imresize(I,new_size,'bicubic'));
img=permute(I_rz(:,:,1),[2,1,3])-50; % H*W -> W*H
img=img(1:end-rem(end,32),1:end-rem(end,32));
fprintf('Processing %s: [%d x %d]\n',imgfile,size(img,1),size(img,2));
% Reshape blob 'data'
net.blobs('data').reshape([size(img,1) size(img,2) 1 1]);
net.reshape(); %Reshape remaining blobs accordingly
cnn_input={single(img)};
pred=net.forward(cnn_input);
pred=imresize(permute(pred{1},[2,1,3]),[size(I,1),size(I,2)]);
pred_lab=zeros(size(img_lab));
pred_lab(:,:,1)=img_lab(:,:,1);
pred_lab(:,:,2:3)=pred;
pred_rgb=(lab2rgb(pred_lab));
imwrite(pred_rgb,resfile);
end
caffe.reset_all();

function config=create_config
config.model='./test.prototxt';
config.weights='../../models/rbdn_colorization.caffemodel';
config.caffe_root = '../caffe_colorization';
fprintf('initializing caffe..\n');
addpath(fullfile(config.caffe_root, 'matlab'));
config.gpuNum=0; caffe.set_mode_gpu(); caffe.set_device(config.gpuNum);
%caffe.set_mode_cpu();
caffe.reset_all();
Loading

0 comments on commit eb74aac

Please sign in to comment.