Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Improved caffe converter #6822

Merged
merged 3 commits into from
Jun 28, 2017
Merged

Conversation

arikpoz
Copy link
Contributor

@arikpoz arikpoz commented Jun 26, 2017

Extended caffe-to-mxnet converter and improved converter test

  • Added support for networks which uses batch normalization without a scale layer following the batch norm, i.e. gamma is fixed to 1
  • Extended naming convention used when implementing batch normalization in caffe
  • Added support for old caffe versions where dilation didn't exist. This is needed to convert models which depends on old caffe
  • Added support for deconvolution layer
  • Added support for older version of caffe where kernel_size, pad and stride parameters were not iterable
  • Fixed crash happening when a bottom layer doesn't exist in the internal top_to_layers dictionary, this can happen if the name of the input is not 'data'
  • Added ignore-by-design support for converting 'Crop' layers
  • Fixed batch norm layer comparison to take into account the rescaling factor
  • Added careful condition in tester to swap (RGB,BGR) input channels only if they are of size 3 or 4, which is the same check the conversion does
  • Allow comparing layers of models with no mean file
  • Added support for comparing the parameters of deconvolution layers

@piiswrong
Copy link
Contributor

Please revert the submodule change.

@arikpoz
Copy link
Contributor Author

arikpoz commented Jun 26, 2017

Done.

@arikpoz
Copy link
Contributor Author

arikpoz commented Jun 26, 2017

build still fails, @piiswrong can you check the log to see if this is something I can fix?

@piiswrong
Copy link
Contributor

Error is due to you changed submodule: https://github.com/dmlc/mxnet/pull/6822/files

@arikpoz
Copy link
Contributor Author

arikpoz commented Jun 26, 2017

But I reverted this change, or so I thought.
And I think rebase from upstream made it in the first place..
Can you suggest how to fix?

@piiswrong
Copy link
Contributor

piiswrong commented Jun 26, 2017

Try checkout upstream master and cherry pick your changes.

- added support for networks which uses batch normalization without a scale layer following the batch norm, i.e. gamma is fixed to 1
- extended naming convention used when implementing batch normalization in caffe
- added support for old caffe versions where dilation didn't exist. This is needed to convert models which depends on old caffe
- added support for deconvolution layer
- added support for older version of caffe where kernel_size, pad and stride parameters were not iterable
- fixed crash happening when a bottom layer doesn't exist in the internal top_to_layers dictionary, this can happen if the name of the input is not 'data'
- added ignore-by-design support for converting 'Crop' layers
- fixed batch norm layer comparison to take into account the rescaling factor
- added careful condition in tester to swap (RGB,BGR) input channels only if they are of size 3 or 4, which is the same check the conversion does
- allow comparing layers of models with no mean file
- added support for comparing the parameters of deconvolution layers
@arikpoz arikpoz force-pushed the improved_caffe_converter branch from bb5af5c to c671c9e Compare June 26, 2017 21:02
@arikpoz
Copy link
Contributor Author

arikpoz commented Jun 27, 2017

submodules issue seem resolved.

@piiswrong
Copy link
Contributor

Is caffe converter tested? Maybe we should setup CI?

@mli Could you have a look?

@arikpoz
Copy link
Contributor Author

arikpoz commented Jun 27, 2017

The only error is due to test timeout, can someone check this, maybe update the timeout?

@piiswrong piiswrong requested a review from mli June 27, 2017 17:14
@mli
Copy link
Contributor

mli commented Jun 27, 2017

the caffe converter is tested in the CI, but we only tried to convert a few commonly cnns, such as vgg/resnet. Each test is expensive.

@arikpoz do you have a better idea how to test the converter?

@arikpoz
Copy link
Contributor Author

arikpoz commented Jun 27, 2017

The current test seems good to me, checking both performance and layer by layer outputs.

What takes times is the performance test since we run inference on many images.
To save time, we can remove the performance test, and trust that if the layer by layer outputs are close enough, then the end performance should be the same.

The layer by layer test is done on a single image comparing all the outputs (and network parameters) of the caffe network and the mxnet network. Since we run inference on one image only, it is very fast to execute.

my 2c.

@piiswrong piiswrong merged commit 8c81ee4 into apache:master Jun 28, 2017
@arikpoz arikpoz deleted the improved_caffe_converter branch June 29, 2017 05:12
Guneet-Dhillon pushed a commit to Guneet-Dhillon/mxnet that referenced this pull request Sep 13, 2017
…he#6822)

- added support for networks which uses batch normalization without a scale layer following the batch norm, i.e. gamma is fixed to 1
- extended naming convention used when implementing batch normalization in caffe
- added support for old caffe versions where dilation didn't exist. This is needed to convert models which depends on old caffe
- added support for deconvolution layer
- added support for older version of caffe where kernel_size, pad and stride parameters were not iterable
- fixed crash happening when a bottom layer doesn't exist in the internal top_to_layers dictionary, this can happen if the name of the input is not 'data'
- added ignore-by-design support for converting 'Crop' layers
- fixed batch norm layer comparison to take into account the rescaling factor
- added careful condition in tester to swap (RGB,BGR) input channels only if they are of size 3 or 4, which is the same check the conversion does
- allow comparing layers of models with no mean file
- added support for comparing the parameters of deconvolution layers
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants