This repository contains the source code under TensorFlow2.0 framework and models trained on ImageNet 2012 dataset for the following paper:
@InProceedings{Li_2018_CVPR,
author = {Li, Peihua and Xie, Jiangtao and Wang, Qilong and Gao, Zilin},
title = {Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization},
booktitle = { IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}
This paper concerns an iterative matrix square root normalization network (called fast MPN-COV), which is very efficient, fit for large-scale datasets, as opposed to its predecessor (i.e., MPN-COV) published in ICCV17) that performs matrix power normalization by Eigen-decompositon. If you use the code, please cite this fast MPN-COV work and its predecessor (i.e., MPN-COV).
Network | Dim | Top1_err/Top5_err | Pre-trained models (tensorflow) |
|||
paper | reproduce | |||||
tensorflow | pytorch | GoogleDrive | BaiduDrive | |||
fast-MPN-COV-VGG-D | 32K | 26.55/8.94 | 23.98/7.12 | 23.98/7.12 | 650.4M | 650.4M |
fast-MPN-COV-ResNet50 | 22.14/6.22 | 21.57/6.14 | 21.71/6.13 | 217.3M | 217.3M | |
fast-MPN-COV-ResNet101 | 21.21/5.68 | 20.50/5.45 | 20.99/5.56 | 289.9M | 289.9M |
- We convert the trained fast-MPNCOV-VGG-D model from the PyTorch framework to TensorFlow framework.
Network | Dim | CUB | Aircraft | Cars | |||
paper | reproduce (tensorflow) |
paper | reproduce (tensorflow) |
paper | reproduce (tensorflow) |
||
fast-MPNCOV-COV-VGG-D | 32K | 87.2 | 86.95 | 90.0 | 91.74 | 92.5 | 92.95 |
fast-MPNCOV-COV-ResNet50 | 88.1 | 87.6 | 90.0 | 90.5 | 92.8 | 93.2 | |
fast-MPNCOV-COV-ResNet101 | 88.7 | 88.1 | 91.4 | 91.8 | 93.3 | 93.9 |
- Our method uses neither bounding boxes nor part annotations
- The reproduced results are obtained by simply finetuning our pre-trained fast MPN-COV-ResNet model with a small learning rate, which do not perform SVM as our paper described.
parameter setting
fast-MPNCOV-VGG-D: weightdecay=1e-4, batchsize=10, learningrate=3e-3 for all layers except the FC layer(which is 5×learningrate, and the learning rate is reduced to 3e-4 at epoch 20(FC: 5×3e-4)
We implement our Fast MPN-COV (i.e., iSQRT-COV) meta-layer under Tensorflow2.0 package. We release two versions of code:
- The backpropagation of our meta-layer without using autograd package;
- The backpropagation of our meta-layer with using autograd package(TODO).
For making our Fast MPN-COV meta layer can be added in a network conveniently, we divide any network for three parts:
- features extractor;
- global image representation;
- classifier.
As such, we can arbitrarily combine a network with our Fast MPN-COV or some other global image representation methods (e.g.,Global average pooling, Bilinear pooling, Compact bilinear pooling, etc.)
- Install Tensorflow (2.0.0b0)
- type
git clone https://github.com/XuChunqiao/Tensorflow-Fast-MPNCOV
- prepare the dataset as follows
.
├── train
│ ├── class1
│ │ ├── class1_001.jpg
│ │ ├── class1_002.jpg
| | └── ...
│ ├── class2
│ ├── class3
│ ├── ...
│ ├── ...
│ └── classN
└── val
├── class1
│ ├── class1_001.jpg
│ ├── class1_002.jpg
| └── ...
├── class2
├── class3
├── ...
├── ...
└── classN
cp ./trainingFromScratch/imagenet/imagenet_tfrecords.py ./
- modify the dataset path and run
python imagenet_tfrecords.py
to create tfrecord files - modify the parameters in train.sh
sh train.sh
- modify the parameters in finetune.sh
sh finetune.sh