ShuffleNet

An implementation of ShuffleNet introduced in in TensorFlow. According to the authors, ShuffleNet is a computationally efficient CNN architecture designed specifically for mobile devices with very limited computing power. It outperforms Google MobileNet by small error percentage at much lower FLOPs.

Link to the original paper: ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

ShuffleNet Unit

Group Convolutions

The paper uses the group convolution operator. However, that operator is not implemented in TensorFlow backend. So, I implemented the operator using graph operations. Despite the fact that this is the same operator as the one stated in the paper, it lead to slower performance than the regular convolution. So, to get the same performance stated in the paper, CuDNN efficient implementation for the operator should be done. """CALL FOR CONTRIBUTION"""

This issue was discussed here: Support Channel groups in convolutional layers #10482

Channel Shuffling

Channel Shuffling can be achieved by applying three operations:

Reshaping the input tensor from (N, H, W, C) into (N, H, W, G, C').
Performing matrix transpose operation on the two dimensions (G, C').
Reshaping the tensor back into (N, H, W, C).

N: Batch size, H: Feature map height, W: Feature map width, C: Number of channels, G: Number of groups, C': Number of channels / Number of groups

Note that: The number of channels should be divisible by the number of groups.

Usage

Main Dependencies

tensorflow 1.3.0
numpy 1.13.1
tqdm 4.15.0
bunch 1.0.1
matplotlib 2.0.2

Train and Test

Prepare your data, and modify the data_loader.py/DataLoader/load_data() method.
Modify the config/test.json to meet your needs.

Run

python main.py config/test.json

Results

The model have successfully overfitted TinyImageNet-200 that was presented in CS231n - Convolutional Neural Networks for Visual Recognition. I'm working on ImageNet training..

Benchmarking

The paper has achieved 140 MFLOPs using the vanilla version. Using the group convolution operator implemented in TensorFlow, I have achieved approximately 270 MFLOPs.

To calculate the FLOPs in TensorFlow, make sure to set the batch size equal to 1, and execute the following line when the model is loaded into memory.

tf.profiler.profile(
        tf.get_default_graph(),
        options=tf.profiler.ProfileOptionBuilder.float_operation(), cmd='scope')

TODO

Training on ImageNet dataset. In progress...

Updates

Inference and training are working properly.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

Thanks for all who helped me in my work and special thanks for my colleagues: Mo'men Abdelrazek, and Mohamed Zahran.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
config		config
data		data
figures		figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_loader.py		data_loader.py
layers.py		layers.py
main.py		main.py
model.py		model.py
summarizer.py		summarizer.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ShuffleNet

ShuffleNet Unit

Group Convolutions

Channel Shuffling

Channel Shuffling can be achieved by applying three operations:

Usage

Main Dependencies

Train and Test

Run

Results

Benchmarking

TODO

Updates

License

Acknowledgments

About

Releases

Packages

Languages

License

ObjectDetection/ShuffleNet

Folders and files

Latest commit

History

Repository files navigation

ShuffleNet

ShuffleNet Unit

Group Convolutions

Channel Shuffling

Channel Shuffling can be achieved by applying three operations:

Usage

Main Dependencies

Train and Test

Run

Results

Benchmarking

TODO

Updates

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages