LiBai531 / caffe-gdp

personal caffe version which enables global and dynamic pruning of CNN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Caffe-GDP

Caffe-GDP is a branch of Caffe, which adds a few lines of code in order to enable a global and dynamic filter pruning (GDP) on convolution layers of typical CNN architecture, as described in a newly accepted paper at IJCAI18, Accelerating Convolutional Networks via Global & Dynamic Filter Pruning. The paper mentioned is based on TensorFlow originally.

The following will introduce how GDP is implemented based on the original Caffe framework, then a guidance to perform GDP on a typical CNN. If you do not care about details, feel free to skip the first part.

Inplementation

New members added to the data structure are listed as below.

New Members
Blob
vector<Dtype*> filter_contribution_2D_; the channel-wise contribution of filters at the convolution layer
vector<Dtype> filter_contrib_; the contribution of filters at the convolution layer
vector<int> filter_mask_; the mask of filters at the convolution layer
Net
vector<int> conv_layer_ids_; the IDs of convolution layers in the net
int num_filter_total_; the total numbers of filters in the net
vector<Dtype> filter_contrib_total_; the collection of filter-wise contribution in the net
BaseConvolutionLayer
shared_ptr<Blob<Dtype> > masked_weight_; the masked weight blob which takes part in forward and backward
Solver
is_pruning, etc newly added super-parameters for GDP at caffe.proto

Caffe-GDP's iteration is different that it updates the mask of weight blob according to the ranking of all filters' contribution right after backward propagation and masks the weight blob before forward operation of the next iteration.

Instruction

Here are an introduction of newly added super-parameters at caffe.proto.

Super-Parameters Meaning Default
is_pruning whether to perform GDP false
pruning_rate the remaining proportion of filters after GDP 1.0
mask_updating_step total steps of mask updating interval 1
mask_updating_stepsize how often is the mask updating interval changed 1000
mi_policy the pattern of how the mask updating interval is changed ("exp"/"minus") "minus"
log_type the pattern of how the mask is printed ("debug"/"release") "debug"
log_name the pattern of how the log of printed mask is named "mask.log"

The following is a guideline to perform GDP to a typical CNN, taking LeNet5 as an example.

1 Firstly enter the root directory of Caffe-GDP and train the net from scratch as usual

./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt

2 Then turn on the GDP at prototxt and set the necessary parameters

./build/tools/caffe train -solver examples/mnist/lenet_solver_pruning.prototxt -weights examples/mnist/lenet_iter_10000.caffemodel

3 Run a Python script to cut the caffemodel aotomatically according to mask.log

python ./python/auto_caffemodel_pruning.py

4 (Optional) Fine-Tune

./build/tools/caffe train -solver examples/mnist/lenet_solver_finetune.prototxt -weights examples/mnist/lenet_iter_3000_pruned.caffemodel

Now when GDP is finished, we get a caffemodel of about 799kB (pruning_rate: 0.5), which is only 47.4% of the original 1684kB with an accuracy of 98.91% compared to 99.02%.

GDP is a learnable pruning method for the typical CNN architecture, which make the net much thinner and faster, while maintaining the original level of accuracy.

Caffe

Build Status License

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/The Berkeley Vision and Learning Center (BVLC) and community contributors.

Check out the project site for all the details like

and step-by-step examples.

Custom distributions

Community

Join the chat at https://gitter.im/BVLC/caffe

Please join the caffe-users group or gitter chat to ask questions and talk about methods and models. Framework development discussions and thorough bug reports are collected on Issues.

Happy brewing!

License and Citation

Caffe is released under the BSD 2-Clause license. The BAIR/BVLC reference models are released for unrestricted use.

Please cite Caffe in your publications if it helps your research:

@article{jia2014caffe,
  Author = {Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor},
  Journal = {arXiv preprint arXiv:1408.5093},
  Title = {Caffe: Convolutional Architecture for Fast Feature Embedding},
  Year = {2014}
}

About

personal caffe version which enables global and dynamic pruning of CNN

License:Other


Languages

Language:C++ 80.0%Language:Python 9.2%Language:Cuda 5.9%Language:CMake 2.8%Language:MATLAB 0.9%Language:Makefile 0.7%Language:Shell 0.4%Language:Dockerfile 0.1%