CAS-CLab / Optimal-Ternary-Weights-Approximation

Caffe implementation of Optimal-Ternary-Weights-Approximation in "Two-Step Quantization for Low-bit Neural Networks" (CVPR2018).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimal Ternary Weights Approximation

Caffe implementation of Optimal-Ternary-Weights-Approximation in "Two-Step Quantization for Low-bit Neural Networks" (CVPR2018).

Objective Function

equation

where equation and equation.

Weight Blob

We use a temporary memory block to store equation and keep equation in the this->blobs_[0]. During the backwardpropagation, equation was used in the gradient accumulation and equation was used in the calculation of bottom gradients.

How to use ?

change type: "Convolution" into type: "TernaryConvolution", e.g.

layer {
    bottom: "pool1"
    top: "res2a_branch1"
    name: "res2a_branch1"
    type: "TernaryConvolution"
    convolution_param {
        num_output: 64
        kernel_size: 1
        pad: 0
        stride: 1
        weight_filler {
            type: "msra"
        }
        bias_term: false
    }
}

So far, GPU only.

2-bit Activation Quantization

Please refer to wps712.

About

Caffe implementation of Optimal-Ternary-Weights-Approximation in "Two-Step Quantization for Low-bit Neural Networks" (CVPR2018).

License:BSD 2-Clause "Simplified" License


Languages

Language:Cuda 53.6%Language:C++ 46.4%