dabnn

Enjoy binary neural networks on mobile!

Join chat at Gitter (English) or QQ Group (Chinese, 1021964010, answer: nndab)

Our ACM MM paper: https://arxiv.org/abs/1908.05858

Introduction

Binary neural networks (BNNs) have great potential on edge devices since they replace float operations by efficient bit-wise operations. However, to leverage the efficiency of bit-wise operations, the reimplmentation of convolution layer and also other layers is needed.

To our best knowledge, dabnn is the first highly-optimized binary neural networks inference framework for mobile platform. We implemented binary convolutions with ARM assembly. On Google Pixel 1, our dabnn is as 800%~2400% faster as BMXNet (the only one open-sourced BNN inference framework except dabnn to our best knowledge) on a single binary convolution, and as about 700% faster as it on binarized ResNet-18.

Benchmark and Comparison

Benchmark result on Google Pixel 1 (single thread):

2019-05-06 10:36:48
Running data/local/tmp/dabnn_benchmark
Run on (4 X 1593.6 MHz CPU s)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
--------------------------------------------------------------------
Benchmark                             Time           CPU Iterations
--------------------------------------------------------------------
dabnn_5x5_256                   3661928 ns    3638192 ns        191     <--- input: 14*14*256, kernel: 256*5*5*256, output: 14*14*256, padding: 2
dabnn_3x3_64                    1306391 ns    1281553 ns        546     <--- input: 56*56*64,  kernel: 64*3*3*64, output: 56*56*64, padding: 1
dabnn_3x3_128                    958388 ns     954754 ns        735     <--- input: 28*28*128, kernel: 128*3*3*128, output: 28*28*128, padding: 1
dabnn_3x3_256                    975123 ns     969810 ns        691     <--- input: 14*14*256, kernel: 256*3*3*256, output: 14*14*256, padding: 1
dabnn_3x3_256_s2                 268310 ns     267712 ns       2618     <--- input: 14*14*256, kernel: 256*3*3*256, output: 7*7*256, padding: 1, stride: 2
dabnn_3x3_512                   1281832 ns    1253921 ns        588     <--- input:  7* 7*512, kernel: 512*3*3*512, output:  7* 7*512, padding: 1
dabnn_bireal18_imagenet        61920154 ns   61339185 ns         10     <--- Bi-Real Net 18, 56.4% top-1 on ImageNet
dabnn_bireal18_imagenet_stem   43294019 ns   41401923 ns         14     <--- Bi-Real Net 18 with stem module (The network structure is described in detail in [our paper](https://arxiv.org/abs/1908.05858)), 56.4% top-1 on ImageNet

The following is the comparison between our dabnn and Caffe (full precision), TensorFlow Lite (full precision) and BMXNet (binary). We surprisingly observe that BMXNet is even slower than the full precision TensorFlow Lite. It suggests that the potential of binary neural networks is far from exploited until our dabnn is published.

Build

We provide pre-built onnx2bnn and also dabnn Android package. However, you need to build it if you want to deploy BNNs on non-Android ARM devices.

We use CMake build system like most C++ projects. Check out docs/build.md for the detail instructions.

Convert ONNX Model

We provide a conversion tool, named onnx2bnn, to convert an ONNX model to a dabnn model. We provide onnx2bnn pre-built binaries for all platforms in GitHub Releases. For Linux users, the onnx2bnn pre-built binary is AppImage format, see https://appimage.org for details.

Note: Binary convolution is a custom operator, so whether the ONNX model is dabnn-comptabile heavily depends on the implementation of the binary convolution in the training code. Please check out our wiki for the further information.

After conversion, the generated dabnn model can be deployed on ARM devices (e.g., mobile phones and embedded devices). For Android developer, we have provided Android AAR package and published it on jcenter, for the usage please check out example project.

Pretrained Models

We publish two pretrained binary neural network models based on Bi-Real Net on ImageNet. More pretrained models will be published in the future.

Bi-Real Net 18, 56.4% top-1 on ImageNet, 61.3ms/image on Google Pixel 1 (single thread). [dabnn] [ONNX]
Bi-Real Net 18 with Stem Module, 56.4% top-1 on ImageNet, 43.2ms/image on Google Pixel 1 (single thread). The detailed network structure is described in our paper. [dabnn] [ONNX]

Implementation Details

The Implementation of Binary Convolutions: docs/bconv.md
Model Conversion: docs/onnx2bnn.md

For more details please read our ACM MM paper.

Example project

Android app demo: https://github.com/JDAI-CV/dabnn-example

Related works using dabnn

The following two papers use dabnn to measure the latency of their binary networks on real devices:

License and Citation

BSD 3 Clause

Please cite daBNN in your publications if it helps your research:

@misc{zhang2019dabnn,
  Author = {Jianhao Zhang and Yingwei Pan and Ting Yao and He Zhao and Tao Mei},
  Title = {daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices},
  Year = {2019},
  Eprint = {arXiv:1908.05858},
}

yngil / dabnn