AIIABenchmark / AIIA-DNN-benchmark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AIIA DNN Benchmark Overview

The goal of the alliance is provide selection reference for application companies, and provide third-party evaluation results for chip companies.

The goal of AIIA DNN benchmarks is to objectively reflect the current state of AI accelerator capabilities, and all metrics are designed to provide an objective comparison dimension.

We follow the principle of continuous iteration of the version, continuous enrichment of the scene, and continues to improve the AI chip type, and finally form a evaluation environment for the training and inference including the terminal and the cloud.

Evaluation & Results

Edge / Inference

How To Use

This is a example of image classification application powered by AIIA. Please feel free to try them on your device.

​ This App based on the TensorFlow Lite engine can classify Images from your Devices.

​ Please download resources from App Resources Hub (psw: k04t)

​ Building in Android Studio with TensorFlow Lite AAR from JCenter

​ Import resource files into the device

​ Also refer to the TFLITE Models

adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files
adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files/images
adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files/models
adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files/models/tflite

adb push ./images/. /sdcard/Android/data/com.xintongyuan.aibench/files/images/
adb push ./tflite/. /sdcard/Android/data/com.xintongyuan.aibench/files/models/tflite/

2. Adding a model to run on existing architecture

​ Create a model class and inherit ImageClassifierTF

​ Dynamic binding in the main program

​ Please refer to the TensorFlow Lite example.

3. Adding a new AI frameworks

​ AIBench supports several deep learning frameworks ( SNPE, HIAI,TENGINE and TensorFlow Lite) currently, which may require the following dependencies:

​ you need to download the SNPE, HIAI, TENGINE, TensorFlow Lite, refer to the Demo and API.

​ Other content will be continuously updated.

Five typical application scenario

Test1: Object_Classification

  • Neural Network: Mobilenetv2 / Resnet101 / VGG16 / Inceptionv3
  • Image Resolution: 224 x 224 px |299 x 299 px
  • Metrics: fps / top1 / top5
  • Dataset: ImageNet (1k frames)

Test2: Object Detection

  • Neural Network: ssd_mobilenetv1 / ssd_mobilenetv2 / ssd_vgg16
  • Image Resolution: 300 x 300 px
  • Metrics: fps / mAP / mIoU
  • Dataset: PASCAL VOC2012 (1k frames)

Test3: Image_Super_Resolution

  • Neural Network: vdsr
  • Image Resolution: 256 x 256 px
  • Metrics: fps / PSNR(dB)
  • Dataset: PASCAL VOC2012 (1k frames)

Test4: Image_Segmentation

  • Neural Network: fcn
  • Image Resolution: 224 x 224 px
  • Metrics: fps / mIoU
  • Dataset: PASCAL VOC2012 (1k frames)

Test5: Face_Recognition

  • Neural Network: vgg16
  • Image Resolution: 224 x 224 px
  • Metrics: fps / Accuracy
  • Dataset: LFW (1k frames)

Benchmark Results

INT8 Inference

Product Platform Device Framework System Test1: Object_Classification Test2: Object_Detection Test3: Image_Super_Resolution Test4: Image_Segmentation Test5: Face_Recognition
mobilenet_v2 resnet101 vgg16 inception_v3 ssd_mobilenetv1 ssd_mobilenetv2 ssd_vgg16 vdsr fcn vgg16
FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5 FPS mAP mIoU FPS mAP mIoU FPS mAP mIoU FPS PSNR(dB) FPS mAP mIoU FPS Accuracy
Huawei_Mate_20 kirin_980 NPU HIAI Android 101.90 71.3% 88.3% 43.78 71.9% 88.4% 32.38 64.3% 85% 58.32 75.8% 91.5% 65.68 0.84 0.83 52.39 0.55 0.80 14.06 0.89 0.79 12.42 24.92 - - - - -
ROC_RK3399_PC CortexA72_x_2 CortexA53_x_4 CPU TENGINE Android 17.41 73.30% 91.30% 1.94 75.1% 93.1% 1.115 68.2% 89.4% 2.2 77.5% 93.5% - - - - - - - - - - - - - - - -

FLOAT16 Inference

Product Platform Device Framework System Test1: Object_Classification Test2: Object_Detection Test3: Image_Super_Resolution Test4: Image_Segmentation Test5: Face_Recognition
mobilenet_v2 resnet101 vgg16 inception_v3 ssd_mobilenetv1 ssd_mobilenetv2 ssd_vgg16 vdsr fcn vgg16
FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5 FPS mAP mIoU FPS mAP mIoU FPS mAP mIoU FPS PSNR(dB) FPS mAP mIoU FPS Accuracy
Huawei_Mate_20 kirin_980 NPU HIAI Android 54.2 70.7% 88.2% 21.98 72.3% 89.2% 13.53 66.1% 85.2% 32.93 75.7% 92.3% 35 0.86 0.84 29.97 0.62 0.78 7.276 0.96 0.84 7.64 24.92 1.39 - - - -
Cloud / Inference

Environment requirement

In order to follow the objective and fair principle in the AI chip evaluation process, the tested party is required to perform and submit a test report during the self-test according to the following requirements.

  1. Hardware environment requirements
No. Hardware requirements
1 Computing Configuration Single node & single card
2 CPU Intel(R) Xeon(R) Silver 4114 CPU @2.20GHz
3 Memory 64G DDR4
4 Storage 512G SSD
  1. Software environment requirements
No. Option requirements
1 Test data set ILSVRC2015 validation on ImageNet (50k frames )
2 application scenario
(Including but not limited to other scenarios)
Object_Classification
3 Neural Network
(Including but not limited to other models)
VGG16/Resnet50/Resnet152/MobileNet_v1 (Offered by AIIA)
4 Acceleration framework Adapt to the AI card
5 Metrics Latency Accuracy Throughput Power
Computing power per watt(frame/sec/w)
The calculation of all test indicators is based on the test data set
and can be calculated in multiple scripts
  1. Procedure requirements
No. Option requirements
1 Pre-processing Standardize with z-score (non-crop)
2 Batch size 1/2/4/8/16/32/64/128
3 Inference latency Inference time without pre-processing and post-processing
4 Power Average power during inference, excluding power of other peripheral modules
5 Program running sequence --->Task initialization (quantization model, loading model)
--->Pre-processing
--->Start monitoring power
---> Start the timer
---> Inference
--->End of time
--->End of power monitoring
---> post-processing
--->Metrics output
6 Log format ###################
processor_name:
test_name:
model_name:
batch size:
power:
latency:(ms/batch)
throughput:(batch size/latency*1000)
top1:
top5:
###################
  1. Sample results

+---------------------------------------------------------------------------------------+
|                                 Resnet50(INT8)                                      |
+---------------------------------------------------------------------------------------+
| top1/top5 | batch size | Latency(ms) | Throughput | Power(w) | 每瓦算力 (/frame/sec/w) |
|-----------|---------------------------------------------------------------------------|
|           | 1          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 2          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 4          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 8          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 16         |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 32         |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 64         |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 128        |             |            |          |                        |
+---------------------------------------------------------------------------------------+

License

Apache License 2.0.

About

License:Apache License 2.0


Languages

Language:Java 100.0%