zhaowujie/onnc-tutorial

Introduction

The NVIDIA Deep Learning Accelerator provides free intellectual property licensing to anyone wanting to build a chip that uses deep neural networks for inference applications. With extensive documentation and tools, many business proposals and research projects choose NVDLA as their inference engine design. However, lack of extensible compiler support becomes the major bottleneck for supporting more AI models and optimizations. This tutorial presents the first open source compiler that supports NVDLA-based designs. The ONNC compiler has more support than the official NVDLA compiler and relieves programmers from manually specifying the low-level details of models that are not supported by the official NVDLA compiler. It also enables the opportunities for hardware customization and proprietary optimization. We will cover the overview, porting and optimizations in three subsections. In each subsection, we will have hands-on labs to demonstrate how to run and customize the NVDLA backend in ONNC for product development and research projects.

ONNC (Open Neural Network Compiler) is a retargetable compilation framework designed specifically for proprietary deep learning accelerators. Its software architecture expedites porting ONNC to any Deep Learning Accelerator (DLA) design that supports ONNX (Open Neural Network Exchange) operators. ONNC guarantees executability across every DLA by means of transforming ONNX models into DLA-specific binary forms and leveraging the intermediate representation (IR) design of ONNX along with effective algorithms to eliminate the overhead of data movement. ONNC is the first open source compiler available for NVDLA-based hardware designs. Its NVDLA backend can compile a model into an executable NVDLA Loadable file. Integrating ONNC with the NVDLA software stack opens up opportunities for developers and researchers to explore the NVDLA-based inference design at system level.

This tutorial was presented at MICRO 2019: The 52nd IEEE/ACM International Symposium on Microarchitecture (October 12th) , Columbus, Ohio.

Intended Audience

Researchers and practitioners in academia or industry looking for an open-source AI compiler for NVDLA-based neural network inference engines.

Contributors

Wei-Fen Lin (weifen@skymizer.com)
Cheng-Tao Hsieh (cthsieh@skymizer.com)

Hands-on Labs

Lab 1. ONNC Working Environment Setup
Lab 2. Digit Recognition with ARM Cortex-M
Lab 3. Starting a New Backend
Lab 4. Code Emitting
Lab 5. CPU Fallback Support
Lab 6. Manipulating ONNC IR and Optimization
Lab 7. ONNC IR Extension
Lab 8. Hardware-specific Optimization

References

Papers

W. F. Lin, D. Y. Tsai, L. Tang, C. T. Hsieh, C. Y. Chou, P. H. Chang, and L. Hsu, “ONNC: A compilation framework connecting ONNX to proprietary deep learning accelerators,” in IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS 2019). IEEE, 2019. Download PDF: Link
W.F. Lin, C. T. Hsieh, C. Y. Chou, "ONNC-based Software Development Platform for Configurable NVDLA Designs", to appear in IEEE International Symposium on VLSI Design, Automation and Test (VLSI-DAT 2019). IEEE, 2019 Download PDF: Link

zhaowujie / onnc-tutorial

Introduction

Intended Audience

Contributors

Hands-on Labs

References

Papers

Documentation

About

Languages