Roadmap

Question

Roadmap

mydmdm opened this issue 3 years ago · comments

nn-Meter is not only a latency predictor but also a critical component in the hardware-aware model design. It empowers existing NAS (neural architecture search) and other efficient model design tasks to be specialized for the target hardware platform.

There are multiple aspects will be covered in this and related repo, including:

latency prediction and pre-trained predictors
- the IR converter, kernel detection tools
- builtin kernel predictors and pre-trained weights
algorithm integration (mainly in NNI), the integration of latency prediction in existing NAS and compression algorithms.
model latency dataset, the collected latencies of thousands of model architectures. Also includes data loaders and an improved GNN predictor.

Release Plan

version 1.0-alpha

Date: 2021 August
Latency prediction
- basic framework and utilities for latency prediction (e.g., config management, artifacts downloading, builtin predictors)
- basic CI workflow with integrated test
- documentation and examples
Algorithm integration
- initial multi-trial NAS example

version 1.0-beta

Date: 2021 November
Algorithm integration
- SPOS / Proxyless NAS in NNI
- ~~SPOS: first integrate nn-meter in the evolution search~~ (move to 2.0)
- Proxyless NAS: predict the block latency in the search space, provide the lookup table
Dataset
- make model-latency dataset public
- reference design of an improved GNN latency predictor

version 2.0

Date: 2021 ~~November~~ December
Algorithm integration
- SPOS: first integrate nn-meter in the evolution search
latency predictor building tools
- fusion rule detecton
- adaptive data sampler

gmimsgit · Answer 1 · Thu Aug 26 2021 20:42:31 GMT+0800 (China Standard Time)

Hello,

the paper mentions methods for:

detecting the fusion rules on a device
adaptive sampling for creating the latency dataset

Will these be added to the repository ?
If they will be added: Do you have a rough time frame for when they will be available ?

Li Lyna Zhang · Answer 2 · Fri Aug 27 2021 14:03:58 GMT+0800 (China Standard Time)

Hello,

the paper mentions methods for:

detecting the fusion rules on a device

adaptive sampling for creating the latency dataset

Will these be added to the repository ?
If they will be added: Do you have a rough time frame for when they will be available ?

@gmimsgt, Hi, we plan to add the fusion rule detection and adaptive sampling algorithms. We will start after version 1.0-beta finishes.

gmimsgit · Answer 3 · Fri Aug 27 2021 16:23:53 GMT+0800 (China Standard Time)

@Lynazhang Thanks for the quick answer.

I appreciate the effort put into polishing the code base as it allowed me to get started quickly.
Especially the fusion rule detection and adaptive sampling are very interesting as I am currently trying to predict/benchmark a new device. The paper has been very helpful in this regard and I would love to try out the implementation.

If it is not an inconvenience is it possible to get the current state of the code?

Liuyi Jin · Answer 4 · Mon Sep 20 2021 13:44:43 GMT+0800 (China Standard Time)

Hi, I'm wondering if you would share your modification code to TFLite, which implements the GPU operator-level profiling?

Li Lyna Zhang · Answer 5 · Sat Oct 30 2021 16:37:48 GMT+0800 (China Standard Time)

Hi, I'm wondering if you would share your modification code to TFLite, which implements the GPU operator-level profiling?

Hi, @liuyibox, we will soon share a patch about the GPU operator-level profiling.