automl darts data-analysis data-generator data-pipeline data-science deep-learning deep-neural-networks feature-extractor machine-learning metalearning neural-architecture-search scikit-learn

     ,--,                                                       
  ,---.'|                      ,--.         ,--.                
  |   | :       ,---,.       ,--.'|       ,--.'|   ,---,        
  :   : |     ,'  .' |   ,--,:  : |   ,--,:  : |  '  .' \       
  |   ' :   ,---.'   |,`--.'`|  ' :,`--.'`|  ' : /  ;    '.     
  ;   ; '   |   |   .'|   :  :  | ||   :  :  | |:  :       \    
  '   | |__ :   :  |-,:   |   \ | ::   |   \ | ::  |   /\   \   
  |   | :.'|:   |  ;/||   : '  '; ||   : '  '; ||  :  ' ;.   :  
  '   :    ;|   :   .''   ' ;.    ;'   ' ;.    ;|  |  ;/  \   \ 
  |   |  ./ |   |  |-,|   | | \   ||   | | \   |'  :  | \  \ ,' 
  ;   : ;   '   :  ;/|'   : |  ; .''   : |  ; .'|  |  '  '--'   
  |   ,/    |   |    \|   | '`--'  |   | '`--'  |  :  :         
  '---'     |   :   .''   : |      '   : |      |  | ,'         
            |   | ,'  ;   |.'      ;   |.'      `--''           
            `----'    '---'        '---'

03.2020 - 06. 2020 | zenith-lee, pyossyoung, meowpunch

supported by Accelerated Computing Systems Lab, Yonsei Univ.

LENNA (Latency Estimation for Neural Network Architecture) upgrades Differentiable Architecture Search (DARTS), which is known as high performance model in Neural Architecture Search (NAS)

PROGRESS

All progress is in Notion Pages [KR]

ABSTRACT

Introduce

These days, researches on NAS (Representative methodology of AutoML) that has hit the Artificial Intelligence (AI) field are being actively carried out. However, most researches are far from being practical and are focused only on performance metrics such as accuracy. So, we can search a practical architecture that can be used in real life by adding hardware metrics such as latency to the loss function.

Related Works

ProxylessNAS searches architecture considering the target hardware metrics. But, ProxylessNAS is applied to simplified structure with parallel arranged operations and has limits to be applicable to general complex architecture structure such as DARTS-made structure. We introduce LENNA, the Multi-Layer Perceptron model made for estimating latency given fundamental information of network, such as parameters, input size, etc.

The latency part of newly generated DARTS loss function would be estimated by LENNA.

𝑳𝒐𝒔𝒔 = 𝑳𝒐𝒔𝒔(𝑫𝑨𝑹𝑻𝑺)+ 𝝀 ∗ (𝒆𝒙𝒑𝒆𝒄𝒕𝒆𝒅 𝒍𝒂𝒕𝒆𝒏𝒄𝒚)

Structure

the project includes followings:

submission
- pre-practice by using ElasticNet
- L(one block) = sum(L(op))
generate dataset
preprocessing
modeling

EXPERIMENT

Environment

CPU: AMD Ryzen 7 3700X 8-core Processor * 16
GPU: GeForce RTX 2060 SUPER * 4

Data Generator

batch_size: 64

Input X

num_layer is fixed, 5 -> 167 dimension

block type (need to be one hot encoded): normal(1), reduction(0)
- 16 32 32 → 16 32 32 normal
- 16 32 32 → 32 16 16 reduction
input_channel: 1~1000 (Caution RuntimeError: CUDA out of memory)
arhitecture parameters: random on unifrom distribution

Init ratio for arch param [KR]

How to generate param? [KR]

Target y

ALGORITHM (about one row)

analysis.binary_gates 중

whenever reset binary gate, accumlate a median(40%) value of estimated latencies 10 times. (figure 1)
- How many times do you need to estimate latency when resetting the binary gate? [KR]
average of cumulative latency. (figure 2)
error of the cumulatvie average and the previous one. (figure 3)
if the error hits continuously 10 times that are less than 1%, stop and use that error.

SNAPSHOT

On /latency_by_binary_gates.ipynb

figure 1) cumulative latency
figure 2) cumulative average
figure 3) cumulative error

ELSE

Modeling

Elastic Net
MLP Regressor

REFERENCES

Liu, H., Simonyan, K., and Yang, Y. Darts: Differentiable architecture search. ICLR, 2019.
Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar. Designing neural network architectures using reinforcement learning. ICLR, 2017.
Jelena Luketina, Mathias Berglund, Klaus Greff, and Tapani Raiko. Scalable gradient-based tuning of continuous regularization hyperparameters. In ICML, pp. 2952–2960, 2016.
Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning. ICLR, 2017
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. arXiv preprint arXiv:1707.01083, 2017.
Han Cai, Ligeng Zhu, Song Han, ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. ICLR, 2019.
Liu, C., Zoph, B., Shlens, J., Hua, W., Li, L.-J., FeiFei, L., Yuille, A., Huang, J., and Murphy, K. Progressive neural architecture search. ECCV, 2018.
Zoph, B., Vasudevan, V., Shlens, J., and Le, Q. V. Learning transferable architectures for scalable image recognition. In CVPR, 2018.

About

Latency Estimation for Neural Network Architecture

automl darts data-analysis data-generator data-pipeline data-science deep-learning deep-neural-networks feature-extractor machine-learning metalearning neural-architecture-search scikit-learn

Languages

Language:Python 90.8%Language:Jupyter Notebook 9.0%Language:Shell 0.2%