ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

by   Xiaoliang Dai, et al.

This paper proposes an efficient neural network (NN) architecture design methodology called Chameleon that honors given resource constraints. Instead of developing new building blocks or using computationally-intensive reinforcement learning algorithms, our approach leverages existing efficient network building blocks and focuses on exploiting hardware traits and adapting computation resources to fit target latency and/or energy constraints. We formulate platform-aware NN architecture search in an optimization framework and propose a novel algorithm to search for optimal architectures aided by efficient accuracy and resource (latency and/or energy) predictors. At the core of our algorithm lies an accuracy predictor built atop Gaussian Process with Bayesian optimization for iterative sampling. With a one-time building cost for the predictors, our algorithm produces state-of-the-art model architectures on different platforms under given constraints in just minutes. Our results show that adapting computation resources to building blocks is critical to model performance. Without the addition of any bells and whistles, our models achieve significant accuracy improvements against state-of-the-art hand-crafted and automatically designed architectures. We achieve 73.8 on ImageNet at 20ms latency on a mobile CPU and DSP. At reduced latency, our models achieve up to 8.5 improvements compared to MobileNetV2 and MnasNet, respectively, on a mobile CPU (DSP), and 2.7 ResNet-152, respectively, on an Nvidia GPU (Intel CPU).


page 1

page 2

page 3

page 4


Dynamic DNNs Meet Runtime Resource Management on Mobile and Embedded Platforms

Deep neural network (DNN) inference is increasingly being executed on mo...

Hardware-aware mobile building block evaluation for computer vision

In this work we propose a methodology to accurately evaluate and compare...

NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications

This work proposes an automated algorithm, called NetAdapt, that adapts ...

AOWS: Adaptive and optimal network width search with latency constraints

Neural architecture search (NAS) approaches aim at automatically finding...

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators

Inverted bottleneck layers, which are built upon depthwise convolutions,...

From micro-OPs to abstract resources: constructing a simpler CPU performance model through microbenchmarking

In a super-scalar architecture, the scheduler dynamically assigns micro-...

Algorithmic Building Blocks for Asymmetric Memories

The future of main memory appears to lie in the direction of new non-vol...

Please sign up or login with your details

Forgot password? Click here to reset