Balancing Accuracy and Latency in Multipath Neural Networks

04/25/2021
by   Mohammed Amer, et al.
12

The growing capacity of neural networks has strongly contributed to their success at complex machine learning tasks and the computational demand of such large models has, in turn, stimulated a significant improvement in the hardware necessary to accelerate their computations. However, models with high latency aren't suitable for limited-resource environments such as hand-held and IoT devices. Hence, many deep learning techniques aim to address this problem by developing models with reasonable accuracy without violating the limited-resource constraint. In this work, we use a one-shot neural architecture search model to implicitly evaluate the performance of an intractable number of multipath neural networks. Combining this architecture search with a pruning technique and architecture sample evaluation, we can model the relation between the accuracy and the latency of a spectrum of models with graded complexity. We show that our method can accurately model the relative performance between models with different latencies and predict the performance of unseen models with good precision across different datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2022

Neural Architecture Search for Improving Latency-Accuracy Trade-off in Split Computing

This paper proposes a neural architecture search (NAS) method for split ...
research
08/04/2021

Efficient Neural Architecture Search with Performance Prediction

Neural networks are powerful models that have a remarkable ability to ex...
research
09/23/2022

Tiered Pruning for Efficient Differentialble Inference-Aware Neural Architecture Search

We propose three novel pruning techniques to improve the cost and result...
research
10/27/2020

μNAS: Constrained Neural Architecture Search for Microcontrollers

IoT devices are powered by microcontroller units (MCUs) which are extrem...
research
05/28/2019

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers

The vast majority of processors in the world are actually microcontrolle...
research
12/02/2018

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization

In this paper, we study the problem of improving computational resource ...
research
01/13/2023

Adaptive Neural Networks Using Residual Fitting

Current methods for estimating the required neural-network size for a gi...

Please sign up or login with your details

Forgot password? Click here to reset