HAPI: Hardware-Aware Progressive Inference

08/10/2020
by   Stefanos Laskaridis, et al.
9

Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN inference still comes at a high computational cost. A growing body of work aims to alleviate this by exploiting the difference in the classification difficulty among samples and early-exiting at different stages of the network. Nevertheless, existing studies on early exiting have primarily focused on the training scheme, without considering the use-case requirements or the deployment platform. This work presents HAPI, a novel methodology for generating high-performance early-exit networks by co-optimising the placement of intermediate exits together with the early-exit strategy at inference time. Furthermore, we propose an efficient design space exploration algorithm which enables the faster traversal of a large number of alternative architectures and generates the highest-performing design, tailored to the use-case requirements and target hardware. Quantitative evaluation shows that our system consistently outperforms alternative search mechanisms and state-of-the-art early-exit schemes across various latency budgets. Moreover, it pushes further the performance of highly optimised hand-crafted early-exit CNNs, delivering up to 5.11x speedup over lightweight models on imposed latency-driven SLAs for embedded devices.

READ FULL TEXT
research
08/14/2020

SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

Despite the soaring use of convolutional neural networks (CNNs) in mobil...
research
07/14/2022

T-RECX: Tiny-Resource Efficient Convolutional Neural Networks with Early-Exit

Deploying Machine learning (ML) on the milliwatt-scale edge devices (tin...
research
04/26/2021

CompOFA: Compound Once-For-All Networks for Faster Multi-Platform Deployment

The emergence of CNNs in mainstream deployment has necessitated methods ...
research
02/02/2021

Fast Exploration of Weight Sharing Opportunities for CNN Compression

The computational workload involved in Convolutional Neural Networks (CN...
research
03/01/2021

Class Means as an Early Exit Decision Mechanism

State-of-the-art neural networks with early exit mechanisms often need c...
research
04/19/2020

HCM: Hardware-Aware Complexity Metric for Neural Network Architectures

Convolutional Neural Networks (CNNs) have become common in many fields i...
research
06/09/2021

Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions

DNNs are becoming less and less over-parametrised due to recent advances...

Please sign up or login with your details

Forgot password? Click here to reset