ATHEENA: A Toolflow for Hardware Early-Exit Network Automation

04/17/2023
by   Benjamin Biggs, et al.
0

The continued need for improvements in accuracy, throughput, and efficiency of Deep Neural Networks has resulted in a multitude of methods that make the most of custom architectures on FPGAs. These include the creation of hand-crafted networks and the use of quantization and pruning to reduce extraneous network parameters. However, with the potential of static solutions already well exploited, we propose to shift the focus to using the varying difficulty of individual data samples to further improve efficiency and reduce average compute for classification. Input-dependent computation allows for the network to make runtime decisions to finish a task early if the result meets a confidence threshold. Early-Exit network architectures have become an increasingly popular way to implement such behaviour in software. We create: A Toolflow for Hardware Early-Exit Network Automation (ATHEENA), an automated FPGA toolflow that leverages the probability of samples exiting early from such networks to scale the resources allocated to different sections of the network. The toolflow uses the data-flow model of fpgaConvNet, extended to support Early-Exit networks as well as Design Space Exploration to optimize the generated streaming architecture hardware with the goal of increasing throughput/reducing area while maintaining accuracy. Experimental results on three different networks demonstrate a throughput increase of 2.00× to 2.78× compared to an optimized baseline network implementation with no early exits. Additionally, the toolflow can achieve a throughput matching the same baseline with as low as 46% of the resources the baseline requires.

READ FULL TEXT
research
09/14/2020

AutoML for Multilayer Perceptron and FPGA Co-design

State-of-the-art Neural Network Architectures (NNAs) are challenging to ...
research
10/31/2019

On Neural Architecture Search for Resource-Constrained Hardware Platforms

In the recent past, the success of Neural Architecture Search (NAS) has ...
research
04/19/2020

HCM: Hardware-Aware Complexity Metric for Neural Network Architectures

Convolutional Neural Networks (CNNs) have become common in many fields i...
research
07/14/2022

T-RECX: Tiny-Resource Efficient Convolutional Neural Networks with Early-Exit

Deploying Machine learning (ML) on the milliwatt-scale edge devices (tin...
research
11/21/2018

HAQ: Hardware-Aware Automated Quantization

Model quantization is a widely used technique to compress and accelerate...
research
04/24/2019

Design Automation for Efficient Deep Learning Computing

Efficient deep learning computing requires algorithm and hardware co-des...
research
05/07/2021

BasisNet: Two-stage Model Synthesis for Efficient Inference

In this work, we present BasisNet which combines recent advancements in ...

Please sign up or login with your details

Forgot password? Click here to reset