Faster Attention Is What You Need: A Fast Self-Attention Neural Network Backbone Architecture for the Edge via Double-Condensing Attention Condensers

08/15/2022
by   Alexander Wong, et al.
0

With the growing adoption of deep learning for on-device TinyML applications, there has been an ever-increasing demand for more efficient neural network backbones optimized for the edge. Recently, the introduction of attention condenser networks have resulted in low-footprint, highly-efficient, self-attention neural networks that strike a strong balance between accuracy and speed. In this study, we introduce a new faster attention condenser design called double-condensing attention condensers that enable more condensed feature embedding. We further employ a machine-driven design exploration strategy that imposes best practices design constraints for greater efficiency and robustness to produce the macro-micro architecture constructs of the backbone. The resulting backbone (which we name AttendNeXt) achieves significantly higher inference throughput on an embedded ARM processor when compared to several other state-of-the-art efficient backbones (>10X faster than FB-Net C at higher accuracy and speed) while having a small model size (>1.47X smaller than OFA-62 at higher speed and similar accuracy) and strong accuracy (1.1 speed). These promising results demonstrate that exploring different efficient architecture designs and self-attention mechanisms can lead to interesting new building blocks for TinyML applications.

READ FULL TEXT
research
04/29/2021

AttendSeg: A Tiny Attention Condenser Neural Network for Semantic Segmentation on the Edge

In this study, we introduce AttendSeg, a low-precision, highly compact d...
research
09/30/2020

AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via Visual Attention Condensers

While significant advances in deep learning has resulted in state-of-the...
research
08/10/2020

TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices

Advances in deep learning have led to state-of-the-art performance acros...
research
12/01/2022

Semiconductor Defect Pattern Classification by Self-Proliferation-and-Attention Neural Network

Semiconductor manufacturing is on the cusp of a revolution: the Internet...
research
01/23/2023

PCBDet: An Efficient Deep Neural Network Object Detection Architecture for Automatic PCB Component Detection on the Edge

There can be numerous electronic components on a given PCB, making the t...
research
03/31/2023

Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

Conformer models maintain a large number of internal states, the vast ma...

Please sign up or login with your details

Forgot password? Click here to reset