Arithmetic Intensity Balancing Convolution for Hardware-aware Efficient Block Design

04/08/2023
by   Shinkook Choi, et al.
0

As deep learning advances, edge devices and lightweight neural networks are becoming more important. To reduce latency in the AI accelerator, it's essential to not only reduce FLOPs but also enhance hardware performance. We proposed an arithmetic intensity balancing convolution (ABConv) to address the issue of the overall intensity being limited by the small weight arithmetic intensity for convolution with a small spatial size. ABConv increased the maximum bound of overall arithmetic intensity and significantly reduced latency, without sacrificing accuracy. We tested the latency and hardware performance of ABConv on the Arm Ethos-U65 NPU in various configurations and used it to replace some of MobileNetV1 and ResNet50 in image classification for CIFAR100.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2023

Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference

The need to execute Deep Neural Networks (DNNs) at low latency and low p...
research
01/07/2019

Efficient Winograd Convolution via Integer Arithmetic

Convolution is the core operation for many deep neural networks. The Win...
research
01/05/2023

FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization

Spiking neural networks (SNNs) have been widely used due to their strong...
research
12/26/2019

Strategies for the vectorized Block Conjugate Gradients method

Block Krylov methods have recently gained a lot of attraction. Due to th...
research
03/29/2018

Improving accuracy of Winograd convolution for DNNs

Modern deep neural networks (DNNs) spend a large amount of their executi...
research
06/12/2020

CoDeNet: Algorithm-hardware Co-design for Deformable Convolution

Deploying deep learning models on embedded systems for computer vision t...
research
10/25/2016

On the optimality of ternary arithmetic for compactness and hardware design

In this paper, the optimality of ternary arithmetic is investigated unde...

Please sign up or login with your details

Forgot password? Click here to reset