Optimizing Temporal Convolutional Network inference on FPGA-based accelerators

05/07/2020
by   Marco Carreras, et al.
0

Convolutional Neural Networks are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition, and segmentation. Recent research results demonstrate that multilayer(deep) networks involving mono-dimensional convolutions and dilation can be effectively used in time series and sequences classification and segmentation, as well as in tasks involving sequence modelling. These structures, commonly referred to as Temporal Convolutional Networks (TCNs), have been demonstrated to consistently outperform Recurrent Neural Networks in terms of accuracy and training time [1]. While FPGA-based inference accelerators for classic CNNs are widespread, literature is lacking in a quantitative evaluation of their usability on inference for TCN models. In this paper we present such an evaluation, considering a CNN accelerator with specific features supporting TCN kernels as a reference and a set of state-of-the-art TCNs as a benchmark. Experimental results show that, during TCN execution, operational intensity can be critical for the overall performance. We propose a convolution scheduling based on batch processing that can boost efficiency up to 96 achieve up to 111,8 GOPS/s and power efficiency of 33,9 GOPS/s/W on an Ultrascale+ ZU3EG (up to 10x speedup and 3x power efficiency improvement with respect to pure software implementation).

READ FULL TEXT

page 1

page 6

page 10

research
05/07/2017

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA

In recent years deep learning algorithms have shown extremely high perfo...
research
02/04/2022

EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators

Dilated and transposed convolutions are widely used in modern convolutio...
research
03/16/2017

Convolutional Neural Network on Three Orthogonal Planes for Dynamic Texture Classification

Dynamic Textures (DTs) are sequences of images of moving scenes that exh...
research
11/28/2016

Dense Prediction on Sequences with Time-Dilated Convolutions for Speech Recognition

In computer vision pixelwise dense prediction is the task of predicting ...
research
12/14/2015

Origami: A 803 GOp/s/W Convolutional Network Accelerator

An ever increasing number of computer vision and image/video processing ...
research
05/12/2021

High-Performance FPGA-based Accelerator for Bayesian Neural Networks

Neural networks (NNs) have demonstrated their potential in a wide range ...
research
09/26/2019

Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator

Recurrent Neural Network (RNN) applications form a major class of AI-pow...

Please sign up or login with your details

Forgot password? Click here to reset