Searching for Winograd-aware Quantized Networks

02/25/2020
by   Javier Fernandez-Marques, et al.
0

Lightweight architectural designs of Convolutional Neural Networks (CNNs) together with quantization have paved the way for the deployment of demanding computer vision applications on mobile devices. Parallel to this, alternative formulations to the convolution operation such as FFT, Strassen and Winograd, have been adapted for use in CNNs offering further speedups. Winograd convolutions are the fastest known algorithm for spatially small convolutions, but exploiting their full potential comes with the burden of numerical error, rendering them unusable in quantized contexts. In this work we propose a Winograd-aware formulation of convolution layers which exposes the numerical inaccuracies introduced by the Winograd transformations to the learning of the model parameters, enabling the design of competitive quantized models without impacting model size. We also address the source of the numerical error and propose a relaxation on the form of the transformation matrices, resulting in up to 10 wiNAS, a neural architecture search (NAS) framework that jointly optimizes a given macro-architecture for accuracy and latency leveraging Winograd-aware layers. A Winograd-aware ResNet-18 optimized with wiNAS for CIFAR-10 results in 2.66x speedup compared to im2row, one of the most widely used optimized convolution implementations, with no loss in accuracy.

READ FULL TEXT
research
11/30/2018

Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search

Recent work in network quantization has substantially reduced the time a...
research
03/17/2023

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

Neural Architecture Search (NAS) has shown promising performance in the ...
research
08/08/2020

NASB: Neural Architecture Search for Binary Convolutional Neural Networks

Binary Convolutional Neural Networks (CNNs) have significantly reduced t...
research
09/25/2022

Bigger Faster: Two-stage Neural Architecture Search for Quantized Transformer Models

Neural architecture search (NAS) for transformers has been used to creat...
research
04/23/2020

Quantaized Winograd/Toom-Cook Convolution for DNNs: Beyond Canonical Polynomials Base

The problem how to speed up the convolution computations in Deep Neural ...
research
01/04/2021

DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions

As the key advancement of the convolutional neural networks (CNNs), dept...
research
10/01/2020

CARLA: A Convolution Accelerator with a Reconfigurable and Low-Energy Architecture

Convolutional Neural Networks (CNNs) have proven to be extremely accurat...

Please sign up or login with your details

Forgot password? Click here to reset