FFT Convolutions are Faster than Winograd on Modern CPUs, Here is Why

09/20/2018
by   Aleksandar Zlateski, et al.
0

Winograd-based convolution has quickly gained traction as a preferred approach to implement convolutional neural networks (ConvNet) on various hardware platforms because it requires fewer floating point operations than FFT-based or direct convolutions. This paper compares three highly optimized implementations (regular FFT--, Gauss--FFT--, and Winograd--based convolutions) on modern multi-- and many--core CPUs. Although all three implementations employed the same optimizations for modern CPUs, our experimental results with two popular ConvNets (VGG and AlexNet) show that the FFT--based implementations generally outperform the Winograd--based approach, contrary to the popular belief. To understand the results, we use a Roofline performance model to analyze the three implementations in detail, by looking at each of their computation phases and by considering not only the number of floating point operations, but also the memory bandwidth and the cache sizes. The performance analysis explains why, and under what conditions, the FFT--based implementations outperform the Winograd--based one, on modern CPUs.

READ FULL TEXT

page 5

page 16

page 17

research
02/17/2021

NEAT: A Framework for Automated Exploration of Floating Point Approximations

Much recent research is devoted to exploring tradeoffs between computati...
research
06/24/2022

Towards Effective Depthwise Convolutions on ARMv8 Architecture

Depthwise convolutions are widely used in lightweight convolutional neur...
research
12/04/2019

L3 Fusion: Fast Transformed Convolutions on CPUs

Fast convolutions via transforms, either Winograd or FFT, had emerged as...
research
05/20/2021

Indirect predicates for geometric constructions

Geometric predicates are a basic ingredient to implement a vast range of...
research
02/16/2021

Numerically more stable computation of the p-values for the two-sample Kolmogorov-Smirnov test

The two-sample Kolmogorov-Smirnov test is a widely used statistical test...
research
01/28/2018

BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing

The past decades witness FLOPS (Floating-point Operations per Second), a...
research
01/28/2018

BOPS, Not FLOPS! A New Metric and Roofline Performance Model For Datacenter Computing

The past decades witness FLOPS (Floating-point Operations per Second) as...

Please sign up or login with your details

Forgot password? Click here to reset