Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks

07/01/2021
by   Eduardo Fonseca, et al.
0

Recent studies have put into question the commonly assumed shift invariance property of convolutional networks, showing that small shifts in the input can affect the output predictions substantially. In this paper, we analyze the benefits of addressing lack of shift invariance in CNN-based sound event classification. Specifically, we evaluate two pooling methods to improve shift invariance in CNNs, based on low-pass filtering and adaptive sampling of incoming feature maps. These methods are implemented via small architectural modifications inserted into the pooling layers of CNNs. We evaluate the effect of these architectural changes on the FSD50K dataset using models of different capacity and in presence of strong regularization. We show that these modifications consistently improve sound event classification in all cases considered. We also demonstrate empirically that the proposed pooling methods increase shift invariance in the network, making it more robust against time/frequency shifts in input spectrograms. This is achieved by adding a negligible amount of trainable parameters, which makes these methods an appealing alternative to conventional pooling layers. The outcome is a new state-of-the-art mAP of 0.541 on the FSD50K classification benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2020

Truly shift-invariant convolutional neural networks

Thanks to the use of convolution and pooling layers, convolutional neura...
research
08/19/2022

Shift Variance in Scene Text Detection

Theory of convolutional neural networks suggests the property of shift e...
research
04/25/2019

Making Convolutional Networks Shift-Invariant Again

Modern convolutional networks are not shift-invariant, as small input sh...
research
09/19/2022

On the Shift Invariance of Max Pooling Feature Maps in Convolutional Neural Networks

In this paper, we aim to improve the mathematical interpretability of co...
research
12/01/2022

From CNNs to Shift-Invariant Twin Wavelet Models

We propose a novel antialiasing method to increase shift invariance in c...
research
06/30/2021

Small in-distribution changes in 3D perspective and lighting fool both CNNs and Transformers

Neural networks are susceptible to small transformations including 2D ro...
research
11/25/2019

Translation Insensitive CNNs

We address the problem that state-of-the-art Convolution Neural Networks...

Please sign up or login with your details

Forgot password? Click here to reset