SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain Features

08/06/2021
by   Gwantae Kim, et al.
0

A mixed sample data augmentation strategy is proposed to enhance the performance of models on audio scene classification, sound event classification, and speech enhancement tasks. While there have been several augmentation methods shown to be effective in improving image classification performance, their efficacy toward time-frequency domain features of audio is not assured. We propose a novel audio data augmentation approach named "Specmix" specifically designed for dealing with time-frequency domain features. The augmentation method consists of mixing two different data samples by applying time-frequency masks effective in preserving the spectral correlation of each audio sample. Our experiments on acoustic scene classification, sound event classification, and speech enhancement tasks show that the proposed Specmix improves the performance of various neural network architectures by a maximum of 2.7

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/08/2021

A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection

In this paper, we propose a novel four-stage data augmentation approach ...
research
12/14/2019

Learning discriminative and robust time-frequency representations for environmental sound classification

Convolutional neural networks (CNN) are one of the best-performing neura...
research
10/09/2021

An evaluation of data augmentation methods for sound scene geotagging

Sound scene geotagging is a new topic of research which has evolved from...
research
05/05/2021

Acoustic Scene Classification Using Multichannel Observation with Partially Missing Channels

Sounds recorded with smartphones or IoT devices often have partially unr...
research
10/12/2021

Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection

Data augmentation methods have shown great importance in diverse supervi...
research
06/12/2018

Sample Dropout for Audio Scene Classification Using Multi-Scale Dense Connected Convolutional Neural Network

Acoustic scene classification is an intricate problem for a machine. As ...
research
02/27/2020

Understanding and Enhancing Mixed Sample Data Augmentation

Mixed Sample Data Augmentation (MSDA) has received increasing attention ...

Please sign up or login with your details

Forgot password? Click here to reset