Multi-style Training for South African Call Centre Audio

02/15/2022
by   Walter Heymans, et al.
0

Mismatched data is a challenging problem for automatic speech recognition (ASR) systems. One of the most common techniques used to address mismatched data is multi-style training (MTR), a form of data augmentation that attempts to transform the training data to be more representative of the testing data; and to learn robust representations applicable to different conditions. This task can be very challenging if the test conditions are unknown. We explore the impact of different MTR styles on system performance when testing conditions are different from training conditions in the context of deep neural network hidden Markov model (DNN-HMM) ASR systems. A controlled environment is created using the LibriSpeech corpus, where we isolate the effect of different MTR styles on final system performance. We evaluate our findings on a South African call centre dataset that contains noisy, WAV49-encoded audio.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2021

Data Augmentation with Locally-time Reversed Speech for Automatic Speech Recognition

Psychoacoustic studies have shown that locally-time reversed (LTR) speec...
research
06/07/2021

Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios

Although end-to-end automatic speech recognition (E2E ASR) has achieved ...
research
07/11/2022

pMCT: Patched Multi-Condition Training for Robust Speech Recognition

We propose a novel Patched Multi-Condition Training (pMCT) method for ro...
research
06/14/2021

SynthASR: Unlocking Synthetic Data for Speech Recognition

End-to-end (E2E) automatic speech recognition (ASR) models have recently...
research
04/29/2022

A Challenging Benchmark of Anime Style Recognition

Given two images of different anime roles, anime style recognition (ASR)...
research
04/12/2019

STC Speaker Recognition Systems for the VOiCES From a Distance Challenge

This paper presents the Speech Technology Center (STC) speaker recogniti...
research
03/28/2022

Finnish Parliament ASR corpus - Analysis, benchmarks and statistics

Public sources like parliament meeting recordings and transcripts provid...

Please sign up or login with your details

Forgot password? Click here to reset