The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge

06/14/2020
by   Ashish Arora, et al.
0

This paper summarizes the JHU team's efforts in tracks 1 and 2 of the CHiME-6 challenge for distant multi-microphone conversational speech diarization and recognition in everyday home environments. We explore multi-array processing techniques at each stage of the pipeline, such as multi-array guided source separation (GSS) for enhancement and acoustic model training data, posterior fusion for speech activity detection, PLDA score fusion for diarization, and lattice combination for automatic speech recognition (ASR). We also report results with different acoustic model architectures, and integrate other techniques such as online multi-channel weighted prediction error (WPE) dereverberation and variational Bayes-hidden Markov model (VB-HMM) based overlap assignment to deal with reverberation and overlapping speakers, respectively. As a result of these efforts, our ASR systems achieve a word error rate of 40.5 evaluation set. This is an improvement of 10.8 challenge baselines for the respective tracks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2020

"This is Houston. Say again, please". The Behavox system for the Apollo-11 Fearless Steps Challenge (phase II)

We describe the speech activity detection (SAD), speaker diarization (SD...
research
02/20/2023

A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One

Although automatic speech recognition (ASR) can perform well in common n...
research
03/28/2018

The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

The CHiME challenge series aims to advance robust automatic speech recog...
research
06/23/2023

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

The CHiME challenges have played a significant role in the development a...
research
02/03/2022

The RoyalFlush System of Speech Recognition for M2MeT Challenge

This paper describes our RoyalFlush system for the track of multi-speake...
research
05/29/2019

Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR

In this paper, we present Hitachi and Paderborn University's joint effor...
research
09/12/2022

VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition

This paper presents a novel streaming automatic speech recognition (ASR)...

Please sign up or login with your details

Forgot password? Click here to reset