The RoyalFlush System of Speech Recognition for M2MeT Challenge

02/03/2022
by   Shuaishuai Ye, et al.
0

This paper describes our RoyalFlush system for the track of multi-speaker automatic speech recognition (ASR) in the M2MeT challenge. We adopted the serialized output training (SOT) based multi-speakers ASR system with large-scale simulation data. Firstly, we investigated a set of front-end methods, including multi-channel weighted predicted error (WPE), beamforming, speech separation, speech enhancement and so on, to process training, validation and test sets. But we only selected WPE and beamforming as our frontend methods according to their experimental results. Secondly, we made great efforts in the data augmentation for multi-speaker ASR, mainly including adding noise and reverberation, overlapped speech simulation, multi-channel speech simulation, speed perturbation, front-end processing, and so on, which brought us a great performance improvement. Finally, in order to make full use of the performance complementary of different model architecture, we trained the standard conformer based joint CTC/Attention (Conformer) and U2++ ASR model with a bidirectional attention decoder, a modification of Conformer, to fuse their results. Comparing with the official baseline system, our system got a 12.22 12.11

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2020

Improving noise robust automatic speech recognition with single-channel time-domain enhancement network

With the advent of deep learning, research on noise-robust automatic spe...
research
08/22/2023

Convoifilter: A case study of doing cocktail party speech recognition

This paper presents an end-to-end model designed to improve automatic sp...
research
02/09/2022

The Volcspeech system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge

This paper describes our submission to ICASSP 2022 Multi-channel Multi-p...
research
03/27/2018

Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline

This paper describes a new baseline system for automatic speech recognit...
research
10/22/2018

Investigation of Monaural Front-End Processing for Robust ASR without Retraining or Joint-Training

In recent years, monaural speech separation has been formulated as a sup...
research
06/14/2020

The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge

This paper summarizes the JHU team's efforts in tracks 1 and 2 of the CH...
research
01/30/2020

BUT Opensat 2019 Speech Recognition System

The paper describes the BUT Automatic Speech Recognition (ASR) systems s...

Please sign up or login with your details

Forgot password? Click here to reset