M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

10/14/2021
by   Fan Yu, et al.
0

Recent development of speech signal processing, such as speech recognition, speaker diarization, etc., has inspired numerous applications of speech technologies. The meeting scenario is one of the most valuable and, at the same time, most challenging scenarios for speech technologies. Speaker diarization and multi-speaker automatic speech recognition in meeting scenarios have attracted increasing attention. However, the lack of large public real meeting data has been a major obstacle for advancement of the field. Therefore, we release the AliMeeting corpus, which consists of 120 hours of real recorded Mandarin meeting data, including far-field data collected by 8-channel microphone array as well as near-field data collected by each participants' headset microphone. Moreover, we will launch the Multi-channel Multi-party Meeting Transcription Challenge (M2MeT), as an ICASSP2022 Signal Processing Grand Challenge. The challenge consists of two tracks, namely speaker diarization and multi-speaker ASR. In this paper we provide a detailed introduction of the dateset, rules, evaluation methods and baseline systems, aiming to further promote reproducible research in this field.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2022

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge

The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Ch...
research
02/27/2019

The VOiCES from a Distance Challenge 2019 Evaluation Plan

The "VOiCES from a Distance Challenge 2019" is designed to foster resear...
research
02/04/2022

The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge

This paper describes our speaker diarization system submitted to the Mul...
research
04/08/2021

AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario

In this paper, we present AISHELL-4, a sizable real-recorded Mandarin sp...
research
10/11/2022

MFCCA:Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario

Recently cross-channel attention, which better leverages multi-channel s...
research
06/15/2021

Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget

Automatic speech recognition (ASR) in the cloud allows the use of larger...
research
11/11/2021

MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

Motivated by unconsolidated data situation and the lack of a standard be...

Please sign up or login with your details

Forgot password? Click here to reset