CASA-ASR: Context-Aware Speaker-Attributed ASR

05/21/2023
by   Mohan Shi, et al.
0

Recently, speaker-attributed automatic speech recognition (SA-ASR) has attracted a wide attention, which aims at answering the question “who spoke what”. Different from modular systems, end-to-end (E2E) SA-ASR minimizes the speaker-dependent recognition errors directly and shows a promising applicability. In this paper, we propose a context-aware SA-ASR (CASA-ASR) model by enhancing the contextual modeling ability of E2E SA-ASR. Specifically, in CASA-ASR, a contextual text encoder is involved to aggregate the semantic information of the whole utterance, and a context-dependent scorer is employed to model the speaker discriminability by contrasting with speakers in the context. In addition, a two-pass decoding strategy is further proposed to fully leverage the contextual modeling ability resulting in a better recognition performance. Experimental results on AliMeeting corpus show that the proposed CASA-ASR model outperforms the original E2E SA-ASR system with a relative improvement of 11.76

READ FULL TEXT
research
10/07/2021

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR

This paper presents Transcribe-to-Diarize, a new approach for neural spe...
research
03/31/2022

A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings

In this paper, we conduct a comparative study on speaker-attributed auto...
research
10/12/2022

A context-aware knowledge transferring strategy for CTC-based ASR

Non-autoregressive automatic speech recognition (ASR) modeling has recei...
research
11/08/2020

Listen, Look and Deliberate: Visual context-aware speech recognition using pre-trained text-video representations

In this study, we try to address the problem of leveraging visual signal...
research
08/07/2023

SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability

Hotword customization is one of the important issues remained in ASR fie...
research
11/18/2020

Context-aware RNNLM Rescoring for Conversational Speech Recognition

Conversational speech recognition is regarded as a challenging task due ...
research
10/16/2019

Transformer ASR with Contextual Block Processing

The Transformer self-attention network has recently shown promising perf...

Please sign up or login with your details

Forgot password? Click here to reset