Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition

06/18/2020
by   Xinyuan Zhou, et al.
0

Code-switching (CS) occurs when a speaker alternates words of two or more languages within a single sentence or across sentences. Automatic speech recognition (ASR) of CS speech has to deal with two or more languages at the same time. In this study, we propose a Transformer-based architecture with two symmetric language-specific encoders to capture the individual language attributes, that improve the acoustic representation of each language. These representations are combined using a language-specific multi-head attention mechanism in the decoder module. Each encoder and its corresponding attention module in the decoder are pre-trained using a large monolingual corpus aiming to alleviate the impact of limited CS training data. We call such a network a multi-encoder-decoder (MED) architecture. Experiments on the SEAME corpus show that the proposed MED architecture achieves 10.2 reduction on the CS evaluation sets with Mandarin and English as the matrix language respectively.

READ FULL TEXT
research
11/02/2022

Monolingual Recognizers Fusion for Code-switching Speech Recognition

The bi-encoder structure has been intensively investigated in code-switc...
research
04/06/2021

Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization

Mandarin-English code-switching (CS) is frequently used among East and S...
research
06/01/2023

Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

This work focuses on improving the Spoken Language Identification (LangI...
research
06/29/2022

Language-specific Characteristic Assistance for Code-switching Speech Recognition

Dual-encoder structure successfully utilizes two language-specific encod...
research
04/11/2022

Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding

This paper proposes a simple and effective approach for automatic recogn...
research
05/05/2023

Online Gesture Recognition using Transformer and Natural Language Processing

The Transformer architecture is shown to provide a powerful machine tran...
research
09/12/2022

Vision Transformer with Convolutional Encoder-Decoder for Hand Gesture Recognition using 24 GHz Doppler Radar

Transformers combined with convolutional encoders have been recently use...

Please sign up or login with your details

Forgot password? Click here to reset