Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

02/22/2022
by   Keqi Deng, et al.
0

Recently, end-to-end automatic speech recognition models based on connectionist temporal classification (CTC) have achieved impressive results, especially when fine-tuned from wav2vec2.0 models. Due to the conditional independence assumption, CTC-based models are always weaker than attention-based encoder-decoder models and require the assistance of external language models (LMs). To solve this issue, we propose two knowledge transferring methods that leverage pre-trained LMs, such as BERT and GPT2, to improve CTC-based models. The first method is based on representation learning, in which the CTC-based models use the representation produced by BERT as an auxiliary learning target. The second method is based on joint classification learning, which combines GPT2 for text modeling with a hybrid CTC/attention architecture. Experiment on AISHELL-1 corpus yields a character error rate (CER) of 4.2 fine-tuned from the wav2vec2.0 models, our knowledge transferring method reduces CER by 16.1

READ FULL TEXT
research
10/14/2022

LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge

This paper describes LeVoice automatic speech recognition systems to tra...
research
11/15/2021

Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing Linguistic Information and Noisy Data

Recent advancements in end-to-end speech synthesis have made it possible...
research
07/29/2021

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition

Language models (LMs) pre-trained on massive amounts of text, in particu...
research
09/07/2023

Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

Transferring the knowledge of large language models (LLMs) is a promisin...
research
10/13/2019

Progress Notes Classification and Keyword Extraction using Attention-based Deep Learning Models with BERT

Despite recent advances in the application of deep learning algorithms t...
research
05/23/2023

NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders

Neural document rerankers are extremely effective in terms of accuracy. ...
research
02/02/2022

Toward a traceable, explainable, and fairJD/Resume recommendation system

In the last few decades, companies are interested to adopt an online aut...

Please sign up or login with your details

Forgot password? Click here to reset