Learning ASR-Robust Contextualized Embeddings for Spoken Language Understanding

09/24/2019
by   Chao-Wei Huang, et al.
0

Employing pre-trained language models (LM) to extract contextualized word representations has achieved state-of-the-art performance on various NLP tasks. However, applying this technique to noisy transcripts generated by automatic speech recognizer (ASR) is concerned. Therefore, this paper focuses on making contextualized representations more ASR-robust. We propose a novel confusion-aware fine-tuning method to mitigate the impact of ASR errors to pre-trained LMs. Specifically, we fine-tune LMs to produce similar representations for acoustically confusable words that are obtained from word confusion networks (WCNs) produced by ASR. Experiments on the benchmark ATIS dataset show that the proposed method significantly improves the performance of spoken language understanding when performing on ASR transcripts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2022

Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

Spoken language understanding (SLU) is an essential task for machines to...
research
07/05/2022

ASR-Generated Text for Language Model Pre-training Applied to Speech Tasks

We aim at improving spoken language modeling (LM) using very large amoun...
research
07/13/2023

Adapting an ASR Foundation Model for Spoken Language Assessment

A crucial part of an accurate and reliable spoken language assessment sy...
research
07/04/2020

Robust Prediction of Punctuation and Truecasingfor Medical ASR

Automatic speech recognition (ASR) systems in the medical domain that fo...
research
07/04/2020

Robust Prediction of Punctuation and Truecasing for Medical ASR

Automatic speech recognition (ASR) systems in the medical domain that fo...
research
10/05/2021

ASR Rescoring and Confidence Estimation with ELECTRA

In automatic speech recognition (ASR) rescoring, the hypothesis with the...
research
10/14/2021

Identifying Introductions in Podcast Episodes from Automatically Generated Transcripts

As the volume of long-form spoken-word content such as podcasts explodes...

Please sign up or login with your details

Forgot password? Click here to reset