A Deep Learning System for Domain-specific speech Recognition

03/18/2023
by   Yanan Jia, et al.
0

As human-machine voice interfaces provide easy access to increasingly intelligent machines, many state-of-the-art automatic speech recognition (ASR) systems are proposed. However, commercial ASR systems usually have poor performance on domain-specific speech especially under low-resource settings. The author works with pre-trained DeepSpeech2 and Wav2Vec2 acoustic models to develop benefit-specific ASR systems. The domain-specific data are collected using proposed semi-supervised learning annotation with little human intervention. The best performance comes from a fine-tuned Wav2Vec2-Large-LV60 acoustic model with an external KenLM, which surpasses the Google and AWS ASR systems on benefit-specific speech. The viability of using error prone ASR transcriptions as part of spoken language understanding (SLU) is also investigated. Results of a benefit-specific natural language understanding (NLU) task show that the domain-specific fine-tuned ASR system can outperform the commercial ASR systems even when its transcriptions have higher word error rate (WER), and the results between fine-tuned ASR and human transcriptions are similar.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2021

A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding

Self-supervised speech representations such as wav2vec 2.0 and HuBERT ar...
research
06/12/2023

On the N-gram Approximation of Pre-trained Language Models

Large pre-trained language models (PLMs) have shown remarkable performan...
research
05/26/2022

Clinical Dialogue Transcription Error Correction using Seq2Seq Models

Good communication is critical to good healthcare. Clinical dialogue is ...
research
12/16/2021

Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems

Automatic Speech Recognition (ASR) systems have found their use in numer...
research
03/09/2022

A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling

Automatic speech recognition (ASR) systems used on smart phones or vehic...
research
05/13/2022

Unified Modeling of Multi-Domain Multi-Device ASR Systems

Modern Automatic Speech Recognition (ASR) systems often use a portfolio ...

Please sign up or login with your details

Forgot password? Click here to reset