PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers

03/30/2023
by   Rahul Pandey, et al.
0

End-to-End (E2E) automatic speech recognition (ASR) systems used in voice assistants often have difficulties recognizing infrequent words personalized to the user, such as names and places. Rare words often have non-trivial pronunciations, and in such cases, human knowledge in the form of a pronunciation lexicon can be useful. We propose a PROnunCiation-aware conTextual adaptER (PROCTER) that dynamically injects lexicon knowledge into an RNN-T model by adding a phonemic embedding along with a textual embedding. The experimental results show that the proposed PROCTER architecture outperforms the baseline RNN-T model by improving the word error rate (WER) by 44 when measured on personalized entities and personalized rare entities, respectively, while increasing the model size (number of trainable parameters) by only 1 personalized device names, we observe 7 compared to only 1

READ FULL TEXT
research
10/05/2021

Fast Contextual Adaptation with Neural Associative Memory for On-Device Personalized Speech Recognition

Fast contextual adaptation has shown to be effective in improving Automa...
research
11/05/2021

Context-Aware Transformer Transducer for Speech Recognition

End-to-end (E2E) automatic speech recognition (ASR) systems often have d...
research
10/29/2018

Contextual Speech Recognition with Difficult Negative Training Examples

Improving the representation of contextual information is key to unlocki...
research
11/16/2020

Deep Shallow Fusion for RNN-T Personalization

End-to-end models in general, and Recurrent Neural Network Transducer (R...
research
09/18/2023

CB-Whisper: Contextual Biasing Whisper using TTS-based Keyword Spotting

End-to-end automatic speech recognition (ASR) systems often struggle to ...
research
12/14/2019

Personalization of End-to-end Speech Recognition On Mobile Devices For Named Entities

We study the effectiveness of several techniques to personalize end-to-e...
research
02/15/2021

Personalization Strategies for End-to-End Speech Recognition Systems

The recognition of personalized content, such as contact names, remains ...

Please sign up or login with your details

Forgot password? Click here to reset