Disambiguation-BERT for N-best Rescoring in Low-Resource Conversational ASR

10/05/2021

∙

We study the inclusion of past conversational context through BERT language models into a CTC-based Automatic Speech Recognition (ASR) system via N-best rescoring. We introduce a data-efficient strategy to fine-tune BERT on transcript disambiguation without external data. Our results show word error rate recoveries up to 37.2 in low-resource data domains, both in language (Norwegian), tone (spontaneous, conversational), and topics (parliament proceedings and customer service phone calls). We show how the nature of the data greatly affects the performance of context-augmented N-best rescoring.

READ FULL TEXT

Disambiguation-BERT for N-best Rescoring in Low-Resource Conversational ASR

Sign in with Google

Consider DeepAI Pro