TEVR: Improving Speech Recognition by Token Entropy Variance Reduction

06/25/2022
by   Hajo Nils Krabbenhöft, et al.
0

This paper presents TEVR, a speech recognition model designed to minimize the variation in token entropy w.r.t. to the language model. This takes advantage of the fact that if the language model will reliably and accurately predict a token anyway, then the acoustic model doesn't need to be accurate in recognizing it. We train German ASR models with 900 million parameters and show that on CommonVoice German, TEVR scores a very competitive 3.64 rate, which outperforms the best reported results by a relative 16.89 reduction in word error rate. We hope that releasing our fully trained speech recognition pipeline to the community will lead to privacy-preserving offline virtual assistants in the future.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/26/2018

Open Source Automatic Speech Recognition for German

High quality Automatic Speech Recognition (ASR) is a prerequisite for sp...
research
06/15/2021

Dialectal Speech Recognition and Translation of Swiss German Speech to Standard German Text: Microsoft's Submission to SwissText 2021

This paper describes the winning approach in the Shared Task 3 at SwissT...
research
06/13/2023

Large-scale Language Model Rescoring on Long-form Data

In this work, we study the impact of Large-scale Language Models (LLM) o...
research
10/06/2021

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition

Text-only adaptation of an end-to-end (E2E) model remains a challenging ...
research
01/28/2022

Neural-FST Class Language Model for End-to-End Speech Recognition

We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech...
research
11/16/2022

Streaming Joint Speech Recognition and Disfluency Detection

Disfluency detection has mainly been solved in a pipeline approach, as p...
research
08/18/2020

Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation

False triggers in voice assistants are unintended invocations of the ass...

Please sign up or login with your details

Forgot password? Click here to reset