BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

01/16/2023
by   Will Rieger, et al.
0

Recent developments using End-to-End Deep Learning models have been shown to have near or better performance than state of the art Recurrent Neural Networks (RNNs) on Automatic Speech Recognition tasks. These models tend to be lighter weight and require less training time than traditional RNN-based approaches. However, these models take frequentist approach to weight training. In theory, network weights are drawn from a latent, intractable probability distribution. We introduce BayesSpeech for end-to-end Automatic Speech Recognition. BayesSpeech is a Bayesian Transformer Network where these intractable posteriors are learned through variational inference and the local reparameterization trick without recurrence. We show how the introduction of variance in the weights leads to faster training time and near state-of-the-art performance on LibriSpeech-960.

READ FULL TEXT
research
12/31/2019

EEG based Continuous Speech Recognition using Transformers

In this paper we investigate continuous speech recognition using electro...
research
04/08/2022

Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition

End-to-end models have achieved significant improvement on automatic spe...
research
02/14/2021

Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech Recognition

Attention is a very popular and effective mechanism in artificial neural...
research
07/22/2021

CarneliNet: Neural Mixture Model for Automatic Speech Recognition

End-to-end automatic speech recognition systems have achieved great accu...
research
10/28/2017

A Study of All-Convolutional Encoders for Connectionist Temporal Classification

Connectionist temporal classification (CTC) is a popular sequence predic...
research
11/28/2017

Exploiting Nontrivial Connectivity for Automatic Speech Recognition

Nontrivial connectivity has allowed the training of very deep networks b...
research
12/22/2019

power-law nonlinearity with maximally uniform distribution criterion for improved neural network training in automatic speech recognition

In this paper, we describe the Maximum Uniformity of Distribution (MUD) ...

Please sign up or login with your details

Forgot password? Click here to reset