Speech recognition with quaternion neural networks

11/21/2018
by   Titouan Parcollet, et al.
0

Neural network architectures are at the core of powerful automatic speech recognition systems (ASR). However, while recent researches focus on novel model architectures, the acoustic input features remain almost unchanged. Traditional ASR systems rely on multidimensional acoustic features such as the Mel filter bank energies alongside with the first, and second order derivatives to characterize time-frames that compose the signal sequence. Considering that these components describe three different views of the same element, neural networks have to learn both the internal relations that exist within these features, and external or global dependencies that exist between the time-frames. Quaternion-valued neural networks (QNN), recently received an important interest from researchers to process and learn such relations in multidimensional spaces. Indeed, quaternion numbers and QNNs have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with up to four times less learning parameters than real-valued models. We propose to investigate modern quaternion-valued models such as convolutional and recurrent quaternion neural networks in the context of speech recognition with the TIMIT dataset. The experiments show that QNNs always outperform real-valued equivalent models with way less free parameters, leading to a more efficient, compact, and expressive representation of the relevant information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2019

Real to H-space Encoder for Speech Recognition

Deep neural networks (DNNs) and more precisely recurrent neural networks...
research
06/20/2018

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

Recently, the connectionist temporal classification (CTC) model coupled ...
research
05/18/2020

Quaternion Neural Networks for Multi-channel Distant Speech Recognition

Despite the significant progress in automatic speech recognition (ASR), ...
research
09/14/2023

Understanding Vector-Valued Neural Networks and Their Relationship with Real and Hypercomplex-Valued Neural Networks

Despite the many successful applications of deep learning models for mul...
research
09/15/2023

Augmenting conformers with structured state space models for online speech recognition

Online speech recognition, where the model only accesses context to the ...
research
10/31/2018

Quaternion Convolutional Neural Networks for Heterogeneous Image Processing

Convolutional neural networks (CNN) have recently achieved state-of-the-...

Please sign up or login with your details

Forgot password? Click here to reset