Towards efficient end-to-end speech recognition with biologically-inspired neural networks

10/04/2021
by   Thomas Bohnstingl, et al.
0

Automatic speech recognition (ASR) is a capability which enables a program to process human speech into a written form. Recent developments in artificial intelligence (AI) have led to high-accuracy ASR systems based on deep neural networks, such as the recurrent neural network transducer (RNN-T). However, the core components and the performed operations of these approaches depart from the powerful biological counterpart, i.e., the human brain. On the other hand, the current developments in biologically-inspired ASR models, based on spiking neural networks (SNNs), lag behind in terms of accuracy and focus primarily on small scale applications. In this work, we revisit the incorporation of biologically-plausible models into deep learning and we substantially enhance their capabilities, by taking inspiration from the diverse neural and synaptic dynamics found in the brain. In particular, we introduce neural connectivity concepts emulating the axo-somatic and the axo-axonic synapses. Based on this, we propose novel deep learning units with enriched neuro-synaptic dynamics and integrate them into the RNN-T architecture. We demonstrate for the first time, that a biologically realistic implementation of a large-scale ASR model can yield competitive performance levels compared to the existing deep learning models. Specifically, we show that such an implementation bears several advantages, such as a reduced computational cost and a lower latency, which are critical for speech recognition applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2019

Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition

Artificial neural networks (ANN) have become the mainstream acoustic mod...
research
03/08/2022

Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems

Recent advancements in deep learning have led to drastic improvements in...
research
07/30/2023

Synaptic Plasticity Models and Bio-Inspired Unsupervised Deep Learning: A Survey

Recently emerged technologies based on Deep Learning (DL) achieved outst...
research
02/02/2023

Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition

The spiking neural network (SNN) using leaky-integrated-and-fire (LIF) n...
research
11/03/2019

eBrainII: A 3 kW Realtime Custom 3D DRAM integrated ASIC implementation of a Biologically Plausible Model of a Human Scale Cortex

The Artificial Neural Networks (ANNs) like CNN/DNN and LSTM are not biol...
research
05/11/2020

Exploring TTS without T Using Biologically/Psychologically Motivated Neural Network Modules (ZeroSpeech 2020)

In this study, we reported our exploration of Text-To-Speech without Tex...
research
09/28/2022

On the visual analytic intelligence of neural networks

Visual oddity task was conceived as a universal ethnic-independent analy...

Please sign up or login with your details

Forgot password? Click here to reset