Successes and critical failures of neural networks in capturing human-like speech recognition

04/06/2022
by   Federico Adolfi, et al.
1

Natural and artificial audition can in principle evolve different solutions to a given problem. The constraints of the task, however, can nudge the cognitive science and engineering of audition to qualitatively converge, suggesting that a closer mutual examination would improve artificial hearing systems and process models of the mind and brain. Speech recognition - an area ripe for such exploration - is inherently robust in humans to a number transformations at various spectrotemporal granularities. To what extent are these robustness profiles accounted for by high-performing neural network systems? We bring together experiments in speech recognition under a single synthesis framework to evaluate state-of-the-art neural networks as stimulus-computable, optimized observers. In a series of experiments, we (1) clarify how influential speech manipulations in the literature relate to each other and to natural speech, (2) show the granularities at which machines exhibit out-of-distribution robustness, reproducing classical perceptual phenomena in humans, (3) identify the specific conditions where model predictions of human performance differ, and (4) demonstrate a crucial failure of all artificial systems to perceptually recover where humans do, suggesting a key specification for theory and model building. These findings encourage a tighter synergy between the cognitive science and engineering of audition.

READ FULL TEXT

page 2

page 3

research
04/01/2016

Building Machines That Learn and Think Like People

Recent progress in artificial intelligence (AI) has renewed interest in ...
research
04/09/2021

Accented Speech Recognition Inspired by Human Perception

While improvements have been made in automatic speech recognition perfor...
research
10/03/2019

Convolutional Neural Networks for Speech Controlled Prosthetic Hands

Speech recognition is one of the key topics in artificial intelligence, ...
research
01/15/2019

Phoneme-Based Persian Speech Recognition

Undoubtedly, one of the most important issues in computer science is int...
research
12/23/2020

Speech Synthesis as Augmentation for Low-Resource ASR

Speech synthesis might hold the key to low-resource speech recognition. ...
research
02/10/2021

Fast Classification Learning with Neural Networks and Conceptors for Speech Recognition and Car Driving Maneuvers

Recurrent neural networks are a powerful means in diverse applications. ...
research
10/10/2019

Causality and deceit: Do androids watch action movies?

We seek causes through science, religion, and in everyday life. We get e...

Please sign up or login with your details

Forgot password? Click here to reset