The Performance Evaluation of Attention-Based Neural ASR under Mixed Speech Input

08/03/2021
by   Bradley He, et al.
0

In order to evaluate the performance of the attention based neural ASR under noisy conditions, the current trend is to present hours of various noisy speech data to the model and measure the overall word/phoneme error rate (W/PER). In general, it is unclear how these models perform when exposed to a cocktail party setup in which two or more speakers are active. In this paper, we present the mixtures of speech signals to a popular attention-based neural ASR, known as Listen, Attend, and Spell (LAS), at different target-to-interference ratio (TIR) and measure the phoneme error rate. In particular, we investigate in details when two phonemes are mixed what will be the predicted phoneme; in this fashion we build a model in which the most probable predictions for a phoneme are given. We found a 65 mixed speech signals at TIR = 0 dB and the performance approaches the unmixed scenario at TIR = 30 dB. Our results show the model, when presented with mixed phonemes signals, tend to predict those that have higher accuracies during evaluation of original phoneme signals.

READ FULL TEXT
research
02/20/2023

A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One

Although automatic speech recognition (ASR) can perform well in common n...
research
03/29/2018

Cracking the cocktail party problem by multi-beam deep attractor network

While recent progresses in neural network approaches to single-channel s...
research
09/21/2022

Assessing ASR Model Quality on Disordered Speech using BERTScore

Word Error Rate (WER) is the primary metric used to assess automatic spe...
research
12/12/2019

On Neural Phone Recognition of Mixed-Source ECoG Signals

The emerging field of neural speech recognition (NSR) using electrocorti...
research
01/14/2020

Improved Robust ASR for Social Robots in Public Spaces

Social robots deployed in public spaces present a challenging task for A...
research
09/19/2021

Model-Based Approach for Measuring the Fairness in ASR

The issue of fairness arises when the automatic speech recognition (ASR)...

Please sign up or login with your details

Forgot password? Click here to reset