Interactive spatial speech recognition maps based on simulated speech recognition experiments

by   Marc René Schädler, et al.

In their everyday life, the speech recognition performance of human listeners is influenced by diverse factors, such as the acoustic environment, the talker and listener positions, possibly impaired hearing, and optional hearing devices. Prediction models come closer to considering all required factors simultaneously to predict the individual speech recognition performance in complex acoustic environments. While such predictions may still not be sufficiently accurate for serious applications, they can already be performed and demand an accessible representation. In this contribution, an interactive representation of speech recognition performance is proposed, which focuses on the listeners head orientation and the spatial dimensions of an acoustic scene. A exemplary modeling toolchain, including an acoustic rendering model, a hearing device model, and a listener model, was used to generate a data set for demonstration purposes. Using the spatial speech recognition maps to explore this data set demonstrated the suitability of the approach to observe possibly relevant behavior. The proposed representation provides a suitable target to compare and validate different modeling approaches in ecologically relevant contexts. Eventually, it may serve as a tool to use validated prediction models in the design of spaces and devices which take speech communication into account.



There are no comments yet.


page 3

page 8

page 10

page 11

page 13

page 15


On model architecture for a children's speech recognition interactive dialog system

This report presents a general model of the architecture of information ...

Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach

The superior temporal gyrus (STG) region of cortex critically contribute...

Spatial Attention for Far-field Speech Recognition with Deep Beamforming Neural Networks

In this paper, we introduce spatial attention for refining the informati...

Collaborative Training of Acoustic Encoders for Speech Recognition

On-device speech recognition requires training models of different sizes...

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

In this paper, a time delay neural network (TDNN) based acoustic model i...

Data-selective Transfer Learning for Multi-Domain Speech Recognition

Negative transfer in training of acoustic models for automatic speech re...

A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

This article provides a unifying Bayesian network view on various approa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.