Perceptimatic: A human speech perception benchmark for unsupervised subword modelling

10/12/2020
by   Juliette Millet, et al.
0

In this paper, we present a data set and methods to compare speech processing models and human behaviour on a phone discrimination task. We provide Perceptimatic, an open data set which consists of French and English speech stimuli, as well as the results of 91 English- and 93 French-speaking listeners. The stimuli test a wide range of French and English contrasts, and are extracted directly from corpora of natural running read speech, used for the 2017 Zero Resource Speech Challenge. We provide a method to compare humans' perceptual space with models' representational space, and we apply it to models previously submitted to the Challenge. We show that, unlike unsupervised models and supervised multilingual models, a standard supervised monolingual HMM-GMM phone recognition system, while good at discriminating phones, yields a representational space very different from that of human native listeners.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2020

The Perceptimatic English Benchmark for Speech Perception Models

We present the Perceptimatic English Benchmark, an open experimental ben...
research
07/30/2023

Mispronunciation detection using self-supervised speech representations

In recent years, self-supervised learning (SSL) models have produced pro...
research
05/31/2022

Do self-supervised speech models develop human-like perception biases?

Self-supervised models for speech processing form representational space...
research
08/06/2020

Evaluating computational models of infant phonetic learning across languages

In the first year of life, infants' speech perception becomes attuned to...
research
05/31/2022

Predicting non-native speech perception using the Perceptual Assimilation Model and state-of-the-art acoustic models

Our native language influences the way we perceive speech sounds, affect...
research
08/08/2020

Speaker discrimination in humans and machines: Effects of speaking style variability

Does speaking style variation affect humans' ability to distinguish indi...
research
12/14/2020

Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networks

We investigate segmenting and clustering speech into low-bitrate phone-l...

Please sign up or login with your details

Forgot password? Click here to reset