Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets

Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any labels. Even though several promising predictive coding -based learning algorithms have been proposed in the literature, it is currently unclear how well they generalise to different languages and training dataset sizes. In addition, despite that such models have shown to be effective phonemic feature learners, it is unclear whether minimisation of the predictive loss functions of these models also leads to optimal phoneme-like representations. The present study investigates the behaviour of two predictive coding models, Autoregressive Predictive Coding and Contrastive Predictive Coding, in a phoneme discrimination task (ABX task) for two languages with different dataset sizes. Our experiments show a strong correlation between the autoregressive loss and the phoneme discrimination scores with the two datasets. However, to our surprise, the CPC model shows rapid convergence already after one pass over the training data, and, on average, its representations outperform those of APC on both languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2020

Vector-Quantized Autoregressive Predictive Coding

Autoregressive Predictive Coding (APC), as a self-supervised objective, ...
research
03/30/2022

Probing phoneme, language and speaker information in unsupervised speech representations

Unsupervised models of representations based on Contrastive Predictive C...
research
11/01/2020

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies

Self-supervised speech representations have been shown to be effective i...
research
04/11/2020

Improved Speech Representations with Multi-Target Autoregressive Predictive Coding

Training objectives based on predictive coding have recently been shown ...
research
10/23/2019

Generative Pre-Training for Speech with Autoregressive Predictive Coding

Learning meaningful and general representations from unannotated speech ...
research
04/24/2021

Aligned Contrastive Predictive Coding

We investigate the possibility of forcing a self-supervised model traine...
research
09/29/2021

Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? – A computational investigation

Decades of research has studied how language learning infants learn to d...

Please sign up or login with your details

Forgot password? Click here to reset