Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models

11/01/2018
by   Herman Kamper, et al.
0

We investigate unsupervised models that can map a variable-duration speech segment to a fixed-dimensional representation. In settings where unlabelled speech is the only available resource, such acoustic word embeddings can form the basis for "zero-resource" speech search, discovery and indexing systems. Most existing unsupervised embedding methods still use some supervision, such as word or phoneme boundaries. Here we propose the encoder-decoder correspondence autoencoder (EncDec-CAE), which, instead of true word segments, uses automatically discovered segments: an unsupervised term discovery system finds pairs of words of the same unknown type, and the EncDec-CAE is trained to reconstruct one word given the other as input. We compare it to a standard encoder-decoder autoencoder (AE), a variational AE with a prior over its latent embedding, and downsampling. EncDec-CAE outperforms its closest competitor by 24 task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2020

A Correspondence Variational Autoencoder for Unsupervised Acoustic Word Embeddings

We propose a new unsupervised model for mapping a variable-duration spee...
research
03/28/2020

Unsupervised feature learning for speech using correspondence and Siamese networks

In zero-resource settings where transcribed speech audio is unavailable,...
research
07/26/2020

Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery

Unsupervised spoken term discovery consists of two tasks: finding the ac...
research
07/27/2020

Evaluating the reliability of acoustic speech embeddings

Speech embeddings are fixed-size acoustic representations of variable-le...
research
12/14/2020

A comparison of self-supervised speech representations as input features for unsupervised acoustic word embeddings

Many speech processing tasks involve measuring the acoustic similarity b...
research
03/09/2016

Unsupervised word segmentation and lexicon discovery using acoustic word embeddings

In settings where only unlabelled speech data is available, speech techn...
research
04/03/2020

Analyzing autoencoder-based acoustic word embeddings

Recent studies have introduced methods for learning acoustic word embedd...

Please sign up or login with your details

Forgot password? Click here to reset