Robust Training of Vector Quantized Bottleneck Models

05/18/2020
by   Adrian Łańcucki, et al.
3

In this paper we demonstrate methods for reliable and efficient training of discrete representation using Vector-Quantized Variational Auto-Encoder models (VQ-VAEs). Discrete latent variable models have been shown to learn nontrivial representations of speech, applicable to unsupervised voice conversion and reaching state-of-the-art performance on unit discovery tasks. For unsupervised representation learning, they became viable alternatives to continuous latent variable models such as the Variational Auto-Encoder (VAE). However, training deep discrete variable models is challenging, due to the inherent non-differentiability of the discretization operation. In this paper we focus on VQ-VAE, a state-of-the-art discrete bottleneck model shown to perform on par with its continuous counterparts. It quantizes encoder outputs with on-line k-means clustering. We show that the codebook learning can suffer from poor initialization and non-stationarity of clustered encoder outputs. We demonstrate that these can be successfully overcome by increasing the learning rate for the codebook and periodic date-dependent codeword re-initialization. As a result, we achieve more robust training across different tasks, and significantly increase the usage of latent codewords even for large codebooks. This has practical benefit, for instance, in unsupervised representation learning, where large codebooks may lead to disentanglement of latent representations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2020

Unsupervised Acoustic Unit Representation Learning for Voice Conversion using WaveNet Auto-encoders

Unsupervised representation learning of speech has been of keen interest...
research
05/28/2022

Improving VAE-based Representation Learning

Latent variable models like the Variational Auto-Encoder (VAE) are commo...
research
05/28/2018

Theory and Experiments on Vector Quantized Autoencoders

Deep neural networks with discrete latent variables offer the promise of...
research
10/24/2020

A Comparison of Discrete Latent Variable Models for Speech Representation Learning

Neural latent variable models enable the discovery of interesting struct...
research
04/11/2020

Depthwise Discrete Representation Learning

Recent advancements in learning Discrete Representations as opposed to c...
research
02/12/2023

Vector Quantized Wasserstein Auto-Encoder

Learning deep discrete latent presentations offers a promise of better s...
research
11/26/2021

Learning source-aware representations of music in a discrete latent space

In recent years, neural network based methods have been proposed as a me...

Please sign up or login with your details

Forgot password? Click here to reset