Variable-rate discrete representation learning

03/10/2021
by   Sander Dieleman, et al.
0

Semantically meaningful information content in perceptual signals is usually unevenly distributed. In speech signals for example, there are often many silences, and the speed of pronunciation can vary considerably. In this work, we propose slow autoencoders (SlowAEs) for unsupervised learning of high-level variable-rate discrete representations of sequences, and apply them to speech. We show that the resulting event-based representations automatically grow or shrink depending on the density of salient information in the input signals, while still allowing for faithful signal reconstruction. We develop run-length Transformers (RLTs) for event-based representation modelling and use them to construct language models in the speech domain, which are able to generate grammatical and semantically coherent utterances and continuations.

READ FULL TEXT

page 11

page 25

research
04/03/2018

Unsupervised Learning of Sequence Representations by Autoencoders

Traditional machine learning models have problems with handling sequence...
research
01/25/2019

Unsupervised speech representation learning using WaveNet autoencoders

We consider the task of unsupervised extraction of meaningful latent rep...
research
07/01/2019

Analysis by Adversarial Synthesis -- A Novel Approach for Speech Vocoding

Classical parametric speech coding techniques provide a compact represen...
research
04/30/2019

Incorporating Symbolic Sequential Modeling for Speech Enhancement

In a noisy environment, a lossy speech signal can be automatically resto...
research
10/15/2022

Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation

Unsupervised representation learning for speech audios attained impressi...
research
06/05/2022

Variable-rate hierarchical CPC leads to acoustic unit discovery in speech

The success of deep learning comes from its ability to capture the hiera...
research
08/19/2019

Salient Speech Representations Based on Cloned Networks

We define salient features as features that are shared by signals that a...

Please sign up or login with your details

Forgot password? Click here to reset