DeepAI AI Chat
Log In Sign Up

Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021

07/13/2021
by   Takashi Maekaku, et al.
yahoo
Carnegie Mellon University
0

We present a system for the Zero Resource Speech Challenge 2021, which combines a Contrastive Predictive Coding (CPC) with deep cluster. In deep cluster, we first prepare pseudo-labels obtained by clustering the outputs of a CPC network with k-means. Then, we train an additional autoregressive model to classify the previously obtained pseudo-labels in a supervised manner. Phoneme discriminative representation is achieved by executing the second-round clustering with the outputs of the final layer of the autoregressive model. We show that replacing a Transformer layer with a Conformer layer leads to a further gain in a lexical metric. Experimental results show that a relative improvement of 35 in a syntactic metric are achieved compared to a baseline method of CPC-small which is trained on LibriSpeech 460h data. We achieve top results in this challenge with the syntactic metric.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/29/2021

The Interspeech Zero Resource Speech Challenge 2021: Spoken language modelling

We present the Zero Resource Speech Challenge 2021, which asks participa...
06/01/2022

DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep Neural Networks

Deep clustering has recently emerged as a promising technique for comple...
11/23/2021

Exploring Non-Contrastive Representation Learning for Deep Clustering

Existing deep clustering methods rely on contrastive learning for repres...
01/28/2022

Hybrid Contrastive Learning with Cluster Ensemble for Unsupervised Person Re-identification

Unsupervised person re-identification (ReID) aims to match a query image...
04/08/2021

Pseudo-supervised Deep Subspace Clustering

Auto-Encoder (AE)-based deep subspace clustering (DSC) methods have achi...