Contrastive Representation Learning for Acoustic Parameter Estimation

02/22/2023
by   Philipp Götz, et al.
0

A study is presented in which a contrastive learning approach is used to extract low-dimensional representations of the acoustic environment from single-channel, reverberant speech signals. Convolution of room impulse responses (RIRs) with anechoic source signals is leveraged as a data augmentation technique that offers considerable flexibility in the design of the upstream task. We evaluate the embeddings across three different downstream tasks, which include the regression of acoustic parameters reverberation time RT60 and clarity index C50, and the classification into small and large rooms. We demonstrate that the learned representations generalize well to unseen data and perform similarly to a fully-supervised baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2022

Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning

Contrastive learning enables learning useful audio and speech representa...
research
02/05/2023

CIPER: Combining Invariant and Equivariant Representations Using Contrastive and Predictive Learning

Self-supervised representation learning (SSRL) methods have shown great ...
research
06/01/2022

Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views

A data augmentation module is utilized in contrastive learning to transf...
research
11/13/2021

Evaluating Contrastive Learning on Wearable Timeseries for Downstream Clinical Outcomes

Vast quantities of person-generated health data (wearables) are collecte...
research
09/01/2023

Learning Speech Representation From Contrastive Token-Acoustic Pretraining

For fine-grained generation and recognition tasks such as minimally-supe...
research
07/23/2023

SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces

Numerous examples in the literature proved that deep learning models hav...
research
09/09/2019

Data Augmentation and Deep Convolutional Neural Networks for Blind Room Acoustic Parameter Estimation

Reverberation time (T60) and the direct-to-reverberant ratio (DRR) are t...

Please sign up or login with your details

Forgot password? Click here to reset