Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model for Telephonic-Speech ASR

11/03/2022
by   Vrunda N. Sukhadia, et al.
0

This paper proposes a novel technique to obtain better downstream ASR performance from a joint encoder-decoder self-supervised model when trained with speech pooled from two different channels (narrow and wide band). The joint encoder-decoder self-supervised model extends the HuBERT model with a Transformer decoder. HuBERT performs clustering of features and predicts the class of every input frame. In simple pooling, which is our baseline, there is no way to identify the channel information. To incorporate channel information, we have proposed non-overlapping cluster IDs for speech from different channels. Our method gives a relative improvement of   5 encoder-decoder self-supervised model built with simple pooling of data, which serves as our baseline.

READ FULL TEXT
research
06/09/2022

Joint Encoder-Decoder Self-Supervised Pre-training for ASR

Self-supervised learning (SSL) has shown tremendous success in various s...
research
07/06/2023

Encoder-Decoder Networks for Self-Supervised Pretraining and Downstream Signal Bandwidth Regression on Digital Antenna Arrays

This work presents the first applications of self-supervised learning ap...
research
11/18/2022

AVATAR submission to the Ego4D AV Transcription Challenge

In this report, we describe our submission to the Ego4D AudioVisual (AV)...
research
06/05/2020

Self-Supervised Encoder for Fault Prediction in Electrochemical Cells

Predicting faults before they occur helps to avoid potential safety haza...
research
09/12/2023

Self-supervised Extraction of Human Motion Structures via Frame-wise Discrete Features

The present paper proposes an encoder-decoder model for extracting the s...
research
02/02/2023

Energy-Inspired Self-Supervised Pretraining for Vision Models

Motivated by the fact that forward and backward passes of a deep network...
research
12/17/2021

Watermarking Images in Self-Supervised Latent Spaces

We revisit watermarking techniques based on pre-trained deep networks, i...

Please sign up or login with your details

Forgot password? Click here to reset