Cross-attention conformer for context modeling in speech enhancement for ASR

10/30/2021
by   Arun Narayanan, et al.
0

This work introduces cross-attention conformer, an attention-based architecture for context modeling in speech enhancement. Given that the context information can often be sequential, and of different length as the audio that is to be enhanced, we make use of cross-attention to summarize and merge contextual information with input features. Building upon the recently proposed conformer model that uses self attention layers as building blocks, the proposed cross-attention conformer can be used to build deep contextual models. As a concrete example, we show how noise context, i.e., short noise-only audio segment preceding an utterance, can be used to build a speech enhancement feature frontend using cross-attention conformer layers for improving noise robustness of automatic speech recognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2022

A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation

Recent work has shown that it is possible to train a single model to per...
research
07/18/2021

DeHumor: Visual Analytics for Decomposing Humor

Despite being a critical communication skill, grasping humor is challeng...
research
03/22/2023

Self-supervised Learning with Speech Modulation Dropout

We show that training a multi-headed self-attention-based deep network t...
research
11/10/2022

Speech Enhancement with Fullband-Subband Cross-Attention Network

FullSubNet has shown its promising performance on speech enhancement by ...
research
06/02/2023

Audio-Visual Speech Enhancement with Score-Based Generative Models

This paper introduces an audio-visual speech enhancement system that lev...
research
06/30/2021

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Single-channel speech enhancement (SE) is an important task in speech pr...
research
02/06/2022

On Using Transformers for Speech-Separation

Transformers have enabled major improvements in deep learning. They ofte...

Please sign up or login with your details

Forgot password? Click here to reset