Multi-View Attention Transfer for Efficient Speech Enhancement

08/22/2022
by   WooSeok Shin, et al.
0

Recent deep learning models have achieved high performance in speech enhancement; however, it is still challenging to obtain a fast and low-complexity model without significant performance degradation. Previous knowledge distillation studies on speech enhancement could not solve this problem because their output distillation methods do not fit the speech enhancement task in some aspects. In this study, we propose multi-view attention transfer (MV-AT), a feature-based distillation, to obtain efficient speech enhancement models in the time domain. Based on the multi-view features extraction model, MV-AT transfers multi-view knowledge of the teacher network to the student network without additional parameters. The experimental results show that the proposed method consistently improved the performance of student models of various sizes on the Valentini and deep noise suppression (DNS) datasets. MANNER-S-8.1GF with our proposed method, a lightweight model for efficient deployment, achieved 15.4x and 4.71x fewer parameters and floating-point operations (FLOPs), respectively, compared to the baseline model with similar performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2022

Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation

This paper investigates how to improve the runtime speed of personalized...
research
03/04/2022

MANNER: Multi-view Attention Network for Noise Erasure

In the field of speech enhancement, time domain methods have difficultie...
research
05/29/2020

Sub-band Knowledge Distillation Framework for Speech Enhancement

In single-channel speech enhancement, methods based on full-band spectra...
research
11/08/2021

SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points

Numerous compression and acceleration strategies have achieved outstandi...
research
08/17/2018

A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)

Numerous studies have investigated the effectiveness of neural network q...
research
09/15/2023

Two-Step Knowledge Distillation for Tiny Speech Enhancement

Tiny, causal models are crucial for embedded audio machine learning appl...
research
12/07/2021

ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images

Despite significant advancements of deep learning-based forgery detector...

Please sign up or login with your details

Forgot password? Click here to reset