A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement

08/26/2021
by   Tianrui Wang, et al.
0

Deep learning technology has been widely applied to speech enhancement. While testing the effectiveness of various network structures, researchers are also exploring the improvement of the loss function used in network training. Although the existing methods have considered the auditory characteristics of speech or the reasonable expression of signal-to-noise ratio, the correlation with the auditory evaluation score and the applicability of the calculation for gradient optimization still need to be improved. In this paper, a signal-to-noise ratio loss function based on auditory power compression is proposed. The experimental results show that the overall correlation between the proposed function and the indexes of objective speech intelligibility, which is better than other loss functions. For the same speech enhancement model, the training effect of this method is also better than other comparison methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2019

On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement

Many deep learning-based speech enhancement algorithms are designed to m...
research
04/27/2017

Complex spectrogram enhancement by convolutional neural network with multi-metrics learning

This paper aims to address two issues existing in the current speech enh...
research
03/30/2021

Time-domain Speech Enhancement with Generative Adversarial Learning

Speech enhancement aims to obtain speech signals with high intelligibili...
research
12/10/2018

An individualized super Gaussian single microphone Speech Enhancement for hearing aid users with smartphone as an assistive device

In this letter, we derive a new super Gaussian Joint Maximum a Posterior...
research
09/12/2023

Assessing the Generalization Gap of Learning-Based Speech Enhancement Systems in Noisy and Reverberant Environments

The acoustic variability of noisy and reverberant speech mixtures is inf...
research
12/07/2020

Modeling the effects of dynamic range compression on signals in noise

Hearing aids use dynamic range compression (DRC), a form of automatic ga...
research
02/08/2022

A Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning for Hearing-Assistive Technologies

Current deep learning (DL) based approaches to speech intelligibility en...

Please sign up or login with your details

Forgot password? Click here to reset