HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

06/10/2020
by   Jiaqi Su, et al.
0

Real-world audio recordings are often degraded by factors such as noise, reverberation, and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to transform recorded speech to sound as though it had been recorded in a studio. We use an end-to-end feed-forward WaveNet architecture, trained with multi-scale adversarial discriminators in both the time domain and the time-frequency domain. It relies on the deep feature matching losses of the discriminators to improve the perceptual quality of enhanced speech. The proposed model generalizes well to new speakers, new speech content, and new environments. It significantly outperforms state-of-the-art baseline methods in both objective and subjective experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2019

High Fidelity Speech Synthesis with Adversarial Networks

Generative adversarial networks have seen rapid development in recent ye...
research
11/02/2022

DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP

Recent development of neural vocoders based on the generative adversaria...
research
04/17/2021

Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

The intelligibility of speech severely degrades in the presence of envir...
research
06/27/2018

Speech Denoising with Deep Feature Losses

We present an end-to-end deep learning approach to denoising speech sign...
research
04/22/2022

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation

This paper presents a speaking-rate-controllable HiFi-GAN neural vocoder...
research
04/28/2020

Conditional Spoken Digit Generation with StyleGAN

This paper adapts a StyleGAN model for speech generation with minimal or...
research
10/18/2021

KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke

An automatic pitch correction system typically includes several stages, ...

Please sign up or login with your details

Forgot password? Click here to reset