VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration

04/12/2022
by   Haohe Liu, et al.
0

Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus on a single type of distortion, such as speech denoising or dereverberation. However, speech signals can be degraded by several different distortions simultaneously in the real world. It is thus important to extend speech restoration models to deal with multiple distortions. In this paper, we introduce VoiceFixer, a unified framework for high-fidelity speech restoration. VoiceFixer restores speech from multiple distortions (e.g., noise, reverberation, and clipping) and can expand degraded speech (e.g., noisy speech) with a low bandwidth to 44.1 kHz full-bandwidth high-fidelity speech. We design VoiceFixer based on (1) an analysis stage that predicts intermediate-level features from the degraded speech, and (2) a synthesis stage that generates waveform using a neural vocoder. Both objective and subjective evaluations show that VoiceFixer is effective on severely degraded speech, such as real-world historical speech recordings. Samples of VoiceFixer are available at https://haoheliu.github.io/voicefixer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2021

VoiceFixer: Toward General Speech Restoration with Neural Vocoder

Speech restoration aims to remove distortions in speech signals. Prior m...
research
02/17/2022

A Two-Stage U-Net for High-Fidelity Denoising of Historical Recordings

Enhancing the sound quality of historical music recordings is a long-sta...
research
07/13/2022

A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System

Neural-based text-to-speech (TTS) systems achieve very high-fidelity spe...
research
06/21/2021

Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis

Current two-stage TTS framework typically integrates an acoustic model w...
research
06/02/2023

HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders

This paper introduces an end-to-end neural speech restoration model, HD-...
research
06/21/2021

UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control

We propose a novel high-fidelity expressive speech synthesis model, UniT...
research
12/30/2022

Blind Restoration of Real-World Audio by 1D Operational GANs

Objective: Despite numerous studies proposed for audio restoration in th...

Please sign up or login with your details

Forgot password? Click here to reset