Phase Repair for Time-Domain Convolutional Neural Networks in Music Super-Resolution

06/20/2023
by   Yenan Zhang, et al.
0

Audio Super-Resolution (SR) is an important topic in the field of audio processing. Many models are designed in time domain due to the advantage of waveform processing, such as being able to avoid the phase problem. However, in prior works it is shown that Time-Domain Convolutional Neural Network (TD-CNN) approaches tend to produce annoying artifacts in their output. In order to confirm the source of the artifact, we conduct an AB listening test and found phase to be the cause. We further propose Time-Domain Phase Repair (TD-PR) to improve TD-CNNs' performance by repairing the phase of the TD-CNNs' output. In this paper, we focus on the music SR task, which is challenging due to the wide frequency response and dynamic range of music. Our proposed method can handle various narrow-bandwidth from 2.5kHz to 4kHz with a target bandwidth of 8kHz. We conduct both objective and subjective evaluation to assess the proposed method. The objective evaluation result indicates the proposed method achieves the SR task effectively. Moreover, the proposed TD-PR obtains the much higher mean opinion scores than all TD-CNN baselines, which indicates that the proposed TD-PR significantly improves perceptual quality. Samples are available on the demo page.

READ FULL TEXT
research
06/07/2018

Super-Resolution using Convolutional Neural Networks without Any Checkerboard Artifacts

It is well-known that a number of excellent super-resolution (SR) method...
research
11/22/2022

AERO: Audio Super Resolution in the Spectral Domain

We present AERO, a audio super-resolution model that processes speech an...
research
04/01/2020

Feature-Driven Super-Resolution for Object Detection

Although some convolutional neural networks (CNNs) based super-resolutio...
research
08/12/2023

BigWavGAN: A Wave-To-Wave Generative Adversarial Network for Music Super-Resolution

Generally, Deep Neural Networks (DNNs) are expected to have high perform...
research
06/16/2021

WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution

Audio super-resolution is the task of constructing a high-resolution (HR...
research
06/11/2020

An Objective Measure of Quality for Time-Scale Modification of Audio

Objective evaluation of audio processed with Time-Scale Modification (TS...
research
04/13/2022

BEHM-GAN: Bandwidth Extension of Historical Music using Generative Adversarial Networks

Audio bandwidth extension aims to expand the spectrum of narrow-band aud...

Please sign up or login with your details

Forgot password? Click here to reset