Audio inpainting with generative adversarial network

03/13/2020
by   P. P. Ebner, et al.
0

We study the ability of Wasserstein Generative Adversarial Network (WGAN) to generate missing audio content which is, in context, (statistically similar) to the sound and the neighboring borders. We deal with the challenge of audio inpainting long range gaps (500 ms) using WGAN models. We improved the quality of the inpainting part using a new proposed WGAN architecture that uses a short-range and a long-range neighboring borders compared to the classical WGAN model. The performance was compared with two different audio instruments (piano and guitar) and on virtuoso pianists together with a string orchestra. The objective difference grading (ODG) was used to evaluate the performance of both architectures. The proposed model outperforms the classical WGAN model and improves the reconstruction of high-frequency content. Further, we got better results for instruments where the frequency spectrum is mainly in the lower range where small noises are less annoying for human ear and the inpainting part is more perceptible. Finally, we could show that better test results for audio dataset were reached where a particular instrument is accompanist by other instruments if we train the network only on this particular instrument neglecting the other instruments.

READ FULL TEXT
research
10/29/2018

A context encoder for audio inpainting

We studied the ability of deep neural networks (DNNs) to restore missing...
research
05/11/2020

GACELA – A generative adversarial context encoder for long audio inpainting

We introduce GACELA, a generative adversarial network (GAN) designed to ...
research
05/24/2023

Diffusion-Based Audio Inpainting

Audio inpainting aims to reconstruct missing segments in corrupted recor...
research
06/11/2022

Multi-instrument Music Synthesis with Spectrogram Diffusion

An ideal music synthesizer should be both interactive and expressive, ge...
research
07/13/2021

Timbre Classification of Musical Instruments with a Deep Learning Multi-Head Attention-Based Model

The aim of this work is to define a model based on deep learning that is...
research
06/24/2020

Face-to-Music Translation Using a Distance-Preserving Generative Adversarial Network with an Auxiliary Discriminator

Learning a mapping between two unrelated domains-such as image and audio...
research
09/19/2019

Physics-informed semantic inpainting: Application to geostatistical modeling

A fundamental problem in geostatistical modeling is to infer the heterog...

Please sign up or login with your details

Forgot password? Click here to reset