A context encoder for audio inpainting

10/29/2018
by   Andrés Marafioti, et al.
0

We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting. We focused on gaps in the range of tens of milliseconds, a condition which has not received much attention yet. The proposed DNN structure was trained on audio signals containing music and musical instruments, separately, with 64-ms long gaps. The input to the DNN was the context, i.e., the signal surrounding the gap, transformed into time-frequency (TF) coefficients. Two networks were analyzed, a DNN with complex-valued TF coefficient output and another one producing magnitude TF coefficient output, both based on the same network architecture. We found significant differences in the inpainting results between the two DNNs. In particular, we discuss the observation that the complex-valued DNN fails to produce reliable results outside the low frequency range. Further, our results were compared to those obtained from a reference method based on linear predictive coding (LPC). For instruments, our DNNs were not able to match the performance of reference method, although the magnitude network provided good results as well. For music, however, our magnitude DNN significantly outperformed the reference method, demonstrating a generally good usability of the proposed DNN structure for inpainting complex audio signals like music. This paves the road towards future, more sophisticated audio inpainting approaches based on DNNs.

READ FULL TEXT

page 4

page 7

page 8

research
05/11/2020

GACELA – A generative adversarial context encoder for long audio inpainting

We introduce GACELA, a generative adversarial network (GAN) designed to ...
research
03/13/2020

Audio inpainting with generative adversarial network

We study the ability of Wasserstein Generative Adversarial Network (WGAN...
research
10/09/2020

Audio-Visual Speech Inpainting with Deep Learning

In this paper, we present a deep-learning-based framework for audio-visu...
research
05/24/2023

Diffusion-Based Audio Inpainting

Audio inpainting aims to reconstruct missing segments in corrupted recor...
research
09/06/2017

A Comparison on Audio Signal Preprocessing Methods for Deep Neural Networks on Music Tagging

Deep neural networks (DNN) have been successfully applied for music clas...
research
10/08/2020

All for One and One for All: Improving Music Separation by Bridging Networks

This paper proposes several improvements for music separation with deep ...
research
10/08/2020

Tatum-Level Drum Transcription Based on a Convolutional Recurrent Neural Network with Language Model-Based Regularized Training

This paper describes a neural drum transcription method that detects fro...

Please sign up or login with your details

Forgot password? Click here to reset