AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries

04/28/2021
by   Woosung Choi, et al.
8

This paper proposes a neural network that performs audio transformations to user-specified sources (e.g., vocals) of a given audio track according to a given description while preserving other sources not mentioned in the description. Audio Manipulation on a Specific Source (AMSS) is challenging because a sound object (i.e., a waveform sample or frequency bin) is `transparent'; it usually carries information from multiple sources, in contrast to a pixel in an image. To address this challenging problem, we propose AMSS-Net, which extracts latent sources and selectively manipulates them while preserving irrelevant sources. We also propose an evaluation benchmark for several AMSS tasks, and we show that AMSS-Net outperforms baselines on several AMSS tasks via objective metrics and empirical verification.

READ FULL TEXT

page 7

page 8

page 9

research
04/08/2019

Audio Classification of Bit-Representation Waveform

This paper investigates waveform representation for audio signal classif...
research
05/03/2023

Diverse and Vivid Sound Generation from Text Descriptions

Previous audio generation mainly focuses on specified sound classes such...
research
10/26/2022

Audio Mosaicing with Simulation-based Inference

Mosaics and collages have been an integral part of art for decades. Part...
research
03/28/2022

Separate What You Describe: Language-Queried Audio Source Separation

In this paper, we introduce the task of language-queried audio source se...
research
05/30/2023

A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition

The ability to accurately recognize, localize and separate sound sources...
research
02/11/2021

Multichannel-based learning for audio object extraction

The current paradigm for creating and deploying immersive audio content ...
research
12/18/2018

Uniform Convergence Bounds for Codec Selection

We frame the problem of selecting an optimal audio encoding scheme as a ...

Please sign up or login with your details

Forgot password? Click here to reset