DDSP-based Neural Waveform Synthesis of Polyphonic Guitar Performance from String-wise MIDI Input

09/14/2023
by   Nicolas Jonason, et al.
1

We explore the use of neural synthesis for acoustic guitar from string-wise MIDI input. We propose four different systems and compare them with both objective metrics and subjective evaluation against natural audio and a sample-based baseline. We iteratively develop these four systems by making various considerations on the architecture and intermediate tasks, such as predicting pitch and loudness control features. We find that formulating the control feature prediction task as a classification task rather than a regression task yields better results. Furthermore, we find that our simplest proposed system, which directly predicts synthesis parameters from MIDI input performs the best out of the four proposed systems. Audio examples are available at https://erl-j.github.io/neural-guitar-web-supplement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2021

Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis

Speech synthesis and music audio generation from symbolic input differ i...
research
06/25/2018

EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System

We present EMPHASIS, an emotional phoneme-based acoustic model for speec...
research
05/29/2020

Predicting Different Acoustic Features from EEG and towards direct synthesis of Audio Waveform from EEG

In [1,2] authors provided preliminary results for synthesizing speech fr...
research
06/17/2021

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

This paper introduces WaveGrad 2, a non-autoregressive generative model ...
research
11/08/2022

PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping

Previous generative adversarial network (GAN)-based neural vocoders are ...
research
05/22/2023

LEAN: Light and Efficient Audio Classification Network

Over the past few years, audio classification task on large-scale datase...
research
07/10/2023

Vocal Tract Area Estimation by Gradient Descent

Articulatory features can provide interpretable and flexible controls fo...

Please sign up or login with your details

Forgot password? Click here to reset