Multi-Step Chord Sequence Prediction Based on Aggregated Multi-Scale Encoder-Decoder Network

11/12/2019
by   Tristan Carsault, et al.
0

This paper studies the prediction of chord progressions for jazz music by relying on machine learning models. The motivation of our study comes from the recent success of neural networks for performing automatic music composition. Although high accuracies are obtained in single-step prediction scenarios, most models fail to generate accurate multi-step chord predictions. In this paper, we postulate that this comes from the multi-scale structure of musical information and propose new architectures based on an iterative temporal aggregation of input labels. Specifically, the input and ground truth labels are merged into increasingly large temporal bags, on which we train a family of encoder-decoder networks for each temporal scale. In a second step, we use these pre-trained encoder bottleneck features at each scale in order to train a final encoder-decoder network. Furthermore, we rely on different reductions of the initial chord alphabet into three adapted chord alphabets. We perform evaluations against several state-of-the-art models and show that our multi-scale architecture outperforms existing methods in terms of accuracy and perplexity, while requiring relatively few parameters. We analyze musical properties of the results, showing the influence of downbeat position within the analysis window on accuracy, and evaluate errors using a musically-informed distance metric.

READ FULL TEXT
research
09/03/2021

Musical Tempo Estimation Using a Multi-scale Network

Recently, some single-step systems without onset detection have shown th...
research
11/12/2019

Using musical relationships between chord labels in automatic chord extraction tasks

Recent researches on Automatic Chord Extraction (ACE) have focused on th...
research
08/04/2018

Learning Multi-scale Features for Foreground Segmentation

Foreground segmentation algorithms aim segmenting moving objects from th...
research
08/03/2021

An Empirical Evaluation of End-to-End Polyphonic Optical Music Recognition

Previous work has shown that neural architectures are able to perform op...
research
04/16/2022

GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness

In recent hyperspectral unmixing (HU) literature, the application of dee...
research
12/22/2021

MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context

We present a next-generation neural network architecture, MOSAIC, for ef...
research
07/25/2022

Effective and Interpretable Information Aggregation with Capacity Networks

How to aggregate information from multiple instances is a key question m...

Please sign up or login with your details

Forgot password? Click here to reset