From Words to Music: A Study of Subword Tokenization Techniques in Symbolic Music Generation

04/18/2023
by   Adarsh Kumar, et al.
24

Subword tokenization has been widely successful in text-based natural language processing (NLP) tasks with Transformer-based models. As Transformer models become increasingly popular in symbolic music-related studies, it is imperative to investigate the efficacy of subword tokenization in the symbolic music domain. In this paper, we explore subword tokenization techniques, such as byte-pair encoding (BPE), in symbolic music generation and its impact on the overall structure of generated songs. Our experiments are based on three types of MIDI datasets: single track-melody only, multi-track with a single instrument, and multi-track and multi-instrument. We apply subword tokenization on post-musical tokenization schemes and find that it enables the generation of longer songs at the same time and improves the overall structure of the generated music in terms of objective metrics like structure indicator (SI), Pitch Class Entropy, etc. We also compare two subword tokenization methods, BPE and Unigram, and observe that both methods lead to consistent improvements. Our study suggests that subword tokenization is a promising technique for symbolic music generation and may have broader implications for music composition, particularly in cases involving complex data such as multi-track songs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

A Multi-Scale Attentive Transformer for Multi-Instrument Symbolic Music Generation

Recently, multi-instrument music generation has become a hot topic. Diff...
research
12/02/2022

A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling

Following the success of the transformer architecture in the natural lan...
research
02/10/2023

GTR-CTRL: Instrument and Genre Conditioning for Guitar-Focused Music Generation with Transformers

Recently, symbolic music generation with deep learning techniques has wi...
research
07/30/2018

Lead Sheet Generation and Arrangement by Conditional Generative Adversarial Network

Research on automatic music generation has seen great progress due to th...
research
03/14/2023

DiffuseRoll: Multi-track multi-category music generation based on diffusion model

Recent advancements in generative models have shown remarkable progress ...
research
03/30/2022

Symbolic music generation conditioned on continuous-valued emotions

In this paper we present a new approach for the generation of multi-inst...
research
05/17/2022

The Power of Reuse: A Multi-Scale Transformer Model for Structural Dynamic Segmentation in Symbolic Music Generation

Symbolic Music Generation relies on the contextual representation capabi...

Please sign up or login with your details

Forgot password? Click here to reset