An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation

09/12/2018
by   Cheng-Zhi Anna Huang, et al.
2

Music relies heavily on self-reference to build structure and meaning. We explore the Transformer architecture (Vaswani et al., 2017) as a generative model for music, as self-attention has shown compelling results on tasks that require long-term structure such as Wikipedia summary generation (Liu et al, 2018). However, timing information is critical for polyphonic music, and Transformer does not explicitly model absolute or relative timing in its structure. To address this challenge, Shaw et al. (2018) introduced relative position representations to self-attention to improve machine translation. However, the formulation was not scalable to longer sequences. We propose an improved formulation which reduces the memory requirements of the relative position computation from O(l^2d) to O(ld), making it possible to train much longer sequences and achieve faster convergence. In experiments on symbolic music we find that relative self-attention substantially improves sample quality for unconditioned generation and is able to generate sequences of lengths longer than those from the training set. When primed with an initial sequence, the model generates continuations that develop the prime coherently and exhibit long-term structure. Relative self-attention can be instrumental in capturing richer relationships within a musical piece.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2018

Music Transformer

Music relies heavily on repetition to build structure and meaning. Self-...
research
07/21/2021

Melody Structure Transfer Network: Generating Music with Separable Self-Attention

Symbolic music generation has attracted increasing attention, while most...
research
03/06/2018

Self-Attention with Relative Position Representations

Relying entirely on an attention mechanism, the Transformer introduced b...
research
07/05/2019

A Bi-directional Transformer for Musical Chord Recognition

Chord recognition is an important task since chords are highly abstract ...
research
09/06/2021

PermuteFormer: Efficient Relative Position Encoding for Long Sequences

A recent variation of Transformer, Performer, scales Transformer to long...
research
11/16/2019

Music theme recognition using CNN and self-attention

We present an efficient architecture to detect mood/themes in music trac...
research
05/31/2023

Monotonic Location Attention for Length Generalization

We explore different ways to utilize position-based cross-attention in s...

Please sign up or login with your details

Forgot password? Click here to reset