Understanding Self-Attention of Self-Supervised Audio Transformers

06/05/2020
by   Shu-wen Yang, et al.
0

Self-supervised Audio Transformers (SAT) enable great success in many downstream speech applications like ASR, but how they work has not been widely explored yet. In this work, we present multiple strategies for the analysis of attention mechanisms in SAT. We categorize attentions into explainable categories, where we discover each category possesses its own unique functionality. We provide a visualization tool for understanding multi-head self-attention, importance ranking strategies for identifying critical attention, and attention refinement techniques to improve model performance.

READ FULL TEXT

page 2

page 3

page 4

research
06/09/2020

Hand-crafted Attention is All You Need? A Study of Attention on Self-supervised Audio Transformer

In this paper, we seek to reduce the computation complexity of transform...
research
04/13/2021

EAT: Enhanced ASR-TTS for Self-supervised Speech Recognition

Self-supervised ASR-TTS models suffer in out-of-domain data conditions. ...
research
06/16/2019

Theoretical Limitations of Self-Attention in Neural Sequence Models

Transformers are emerging as the new workhorse of NLP, showing great suc...
research
02/25/2021

SparseBERT: Rethinking the Importance Analysis in Self-attention

Transformer-based models are popular for natural language processing (NL...
research
12/07/2022

Teaching Matters: Investigating the Role of Supervision in Vision Transformers

Vision Transformers (ViTs) have gained significant popularity in recent ...
research
10/06/2020

Guiding Attention for Self-Supervised Learning with Transformers

In this paper, we propose a simple and effective technique to allow for ...
research
02/03/2023

PSST! Prosodic Speech Segmentation with Transformers

Self-attention mechanisms have enabled transformers to achieve superhuma...

Please sign up or login with your details

Forgot password? Click here to reset