Study of positional encoding approaches for Audio Spectrogram Transformers

10/13/2021
by   Leonardo Pepino, et al.
0

Transformers have revolutionized the world of deep learning, specially in the field of natural language processing. Recently, the Audio Spectrogram Transformer (AST) was proposed for audio classification, leading to state of the art results in several datasets. However, in order for ASTs to outperform CNNs, pretraining with ImageNet is needed. In this paper, we study one component of the AST, the positional encoding, and propose several variants to improve the performance of ASTs trained from scratch, without ImageNet pretraining. Our best model, which incorporates conditional positional encodings, significantly improves performance on Audioset and ESC-50 compared to the original AST.

READ FULL TEXT
research
11/23/2022

ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation

Vision transformers, which were originally developed for natural languag...
research
10/11/2021

Efficient Training of Audio Transformers with Patchout

The great success of transformer-based models in natural language proces...
research
05/10/2023

XTab: Cross-table Pretraining for Tabular Transformers

The success of self-supervised learning in computer vision and natural l...
research
10/25/2022

Audio MFCC-gram Transformers for respiratory insufficiency detection in COVID-19

This work explores speech as a biomarker and investigates the detection ...
research
06/03/2021

When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations

Vision Transformers (ViTs) and MLPs signal further efforts on replacing ...
research
03/03/2023

Data-Efficient Training of CNNs and Transformers with Coresets: A Stability Perspective

Coreset selection is among the most effective ways to reduce the trainin...
research
03/14/2023

CAT: Causal Audio Transformer for Audio Classification

The attention-based Transformers have been increasingly applied to audio...

Please sign up or login with your details

Forgot password? Click here to reset