FlexiAST: Flexibility is What AST Needs

07/18/2023
by   Jiu Feng, et al.
0

The objective of this work is to give patch-size flexibility to Audio Spectrogram Transformers (AST). Recent advancements in ASTs have shown superior performance in various audio-based tasks. However, the performance of standard ASTs degrades drastically when evaluated using different patch sizes from that used during training. As a result, AST models are typically re-trained to accommodate changes in patch sizes. To overcome this limitation, this paper proposes a training procedure to provide flexibility to standard AST models without architectural changes, allowing them to work with various patch sizes at the inference stage - FlexiAST. This proposed training approach simply utilizes random patch size selection and resizing of patch and positional embedding weights. Our experiments show that FlexiAST gives similar performance to standard AST models while maintaining its evaluation ability at various patch sizes on different datasets for audio classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2022

FlexiViT: One Model for All Patch Sizes

Vision Transformers convert images to sequences by slicing them into pat...
research
10/24/2022

Large Batch and Patch Size Training for Medical Image Segmentation

Multi-organ segmentation enables organ evaluation, accounts the relation...
research
07/09/2023

Random Position Adversarial Patch for Vision Transformers

Previous studies have shown the vulnerability of vision transformers to ...
research
04/11/2023

Life Regression based Patch Slimming for Vision Transformers

Vision transformers have achieved remarkable success in computer vision ...
research
11/21/2022

Unsupervised Echocardiography Registration through Patch-based MLPs and Transformers

Image registration is an essential but challenging task in medical image...
research
06/30/2023

Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing

Vision transformers (ViTs) have significantly changed the computer visio...
research
08/31/2023

Learning to Represent Patches

Patch representation is crucial in automating various software engineeri...

Please sign up or login with your details

Forgot password? Click here to reset