Multi-Scale Self-Attention for Text Classification

12/02/2019
by   Qipeng Guo, et al.
0

In this paper, we introduce the prior knowledge, multi-scale structure, into self-attention modules. We propose a Multi-Scale Transformer which uses multi-scale multi-head self-attention to capture features from different scales. Based on the linguistic perspective and the analysis of pre-trained Transformer (BERT) on a huge corpus, we further design a strategy to control the scale distribution for each layer. Results of three different kinds of tasks (21 datasets) show our Multi-Scale Transformer outperforms the standard Transformer consistently and significantly on small and moderate size datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2023

Multi-Scale Prototypical Transformer for Whole Slide Image Classification

Whole slide image (WSI) classification is an essential task in computati...
research
05/31/2020

CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with Multi-Head Self-Attention Weights based Counterfactual Detection

In this paper, we describe an approach for modelling causal reasoning in...
research
11/25/2022

Aggregated Text Transformer for Scene Text Detection

This paper explores the multi-scale aggregation strategy for scene text ...
research
04/21/2020

Attention Module is Not Only a Weight: Analyzing Transformers with Vector Norms

Because attention modules are core components of Transformer-based model...
research
02/13/2022

DEEPCHORUS: A Hybrid Model of Multi-scale Convolution and Self-attention for Chorus Detection

Chorus detection is a challenging problem in musical signal processing a...
research
06/08/2022

UHD Image Deblurring via Multi-scale Cubic-Mixer

Currently, transformer-based algorithms are making a splash in the domai...
research
03/24/2022

Beyond Fixation: Dynamic Window Visual Transformer

Recently, a surge of interest in visual transformers is to reduce the co...

Please sign up or login with your details

Forgot password? Click here to reset