LittleBird: Efficient Faster Longer Transformer for Question Answering

10/21/2022
by   Minchul Lee, et al.
0

BERT has shown a lot of sucess in a wide variety of NLP tasks. But it has a limitation dealing with long inputs due to its attention mechanism. Longformer, ETC and BigBird addressed this issue and effectively solved the quadratic dependency problem. However we find that these models are not sufficient, and propose LittleBird, a novel model based on BigBird with improved speed and memory footprint while maintaining accuracy. In particular, we devise a more flexible and efficient position representation method based on Attention with Linear Biases (ALiBi). We also show that replacing the method of global information represented in the BigBird with pack and unpack attention is more effective. The proposed model can work on long inputs even after being pre-trained on short inputs, and can be trained efficiently reusing existing pre-trained language model for short inputs. This is a significant benefit for low-resource languages where large amounts of long text data are difficult to obtain. As a result, our experiments show that LittleBird works very well in a variety of languages, achieving high performance in question answering tasks, particularly in KorQuAD2.0, Korean Question Answering Dataset for long paragraphs.

READ FULL TEXT

page 4

page 15

page 16

page 17

research
12/18/2021

Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages

Transformer based architectures have shown notable results on many down ...
research
10/14/2019

Whatcha lookin' at? DeepLIFTing BERT's Attention in Question Answering

There has been great success recently in tackling challenging NLP tasks ...
research
02/23/2023

VLSP2022-EVJVQA Challenge: Multilingual Visual Question Answering

Visual Question Answering (VQA) is a challenging task of natural languag...
research
04/18/2023

HeRo: RoBERTa and Longformer Hebrew Language Models

In this paper, we fill in an existing gap in resources available to the ...
research
04/30/2021

Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads

Vision-and-Language (VL) pre-training has shown great potential on many ...
research
12/15/2021

LongT5: Efficient Text-To-Text Transformer for Long Sequences

Recent work has shown that either (1) increasing the input length or (2)...
research
12/21/2022

Analyzing Semantic Faithfulness of Language Models via Input Intervention on Conversational Question Answering

Transformer-based language models have been shown to be highly effective...

Please sign up or login with your details

Forgot password? Click here to reset