Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models

04/15/2022
by   Phyllis Ang, et al.
0

With many real-world applications of Natural Language Processing (NLP) comprising of long texts, there has been a rise in NLP benchmarks that measure the accuracy of models that can handle longer input sequences. However, these benchmarks do not consider the trade-offs between accuracy, speed, and power consumption as input sizes or model sizes are varied. In this work, we perform a systematic study of this accuracy vs. efficiency trade-off on two widely used long-sequence models - Longformer-Encoder-Decoder (LED) and Big Bird - during fine-tuning and inference on four datasets from the SCROLLS benchmark. To study how this trade-off differs across hyperparameter settings, we compare the models across four sequence lengths (1024, 2048, 3072, 4096) and two model sizes (base and large) under a fixed resource budget. We find that LED consistently achieves better accuracy at lower energy costs than Big Bird. For summarization, we find that increasing model size is more energy efficient than increasing sequence length for higher accuracy. However, this comes at the cost of a large drop in inference speed. For question answering, we find that smaller models are both more efficient and more accurate due to the larger training batch sizes possible under a fixed resource budget.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2022

A Survey for Efficient Open Domain Question Answering

Open domain question answering (ODQA) is a longstanding task aimed at an...
research
11/18/2021

Dynamic-TinyBERT: Boost TinyBERT's Inference Efficiency by Dynamic Sequence Length

Limited computational budgets often prevent transformers from being used...
research
01/10/2022

SCROLLS: Standardized CompaRison Over Long Language Sequences

NLP benchmarks have largely focused on short texts, such as sentences an...
research
04/05/2023

To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency

Sequence-to-sequence language models can be used to produce abstractive ...
research
11/04/2021

CoreLM: Coreference-aware Language Model Fine-Tuning

Language Models are the underpin of all modern Natural Language Processi...
research
10/11/2022

Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

Do all instances need inference through the big models for a correct pre...
research
05/01/2020

When Ensembling Smaller Models is More Efficient than Single Large Models

Ensembling is a simple and popular technique for boosting evaluation per...

Please sign up or login with your details

Forgot password? Click here to reset