Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation

07/19/2023
by   Hao Peng, et al.
0

Rising computational demands of modern natural language processing (NLP) systems have increased the barrier to entry for cutting-edge research while posing serious environmental concerns. Yet, progress on model efficiency has been impeded by practical challenges in model evaluation and comparison. For example, hardware is challenging to control due to disparate levels of accessibility across different institutions. Moreover, improvements in metrics such as FLOPs often fail to translate to progress in real-world applications. In response, we introduce Pentathlon, a benchmark for holistic and realistic evaluation of model efficiency. Pentathlon focuses on inference, which accounts for a majority of the compute in a model's lifecycle. It offers a strictly-controlled hardware platform, and is designed to mirror real-world applications scenarios. It incorporates a suite of metrics that target different aspects of efficiency, including latency, throughput, memory overhead, and energy consumption. Pentathlon also comes with a software library that can be seamlessly integrated into any codebase and enable evaluation. As a standardized and centralized evaluation platform, Pentathlon can drastically reduce the workload to make fair and reproducible efficiency comparisons. While initially focused on natural language processing (NLP) models, Pentathlon is designed to allow flexible extension to other fields. We envision Pentathlon will stimulate algorithmic innovations in building efficient models, and foster an increased awareness of the social and environmental implications in the development of future-generation NLP models.

READ FULL TEXT
research
02/15/2022

A Survey on Model Compression for Natural Language Processing

With recent developments in new architectures like Transformer and pretr...
research
06/11/2021

A Discussion on Building Practical NLP Leaderboards: The Case of Machine Translation

Recent advances in AI and ML applications have benefited from rapid prog...
research
11/15/2022

A Survey for Efficient Open Domain Question Answering

Open domain question answering (ODQA) is a longstanding task aimed at an...
research
04/30/2022

EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing

The success of Pre-Trained Models (PTMs) has reshaped the development of...
research
03/21/2021

TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing

Various robustness evaluation methodologies from different perspectives ...
research
07/31/2022

Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers

There exists a wide variety of efficiency methods for natural language p...
research
09/13/2023

OYXOY: A Modern NLP Test Suite for Modern Greek

This paper serves as a foundational step towards the development of a li...

Please sign up or login with your details

Forgot password? Click here to reset