Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

10/11/2022
by   Neeraj Varshney, et al.
0

Do all instances need inference through the big models for a correct prediction? Perhaps not; some instances are easy and can be answered correctly by even small capacity models. This provides opportunities for improving the computational efficiency of systems. In this work, we present an explorative study on 'model cascading', a simple technique that utilizes a collection of models of varying capacities to accurately yet efficiently output predictions. Through comprehensive experiments in multiple task settings that differ in the number of models available for cascading (K value), we show that cascading improves both the computational efficiency and the prediction accuracy. For instance, in K=3 setting, cascading saves up to 88.93 consistently achieves superior prediction accuracy with an improvement of up to 2.18 and show that it further increases the efficiency improvements. Finally, we hope that our work will facilitate development of efficient NLP systems making their widespread adoption in real-world applications possible.

READ FULL TEXT

page 1

page 14

page 15

research
05/02/2023

Post-Abstention: Towards Reliably Re-Attempting the Abstained Instances in QA

Despite remarkable progress made in natural language processing, even th...
research
07/01/2021

Interviewer-Candidate Role Play: Towards Developing Real-World NLP Systems

Standard NLP tasks do not incorporate several common real-world scenario...
research
11/30/2022

xTrimoABFold: Improving Antibody Structure Prediction without Multiple Sequence Alignments

In the field of antibody engineering, an essential task is to design a n...
research
01/19/2023

Decision-Focused Evaluation: Analyzing Performance of Deployed Restless Multi-Arm Bandits

Restless multi-arm bandits (RMABs) is a popular decision-theoretic frame...
research
08/14/2018

Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models

We introduce a method which enables a recurrent dynamics model to be tem...
research
04/15/2022

Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models

With many real-world applications of Natural Language Processing (NLP) c...
research
11/08/2017

Efficient Destination Prediction Based on Route Choices with Transition Matrix Optimization

Destination prediction is an essential task in a variety of mobile appli...

Please sign up or login with your details

Forgot password? Click here to reset