EBJR: Energy-Based Joint Reasoning for Adaptive Inference

10/20/2021
by   Mohammad Akbari, et al.
0

State-of-the-art deep learning models have achieved significant performance levels on various benchmarks. However, the excellent performance comes at a cost of inefficient computational cost. Light-weight architectures, on the other hand, achieve moderate accuracies, but at a much more desirable latency. This paper presents a new method of jointly using the large accurate models together with the small fast ones. To this end, we propose an Energy-Based Joint Reasoning (EBJR) framework that adaptively distributes the samples between shallow and deep models to achieve an accuracy close to the deep model, but latency close to the shallow one. Our method is applicable to out-of-the-box pre-trained models as it does not require an architecture change nor re-training. Moreover, it is easy to use and deploy, especially for cloud services. Through a comprehensive set of experiments on different down-stream tasks, we show that our method outperforms strong state-of-the-art approaches with a considerable margin. In addition, we propose specialized EBJR, an extension of our method where we create a smaller specialized side model that performs the target task only partially, but yields an even higher accuracy and faster inference. We verify the strengths of our methods with both theoretical and experimental evaluations.

READ FULL TEXT

page 1

page 2

page 15

research
03/01/2022

E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models

Building huge and highly capable language models has been a trend in the...
research
06/01/2023

Can Large Pre-trained Models Help Vision Models on Perception Tasks?

The recent upsurge in pre-trained large models (e.g. GPT-4) has swept ac...
research
04/29/2020

General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference

The state of the art on many NLP tasks is currently achieved by large pr...
research
09/18/2022

Improving the Performance of DNN-based Software Services using Automated Layer Caching

Deep Neural Networks (DNNs) have become an essential component in many a...
research
07/31/2020

Diet deep generative audio models with structured lottery

Deep learning models have provided extremely successful solutions in mos...
research
10/16/2020

Wireless Localisation in WiFi using Novel Deep Architectures

This paper studies the indoor localisation of WiFi devices based on a co...

Please sign up or login with your details

Forgot password? Click here to reset