The Costly Dilemma: Generalization, Evaluation and Cost-Optimal Deployment of Large Language Models

08/15/2023
by   Abi Aryan, et al.
0

When deploying machine learning models in production for any product/application, there are three properties that are commonly desired. First, the models should be generalizable, in that we can extend it to further use cases as our knowledge of the domain area develops. Second they should be evaluable, so that there are clear metrics for performance and the calculation of those metrics in production settings are feasible. Finally, the deployment should be cost-optimal as far as possible. In this paper we propose that these three objectives (i.e. generalization, evaluation and cost-optimality) can often be relatively orthogonal and that for large language models, despite their performance over conventional NLP models, enterprises need to carefully assess all the three factors before making substantial investments in this technology. We propose a framework for generalization, evaluation and cost-modeling specifically tailored to large language models, offering insights into the intricacies of development, deployment and management for these large language models.

READ FULL TEXT
research
08/29/2023

Evaluation and Analysis of Hallucination in Large Vision-Language Models

Large Vision-Language Models (LVLMs) have recently achieved remarkable s...
research
08/15/2020

Site Reliability Engineering: Application of Item Response Theory to Application Deployment Practices and Controls

Reliability of an application or solution in production environment is o...
research
04/27/2017

Duluth at SemEval-2017 Task 6: Language Models in Humor Detection

This paper describes the Duluth system that participated in SemEval-2017...
research
08/08/2023

Learning Evaluation Models from Large Language Models for Sequence Generation

Large language models achieve state-of-the-art performance on sequence g...
research
08/03/2023

Efficient Sentiment Analysis: A Resource-Aware Evaluation of Feature Extraction Techniques, Ensembling, and Deep Learning Models

While reaching for NLP systems that maximize accuracy, other important m...
research
09/10/2020

Patient Cohort Retrieval using Transformer Language Models

We apply deep learning-based language models to the task of patient coho...
research
11/16/2022

Holistic Evaluation of Language Models

Language models (LMs) are becoming the foundation for almost all major l...

Please sign up or login with your details

Forgot password? Click here to reset