Open, Closed, or Small Language Models for Text Classification?

08/19/2023
by   Hao Yu, et al.
0

Recent advancements in large language models have demonstrated remarkable capabilities across various NLP tasks. But many questions remain, including whether open-source models match closed ones, why these models excel or struggle with certain tasks, and what types of practical procedures can improve performance. We address these questions in the context of classification by evaluating three classes of models using eight datasets across three distinct tasks: named entity recognition, political party prediction, and misinformation detection. While larger LLMs often lead to improved performance, open-source models can rival their closed-source counterparts by fine-tuning. Moreover, supervised smaller models, like RoBERTa, can achieve similar or even greater performance in many datasets compared to generative LLMs. On the other hand, closed models maintain an advantage in hard tasks that demand the most generalizability. This study underscores the importance of model selection based on task requirements

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2023

Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks

This study examines the performance of open-source Large Language Models...
research
08/15/2023

Informed Named Entity Recognition Decoding for Generative Language Models

Ever-larger language models with ever-increasing capabilities are by now...
research
05/14/2023

MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling

We present MatSci-NLP, a natural language benchmark for evaluating the p...
research
05/25/2023

On the Tool Manipulation Capability of Open-source Large Language Models

Recent studies on software tool manipulation with large language models ...
research
05/02/2020

Exploring and Predicting Transferability across NLP Tasks

Recent advances in NLP demonstrate the effectiveness of training large-s...
research
08/22/2023

Halo: Estimation and Reduction of Hallucinations in Open-Source Weak Large Language Models

Large Language Models (LLMs) have revolutionized Natural Language Proces...
research
07/11/2022

Embedding Recycling for Language Models

Training and inference with large neural models is expensive. However, f...

Please sign up or login with your details

Forgot password? Click here to reset