Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection

04/03/2023
by   Maxime Labonne, et al.
0

This paper investigates the effectiveness of large language models (LLMs) in email spam detection by comparing prominent models from three distinct families: BERT-like, Sentence Transformers, and Seq2Seq. Additionally, we examine well-established machine learning techniques for spam detection, such as Naïve Bayes and LightGBM, as baseline methods. We assess the performance of these models across four public datasets, utilizing different numbers of training samples (full training set and few-shot settings). Our findings reveal that, in the majority of cases, LLMs surpass the performance of the popular baseline techniques, particularly in few-shot scenarios. This adaptability renders LLMs uniquely suited to spam detection tasks, where labeled samples are limited in number and models require frequent updates. Additionally, we introduce Spam-T5, a Flan-T5 model that has been specifically adapted and fine-tuned for the purpose of detecting email spam. Our results demonstrate that Spam-T5 surpasses baseline models and other LLMs in the majority of scenarios, particularly when there are a limited number of training samples available. Our code is publicly available at https://github.com/jpmorganchase/emailspamdetection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2021

Effectiveness of Pre-training for Few-shot Intent Classification

This paper investigates the effectiveness of pre-training for few-shot i...
research
04/18/2021

Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity

When primed with only a handful of training samples, very large pretrain...
research
11/06/2022

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Recent studies have revealed the intriguing few-shot learning ability of...
research
02/24/2023

Language Models are Few-shot Learners for Prognostic Prediction

Clinical prediction is an essential task in the healthcare industry. How...
research
07/19/2023

On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models

Since late 2022, Large Language Models (LLMs) have become very prominent...
research
03/15/2023

Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!

Large Language Models (LLMs) have made remarkable strides in various tas...
research
06/07/2023

Good Data, Large Data, or No Data? Comparing Three Approaches in Developing Research Aspect Classifiers for Biomedical Papers

The rapid growth of scientific publications, particularly during the COV...

Please sign up or login with your details

Forgot password? Click here to reset