Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models

08/23/2023
by   Fredrik Heiding, et al.
0

AI programs, built using large language models, make it possible to automatically create phishing emails based on a few data points about a user. They stand in contrast to traditional phishing emails that hackers manually design using general rules gleaned from experience. The V-Triad is an advanced set of rules for manually designing phishing emails to exploit our cognitive heuristics and biases. In this study, we compare the performance of phishing emails created automatically by GPT-4 and manually using the V-Triad. We also combine GPT-4 with the V-Triad to assess their combined potential. A fourth group, exposed to generic phishing emails, was our control group. We utilized a factorial approach, sending emails to 112 randomly selected participants recruited for the study. The control group emails received a click-through rate between 19-28 V-Triad 69-79 participant was asked to explain for why they pressed or did not press a link in the email. These answers often contradict each other, highlighting the need for personalized content. The cues that make one person avoid phishing emails make another person fall for them. Next, we used four popular large language models (GPT, Claude, PaLM, and LLaMA) to detect the intention of phishing emails and compare the results to human detection. The language models demonstrated a strong ability to detect malicious intent, even in non-obvious phishing emails. They sometimes surpassed human detection, although often being slightly less accurate than humans.

READ FULL TEXT

page 4

page 14

page 15

page 16

page 17

page 18

page 19

page 20

research
06/15/2022

Human Heuristics for AI-Generated Language Are Flawed

Human communication is increasingly intermixed with language generated b...
research
05/20/2023

Re-visiting Automated Topic Model Evaluation with Large Language Models

Topic models are used to make sense of large text collections. However, ...
research
10/27/2022

Can language models handle recursively nested grammatical structures? A case study on comparing models and humans

How should we compare the capabilities of language models and humans? He...
research
05/22/2023

Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding

Recently, large pretrained language models have demonstrated strong lang...
research
08/28/2023

AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics

The introduction of ChatGPT and the subsequent improvement of Large Lang...
research
10/29/2020

AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts

The remarkable success of pretrained language models has motivated the s...
research
05/24/2022

The Curious Case of Control

Children acquiring English make systematic errors on subject control sen...

Please sign up or login with your details

Forgot password? Click here to reset