Targeted Honeyword Generation with Language Models

08/15/2022
by   Fangyi Yu, et al.
0

Honeywords are fictitious passwords inserted into databases in order to identify password breaches. The major difficulty is how to produce honeywords that are difficult to distinguish from real passwords. Although the generation of honeywords has been widely investigated in the past, the majority of existing research assumes attackers have no knowledge of the users. These honeyword generating techniques (HGTs) may utterly fail if attackers exploit users' personally identifiable information (PII) and the real passwords include users' PII. In this paper, we propose to build a more secure and trustworthy authentication system that employs off-the-shelf pre-trained language models which require no further training on real passwords to produce honeywords while retaining the PII of the associated real password, therefore significantly raising the bar for attackers. We conducted a pilot experiment in which individuals are asked to distinguish between authentic passwords and honeywords when the username is provided for GPT-3 and a tweaking technique. Results show that it is extremely difficult to distinguish the real passwords from the artifical ones for both techniques. We speculate that a larger sample size could reveal a significant difference between the two HGT techniques, favouring our proposed approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2022

MSDT: Masked Language Model Scoring Defense in Text Domain

Pre-trained language models allowed us to process downstream tasks with ...
research
05/25/2022

Are Large Pre-Trained Language Models Leaking Your Personal Information?

Large Pre-Trained Language Models (PLMs) have facilitated and dominated ...
research
08/31/2023

LLM in the Shell: Generative Honeypots

Honeypots are essential tools in cybersecurity. However, most of them (e...
research
05/09/2023

Towards an Automatic Optimisation Model Generator Assisted with Generative Pre-trained Transformer

This article presents a framework for generating optimisation models usi...
research
05/25/2022

Detecting Label Errors using Pre-Trained Language Models

We show that large pre-trained language models are extremely capable of ...
research
10/05/2022

COMPS: Conceptual Minimal Pair Sentences for testing Property Knowledge and Inheritance in Pre-trained Language Models

A characteristic feature of human semantic memory is its ability to not ...
research
08/25/2021

Decoys in Cybersecurity: An Exploratory Study to Test the Effectiveness of 2-sided Deception

One of the widely used cyber deception techniques is decoying, where def...

Please sign up or login with your details

Forgot password? Click here to reset