On the Possibilities of AI-Generated Text Detection

04/10/2023
by   Souradip Chakraborty, et al.
29

Our work focuses on the challenge of detecting outputs generated by Large Language Models (LLMs) from those generated by humans. The ability to distinguish between the two is of utmost importance in numerous applications. However, the possibility and impossibility of such discernment have been subjects of debate within the community. Therefore, a central question is whether we can detect AI-generated text and, if so, when. In this work, we provide evidence that it should almost always be possible to detect the AI-generated text unless the distributions of human and machine generated texts are exactly the same over the entire support. This observation follows from the standard results in information theory and relies on the fact that if the machine text is becoming more like a human, we need more samples to detect it. We derive a precise sample complexity bound of AI-generated text detection, which tells how many samples are needed to detect. This gives rise to additional challenges of designing more complicated detectors that take in n samples to detect than just one, which is the scope of future research on this topic. Our empirical evaluations support our claim about the existence of better detectors demonstrating that AI-Generated text detection should be achievable in the majority of scenarios. Our results emphasize the importance of continued research in this area

READ FULL TEXT

page 2

page 6

research
07/22/2023

The Imitation Game: Detecting Human and AI-Generated Texts in the Era of Large Language Models

The potential of artificial intelligence (AI)-based large language model...
research
03/23/2023

Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense

To detect the deployment of large language models for malicious use case...
research
07/04/2023

Generative Artificial Intelligence Consensus in a Trustless Network

We performed a billion locality sensitive hash comparisons between artif...
research
07/05/2023

Evade ChatGPT Detectors via A Single Space

ChatGPT brings revolutionary social value but also raises concerns about...
research
05/24/2023

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

We introduce Ghostbuster, a state-of-the-art system for detecting AI-gen...
research
05/22/2023

G3Detector: General GPT-Generated Text Detector

The burgeoning progress in the field of Large Language Models (LLMs) her...
research
04/16/2023

ArguGPT: evaluating, understanding and identifying argumentative essays generated by GPT models

AI generated content (AIGC) presents considerable challenge to educators...

Please sign up or login with your details

Forgot password? Click here to reset