Deepfake Text Detection in the Wild

05/22/2023
by   Yafu Li, et al.
0

Recent advances in large language models have enabled them to reach a level of text generation comparable to that of humans. These models show powerful capabilities across a wide range of content, including news article writing, story generation, and scientific writing. Such capability further narrows the gap between human-authored and machine-generated texts, highlighting the importance of deepfake text detection to avoid potential risks such as fake news propagation and plagiarism. However, previous work has been limited in that they testify methods on testbed of specific domains or certain language models. In practical scenarios, the detector faces texts from various domains or LLMs without knowing their sources. To this end, we build a wild testbed by gathering texts from various human writings and deepfake texts generated by different LLMs. Human annotators are only slightly better than random guessing at identifying machine-generated texts. Empirical results on automatic detection methods further showcase the challenges of deepfake text detection in a wild testbed. In addition, out-of-distribution poses a greater challenge for a detector to be employed in realistic application scenarios. We release our resources at https://github.com/yafuly/DeepfakeTextDetect.

READ FULL TEXT
research
05/29/2023

Multiscale Positive-Unlabeled Detection of AI-Generated Texts

Recent releases of Large Language Models (LLMs), e.g. ChatGPT, are aston...
research
07/07/2023

RADAR: Robust AI-Text Detection via Adversarial Learning

Recent advances in large language models (LLMs) and the intensifying pop...
research
01/18/2023

How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection

The introduction of ChatGPT has garnered widespread attention in both ac...
research
10/22/2019

Automatic Extraction of Personality from Text: Challenges and Opportunities

In this study, we examined the possibility to extract personality traits...
research
07/21/2023

Is ChatGPT Involved in Texts? Measure the Polish Ratio to Detect ChatGPT-Generated Text

The remarkable capabilities of large-scale language models, such as Chat...
research
07/25/2023

FacTool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

The emergence of generative pre-trained models has facilitated the synth...
research
08/11/2022

A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception

In recent years there has been substantial growth in the capabilities of...

Please sign up or login with your details

Forgot password? Click here to reset