Deepfake Text Detection: Limitations and Opportunities

10/17/2022
by   Jiameng Pu, et al.
2

Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed for deepfake text detection. However, we lack a thorough understanding of their real-world applicability. In this paper, we collect deepfake text from 4 online services powered by Transformer-based tools to evaluate the generalization ability of the defenses on content in the wild. We develop several low-cost adversarial attacks, and investigate the robustness of existing defenses against an adaptive attacker. We find that many defenses show significant degradation in performance under our evaluation scenarios compared to their original claimed performance. Our evaluation shows that tapping into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deepfake text detection schemes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2022

Adversarial Robustness of Neural-Statistical Features in Detection of Generative Transformers

The detection of computer-generated text is an area of rapidly increasin...
research
09/01/2023

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

As Large Language Models quickly become ubiquitous, it becomes critical ...
research
05/20/2023

SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters

Text-to-image generative models such as Stable Diffusion and DALL·E 2 ha...
research
03/07/2021

Deepfake Videos in the Wild: Analysis and Detection

AI-manipulated videos, commonly known as deepfakes, are an emerging prob...
research
11/03/2019

SOK: A Comprehensive Reexamination of Phishing Research from the Security Perspective

Phishing and spear-phishing are typical examples of masquerade attacks s...
research
06/26/2019

Prediction Poisoning: Utility-Constrained Defenses Against Model Stealing Attacks

With the advances of ML models in recent years, we are seeing an increas...
research
06/07/2023

On the Reliability of Watermarks for Large Language Models

As LLMs become commonplace, machine-generated text has the potential to ...

Please sign up or login with your details

Forgot password? Click here to reset