Towards an Understanding and Explanation for Mixed-Initiative Artificial Scientific Text Detection

04/11/2023
by   Luoxuan Weng, et al.
0

Large language models (LLMs) have gained popularity in various fields for their exceptional capability of generating human-like text. Their potential misuse has raised social concerns about plagiarism in academic contexts. However, effective artificial scientific text detection is a non-trivial task due to several challenges, including 1) the lack of a clear understanding of the differences between machine-generated and human-written scientific text, 2) the poor generalization performance of existing methods caused by out-of-distribution issues, and 3) the limited support for human-machine collaboration with sufficient interpretability during the detection process. In this paper, we first identify the critical distinctions between machine-generated and human-written scientific text through a quantitative experiment. Then, we propose a mixed-initiative workflow that combines human experts' prior knowledge with machine intelligence, along with a visual analytics prototype to facilitate efficient and trustworthy scientific text detection. Finally, we demonstrate the effectiveness of our approach through two case studies and a controlled user study with proficient researchers. We also provide design implications for interactive artificial text detection tools in high-stakes decision-making scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2023

Detection of Fake Generated Scientific Abstracts

The widespread adoption of Large Language Models and publicly available ...
research
06/21/2023

Testing of Detection Tools for AI-Generated Text

Recent advances in generative pre-trained transformer large language mod...
research
12/24/2022

Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text

As text generated by large language models proliferates, it becomes vita...
research
11/02/2019

Human and Automatic Detection of Generated Text

With the advent of generative models with a billion parameters or more, ...
research
06/07/2023

On the Reliability of Watermarks for Large Language Models

As LLMs become commonplace, machine-generated text has the potential to ...
research
03/10/2023

ChatGPT as the Transportation Equity Information Source for Scientific Writing

Transportation equity is an interdisciplinary agenda that requires both ...
research
04/24/2023

CHEAT: A Large-scale Dataset for Detecting ChatGPT-writtEn AbsTracts

The powerful ability of ChatGPT has caused widespread concern in the aca...

Please sign up or login with your details

Forgot password? Click here to reset