Towards Automatic Boundary Detection for Human-AI Hybrid Essay in Education

07/23/2023
by   Zijie Zeng, et al.
0

Human-AI collaborative writing has been greatly facilitated with the help of modern large language models (LLM), e.g., ChatGPT. While admitting the convenience brought by technology advancement, educators also have concerns that students might leverage LLM to partially complete their writing assignment and pass off the human-AI hybrid text as their original work. Driven by such concerns, in this study, we investigated the automatic detection of Human-AI hybrid text in education, where we formalized the hybrid text detection as a boundary detection problem, i.e., identifying the transition points between human-written content and AI-generated content. We constructed a hybrid essay dataset by partially removing sentences from the original student-written essays and then instructing ChatGPT to fill in for the incomplete essays. Then we proposed a two-step detection approach where we (1) Separated AI-generated content from human-written content during the embedding learning process; and (2) Calculated the distances between every two adjacent prototypes (a prototype is the mean of a set of consecutive sentences from the hybrid text in the embedding space) and assumed that the boundaries exist between the two prototypes that have the furthest distance from each other. Through extensive experiments, we summarized the following main findings: (1) The proposed approach consistently outperformed the baseline methods across different experiment settings; (2) The embedding learning process (i.e., step 1) can significantly boost the performance of the proposed approach; (3) When detecting boundaries for single-boundary hybrid essays, the performance of the proposed approach could be enhanced by adopting a relatively large prototype size, leading to a 22% improvement (against the second-best baseline method) in the in-domain setting and an 18% improvement in the out-of-domain setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2023

Is This Abstract Generated by AI? A Research for the Gap between AI-generated Scientific Text and Human-written Scientific Text

BACKGROUND: Recent neural language models have taken a significant step ...
research
07/22/2023

The Imitation Game: Detecting Human and AI-Generated Texts in the Era of Large Language Models

The potential of artificial intelligence (AI)-based large language model...
research
06/04/2022

Student-AI Creative Writing: Pedagogical Strategies for Applying Natural Language Generation in Schools

AI natural language generation (NLG) is a process where computer systems...
research
06/02/2023

Prototyping the use of Large Language Models (LLMs) for adult learning content creation at scale

As Large Language Models (LLMs) and other forms of Generative AI permeat...
research
04/24/2023

AI, write an essay for me: A large-scale comparison of human-written versus ChatGPT-generated essays

Background: Recently, ChatGPT and similar generative AI models have attr...
research
09/14/2023

ChatGPT v Bard v Bing v Claude 2 v Aria v human-expert. How good are AI chatbots at scientific writing? (ver. 23Q3)

Historically, proficient writing was deemed essential for human advancem...
research
12/19/2019

Identifying Adversarial Sentences by Analyzing Text Complexity

Attackers create adversarial text to deceive both human perception and t...

Please sign up or login with your details

Forgot password? Click here to reset