ChatGPT or academic scientist? Distinguishing authorship with over 99 accuracy using off-the-shelf machine learning tools

by   Heather Desaire, et al.

ChatGPT has enabled access to AI-generated writing for the masses, and within just a few months, this product has disrupted the knowledge economy, initiating a culture shift in the way people work, learn, and write. The need to discriminate human writing from AI is now both critical and urgent, particularly in domains like higher education and academic writing, where AI had not been a significant threat or contributor to authorship. Addressing this need, we developed a method for discriminating text generated by ChatGPT from (human) academic scientists, relying on prevalent and accessible supervised classification methods. We focused on how a particular group of humans, academic scientists, write differently than ChatGPT, and this targeted approach led to the discovery of new features for discriminating (these) humans from AI; as examples, scientists write long paragraphs and have a penchant for equivocal language, frequently using words like but, however, and although. With a set of 20 features, including the aforementioned ones and others, we built a model that assigned the author, as human or AI, at well over 99 in 20 times fewer misclassified documents compared to the field-leading approach. This strategy for discriminating a particular set of humans writing from AI could be further adapted and developed by others with basic skills in supervised classification, enabling access to many highly accurate and targeted models for detecting AI usage in academic writing and beyond.


page 1

page 2

page 3

page 4


Exploring AI-Generated Text in Student Writing: How Does AI Help?

English as foreign language_EFL_students' use of text generated from art...

The Carbon Emissions of Writing and Illustrating Are Lower for AI than for Humans

As AI systems proliferate, their greenhouse gas emissions are an increas...

Generation of Chinese classical poetry based on pre-trained model

In order to test whether artificial intelligence can create qualified cl...

Multiversal views on language models

The virtuosity of language models like GPT-3 opens a new world of possib...

Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks

Large language models have introduced exciting new opportunities and cha...

What do writing features tell us about AI papers?

As the numbers of submissions to conferences grow quickly, the task of a...

Automatic Compilation of Resources for Academic Writing and Evaluating with Informal Word Identification and Paraphrasing System

We present the first approach to automatically building resources for ac...

Please sign up or login with your details

Forgot password? Click here to reset