Improving Detection of ChatGPT-Generated Fake Science Using Real Publication Text: Introducing xFakeBibs a Supervised-Learning Network Algorithm

08/15/2023
by   Ahmed Abdeen Hamed, et al.
0

ChatGPT is becoming a new reality. In this paper, we show how to distinguish ChatGPT-generated publications from counterparts produced by scientists. Using a newly designed supervised Machine Learning algorithm, we demonstrate how to detect machine-generated publications from those produced by scientists. The algorithm was trained using 100 real publication abstracts, followed by a 10-fold calibration approach to establish a lower-upper bound range of acceptance. In the comparison with ChatGPT content, it was evident that ChatGPT contributed merely 23% of the bigram content, which is less than 50% of any of the other 10 calibrating folds. This analysis highlights a significant disparity in technical terms where ChatGPT fell short of matching real science. When categorizing the individual articles, the xFakeBibs algorithm accurately identified 98 out of 100 publications as fake, with 2 articles incorrectly classified as real publications. Though this work introduced an algorithmic approach that detected the ChatGPT-generated fake science with a high degree of accuracy, it remains challenging to detect all fake records. This work is indeed a step in the right direction to counter fake science and misinformation.

READ FULL TEXT

page 1

page 3

page 8

page 10

page 11

research
03/05/2020

Fake Generated Painting Detection via Frequency Analysis

With the development of deep neural networks, digital fake paintings can...
research
09/28/2020

Transformers Are Better Than Humans at Identifying Generated Text

Fake information spread via the internet and social media influences pub...
research
10/22/2018

What Does a Successful Postdoctoral Fellowship Publication Record Look Like?

Obtaining a prize postdoctoral fellowship in astronomy and astrophysics ...
research
01/08/2021

Are Female Scientists Less Inclined to Publish Alone? The Gender Solo Research Gap

Solo research is a result of individual authorship decisions which accum...
research
10/23/2017

Automating, Operationalizing and Productizing Journalistic Article Analysis

Public Good Software's products match journalistic articles and other na...
research
06/11/2021

A dataset of mentorship in science with semantic and demographic estimations

Mentorship in science is crucial for topic choice, career decisions, and...
research
09/12/2018

Reversing the asymmetry in data exfiltration

Preventing data exfiltration from computer systems typically depends on ...

Please sign up or login with your details

Forgot password? Click here to reset