Evaluating AIGC Detectors on Code Content

04/11/2023
by   Jian Wang, et al.
0

Artificial Intelligence Generated Content (AIGC) has garnered considerable attention for its impressive performance, with ChatGPT emerging as a leading AIGC model that produces high-quality responses across various applications, including software development and maintenance. Despite its potential, the misuse of ChatGPT poses significant concerns, especially in education and safetycritical domains. Numerous AIGC detectors have been developed and evaluated on natural language data. However, their performance on code-related content generated by ChatGPT remains unexplored. To fill this gap, in this paper, we present the first empirical study on evaluating existing AIGC detectors in the software domain. We created a comprehensive dataset including 492.5K samples comprising code-related content produced by ChatGPT, encompassing popular software activities like Q A (115K), code summarization (126K), and code generation (226.5K). We evaluated six AIGC detectors, including three commercial and three open-source solutions, assessing their performance on this dataset. Additionally, we conducted a human study to understand human detection capabilities and compare them with the existing AIGC detectors. Our results indicate that AIGC detectors demonstrate lower performance on code-related data compared to natural language data. Fine-tuning can enhance detector performance, especially for content within the same domain; but generalization remains a challenge. The human evaluation reveals that detection by humans is quite challenging.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2023

SoTaNa: The Open-Source Software Development Assistant

Software development plays a crucial role in driving innovation and effi...
research
06/07/2023

Check Me If You Can: Detecting ChatGPT-Generated Academic Writing using CheckGPT

With ChatGPT under the spotlight, utilizing large language models (LLMs)...
research
06/09/2023

Towards a Robust Detection of Language Model Generated Text: Is ChatGPT that Easy to Detect?

Recent advances in natural language processing (NLP) have led to the dev...
research
05/09/2023

The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation

We present The Vault, an open-source, large-scale code-text dataset desi...
research
09/15/2023

Exploring the Potential of ChatGPT in Automated Code Refinement: An Empirical Study

Code review is an essential activity for ensuring the quality and mainta...
research
08/18/2022

An Empirical Evaluation of Competitive Programming AI: A Case Study of AlphaCode

AlphaCode is a code generation system for assisting software developers ...
research
07/05/2023

Evade ChatGPT Detectors via A Single Space

ChatGPT brings revolutionary social value but also raises concerns about...

Please sign up or login with your details

Forgot password? Click here to reset