HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus

09/06/2023
by   Zhenpeng Su, et al.
0

ChatGPT has gained significant interest due to its impressive performance, but people are increasingly concerned about its potential risks, particularly around the detection of AI-generated content (AIGC), which is often difficult for untrained humans to identify. Current datasets utilized for detecting ChatGPT-generated text primarily center around question-answering, yet they tend to disregard tasks that possess semantic-invariant properties, such as summarization, translation, and paraphrasing. Our primary studies demonstrate that detecting model-generated text on semantic-invariant tasks is more difficult. To fill this gap, we introduce a more extensive and comprehensive dataset that considers more types of tasks than previous work, including semantic-invariant tasks. In addition, the model after a large number of task instruction fine-tuning shows a strong powerful performance. Owing to its previous success, we further instruct fine-tuning Tk-instruct and built a more powerful detection system. Experimental results show that our proposed detector outperforms the previous state-of-the-art RoBERTa-based detector.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2023

GRASS: Unified Generation Model for Speech-to-Semantic Tasks

This paper explores the instruction fine-tuning technique for speech-to-...
research
07/22/2022

Two-Stage Fine-Tuning: A Novel Strategy for Learning Class-Imbalanced Data

Classification on long-tailed distributed data is a challenging problem,...
research
12/18/2021

Improving Learning-to-Defer Algorithms Through Fine-Tuning

The ubiquity of AI leads to situations where humans and AI work together...
research
12/07/2022

M3ST: Mix at Three Levels for Speech Translation

How to solve the data scarcity problem for end-to-end speech-to-text tra...
research
05/15/2023

Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text Sequence-to-Sequence modeling

The increasing size of language models raises great research interests i...
research
08/18/2023

PUMGPT: A Large Vision-Language Model for Product Understanding

Recent developments of multi-modal large language models have demonstrat...
research
07/05/2023

LOAF-M2L: Joint Learning of Wording and Formatting for Singable Melody-to-Lyric Generation

Despite previous efforts in melody-to-lyric generation research, there i...

Please sign up or login with your details

Forgot password? Click here to reset