Paraphrase Identification with Deep Learning: A Review of Datasets and Methods

12/13/2022
by   Chao Zhou, et al.
0

The rapid advancement of AI technology has made text generation tools like GPT-3 and ChatGPT increasingly accessible, scalable, and effective. This can pose serious threat to the credibility of various forms of media if these technologies are used for plagiarism, including scientific literature and news sources. Despite the development of automated methods for paraphrase identification, detecting this type of plagiarism remains a challenge due to the disparate nature of the datasets on which these methods are trained. In this study, we review traditional and current approaches to paraphrase identification and propose a refined typology of paraphrases. We also investigate how this typology is represented in popular datasets and how under-representation of certain types of paraphrases impacts detection capabilities. Finally, we outline new directions for future research and datasets in the pursuit of more effective paraphrase detection using AI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2023

Cognitive Mirage: A Review of Hallucinations in Large Language Models

As large language models continue to develop in the field of AI, text ge...
research
03/27/2021

Deep Learning Techniques for In-Crop Weed Identification: A Review

Weeds are a significant threat to the agricultural productivity and the ...
research
09/07/2022

SynSciPass: detecting appropriate uses of scientific text generation

Approaches to machine generated text detection tend to focus on binary c...
research
05/01/2019

AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions

In the last two decades, the landscape of text generation has undergone ...
research
12/12/2022

Fighting Malicious Media Data: A Survey on Tampering Detection and Deepfake Detection

Online media data, in the forms of images and videos, are becoming mains...
research
04/27/2020

"Unsex me here": Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples

To effectively tackle sexism online, research has focused on automated m...
research
04/27/2022

The MeVer DeepFake Detection Service: Lessons Learnt from Developing and Deploying in the Wild

Enabled by recent improvements in generation methodologies, DeepFakes ha...

Please sign up or login with your details

Forgot password? Click here to reset