A Novel Plagiarism Detection Approach Combining BERT-based Word Embedding, Attention-based LSTMs and an Improved Differential Evolution Algorithm

05/03/2023
by   Seyed Vahid Moravvej, et al.
0

Detecting plagiarism involves finding similar items in two different sources. In this article, we propose a novel method for detecting plagiarism that is based on attention mechanism-based long short-term memory (LSTM) and bidirectional encoder representations from transformers (BERT) word embedding, enhanced with optimized differential evolution (DE) method for pre-training and a focal loss function for training. BERT could be included in a downstream task and fine-tuned as a task-specific BERT can be included in a downstream task and fine-tuned as a task-specific structure, while the trained BERT model is capable of detecting various linguistic characteristics. Unbalanced classification is one of the primary issues with plagiarism detection. We suggest a focal loss-based training technique that carefully learns minority class instances to solve this. Another issue that we tackle is the training phase itself, which typically employs gradient-based methods like back-propagation for the learning process and thus suffers from some drawbacks, including sensitivity to initialization. To initiate the BP process, we suggest a novel DE algorithm that makes use of a clustering-based mutation operator. Here, a winning cluster is identified for the current DE population, and a fresh updating method is used to produce potential answers. We evaluate our proposed approach on three benchmark datasets ( MSRP, SNLI, and SemEval2014) and demonstrate that it performs well when compared to both conventional and population-based methods.

READ FULL TEXT

page 9

page 10

page 14

research
01/07/2023

RLAS-BIABC: A Reinforcement Learning-Based Answer Selection Using the BERT Model Boosted by an Improved ABC Algorithm

Answer selection (AS) is a critical subtask of the open-domain question ...
research
09/20/2021

An Enhanced Differential Evolution Algorithm Using a Novel Clustering-based Mutation Operator

Differential evolution (DE) is an effective population-based metaheurist...
research
04/30/2020

Enriched Pre-trained Transformers for Joint Slot Filling and Intent Detection

Detecting the user's intent and finding the corresponding slots among th...
research
05/31/2018

Attention-Based LSTM for Psychological Stress Detection from Spoken Language Using Distant Supervision

We propose a Long Short-Term Memory (LSTM) with attention mechanism to c...
research
09/14/2023

Revisiting Supertagging for HPSG

We present new supertaggers trained on HPSG-based treebanks. These treeb...

Please sign up or login with your details

Forgot password? Click here to reset