Opinion Spam Detection: A New Approach Using Machine Learning and Network-Based Algorithms

05/26/2022
by   Kiril Danilchenko, et al.
11

E-commerce is the fastest-growing segment of the economy. Online reviews play a crucial role in helping consumers evaluate and compare products and services. As a result, fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers. There are many reasons why it is hard to identify opinion spammers automatically, including the absence of reliable labeled data. This limitation precludes an off-the-shelf application of a machine learning pipeline. We propose a new method for classifying reviewers as spammers or benign, combining machine learning with a message-passing algorithm that capitalizes on the users' graph structure to compensate for the possible scarcity of labeled data. We devise a new way of sampling the labels for the training step (active learning), replacing the typical uniform sampling. Experiments on three large real-world datasets from Yelp.com show that our method outperforms state-of-the-art active learning approaches and also machine learning methods that use a much larger set of labeled data for training.

READ FULL TEXT

page 3

page 4

research
12/24/2020

Leveraging GPT-2 for Classifying Spam Reviews with Limited Labeled Data via Adversarial Training

Online reviews are a vital source of information when purchasing a servi...
research
03/19/2019

GANs for Semi-Supervised Opinion Spam Detection

Online reviews have become a vital source of information in purchasing a...
research
07/29/2018

Opinion Spam Recognition Method for Online Reviews using Ontological Features

Nowadays, there are a lot of people using social media opinions to make ...
research
12/27/2020

Improving Opinion Spam Detection by Cumulative Relative Frequency Distribution

Over the last years, online reviews became very important since they can...
research
03/29/2020

A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

Entity Matching (EM) is a core data cleaning task, aiming to identify di...
research
09/03/2016

Graph-Based Active Learning: A New Look at Expected Error Minimization

In graph-based active learning, algorithms based on expected error minim...
research
03/09/2018

Highly Automated Learning for Improved Active Safety of Vulnerable Road Users

Highly automated driving requires precise models of traffic participants...

Please sign up or login with your details

Forgot password? Click here to reset