Log In Sign Up

Detecting Deceptive Reviews using Generative Adversarial Networks

In the past few years, consumer review sites have become the main target of deceptive opinion spam, where fictitious opinions or reviews are deliberately written to sound authentic. Most of the existing work to detect the deceptive reviews focus on building supervised classifiers based on syntactic and lexical patterns of an opinion. With the successful use of Neural Networks on various classification applications, in this paper, we propose FakeGAN a system that for the first time augments and adopts Generative Adversarial Networks (GANs) for a text classification task, in particular, detecting deceptive reviews. Unlike standard GAN models which have a single Generator and Discriminator model, FakeGAN uses two discriminator models and one generative model. The generator is modeled as a stochastic policy agent in reinforcement learning (RL), and the discriminators use Monte Carlo search algorithm to estimate and pass the intermediate action-value as the RL reward to the generator. Providing the generator model with two discriminator models avoids the mod collapse issue by learning from both distributions of truthful and deceptive reviews. Indeed, our experiments show that using two discriminators provides FakeGAN high stability, which is a known issue for GAN architectures. While FakeGAN is built upon a semi-supervised classifier, known for less accuracy, our evaluation results on a dataset of TripAdvisor hotel reviews show the same performance in terms of accuracy as of the state-of-the-art approaches that apply supervised machine learning. These results indicate that GANs can be effective for text classification tasks. Specifically, FakeGAN is effective at detecting deceptive reviews.


page 1

page 2

page 3

page 4


GANs for Semi-Supervised Opinion Spam Detection

Online reviews have become a vital source of information in purchasing a...

Bayesian GAN

Generative adversarial networks (GANs) can implicitly learn rich distrib...

SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient

As a new way of training generative models, Generative Adversarial Nets ...

GANgster: A Fraud Review Detector based on Regulated GAN with Data Augmentation

Financial implications of written reviews provide great incentives for b...

A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models

Generative adversarial networks (GANs) are a recently proposed class of ...

Fuzzy Generative Adversarial Networks

Generative Adversarial Networks (GANs) are well-known tools for data gen...

Downhole Track Detection via Multiscale Conditional Generative Adversarial Nets

Frequent mine disasters cause a large number of casualties and property ...

I Introduction

In the current world, we habitually turn to the wisdom of our peers, and often complete strangers, for advice, instead of merely taking the word of an advertiser or business owner. A 2015 study by marketing research company Mintel [1] found nearly 70 percent of Americans seek out others’ opinions online before making a purchase. Many platforms such as and have sprung up to facilitate this sharing of ideas amongst users. The heavy reliance on review information by the users has dramatic effects on business owners. It has been shown that an extra half-star rating on Yelp helps restaurants to sell out 19 percentage points more frequently [2].

This phenomenon has also lead to a market for various kinds of fraud. In simple cases, this could be a business rewarding its customers with a discount, or outright paying them, to write a favorable review. In more complex cases, this could involve astroturfing, opinion spamming [3] or deceptive opinion spamming [4], where fictitious reviews are deliberately written to sound authentic. Figure 1 shows an example of a truthful and deceptive review written for the same hotel. It is estimated that up to 25% of Yelp reviews are fraudulent [5, 6].

Detecting deceptive reviews is a text classification problem. In recent years, deep learning techniques based on natural language processing have been shown to be successful for text classification tasks. Recursive Neural Network (RecursiveNN) 

[7, 8, 9]

has shown good performance classifying texts, while Recurrent Neural Network (RecurrentNN) 

[10] better captures the contextual information and is ideal for realizing semantics of long texts. However, RecurrentNN is a biased model, where later words in a text have more influence than earlier words [11]

. This is not suitable for tasks such as detection of deceptive reviews that depend on an unbiased semantics of the entire document (review). Recently, techniques based on Convolutional Neural Network (CNN) 

[12, 13] were shown to be effective for text classification. However, the effectiveness of these techniques depends on careful selection of the window size [11], which controls the parameter space.

Moreover, in general, the main problem with applying classification methods for detecting deceptive reviews is the lack of substantial ground truth datasets required for most of the supervised machine learning techniques. This problem worsens for neural networks based methods, whose complexity requires much bigger dataset to reach a reasonable performance.

To address the limitations of the existing techniques, we propose FakeGAN, which is a technique based on Generative Adversarial Network (GAN) [14]

. GANs are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework. GANs have been used mostly for image-based applications 

[14, 15, 16, 17]

. In this paper, for the first time, we propose the use of GANs for a text classification task, i.e., detecting deceptive reviews. Moreover, the use of a semi-supervised learning method like GAN can eliminate the problem of ground truth scarcity that in general hinders the detection success 

[4, 18, 19].

We augment GAN models for our application in such a way that unlike standard GAN models which have a single Generator and Discriminator model, FakeGAN uses two discriminator models , and one generative model . The discriminator model tries to distinguish between truthful and deceptive reviews whereas tries to distinguish between reviews generated by the generative model and samples from deceptive reviews distribution. The discriminator model helps to generate reviews close to the deceptive reviews distribution, while helps to generate reviews which are classified by as truthful.

Our intuition behind using two discriminators is to create a stronger generator model. If in the adversarial learning phase, the generator gets rewards only from , the GAN may face the mod collapse issue [20], as it tries to learn two different distributions (truthful and deceptive reviews). The combination of and trains to generate better deceptive reviews which in turn train to be a better discriminator.

Indeed, our evaluation using the hotel reviews dataset shows that the discriminator generated by FakeGAN performs on par with the state-of-the-art methods that apply supervised machine learning, with an accuracy of 89.1%. These results indicate that GANs can be effective for text classification tasks, specifically, FakeGAN is effective at detecting deceptive reviews. To the best of our knowledge, FakeGAN is the first work that use GAN to generate better discriminator model (i.e., ) in contrast to the common GAN applications which aim to improve the generator model.

In summary, following are our contributions:

  1. We propose FakeGAN, a deceptive review detection system based on a double discriminator GAN.

  2. We believe that FakeGAN demonstrates a good first step towards using GANs for text classification tasks.

  3. To the best of our knowledge, FakeGAN is the first system using semi-supervised neural network-based learning methods for detecting deceptive fraudulent reviews.

  4. Our evaluation results demonstrate that FakeGAN is as effective as the state-of-the-art methods that apply supervised machine learning for detecting deceptive reviews.

(a) A truthful review provided by a high profile user on TripAdvisor
(b) A deceptive review written by an Amazon Mechanical worker
Fig. 1: A truthful review versus a deceptive review, both written for the same hotel.

Ii Approach

Generative Adversarial Network (GAN) [14] is a promising framework for generating high-quality samples with the same distribution as the target dataset. FakeGAN leverages GAN to learn the distributions of truthful and deceptive reviews and to build a semi-supervised classifier using the corresponding distributions.

A GAN consists of two models: a generative model which tries to capture the data distribution, and a discriminative model that distinguishes between samples coming from the training data or the generator . These two models are trained simultaneously, where is trying to fool the discriminator , while

is maximizing its probability estimation that whether a sample comes from the training data or is produced by the generator. In a nutshell, this framework corresponds to a minimax two-player game.

The feedback or the gradient update from discriminator model plays a vital role in the effectiveness of a GAN. In the case of text generation, it is difficult to pass the gradient update because the generative model produces discrete tokens (words), but the discriminative model makes a decision for complete sequence or sentence. Inspired by SeqGAN 


that uses GAN model for Chinese poem generation, in this work, we model the generator as a stochastic policy in reinforcement learning (RL), where the gradient update or RL reward signal is provided by the discriminator using Monte Carlo search. Monte Carlo is a heuristic search algorithm for identifying the most promising moves in a game. In summary, in each state of the game, it plays out the game to the very end for a fixed number of times according to a given policy. To find the most promising move, it must be provided by reward signals for a complete sequence of moves.

All the existing applications use GAN to create a strong generator, where the main issue is the convergence of generator model [22, 23, 20]. Mode collapse in particular is a known problem in GANs, where complexity and multimodality of the input distribution cause the generator to produce samples from a single mode. The generator may switch between modes during the learning phase, and this cat-and-mouse game may never end [24, 20]. Although no formal proof exists for convergence, in Section III we show that the FakeGAN’s discriminator converges in practice.

Unlike the typical applications of GANs, where the ultimate goal is to have a strong generator, FakeGAN leverages GAN to create a well-trained discriminator, so that it can successfully distinguish truthful and deceptive reviews. However, to avoid the stability issues inherent to GANs we augment our network to have two discriminator models though we use only one of them as our intended classifier. Note that leveraging samples generated by the generator makes our classifier a semi-supervised classifier.

Fig. 2: The overview of FakeGAN. The symbols and indicates positive and negative samples respectively. Note that, these are different from truthful and deceptive reviews.


We start with defining certain symbols which will be used throughout this section to define various steps of our approach. The training dataset, , consists of two parts, deceptive reviews and truthful reviews . We use to denote the vocabulary of all tokens (i.e., words) which are available in .

Our generator model parametrized by produces each review as a sequence of tokens of length where . We use to indicate all the reviews generated by our generator model .

We use two discriminator models and . The discriminator distinguishes between truthful and deceptive reviews, as such is the probability that the sequence of tokens comes from or . Similarly, distinguishes between deceptive samples in the dataset and samples generated by consequently is a probability indicating how likely the sequence of tokens comes from or .

The discriminator guides the generator to produce samples similar to whereas guides to generate samples which seems truthful to . So in each round of training, by using the feedback from and , the generator tries to fool and by generating reviews that seems deceptive (not generated by ) to , and truthful (not generated by or comes from ) to .

Figure 2 shows an overview of FakeGAN. During pre-training, we use the Maximum Likelihood Estimation (MLE) to train the generator on deceptive reviews from the training dataset. We also use minimizing the cross-entropy technique to pre-train the discriminators.

The generator is defined as a policy model in reinforcement learning. In timestep , the state is the sequence of produced tokens, and the action is the next token. The policy model is stochastic. Furthermore, the generator is trained by using a policy gradient and Monte Carlo (MC) search on the expected end reward from the discriminative models and . Similar to [21], we consider the estimated probability as the reward. Formally, the corresponding action-value function is:


As mentioned before, produces a review token by token. However, the discriminators provide the reward for a complete sequence. Moreover, should care about the long-term reward, similar to playing Chess where players sometimes prefer to give up immediate good moves for a long-term goal of victory [25]. Therefore, to estimate the action-value function in every timestep , we apply the Monte Carlo search times with a roll-out policy to sample the undetermined last tokens. We define an -time Monte Carlo search as


where for


and is sampled via roll-out policy based on the current state

. The complexity of action-value estimation function mainly depends on the roll-out policy. While one might use a simple version (e.g., random sampling or sampling based on n-gram features) as the policy to train the GAN fast, to be more efficient, we use the same generative model (

at time ). Note that, a higher value of

results in less variance and more accurate evaluation of the action-value function. We can now define the action-value estimation function at



where s are created according to the Equation 2. As there is no intermediate reward for the generator, we define the the objective function for the generator (based on [26]) to produce a sequence from the start state to maximize its final reward:


Conseqently, the gradient of the objective function is:


We update the generator’s parameters () as:


where is the learning rate.

By dynamically updating the discriminative models, we can further improve the generator. So, after generating samples, we will re-train the discriminative models and for steps using the following objective functions respectively:


In each of the steps, we use to generate the same number of samples as number of truthful reviews i.e., . The updated discriminators will be used to update the generator, and this cycle continues until FakeGAN converges. Algorithm 1 formally defines all the above steps.

0:  discriminators and , generator , roll-out policy , dataset
  Initialize with random weight.

  Load word2vec vector embeddings into

, and models
  Pre-train using MLE on
  Pre-train by minimizing the cross entropy
  Generate negative examples by for training
  Pre-train by minimizing the cross entropy
     for g-steps do
        Generate a sequence of tokens
        for  in  do
           Compute by Eq. 4
        end for
        Update via policy gradient Eq. 7
     end for
     for d-steps do
        Use to generate .
        Train discriminator by Eq. 8
        Train discriminator by Eq. 9
     end for
  until D reaches a stable accuracy.
Algorithm 1 FakeGAN

The Generative Model

We use RecurrentNNs (RNNs) to construct the generator. An RNN maps the input embedding representations of the input sequence of tokens into hidden states by using the following recursive function.


Finally, a softmax output layer

with bias vector

and weight matrix

maps the hidden layer neurons into the output token distribution as


To deal with the common vanishing and exploding gradient problem


of the backpropagation through time, we exploit the Long Short-Term Memory (LSTM) cells


The Discriminator Model

For the discriminators, we select the CNN because of their effectiveness for text classification tasks [29]. First, we construct the matrix of the sequence by concatenating the input embedding representations of the sequence of tokens as:


Then a kernel computes a convolutional operation to a window size of by using a non-linear function , which results in a feature map:


Where is the inner product of two vectors, and is a bias term. Usually, various numbers of kernels with different window sizes are used in CNN. We hyper-tune size of kernels by trying kernels which have been successfully used in text classification tasks by community [13, 30, 11]. Then we apply a max-over-time pooling operation over the feature maps to allow us to combine the outputs of different kernels. Based on [31]

we add the highway architecture to improve the performance. In the end, a fully connected layer with sigmoid activation functions is used to output the class probability of the input sequence.

Iii Evaluation

We implemented FakeGAN using the TensorFlow 

[32] framework. We chose the dataset from [4] which has 800 reviews of 20 Chicago hotels with positive sentiment. The dataset consists of 400 truthful reviews provided by high profile users on TripAdvisor and 400 deceptive reviews written by Amazon Mechanical Workers. To the best of our knowledge, this is the biggest available dataset of labeled reviews and has been used by many related works [4, 18, 33]. Similar to SeqGAN [21], the generator in FakeGAN only creates fixed length sentences. Since the majority of reviews in this dataset has a length less than 200 words, we set the sequence length of FakeGAN (

) to 200. For sentences whose length is less than 200, we pad them with a fixed token to reach the size of 200 resulting in 332 truthful and 353 deceptive reviews. Note that, having a larger dataset results in a less training time. Although larger dataset makes each adversarial step slower, it provides

a richer distribution of samples, thus reduces the number of adversarial steps resulting in less training time.

We used the k-fold cross-validation with k=5 to evaluate FakeGAN. We leveraged GloVe vectors222Check “glove.6B.200d.txt” from for word representation [34]. Similar to SeqGAN [21], the convergence of FakeGAN varies with the training parameters and of generator and discriminative models respectively. After experimenting with different values, we observed that following values and are optimal. For pre-training phase, we trained the generator and the discriminators until convergence, which took 120 and 50 steps respectively. The adversarial learning starts after the pre-training phase. All our experiments were run on a 40-core machine, where the pre-training took one hour, and the adversarial training took 11 hours with a total of 12 hours.

Iii-a Accuracy of Discriminator

As mentioned before, the goal of FakeGAN is to generate a highly accurate discriminator model, , that can distinguish deceptive and truthful reviews. Figure 2(a) shows the accuracy trend for this model; for simplicity, the trend is shown only for the first iteration of k-fold cross-validation. During the pre-training phase, the accuracy of stabilized at step. We set the adversarial learning to begin at step 51. After a little decrease in accuracy at the beginning, the accuracy increases and converges to , which is on-par with the accuracy of state-of-the-art approach [4] that applied supervised machine learning on the same dataset (

). The accuracy, precision and recall for k-fold cross-validation are 89.1%,


all with a standard deviation of 0.5. This supports our hypothesis that adversarial training can be used for detecting deceptive reviews. Interestingly even though FakeGAN relies on semi-supervised learning, it yields similar performance as of a fully-supervised classification algorithm.

(a) Accuracy of FakeGAN (Discriminator ) at each step by feeding the testing dataset to . While minimizing cross entropy method for pre-training converges and reaches accuracy at , adversarial training phase boosts the accuracy to .
(b) Accuracy of at each step by feeding the testing dataset and generated samples by to . Similar to figure 2(a), this plot shows that converged after 450 steps resulting in the convergence of FakeGAN.
Fig. 3: The accuracy of and

on the test dataset over epochs. The vertical dashed line shows the beginning of adversarial training.

Iii-B Accuracy of Discriminator

Figure 2(b) shows the accuracy trend for the discriminator . Similar to , converges after 450 steps with an accuracy of accuracy. It means that at this point, the generator will not be able to make any progress trying to fool , and the output distribution of will stay almost same. Thus, continuing adversarial learning does not result in any improvement of the accuracy of our main discriminator, .

Iii-C Comparing FakeGAN with the original GAN approach

To justify the use of two discriminators in FakeGAN, we tried using just one discriminator (only ) in two different settings. In the first case, the generator is pre-trained to learn only truthful reviews distribution. Here the discriminator reached accuracy in pre-training, and the accuracy of adversarial learning, i.e., the classifier, reduces to about . In the second case, the generator is pre-trained to learn only deceptive reviews distribution. Unlike the first case, adversarial learning improved the performance of by converging at , however, still, the performance is lower than that of FakeGAN.

These results demonstrate that using two discriminators is necessary to improve the accuracy of FakeGAN.

Iii-D Scalability Discussion

We argue that the time complexity of our proposed augmented GAN with two discriminators is the same as of original GANs because their bottleneck is the MC search, where using the rollout policy (which is until the time) generates 16 complete sequences, to help the generator for just outputting the most promising token as its current action. This happens for every token of a sequence which is generated by . However, compared to MC search, discriminators and are efficient and not time-consuming.

Iii-E Stability Discussion

As we discussed in Section II, the stability of GANs is a known issue. We observed that the parameters and have a large effect on the convergence and performance of FakeGAN as illustrated in the Figure 4, when and are both equal to one. We believe that the stability of GAN makes hyper-tuning of FakeGAN a challenging task thus prevents it from outperforming the state-of-the-art methods based on supervised machine learning. However, with the following values and , FakeGAN converges and performs on par with the state-of-the-art approach.

(a) The accuracy of fluctuates around 77% in constrast to the stabilization at in Figure 2(a) (with values g=1 and d=6)
(b) Accuracy of . Unlike in Figure 2(b), this plot shows that is not stable.
Fig. 4: The accuracy of and on the test dataset over epochs while both and are one.

Iv Related work

Text classification has been used extensively in email spam [35] detection and link spam detection in web pages [36, 37, 38]. Over the last decade, researchers have been working on deceptive opinion spam.

Jindal et al. [3] first introduced deceptive opinion spam

problem as a widespread phenomenon and showed that it is different from other traditional spam activities. They built their ground truth dataset by considering the duplicate reviews as spam reviews and the rest as nonspam reviews. They extracted features related to review, product and reviewer, and trained a Logistic Regression model on these features to find fraudulent reviews on Amazon. Wu et al. 

[39] claimed that deleting dishonest reviews will distort the popularity significantly. They leveraged this idea to detect deceptive opinion spam in the absence of ground truth data. Both of these heuristic evaluation approaches are not necessarily true and thorough.

Yoo et al. [19] instructed a group of tourism marketing students to write a hotel review from the perspective of a hotel manager. They gathered 40 truthful and 42 deceptive hotel reviews and found that truthful and deceptive reviews have different lexical complexity. Ott et al. [4] created a much larger dataset of 800 opinions by crowdsourcing333They used Amazon Mechanical Turk the job of writing fraudulent reviews for existing businesses. They combined work from psychology and computational linguistics to develop and compare three444Genre identification, psycholinguistic deception detection, and text categorization. approaches for detecting deceptive opinion spam. On a similar dataset, Feng et al. [33]

trained Support Vector Machine model based on syntactic stylometry features for deception detection. Li et al. 

[18] also combined ground truth dataset created by Ott et al. [4] with their employee (domain-expert) generated deceptive reviews to build a feature-based additive model for exploring the general rule for deceptive opinion spam detection. Rahman et al. [40] developed a system to detect venues that are targets of deceptive opinions. Although, this easies the identification of deceptive reviews considerable effort is still involved in identifying the actual deceptive reviews. In almost all these works, the size of the dataset limits the proposed model to reach its real capacity.

To alleviate these issues with the ground truth, we use a Generative adversarial network, which is more an unsupervised learning method rather than supervised. We start with an existing dataset and use the generator model to create necessary reviews to strengthen the classifier (discriminator).

V Future work

Contrary to the popular belief that supervised learning techniques are superior to unsupervised techniques, the accuracy of FakeGAN, a semi-supervised learning technique is comparable to the state-of-the-art supervised techniques on the same dataset. We believe that this is a preliminary step which we plan to extend by trying different architectures like Conditional GAN [41] and better hyper-tuning.

Vi Conclusion

In this paper, we propose FakeGAN, a technique to detect deceptive reviews using Generative Adversarial Networks (GAN). To the best of our knowledge, this is the first work to leverage GANs and semi-supervised learning methods to identify deceptive reviews. Our evaluation using a dataset of 800 reviews from 20 Chicago hotels of TripAdvisor shows that FakeGAN with an accuracy of 89.1% performed on par with the state-of-the-art models. We believe that FakeGAN demonstrates a good first step towards using GAN for text classification tasks, specifically those requiring very large ground truth datasets.


We would like to thank the anonymous reviewers for their valuable comments. This material is based on research sponsored by the Office of Naval Research under grant numbers N00014-15-1-2948, N00014-17-1-2011 and by DARPA under agreement number FA8750-15-2-0084. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. This work is also sponsored by a gift from Google’s Anti-Abuse group. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA or the U.S. Government.