GANs for Semi-Supervised Opinion Spam Detection

03/19/2019 ∙ by Gray Stanton, et al. ∙ University of Washington Colorado State University 0

Online reviews have become a vital source of information in purchasing a service (product). Opinion spammers manipulate reviews, affecting the overall perception of the service. A key challenge in detecting opinion spam is obtaining ground truth. Though there exists a large set of reviews online, only a few of them have been labeled spam or non-spam. In this paper, we propose spamGAN, a generative adversarial network which relies on limited set of labeled data as well as unlabeled data for opinion spam detection. spamGAN improves the state-of-the-art GAN based techniques for text classification. Experiments on TripAdvisor dataset show that spamGAN outperforms existing spam detection techniques when limited labeled data is used. Apart from detecting spam reviews, spamGAN can also generate reviews with reasonable perplexity.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Opinion spam is a widespread problem in e-commerce, social media, travel sites, movie review sites, etc. [Jindal et al.2010]. Statistics show that more than of the consumers read reviews before making a purchase [Hub2018]

. The likelihood of purchase is also reported to increase when there are more reviews. Opinion spammers try to exploit such financial gains by providing spam reviews which influence readers and thereby affect sales. We consider the problem of identifying spam reviews as a classification problem, i.e., given a review, it needs to be classified either as spam or non-spam.

One of the main challenges in identifying spam reviews is the lack of labeled data, i.e., spam and non-spam labels [Rayana and Akoglu2015]. While there exists a corpus of online reviews only few of them are labeled. This is mainly because manual labeling is often time consuming, costly and subjective [Li et al.2018]. Research shows that unlabeled data, when used in conjunction with small amounts of labeled data can produce considerable improvement in learning accuracy [Ott et al.2011]

. There is very limited research on using semi-supervised learning techniques for opinion spam detection 

[Crawford et al.2015]. The existing semi-supervised learning approaches [Li et al.2011, Hernández et al.2013, Li et al.2014]

for identifying opinion spam use pre-defined set of features for training their classifier. In this paper, we will use deep neural networks which automatically discovers features needed for classification 

[LeCun et al.2015].

Deep generative models have shown promising results for semi-supervised learning [Kumar et al.2017]. Specifically, Generative Adversarial Networks (GANs) [Goodfellow et al.2014] which have the ability to generate samples very close to real data, have achieved state-of-the art results. However, most research on GANs are for images (continuous values) and not text data (discrete values) [Fedus et al.2018].

GANs operate by training two neural networks which play a min-max game: discriminator D tries to discriminate real training samples from fake ones and generator G tries to generate fake training samples to fool the discriminator. The main drawback with GANs is that: 1) when the data is discrete, the gradient from the discriminator may not be useful for improving the generator. This is because, the slight change in weights brought forth by the gradients may not correspond to a suitable discrete mapping in the dictionary [Huszár2015]; 2) the discrimination is based on the entire sentence not parts of it, giving rise to the sparse rewards problem [Yu et al.2017].

Existing works on GANs for text data generation are limited by the length of the sentence that can be generated, e.g., MaskGAN [Fedus et al.2018] considers words per sentence. These approaches may not be suitable for processing most online reviews, which are relatively lengthy. For example, the TripAdvisor review dataset used in our experiments has sentences with median length . Further, GANs have also not been fully investigated for text classification tasks.

In this paper, we propose spamGAN, a semi-supervised GAN based approach for classifying opinion spam. spamGAN uses both labeled instances and unlabeled data to correctly learn the input distribution, resulting in better prediction accuracy for comparatively longer reviews. spamGAN consists of different components: generator, discriminator, classifier which work together to not only classify spam reviews but also generate samples close to the train set. We conduct experiments on TripAdvisor dataset and show that spamGAN outperforms existing works when using limited labeled data.

Following are the main contributions of this paper: 1) we propose spamGAN: a semi-supervised GAN based model to detect opinion spam. To the best of our knowledge, we are the first to explore the potential of GANs for spam detection; 2) the proposed GAN model improves the state-of-the-art GAN based models for semi-supervised text classification; 3) most existing research on opinion spam (other than deep learning methods) manually identify heuristics/features for classifying spamming behavior, however in our GAN based approach, the features are learned by the neural network; 4) experiments show that spamGAN outperforms state-of-the art methods in classifying spam when limited labeled data is used; 5) spamGAN can also generate spam/non-spam reviews very similar to the training set which can be used for synthetic data generation in cases with limited ground truth.

2 Related Work

Most existing opinion spam detection techniques are supervised methods based on pre-defined features. [Jindal and Liu2008]

used logistic regression with product, review and reviewer-centric features.

[Ott et al.2011]

used n-gram features to train a Naive Bayes and SVM classifier.

[Feng et al.2012, Mukherjee et al.2013, Li et al.2015] used part-of-speech tags and context free grammar parse trees, behavioral features, spatio-temproal features, respectively. [Wang et al.2011, Akoglu et al.2013] used graph based algorithms.

Neural network methods for spam detection consider the reviews as input wihtout specific feature extraction. GRNN 

[Ren and Ji2017]

used a gated recurrent neural network to study the contexual information of review sentences. DRI-RCNN 

[Zhang et al.2018] used a recurrent network for learning the contextual information of the words in the reviews. DRI-RCNN extends RCNN [Lai et al.2015]

by learning embedding vectors with respect to both spam and non-spam labels for the words in the reviews. Since RCNN and DRI-RCNN use neural networks for spam classification, we will use these supervised methods for comparison in our experiments.

Few semi-supervised methods for opinion spam detection exist. [Li et al.2011]

used co-training with Naive-Bayes classifier on reviewer, product and review features.

[Hernández et al.2013, Li et al.2014] used only positively labeled samples along with unlabeled data. [Rayana and Akoglu2015] used review features, timestamp, ratings as well as pairwise markov random field network of reviewers and product to build a supervised algorithm along with semi-supervised extensions. Other un-supervised methods for spam detection [Xu et al.2015] exists, but, they are out of the scope of this work.

The ongoing research on GANs for text classification aim to address the drawbacks of GANs in generating sentences with respect to the gradients and the sparse rewards problem. SeqGAN [Yu et al.2017]

addresses them by considering sequence generation as a reinforcement learning problem. Monte Carlo Tree Search (MCTS) is used to overcome the issue of sparse rewards, however it is computationally intractable. StepGAN 

[Tuan and Lee2018] and MaskGAN [Fedus et al.2018] use the actor-critic [Konda and Tsitsiklis2000] method to learn the rewards, however MaskGAN is limited by length of the sequence. Further, all of them focus on sentence generation. CSGAN [Li et al.2018] deals with sentence classification, but it uses MCTS and character-level embeddings. spamGAN differs from CSGAN in using the actor-critic reinforcement learning method for sequence generation and word-level embeddings, suitable for longer sentences.

3 spamGAN

In this section, we will present the problem set-up, the three components of spamGAN as well as their interactions through a sequential decision making framework.

3.1 Problem Set-up

Let be the set of reviews labeled spam or non-spam. Given the cost of labeling, we hope to improve classification performance by also using , a significantly larger set of unlabeled reviews222 includes both spam/non-spam reviews.. Let be a combination of labeled and unlabeled sentences for training333Training (see Alg. 1) can use only or both and .. Each training sentence consists of a sequence of word tokens, where represents the token in the sentence and is a corpus of tokens used. For sentences belonging to , we also include a class label belonging to one of the classes .

To leverage both the labeled and unlabeled data, we include three components in spamGAN: the generator , the discriminator , and the classifier as shown in Fig. 1. The generator, for a given class label, learns to generate new sentences (we call them 444Fake sentences are those produced by the generator. Spam sentences are deceptive sentences with class label . Generator can generate fake sentences belonging to or class. sentences) similar to the real sentences in the train set belonging to the same class. The discriminator learns to differentiate between real and fake sentences, and informs the generator (via rewards) if the generated sentences are unrealistic. This competition between the generator and discriminator improves the quality of the generated sentence.

We know the class labels for the fake sentences produced by the generator as they are controlled [Hu et al.2017], i.e., constrained by class labels . The classifier is trained using real labeled sentences from and fake sentences produced by the generator, thus improving its ability to generalize beyond the small set of labeled sentences. The classifier’s performance on fake sentences is also used as feedback to improve the generator: better classification accuracy results in more rewards. While the discriminator and generator are competing, the classifier and generator are mutually bootstrapping. As the components of spamGAN are trained, the generator produces sentences very similar to the training set while the classifier learns the characteristics of spam and non-spam sentences in order to identify them correctly.

Figure 1: spamGAN Architecture

3.2 Generator

If

is the true joint distribution of sentences

and classes from the real training set, the generator aims to find a parameterized conditional distribution that best approximates the true distribution. The generated fake sentence is conditioned on the network parameters , noise vector , and class label , which are sampled from the prior distribution and , respectively. and together make up the context vector. The context vector is concatenated to the generated sentence at every timestep [Tuan and Lee2018], ensuring that the actual class labels for each generated fake sentence is retained.

While sampling from , the word tokens are generated auto-regressively, decomposing the distribution over token sequences into the ordered conditional sequence,

(1)

During pre-training, we use batches of real sentences from and minimize the cross-entropy of the next token conditioned on the preceding ones. Specifically, we minimize the loss (Eqn. 2) over real sentence-class pairs from as well as unlabeled real sentences from with randomly-assigned class labels drawn from the class prior distribution.

(2)

During adversarial training, we treat sequence generation as a sequential decision making problem [Yu et al.2017]. The generator acts as a reinforcement learning agent and is trained to maximize the expected rewards using policy gradients, where the rewards are feedback obtained from the discriminator and classifier for the generated sentences (See Sec. 3.5

). For implementation, we use a unidirectional multi-layer recurrent neural network with gated recurrent units as the base cell to represent the generator.

3.3 Discriminator

The discriminator , with parameters predicts if a sentence is real (sampled from

) or fake (produced by the generator) by computing a probability score

that the sentence is real. Like [Tuan and Lee2018] instead of computing the score at the end of the sentence, the discriminator produces scores for every timestep , which are then averaged to produce the overall score.

(3)

is the intermediate score for timestep and is based solely on the preceding partial sentence, . In a setup reminiscent of -learning, we consider

to be the estimated value for the state

and action . Thus, the discriminator provides estimates for the true state-action values without the additional computational overhead of using MCTS rollouts.

We train the discriminator like traditional GANs by maximizing the score for real sentences and minimizing it for fake ones. This is achieved by minimizing the loss ,

(4)

We also include a discrimination critic  [Konda and Tsitsiklis2000] which is trained to approximate the score from the discriminator network, for the next token based on the preceding partial sentence . The approximated score will be used to stabilize policy gradient updates for the generator during adversarial training.

(5)

is trained to minimize the sequence mean-squared error between and the actual score .

(6)

The discriminator network is implemented as a unidirectional Recurrent Neural Network (RNN) with one dense output layer which produces the probability that a sentence is real at each timestep, i.e., . For the discrimination critic, we have a additional output dense layer (different from the one that computes ) attached to the discriminator RNN, which estimates for each timestep.

3.4 Classifier

Given a sentence , the classifier with parameters predicts if the sentence belongs to class . Like the discriminator, it assigns a prediction score at each timestep for the partial sentence , which identifies the probability the sentence belongs to class . The intermediate scores are then averaged to produce the overall score:

(7)

The classifier loss is based on: 1) , the cross-entropy loss on true labeled sentences computed using the overall classifier sentence score; 2) the loss for the fake sentences. Fake sentences are considered as potentially-noisy training examples, so we not only minimize cross-entropy loss but also include Shannon entropy .

(8)

In , , the balancing parameter, influences the impact of Shannon entropy. Including , for minimum entropy regularization [Hu et al.2017], allows the classifier to predict classes for generated fake sentences more confidently. This is crucial in reinforcing the generator to produce sentences of the given class during adversarial training.

Like in discriminator, we include a classification critic to estimate the classifier score for based on the preceding partial sentence ,

(9)

The implementation of the classifier is similar to the discriminator. We use a unidirectional recurrent neural network with a dense output layer producing the predicted probability distribution over classes

. The classification critic is also an alternative head off the classifier RNN with an additional dense layer estimating for each timestep. We train this classifier critic by minimizing ,

(10)

3.5 Reinforcement Learning Component

We consider a sequential decision making framework, in which the generator acts as as a reinforcement learning agent. The current state of the agent is the generated tokens so far. The action is the next token to be generated, which is selected based on the stochastic policy . The reward the agent receives for the generated sentence of a given class is determined by the discriminator and classifier. Specifically, we take the overall scores (Eqn.3) and (Eqn. 7) and blend them in a manner reminiscent of the F1 score, producing the sentence reward,

(11)

This reward is for the entire sentence delivered during the final timestep, with reward for every other timestep being zero [Tuan and Lee2018]. Thus, the generator agent seeks to maximize the expected reward, given by,

(12)

To maximize , the generator parameters are updated via policy gradients [Sutton et al.2000]. Specifically, we use the advantage actor-critic method to solve for optimal policy [Konda and Tsitsiklis2000]. The expectation in Eqn. 12 can be re-written using rewards for intermediate time-steps from the discriminator and classifier. The intermediate scores from the discriminator, and the classifier, , are combined as shown in Eqn. 13 and the combined values serve as estimators for , the expected reward for sentence

. To reduce variance in the gradient estimates, we replace

by the advantage function , where is given by Eqn. 13. We use in Eqn. 14 to increase the importance of initially-generated tokens while updating . is a linearly-decreasing factor which corrects the relative lack of confidence in the initial intermediate scores from the discriminator and classifier.

1 Input: Labeled dataset , Unlabeled dataset
2 Parameters: Network parameters
3 Perform pre-training as described in Sec. 3.6
4 for  do
5        for  do
6               sample batch of classes from
7               generate batch of fake sequences given
8               for  do
9                      compute , using Eqn. 13
10              update using policy gradient in Eqn. 14
11              
12       for  do
13               sample batch of real sentences from ,
14               Update using MLE in Eqn. 2
15              
16       for  do
17               sample batch of real sentences from ,
18               sample batch of fake sentences from
19               update discriminator using from Eqn. 4
20               compute for fake sentcs
21               update using from Eqn. 6
22              
23       for  do
24               sample batch of real sentences-class pairs from
25               sample batch of fake sentence-class pairs from
26               update classifier using from Eqn. LABEL:eqn:closs
27               compute on fake sents
28               update using from Eqn. 10
29       
Algorithm 1 spamGAN
(13)

During adversarial training, we perform gradient ascent to update the generator using the gradient equation shown below,

(14)
Method 10% Labeled 30% 50% 70% 90% 100%
spamGAN-0% 0.700 0.02 0.811 0.02 0.838 0.01 0.845 0.01 0.852 0.02 0.862 0.01
spamGAN-50% 0.678 0.03 0.797 0.03 0.839 0.02 0.845 0.02 0.857 0.02 0.856 0.01
spamGAN-70% 0.695 0.05 0.780 0.03 0.828 0.02 0.850 0.01 0.841 0.02 0.844 0.02
spamGAN-100% 0.681 0.02 0.783 0.02 0.831 0.01 0.837 0.01 0.843 0.02 0.845 0.01
Base classifier 0.722 0.03 0.786 0.02 0.791 0.02 0.829 0.01 0.824 0.02 0.827 0.02
DRI-RCNN 0.647 0.10 0.757 0.01 0.796 0.01 0.834 0.18 0.835 0.02 0.846 0.01
RCNN 0.538 0.09 0.665 0.14 0.733 0.09 0.811 0.03 0.834 0.02 0.825 0.02
Co-Train (Naive Bayes) 0.655 0.01 0.740 0.01 0.738 0.02 0.743 0.01 0.754 0.01 0.774 0.01
PU Learn (Naive Bayes) 0.508 0.02 0.713 0.03 0.816 0.01 0.826 0.01 0.838 0.02 0.843 0.02
Table 1: Accuracy (Mean Std) for Different % Labeled Data

3.6 Pre-Training

Before beginning adversarial training, we pre-train the different components of spamGAN. The generator is pre-trained using maximum likelihood estimation (MLE) [Grover et al.2018] by updating the parameters via Eqn 2. Once the generator is pre-trained, we take batches of real sentences from the labeled dataset , the unlabeled dataset and fake sentences sampled from to pre-train the discriminator minimizing the loss in Eqn. 4. The classifier is pre-trained solely on real sentences from the labeled dataset . It is trained to minimize the cross-entropy loss on real sentences and their labels. The critic networks and are trained by minimizing their loses (Eqn. 6) and (Eqn. 10). Such pre-training addresses the problem of mode collapse [Guo et al.2018] to a satisfactory extent.

3.7 spamGAN algorithm

Alg. 1 describes spamGAN in detail. After pre-training, we perform adversarial training for (Lines -). We create a batch of fake sentences using generator by sampling classes from prior (Lines -). We compute , using Eqn. 13 for every timestep (Line ). The generator is then updated using policy gradient in Eqn. 14 (Line ). This process is repeated for . Like [Li et al.2017] the training robustness is greatly improved when the generator is updated using MLE via Eqn 2 on sentences from (Lines -). We then train the discriminator using real sentences from , as well as fake sentences from the generator (Lines -). The discriminator is updated using Eqn. 4 (Line ). We also train the discrimination critic, by computing for the fake sentences and updating the gradients using Eqn. 6 (Line -). This process is repeated for . We perform a similar set of operations for the classifier (Lines -).

Method 10% Labeled 30% 50% 70% 90% 100%
spamGAN-0% 0.718 0.02 0.812 0.02 0.840 0.01 0.848 0.02 0.854 0.02 0.868 0.01
spamGAN-50% 0.674 0.05 0.797 0.03 0.843 0.01 0.848 0.02 0.860 0.02 0.863 0.01
spamGAN-70% 0.702 0.05 0.784 0.03 0.830 0.02 0.856 0.01 0.848 0.02 0.854 0.01
spamGAN-100% 0.684 0.03 0.788 0.03 0.839 0.02 0.844 0.01 0.846 0.02 0.850 0.01
Base classifier 0.731 0.03 0.795 0.03 0.803 0.02 0.829 0.01 0.832 0.02 0.838 0.02
DRI-RCNN 0.632 0.07 0.754 0.02 0.779 0.00 0.812 0.03 0.817 0.03 0.833 0.02
RCNN 0.638 0.01 0.715 0.01 0.754 0.02 0.776 0.05 0.820 0.03 0.833 0.02
Co-Train (Naive Bayes) 0.637 0.02 0.698 0.01 0.680 0.02 0.677 0.01 0.712 0.01 0.726 0.01
PU-Learn (Naive Bayes) 0.050 0.02 0.636 0.05 0.815 0.02 0.837 0.02 0.844 0.02 0.852 0.01
Table 2: F1-Score (Mean Std) for Different % Labeled Data

4 Experiments

We use the TripAdvisor labeled dataset [Ott et al.2011] 555http://myleott.com/op-spam.html, consisting of 800 truthful reviews on Chicago hotels as well as deceptive reviews obtained from Amazon Mechanical Turk. We remove a small number of duplicate truthful reviews, to get a balanced labeled dataset of 1596 reviews. We augment the labeled set with unlabeled TripAdvisor reviews for Chicago hotels 666http://times.cs.uiuc.edu/ wang296/Data/index.html. All reviews are converted to lower-case and tokenized at word level, with a vocabulary of . The maximum sequence length is words, close to the median review length of the full dataset.

also includes tokens , , , and . , are added to the beginning, end of each sentence. Sentences smaller than

are padded with

while longer ones are truncated, ensuring a consistent sentence length. replaces out-of-vocabulary words.

In spamGAN, the generator consists of 2 GRU layers of 1024 units each and an output dense layer providing logits for the

tokens. The generator, discriminator and classifier are trained using ADAM optimizer. All use variational dropout= between recurrent layers and word embeddings with dimension . For generator, learning rate = , weight decay =

. Gradient clipping is set to a maximum global norm of

. The discriminator contains 2 GRU layers of 512 units each and a dense layer with a single scalar output and sigmoid activation. The discrimination critic is implemented as an alternative dense layer. Learning rate = and weight decay =. The classifier is similar to discriminator. We set balancing coefficient . The train time of spamGAN using a Tesla P4 GPU was hrs.

Figure 2: Comparison of spamGAN-50 with Other Approaches

We use a train-test split on labeled data. We compare spamGAN with supervised methods which use recurrent networks: 1) DRI-RCNN [Zhang et al.2018]; 2) RCNN [Lai et al.2015] as well as semi-supervised methods: 3) Co-Training [Li et al.2011] with Naive Bayes classifier; 4) PU Learning [Hernández et al.2013] with Naive Bayes (SVM performed poorly) using only spam and unlabeled reviews.

We conduct experiments with of labeled data. To analyze the impact of unlabeled data, we show different versions: spamGAN-0 (no unlabeled data), spamGAN-50 (50% unlabeled data), spamGAN-70 (70% unlabeled) and spamGAN-100. Co-Train, PU-Learn results are for unlabeled data. We also show the performance of our base classifier (without generator, discriminator, trained on real labeled data to minimize ). All experiments are repeated

times and the mean, standard deviation are reported.

4.0.1 Influence of Labeled Data

Table. 1 shows the classification accuracy of the different models on the test set. SpamGAN models, in general, outperform other approaches, especially when the % of labeled data is limited. When we merely use of labeled data, spamGAN-0, spamGAN-50, spamGAN-70, spamGAN-100 achieve an accuracy of , respectively, which is higher than supervised approaches DRI-RCNN () and R-CNN () as well as semi-supervised approaches Co-train () and PU-learning (). Even without any unlabeled data spamGAN-0 gets good results because the mutual bootstrapping between generator and classifier allows the classifier to explore beyond the small labeled training set using the fake sentences produced by the generator. The accuracy of our base classifier is , higher than spamGAN models as GANs needs more samples to train, in general.

The accuracy of all approaches increases with % of labeled data. We select spamGAN-50 as a representative for comparison in Fig. 2. Though the difference in accuracy between spamGAN-50 and others reduces as the % of labeled data increases, spamGAN-50 still performs better than others with an accuracy of when all labeled data are considered.

Table. 2 shows the F1-score. We can again see that spamGAN-0, spamGAN-50 and spamGAN-70 perform better than the others, especially when the % of labeled data is small.

4.0.2 Influence of Unlabeled Data

While unlabeled data is used to augment the classifier’s performance, Fig. 3 shows that F1-score slightly decreases when the % unlabeled data increases, especially for spamGAN-100. In our case, as unlabeled data is much larger than the labeled, the generator does not entirely learn the importance of the sentence classes during pre-training (when the unlabeled sentence classes are randomly assigned), which causes problems for the classifier during adversarial training. However, when no unlabeled data is used, the generator easily learns to generate sentences conditioned on classes paving way for mutual bootstrapping between classifier and generator. We can also attribute the drop in performance to the difference in distribution of data between the unlabeled TripAdvisor reviews and the handcrafted reviews from Amazon MechanicalTurk.

Figure 3: Influence of Unlabeled Data on F1-Score

4.0.3 Perplexity of Generated Sentence

We also compute the perplexity of the sentences produced by the generator (the lower the value the better). Fig. 4 shows that as the % of unlabeled data increases (spamGAN-0 to spamGAN-100), the perplexity of the sentences decreases. SpamGAN-100, SpamGAN-70 achieve a perplexity of , respectively. Fig. 3, Fig. 4 show that using unlabeled data improves the generator in producing realistic sentences but does not fully help to differentiate between the classes which again, can be attributed to the difference in the data distribution between the labeled and unlabeled data.

Following is a sample (partial) spam sentence produced by the generator: ”Loved this hotel but i decided to the hotel in a establishment didnt look bad …the palmer house was anyplace that others said in the reviews..”. We notice that spam sentences use more conservative choice of words, focusing on adjectives, reviewer, and attributes of the hotel, while non-spam sentences speak more about the trip in general.

Figure 4: Influence of Unlabeled Data on Perplexity

5 Conclusion and Future Work

We have proposed spamGAN, an approach for detecting opinion spam with limited labeled data. spamGAN, apart from detecting spam, helps to generate reviews similar to the training set. Experiments show that spamGAN outperforms state-of-the-art supervised and semi-supervised techniques when labeled data is limited. While we use TripAdvisor dataset, we plan to conduct experiments on YelpZip data (overcoming the data distribution issue of MechanicalTurk reviews). As the overall spamGAN architecture is agnostic to the implementation details of the classifier, we plan to use a more sophisticated design for classifier than a simple recurrent network.

References

  • [Akoglu et al.2013] Leman Akoglu, Rishi Chandy, and Christos Faloutsos. Opinion fraud detection in online reviews by network effects. In AAAI-ICWSM, 2013.
  • [Crawford et al.2015] Michael Crawford, Taghi M Khoshgoftaar, Joseph D Prusa, Aaron N Richter, and Hamzah Al Najada.

    Survey of review spam detection using machine learning techniques.

    Journal of Big Data, 2(1):23, 2015.
  • [Fedus et al.2018] William Fedus, Ian Goodfellow, and Andrew M Dai.

    Maskgan: Better text generation via filling in the _.

    ICLR, 2018.
  • [Feng et al.2012] Song Feng, Ritwik Banerjee, and Yejin Choi. Syntactic stylometry for deception detection. In ACL, 2012.
  • [Goodfellow et al.2014] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In NIPS, 2014.
  • [Grover et al.2018] Aditya Grover, Manik Dhar, and Stefano Ermon. Flow-gan: Combining maximum likelihood and adversarial learning in generative models. In AAAI, 2018.
  • [Guo et al.2018] Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, and Jun Wang. Long text generation via adversarial training with leaked information. In AAAI, 2018.
  • [Hernández et al.2013] Donato Hernández, Rafael Guzmán, Manuel Móntes y Gomez, and Paolo Rosso. Using pu-learning to detect deceptive opinion spam. In Workshop on computational approaches to subjectivity, sentiment and social media analysis, pages 38–45, 2013.
  • [Hu et al.2017] Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, and Eric P Xing. Toward controlled generation of text. arXiv preprint arXiv:1703.00955, 2017.
  • [Hub2018] Crowd Learning Hub. https://learn.g2crowd.com/customer-reviews-statistics. 2018.
  • [Huszár2015] Ferenc Huszár. How (not) to train your generative model: Scheduled sampling, likelihood, adversary? arXiv preprint arXiv:1511.05101, 2015.
  • [Jindal and Liu2008] Nitin Jindal and Bing Liu. Opinion spam and analysis. In WSDM, 2008.
  • [Jindal et al.2010] Nitin Jindal, Bing Liu, and Ee-Peng Lim. Finding unusual review patterns using unexpected rules. In CIKM, 2010.
  • [Konda and Tsitsiklis2000] Vijay R Konda and John N Tsitsiklis. Actor-critic algorithms. In NIPS, 2000.
  • [Kumar et al.2017] Abhishek Kumar, Prasanna Sattigeri, and Tom Fletcher. Semi-supervised learning with gans: manifold invariance with improved inference. In NIPS, 2017.
  • [Lai et al.2015] Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao.

    Recurrent convolutional neural networks for text classification.

    In AAAI, 2015.
  • [LeCun et al.2015] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015.
  • [Li et al.2011] Fangtao Li, Minlie Huang, Yi Yang, and Xiaoyan Zhu. Learning to identify review spam. In IJCAI, volume 22, page 2488, 2011.
  • [Li et al.2014] Huayi Li, Bing Liu, Arjun Mukherjee, and Jidong Shao. Spotting fake reviews using positive-unlabeled learning. Computación y Sistemas, 18(3):467–475, 2014.
  • [Li et al.2015] Huayi Li, Zhiyuan Chen, Arjun Mukherjee, Bing Liu, and Jidong Shao. Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns. In AAAI-ICWSM, 2015.
  • [Li et al.2017] Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. Adversarial Learning for Neural Dialogue Generation. 2017.
  • [Li et al.2018] Yang Li, Quan Pan, Suhang Wang, Tao Yang, and Erik Cambria. A generative model for category text generation. Information Sciences, 450:301–315, 2018.
  • [Mukherjee et al.2013] Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie Glance. What yelp fake review filter might be doing? In AAAI-ICWSM, 2013.
  • [Ott et al.2011] Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In ACL, 2011.
  • [Rayana and Akoglu2015] Shebuti Rayana and Leman Akoglu. Collective opinion spam detection: Bridging review networks and metadata. In KDD, 2015.
  • [Ren and Ji2017] Yafeng Ren and Donghong Ji. Neural networks for deceptive opinion spam detection: An empirical study. Information Sciences, 385:213–224, 2017.
  • [Sutton et al.2000] Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In NIPS, 2000.
  • [Tuan and Lee2018] Yi-Lin Tuan and Hung-Yi Lee. Improving conditional sequence generative adversarial networks by stepwise evaluation. arXiv:1808.05599, 2018.
  • [Wang et al.2011] Guan Wang, Sihong Xie, Bing Liu, and S Yu Philip. Review graph based online store review spammer detection. In ICDM, 2011.
  • [Xu et al.2015] Yinqing Xu, Bei Shi, Wentao Tian, and Wai Lam. A unified model for unsupervised opinion spamming detection incorporating text generality. In AAAI, 2015.
  • [Yu et al.2017] Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. Seqgan: Sequence generative adversarial nets with policy gradient. In AAAI, 2017.
  • [Zhang et al.2018] Wen Zhang, Yuhang Du, Taketoshi Yoshida, and Qing Wang. Dri-rcnn: An approach to deceptive review identification using recurrent convolutional neural network. Information Processing & Management, 54(4):576–592, 2018.