1. Introduction
Game developers can monetize their games by selling ingame ad placements to advertisers. Ingame ads can be integrated to the game either through a banner in the background or commercials during breaks (when a certain part of the game is completed). There are four main elements in the game advertising ecosystem: publishers or developers, advertisers ^{1}^{1}1Advertisers can be publishers. (demand), advertising network, and users (supply) (Mouawi et al., 2019). Game advertising networks connect advertisers with game developers and serve billions of ads to user devices triggering an enormous amount of ad events. For example, Unity Ads reports 22.9B+ monthly global ad impressions, reaching 2B+ monthly active endusers worldwide ^{2}^{2}2https://www.businesswire.com/news/home/20201013005191/en/.
There are multiple types of ad events in the realworld, e.g. request, start, view, click, install, etc. Each type stands for one specific kind of adrelated user action happening at a specific time. A complete ad life cycle can be depicted as a temporal sequence of ad events, each of which is a tuple of event type with corresponding time interval. Click and install are two kinds of ad events commonly associated with ad revenue. PayPerClick (Kapoor et al., 2016) and PayPerInstall (Thomas et al., 2016) are the most widely used advertising models for pricing.
Naturally, as advertisers allocate more of their budgets into this ecosystem, more fraudsters tend to abuse the advertising networks and defraud advertisers of their money (Nagaraja and Shah, 2019). Fraudulent ad activities aimed at generating illegitimate ad revenues or unearned benefits are one of the major threats to these advertising models. Common types of fraudulent activities include fake impressions (Haider et al., 2018), click bots (Haddadi, 2010; Kudugunta and Ferrara, 2018), click farms (Oentaryo et al., 2014), etc. (Zhu et al., 2017a). Fraud in the advertising ecosystem is top of mind for advertisers, developers, and advertising networks. Having their reputation and integrity on the line, a huge amount of effort has been focused on fraud detection activities from the advertising network’s side (Jianyu et al., 2017; Dong et al., 2018; Mouawi et al., 2019; Nagaraja and Shah, 2019).
Studying the behaviors of ads temporal sequences helps to identify intrinsic hidden patterns committed by fake or malicious users in advertising networks. Given the massive ad activity data in game advertising networks, machine learningbased approaches have become popular in the industry.
However, it is not a straightforward task to train machine learning models directly on fraudulent and benign sequences collected from ad activities (Choi and Lim, 2020). The vast majority of ad traffic is nonfraudulent and data labeling by human experts is timeconsuming, which results in low availability of labeled fraud sequences and a high class imbalance between the fraud/nonfraud training data. Simply oversampling the minority fraud class can cause significant overfitting while undersampling the majority nonfraud may lead to information loss and yield a tiny training dataset (Ba, 2019). To mitigate this data availability problem, in this study we present a novel data generator which is able to learn the intrinsic hidden patterns from sequential training data and generate emulated sequences with high qualities.
The main contributions of our work can be summarized as follows:

We build a data generator which is able to generate multitype temporal sequences with nonuniform time intervals.

We present a new application for eventbased sequence GAN for fraud detection in game advertising.

We propose a new way of sequence GAN training by employing a Critic network.
2. Related Work
Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) have drawn a significant attention as a framework for training generative models capable of producing synthetic data with desired structures and properties (Killoran et al., 2017). Ba, 2019 proposed using GANs to generate data that mimics training data as an augmented oversampling method with an application in credit card fraud. The generated data is used to assist the classification of credit card fraud Ba (2019).
Despite of the remarkable success of GANs in generating reallooking data, very few studies focus on generating sequence data. This is due to the fact that taking advantage of GAN to generate temporal sequence data with intrinsic hidden patterns can be more challenging. Recurrent Neural Network (RNN) solutions have become stateoftheart methods on modeling sequential data. Hyland et al., 2017 developed a Recurrent Conditional GAN (RCGAN) to generate realvalued multidimensional time series, and then used the generated series for supervised training
(Esteban et al., 2017). The time series data in their study were physiological signals sampled at specific fixed frequencies, whereas ad events data have a higher complexity in terms of nonuniform time intervals and discrete event types, and thus can not be modeled as wave signals. In ad event sequences, two events with a short time interval tend to be more correlated than events with larger time intervals.Killoran et al., 2017 proposed a GANbased generative model for DNA along with a activation maximization technique for DNA sequence data. Their experiments have shown that these generative techniques can learn important structure from DNA sequences and can be used to design new DNA sequences with desired properties (Killoran et al., 2017)
. Similar to the previous study, their focus is on the fixed interval sequences. Zheng et al., 2019 adopted the LSTMAutoencoder to encode the benign users into a hidden space. They proposed using OneClass Adversarial Network (OCAN) for the training process of the GAN model. In their training framework, the discriminator is trained to be a classifier for distinguishing benign users and the generator produces samples that are complementary to the representations of benign users
(Zheng et al., 2019). However, since OCAN has not been trained on the malicious users dataset, it is hard to measure the quality of the generated sequences and understand the pattern they follow.2.1. TimeLSTM
The common LSTM cells have shown remarkable success to generate complex sequences with longrange structures in numerous domains (Graves, 2013). Recently, a combination of a LSTM cell with a dimensionreducing symbolic representation was proposed to forecast time series (Elsworth and Güttel, 2020)
. However, RNN models usually consider the sequence of the events and ignore their intervals. Thus, these models are not suitable to process nonuniformly distributed events generated in continuous time. This major drawback of the traditional recurrent models has led to development of PhasedLSTM
(Neil et al., 2016). An LSTM variant to model eventbased sequences. Neil et al., 2016 proposed adding a new time gate to the traditional LSTM cells. The time gate is controlled by a parametrized oscillation and has three phases, i.e. rises from to in the first phase, drops from to in the second phase, and it remains inactive in the third phase. Xiao et al., 2017 proposed using an intensity function modulated synergistically by one RNN (Xiao et al., 2017). Further, the TimeAware LSTM (TLSTM) cells were proposed in (Baytas et al., 2017) to handle nonuniform time intervals in longitudinal patient records. They used the TLSTM cell in an autoencoder to learn a single representation for sequential records of patients.Zhu et al., 2017 proposed a LSTM variant named TimeLSTM to model users’ sequential actions in which LSTM cells are equipped with time gates to model time intervals (Zhu et al., 2017b)
. In this paper, TimeLSTM was used for recommending items to users. We find it useful to model event types and time intervals using TimeLSTM cells because of its ability in capturing the intrinsic patterns of fraudulent sequences, and thus to understand the internal mechanisms of how fraudulent activities are generated. We implemented our version of TimeLSTM cells in Keras and used it in the architecture of the generators and discriminators of our GAN models.
2.2. GAN for Sequence Data
When generating continuous outputs, gradient updates can be passed from the discriminator to the generator. However, for discrete outputs, due to lack of differentiability, the backpropagation does not work properly. Yu et al., 2017 addressed the issue of training GAN models to generate sequences of discrete tokens. They proposed a sequence generation framework, called SeqGAN that models the data generator as a stochastic policy in Reinforcement Learning (RL)
(Yu et al., 2017). They regarded the generator as a stochastic parametrized policy. Their policy gradient employs MC search to approximate the state values, which is a computationally expensive process in the training loop. Moreover, the SeqGAN is limited only to discrete token generation, in our work we propose a modified version of seqGAN in combination with TimeLSTM cells that can generate both discrete tokens and continuous time intervals. To efficiently train the policy network, we employ a Critic network to approximate the return given a partially generated sequence to speed up the training process. This approach also brings the potential to use a trained Critic network for early fraud detection from partial sequences.Zhao et al., 2020 presents an application of SeqGAN in recommendation systems. The paper solves the slow convergence and unstable RL training by using ActorCritic algorithm instead of MC rollouts (Zhao et al., 2020). Their generator model produces the entire recommended sequences given the interaction history while the discriminator learns to maximize the score of groundtruth and minimize the score of generated sequences. In each step, the generator
generates a token by topk beam search based on the model distribution. In our work, we directly sample from the distribution of the output probabilities of the tokens. While our methodologies are close, we are aiming for different goals. We optimize the generated data to solve the sample imbalance problem while they optimize for better recommendations. Therefore, different evaluation metrics are needed. Our methodologies also differ in the training strategy. For example, we used a Critic network as the baseline whereas they used TemporalDifference bootstrap targets. They pretrained the discriminator on the generated data to reduce the exposure bias while we pretrained the discriminator on the actual training data for improving the metrics we use in our experiments. More importantly they do not include time intervals as an attribute in their model while we have time intervals in our models.
Recently, Smith and Smith 2020, proposed using two generators, a convolutional generator that transforms a single random vector to a RGB spectrogram image, and a second generator that receives the 2D spectrogram images and outputs a time series
(Smith and Smith, 2020). In our work, we find the RL training process a more natural way to address the issue of generating discrete outputs.3. Methodology
Notation. In this paper, all sequences, sets are denoted by bold letters like . We use to refer to the size/length of a sequences or set.
In this section, we introduce a new methodology to generate multitype sequences using seqGAN and TimeLSTM cells.
3.1. Definitions
An original ad event sequence of length is composed of two subsequences, the subsequence of event types and the subsequence of time stamps. First, we transform the time stamps into time intervals and , and . Then, we combine the event types and time intervals into a joint multitype sequence :
(1) 
where denotes a partial sequence from time step to time step .
3.2. TimeLSTM
In this paper, we adopt the type TimeLSTM cell from (Zhu et al., 2017b). The update equations of this TimeLSTM cell are as follows:
(2)  
(3)  
(4)  
(5)  
(6)  
(7) 
is the input feature vector at time step , which in our case would be the embedding of the event type. is the input time interval at time step . , , , are the activations of input, forget, time, output gates, respectively. , are the cell activation and hidden state. is the cell state of the previous time step . , ,
are the sigmoid function and
, are the function. , , , , , , , , , , , , are the weight parameters of the cell. , , are peephole parameters.3.3. RL and Policy improvement to train GAN
We implement a modified version of seqGAN model to generate multitype temporal sequences. TimeLSTM cells are utilized in our implementations for both the generator and the discriminator .
The sequence generation process of our generator can be modeled as a sequential decision process in RL. The state at time step is defined as:
(8) 
where is the time gate activate in (4) and is the hidden state in (7). is the partial sequence at time step and is the trainable parameters of the TimeLSTM cell.
The action at time step is a combination of two parts: , , where is the action to find the next event type and is the action to find the next time interval . Thus a new partial sequence can be formed step by step, until a complete sequence of length described in (1) is constructed.
To make decisions in this sequence generation process, we employ a hybrid policy to represent action spaces with both continuous and discrete dimensions (similar to the idea in (Neunert et al., 2020)
). This policy is designed to choose discrete event types and continuous time intervals, assuming their action spaces are independent. Then we use a categorical distribution and a Gaussian distribution to model the policy distributions for the event types and the time intervals respectively. So the hybrid generator policy can be defined as:
(9) 
where . is the set of all possible event types.
When generating a new event type and time interval at each step, we follow the generator policy and sample from categorical and normal distributions independently and concatenate them to obtain the action vector
, then append them to the current partial sequence to obtain a new partial sequence . Once a complete sequence of length has been generated, we pass the sequence to the Discriminator which predicts the probability of the sequence to be real against generated:(10) 
The feedback from can be used in training such that can better learn how to generate sequences similar to real training data to deceive . Because the discrete data is not differentiable, gradients can not passed back to generator like in imagebase GANs.
The original seqGAN training uses Policy Gradient method with MonteCarlo rollout to optimize the policy.(Yu et al., 2017)
. In order to reduce variance in the optimization process, seqGAN runs the rollout policy starting from current state till the end of the sequence for multiple times to get the mean return. Here we use an ActorCritic method with a Critic network instead of MC rollout to estimate the value of any state, which is computationally more efficient.
(Bhatnagar et al., 2007).The Critic network models a statedependent value for a partially generated sequence under policy , which is defined as the expected future return for a complete sequence provided by the Discriminator :
(11) 
The value function parameters are updated during training by minimizing the mean squared error between the true return and the value function :
(12) 
The difference between them, , is named the advantage function, which is used in training and helps to reduce variance.
The goal of training is to choose actions based on a policy that maximizes expected return. The object function of follows Policy Gradient method (Sutton and Barto, 2018) which can be derived as:
(13) 
Because of the independence assumption we made, the policy gradient term can be broken down and written into a categorical crossentropy and a Gaussian loglikelihood as follows:
(14) 
The goal of training to use distinguish generated sequences with true sequences from training data.^{3}^{3}3Definitions of the training data, positive and negative datasets are in section 4.1. is updated through minimizing binary crossentropy loss.
We keep training and alternatively. The Pseudo code of the entire process is shown in Algorithm 1.
4. Data Experiments
Due to the concerns about data privacy laws (e.g. GDPR ^{4}^{4}4General Data Protection Regulation, CCPA ^{5}^{5}5California Consumer Privacy Act), and to protect confidential details of the Unity Fraud Detection service, we decide not to use realworld ad events data and patterns for this study, thus to avoid data privacy issues and prevent fraudsters from reverseengineering the presented algorithms and rules to circumvent fraud detection systems. Instead, we conduct our experiments on a synthetic dataset emulating realworld ad events.
4.1. Synthetic Dataset
We define the synthetic dataset as , and types of events where
is the set of hypothetical ad event types; PAD is reserved for padding and end token; INI is the dummy initial token marking the beginning of a sequence, which always comes with a zero initial time interval.
Each sequence in the synthetic dataset has a uniform length , including the dummy initial step . For the following steps, each event type is randomly sampled from with an equal probability, and each time interval
is sampled from a ChiSquare distribution with the degree of freedom conditioned on
, i.e.:(15)  
(16) 
One example of a complete synthetic sequence is as below:
Then, we split the synthetic dataset into a positive dataset and a negative dataset , by a set of human defined rules. These rules are variants of realworld rules observed in ad activities. In this study, we intentionally avoid using real patterns or rules that appear in real fraud detection work ,in order to prevent potential information leakage to fraudsters.
There rules we defined are as follows:

A sequence starts with an event.

There are more than three distinct types of events after the initial token, and at least one of them is .

Each event is paired with one and only one previous event. Each event can be paired with at most one later event.

The total number of events is greater or equal to that of events; The total number of events is greater or equal to that of events; The total number of events is greater or equal to that of events.

The time delay between any two consecutive same events is no smaller than 10

The time delay between any two paired and events is no greater than 50.
If a sequence follows more than three rules out of six, it is classified as a positive sequence , otherwise as a negative sequence .
The goal of GAN training is to teach the generator to learn the intrinsic humandefined patterns in the synthetic dataset , and generate sequences satisfying as many abovementioned rules as possible. In a realworld application, those patterns can be hidden or unknown to human experts, but a GAN is expected to learn and reproduce the patterns that are not intuitive to humans.
4.2. Evaluation Metric
In the last few years, several different evaluation metrics for GANs have been introduced in the literature. Among them, Fréchet Inception Distance (FID) (Heusel et al., 2017) has been used extensively (DeVries et al., 2019). However, it is not enough to show the effectiveness of our training on multitype sequences using only one metric. This is due to the fact that our sequences consist of a discrete categorical part (event type) and a continuous numerical part (time interval). We propose using multiple metrics including a RuleBased Quality (RBQ) score (to check if the sequences follow our validity rules), Mean Absolute Deviation (MAD) metric (event types are diverse), and Maximum Mean Discrepancy (MMD) (Fortet and Mourier, 1953) (dissimilarity between event types or time intervals) in addition to FID for time intervals. The arrows () show the improvement directions.
RBQ . The quality of a generated sequence is measured by a metric derived from the six rules we defined in section 4.1. The general intuition behind the RBQ score is that it is less probable for a generated sequence to follow multiple humandefined rules, so that a sequence with more desired patterns deserves a higher quality score. As a result, in RBQ, rule combinations are weighted by their length, where an individual rule is considered as a combination of length . The six rules are considered equally important in calculation. We employ a geometric series with common ratio for weighting. The RBQ score for a sequence is defined as follows:
(17) 
where is the set of all rule combinations that sequence follows, is one rule combination, and is its length. For example, if a sequences follow rule and rule described in section 4.1, then it contains different rule combinations , thus yields an RBQ score of .
MAD
. We propose using MAD to measure statistical dispersion of the categorical part of the multitype sequences, i.e., the event types. We use dispersion as a proxy for diversity of the generated event types. Basically, we onehot encode the event types and compute the mean absolute deviation of each sequence from the median of all sequences. Median is known to be more robust to noise and fits our need to have categorical values as opposed to mean. Given the diversity oracle, we compare the MAD score of any batches of sequences against the MAD score of a batch of sampled sequences from our
dateset as the comparison base. MAD can be computed using:(18) 
where is a batch of sequences, is the batch size, is a sequence of length in , is the event type of step in , is the median of the event types at step across the batch .
FID . We use FID to evaluate the numerical part of the multitype sequences, i.e., the time intervals. It focuses on capturing certain desirable properties including the quality and diversity of the generated sequences. FID performs well in terms of robustness and computational efficiency (Borji, 2019). The Fréchet distance between two Gaussians is defined as:
(19) 
where and are the means and covariances of the samples from the real data distribution and model distribution, respectively.
MMD
. We also employ MMD to evaluate the time intervals. This measure computes the dissimilarity between two probability distributions
andusing samples drawn independently from each. We use an estimator with Radial Basis Function (RBF) kernel
, which is:(20) 
FIDH . This metric is a variant of FID, which views the TimeLSTM hidden state in as a continuous multivariate Gaussian distribution. Then the FID score will be computed between two hidden states using Eq. 19. Hidden states can be viewed as representations of the input sequence. FIDH uses the information from both the discrete part and the continuous part of our multityped sequence.
4.3. Experiment Setup
We use the and datasets defined in section 4.1 for model training and evaluation. Both datasets contain around data samples. As described in Algorithm 1, we first pretrain and until convergence, and then start RL training for the pretrained and . In this section, we compare the generated sequences from the following models:

: Generator with initial random model parameters.

: Generator pretrained using MLE.

: Generator trained by algorithm 1.
We monitor the training process and use the metrics defined in section 4.2 to evaluate model performance during training, which are plotted in Figure 1.
The ratio between steps and steps is set to . Both and have the same batch size , and use the SGD optimizer with learning rate . We save the models during the training process and present the best performed models for evaluation.
As shown in Figure 1, is the randomly initialized model (blue line), is the model pretrained by MLE after steps (orange line), and is the generator which is trained for steps (green line) using algorithm 1. During the pretraining process, RBQ and MAD increase, while FID and MMD decrease. After the pretraining finishes and RL training starts, RBQ keeps increasing, and FID surges dramatically after certain point ( steps in Figure 1), where we stop training because it indicates a modecollapsed generator. Notably, the FID score measured in training process is calculated between the training batch sampled from and the evaluation batch generated by , both of size , which is different from the FID and FIDH scores in Table 1.
4.4. Experiment Results
To evaluate the performance of a generator after training, we use the trained generator to create test datasets. Each model generates batches of size
. Then, we performed a twosample ttest to compare each test dataset with samples of the same size drawn from
and .Oracle scores. The Tables 1 show the mean values of the different metrics over batches. In particular, the MAD, FID score and FID score for hidden units (FIDH) are calculated respectively using data sampled from as the base for the comparisons. The results are presented in 1.
Samp.  RBQ  MAD  FID  MMD  FIDH 

G0  14.7290  1.1114  9719.9854  0.1563  5.4784 
G1  81.0501  1.3208  103.7343  0.0002  3.3819 
G2  123.5881  1.2503  187.6541  0.0002  2.6775 
122.1438  1.2880  0.0000  0.0000  —  
12.2944  1.4534  64.6673  0.0002  —  
The results demonstrate that the sequences generated by network have a significantly higher RBQ score than that generated by MLE pretrained generator and randomly initialized generator . The RBQ score of is close to the samples drawn from , which shows the high quality of these sequences. It indicates the generator is able to learn the intrinsic patterns and rules in during RLtraining, and then generate sequences that mimics these patterns to deceive the discriminator.
From the perspective of FID and MAD, has lower scores compared to the MLE pretrained generator . As we discussed, in section 4.2, FID evaluates the continuous distribution of the time interval , and MAD measures the dispersion of the discrete event type in a sequence. In each sequence from the training datasets, the two features are intrinsically correlated. However, FID only evaluates the continuous part of the sequences and MAD evaluates the discrete part of the sequences separately as two independent distributions without paying attention to their internal correlations. As a result, although have a higher FID score for time intervals and higher MAD score for event types, it has a lower RBQ score compared to
. This is because the RBQ score is calculated based on the joint distribution of time intervals and event types. Same logic applies to FIDH score as well,
has a lower FIDH score than , because FIDH is calculated between the hidden state representations of two sequences, which utilizes the information from both time intervals and event types. Moreover, MMD converges fast during the training and both and have very similar performance with regards to MMD.4.5. Discussion
Most metrics such as FID only yield onedimensional scores and fail to distinguish between different failure cases. Given that in Table 1 both and
have similar performances, we propose using Precision and Recall for Distributions (PRD)
(Sajjadi et al., 2018) to explain their differences. PRD is used to compare a distribution to a reference distribution . The intuition behind PRD is that precision measures how much of can be generated by while recall measures how much of can be generated by . Figure 2 illustrates the comparison of the PRD curves among , , and :We can interpret the differences between and from Figure 2: for a desired recall level of and lower, generates sequences closer to the training data. However, if one desires a recall higher than , enjoys higher precision.
We next examine the representations learned by the discriminator that underpinned the successful performance of it. We use tSNE (Maaten and Hinton, 2008) to visualize the output of the TimeLSTM cell. Basically, the nearby points in the representation space have similar rewards given by the discriminator.
We generate sequences from each of , , and and pass them to the discriminators that trained with the corresponding generators.
Figure 3 depicts the twodimensional tSNE embedding of the representations of the outputs of the TimeLSTM cells for each discriminator. The points are colored according to the predicted rewards from the discriminator. Darker colors mean higher reward. We can clearly see in Figure 3, discriminator has more points with relatively darker colors than discriminators and , which means the discriminator that is trained with returns higher rewards for the generated sequences. On the other hand, the discriminator seems to return a mean value for all the sequences (same color). This is due to the fact that
is a randomly initialized discriminator. The tSNE embeddings can be also used for feature extraction on labeled data.
Radford et al., 2015 has shown a way to build high quality representations by training a GAN model and reusing parts of the generator and discriminator networks as feature extractors for other supervised tasks (Radford et al., 2015). This can be potentially a promising future direction for this work.
5. Conclusions
In this paper, we have described, trained, and evaluated a seqGAN methodology for generating artificial sequences to mimic the fraudulent user patterns in ad traffic. We have additionally employed a variant of TimeLSTM cell to generate synthetic ad events with nonuniform time intervals between events. As this task poses new challenges, we have presented a new solution by training seqGAN using a combination of MLE pretraining and a Critic network. The generator proposed in this paper is capable of generating multitype temporal sequences with nonuniform time intervals, which is one of the novelties of our developed methodology. We have also proposed using multiple criteria to measure the quality and diversity of the generated sequences. Through numerous experiments, we have discovered that the generated multitype sequences are of desired properties.
Furthermore, we compared the performance of our generator under different settings with randomly sampled data from our training datasets. We concluded that the seqGANtrained generator has a higher performance compared to pretrained generators using MLE, measured by multiple criteria including RBQ and FIDH scores that are appropriate for evaluating multitype sequences.
6. Acknowledgments
The authors would like to thank Unity for giving the opportunity to work on this project during Unity’s HackWeek 2020.
References
 Improving detection of credit card fraudulent transactions using generative adversarial networks. arXiv preprint arXiv:1907.03355. Cited by: §1, §2.
 Patient subtyping via timeaware lstm networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 65–74. Cited by: §2.1.
 Naturalgradient actorcritic algorithms. Automatica. Cited by: §3.3.
 Pros and cons of gan evaluation measures. Computer Vision and Image Understanding 179, pp. 41–65. Cited by: §4.2.
 Identifying machine learning techniques for classification of target advertising. ICT Express. Cited by: §1.
 On the evaluation of conditional gans. arXiv preprint arXiv:1907.08175. Cited by: §4.2.
 Frauddroid: automated ad fraud detection for android apps. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 257–268. Cited by: §1.
 Time series forecasting using lstm networks: a symbolic approach. arXiv preprint arXiv:2003.05672. Cited by: §2.1.
 Realvalued (medical) time series generation with recurrent conditional gans. arXiv preprint arXiv:1706.02633. Cited by: §2.
 Convergence de la répartition empirique vers la répartition théorique. In Annales scientifiques de l’École Normale Supérieure, Vol. 70, pp. 267–285. Cited by: §4.2.
 Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: §2.

Generating sequences with recurrent neural networks
. arXiv preprint arXiv:1308.0850. Cited by: §2.1.  Fighting online clickfraud using bluff ads. ACM SIGCOMM Computer Communication Review 40 (2), pp. 21–25. Cited by: §1.
 An ensemble learning based approach for impression fraud detection in mobile advertising. Journal of Network and Computer Applications 112, pp. 126–141. Cited by: §1.
 Gans trained by a two timescale update rule converge to a local nash equilibrium. In Advances in neural information processing systems, pp. 6626–6637. Cited by: §4.2.
 Fraud detection via coding nominal attributes. In Proceedings of the 2017 2nd International Conference on Multimedia Systems and Signal Processing, pp. 42–45. Cited by: §1.
 Payperclick advertising: a literature review. The Marketing Review 16 (2), pp. 183–202. Cited by: §1.
 Generating and designing dna with deep generative models. arXiv preprint arXiv:1712.06148. Cited by: §2, §2.
 Deep neural networks for bot detection. Information Sciences 467, pp. 312–322. Cited by: §1.
 Visualizing data using tsne. Journal of machine learning research 9 (Nov), pp. 2579–2605. Cited by: §4.5.
 Crowdsourcing for click fraud detection. EURASIP Journal on Information Security 2019 (1), pp. 11. Cited by: §1, §1.
 Clicktok: click fraud detection using traffic analysis. In Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks, pp. 105–116. Cited by: §1.
 Phased lstm: accelerating recurrent network training for long or eventbased sequences. In Advances in neural information processing systems, pp. 3882–3890. Cited by: §2.1.
 Continuousdiscrete reinforcement learning for hybrid control in robotics. arXiv preprint arXiv:2001.00449. Cited by: §3.3.
 Detecting click fraud in online advertising: a data mining approach. The Journal of Machine Learning Research 15 (1), pp. 99–140. Cited by: §1.
 Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434. Cited by: §4.5.
 Assessing generative models via precision and recall. In Advances in Neural Information Processing Systems, pp. 5228–5237. Cited by: §4.5.
 Conditional gan for timeseries generation. arXiv preprint arXiv:2006.16477. Cited by: §2.2.
 Reinforcement learning: an introduction. MIT press. Cited by: §3.3.
 Investigating commercial payperinstall and the distribution of unwanted software. In 25th USENIX Security Symposium (USENIX Security 16), pp. 721–739. Cited by: §1.
 Joint modeling of event sequence and time series with attentional twin recurrent neural networks. arXiv preprint arXiv:1703.08524. Cited by: §2.1.

Seqgan: sequence generative adversarial nets with policy gradient.
In
Thirtyfirst AAAI conference on artificial intelligence
, Cited by: §2.2, §3.3.  Adversarial oracular seq2seq learning for sequential recommendation. In Proceedings of the TwentyNinth International Joint Conference on Artificial Intelligence, IJCAI, pp. 1905–1911. Cited by: §2.2.
 Oneclass adversarial nets for fraud detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 1286–1293. Cited by: §2.
 Fraud prevention in online digital advertising. Springer. Cited by: §1.
 What to do next: modeling user behaviors by timelstm.. In IJCAI, Vol. 17, pp. 3602–3608. Cited by: §2.1, §3.2.