I Introduction
Modern vehicles are equipped with an increasing number of sensing devices, such as Global Positioning System (GPS), Inertial Measurement Units (IMU), and other sensors that communicate through the Controller Area Network (CANBus). This realtime sensed data can be used to detect, analyze, predict, and plan a large variety of issues such as traffic congestion, vehicle energy consumption and emissions, urban mobility, and drivers’ behavior. Multiple approaches have been developed and applied to accurately identify driving behavioral patterns, such as driver recognition [1, 2], maneuver recognition [3, 4, 5], and aggressive driving detection [6]. While an accurate classification of the driving behavior can contribute to a better driving experience for the driver, there are also other applications where such classification can be useful.
Recently, there are been a growing interest from car insurance companies in designing driver behavior classification systems that could eventually be used to relate their costumers’ fees to how they drive. As a part of this solution, it is of interest to accurately classify the level of aggressiveness of their customers’ recorded trips. Nevertheless, the large number of trips would not allow to identify for each one the type of driving. Consequently, several works such as [4], [7], and [8]
have been conducted to solve this problem by an unsupervised learning approach. In the mentioned work, the goal is to find clusters from the recorded trip data which can be characterised by different levels of the aggressiveness without relying on the labels.
Since the labels, i.e., the driving style, remain a crucial element in order to apply a supervised algorithm, generating realistic artificial data can be an alternative to increase the size of the training or validation datasets and possibly improve the quality of the classification. Semisupervised learning is motivated by the availability of large datasets with unlabeled features in addition to labeled ones, in different applications [9], [10], [11]
. This lack of labeled data can be efficiently addressed through a deep learning pipeline.
Another application of interest for driving behavior classification is the development of autonomous vehicles. A better understanding of how humans drive can indeed allow for both a better functioning on a technical level and, of course, minimizing as much as possible any error, in view of the security of the users. Identifying aggressive drivers is crucial in developing safer autonomous driving techniques and advanced driving assistant systems. This problem has been extensively studied over the past decades in several works [12], [13], [14], [15]. Current autonomous driving systems use a wide range of algorithms to process sensor data. Some work, as [16], uses endtoend approaches to make navigation decisions from the sensor inputs such as camera images, LIDAR data, etc. A variety of sensors can be useful for cars to extract important information to improve the quality of autonomous vehicles and to learn how to drive safely and efficiently. Nevertheless, data collection can also be expensive and restricted in terms of privacy. Simulating data and exploiting it in the same context as real ones appears as a solution to study. The attention to generative models is increasing due to their capability of modelling underlying patterns in multidimensional data. However, assessing the quality of the synthetic data remains a crucial point to validate.
In this paper, we formulate the problem of generating labeled IMU signals, representing aggressive and normal drivers, of oneminute length for a specific part of the road, using Recurrent Conditional GANs. The generated data will be practically assessed based on its capacity to improve the classification of the semisupervised framework.
Ii Related work
Since obtaining real sensor data can be costly, timeconsuming, and have privacy issues, there have recently been several studies on sensor modelling for virtual testing, e.g. in [17], [18], [19], [20]
which are mostly based on parametric models. In
[17], a nonparametric statistical model was developed allowing for the generation of sensor position output. In [18] a radar model is proposed where noise is added to the raw signals, and then filtering is applied to model sensor output. Further, [19]proposed a Variational Autoencoder (VAE) approach in order to model the radar sensor output given some input vector, using object lists and spatial rasters. In
[21], an Autoregressive InputOutput Hidden Markov Model (AIOHMM) was proposed by fusing sensory streams through a linear transformation of features to synthesize realvalued time series describing sensor errors based on data describing the environment.
Generative Adversarial Networks (GANs) [22]
have proven to perform well in generating different types of data. Different research works, from computer vision
[23], [24], [25], to natural language processing
[26], had shown that the application of this kind of generative models can provide good results. In [27] a Recurrent Conditional Generative Adversarial Network (RCGAN) has been proposed for modelling realvalued time series describing sensor outputs that are used in autonomous driving applications. In [28], the authors augmented the LiDAR sensor data in simulated environments, by employing CycleGANs.Evaluating GANs is a challenging task. Unlike other deep learning models which are trained with a loss function until convergence, a GAN generator is trained and combined with a discriminator that learns to distinguish between the real or fake data. Both the generator and discriminator model are trained together to reach an equilibrium. Hence, there is no objective loss function used to train separately the GAN generator models. Some would rely on a visual assessment by having appealing results that agree with the real distribution. The latter shows a high potential for some data, especially for images. Meanwhile, when time series are inspected visually, this remains an inconsistent method, since it is based on a manual operation to inspect each generated sample. By evaluating a convenient distance metric between the real and fake data distribution, we can assess the trained model and infer how much the model is capturing temporal patterns. Quantitative measures, such as reconstruction loss, KullbackLeibler (KL) divergence, JensenShannon (JS) divergence can be combined with visual assessment to provide a robust assessment of GAN models. A quantitative extrinsic approach like in
[28] and [29] is also an alternative, which mainly relies on an external method to measure the quality of the generated data.Iii Contribution
This paper makes two main contributions to the field of driving behavior classification. First, it addresses the problem of data augmentation of car sensors. In our study, we generate IMU signals of oneminute length in a common portion of the road characterised by the type of driving style, which is either aggressive or normal. We use Recurrent Conditional GANs for the generation of these labeled time series. Second, we build a framework to evaluate the quality of generated data from a practical perspective. In other words, we assess the quality of the data based on the improvement of a semisupervised model, which identifies the type of driving, by adding different percentage of synthesised data to the classifier’s training and/or validation sets. Consequently, the paper investigates how much data should be generated and in which set should be used, to improve the accuracy of the driving behavior’s classification.
Iv Approach
In this section, we firstly present the experimental setting used to collect the labeled data. Then, we present the data preprocessing tasks, followed by the generative model used to synthesize the multidimensional time series. Finally, we present the extrinsic assessment framework used to evaluate the generated data. The entire pipeline is shown in Fig. 1.
Iva Experimental Setting
The dataset being used in this paper was collected from a vehicle simulator. The experiment consisted of drivers driving separately using different cars, in the same circuit. The circuit is depicted in Fig. 2. The drivers had been asked to drive both in a normal and in an aggressive way. By doing so, we have close to a groundtruth about which recorded trips that are normal or aggressive. The simulator was collecting the same signals as a real IMU unit, i.e., longitudinal acceleration, lateral acceleration, pitch, yaw, and roll. All the signals had the same sampling frequency, namely 1000 Hz. In total, the dataset consists of simulation drives.
IvB Data preprocessing
For computational reasons, we downsampled the IMU signals to 1 Hz, by taking each observation. Although this downsampling is done mostly for computational convenience, it is also very likely that in practical applications the hardware will have a more limited sampling frequency compared to the simulator. Since the signals may contain artifacts, we filtered them by applying a moving average filter with a sliding window of ten samples. We limited our study only on the first one minute of each trip, both for computational reasons, but also since in practical applications, it would be favorably to classify the driver without too much history. A similar choice timewindow has previously been suggested in [6].
All features were normalized using a MinMax scaler [30]. Our dataset was split into the labeled data used in training the RCGAN and the unlabeled one used in the semisupervised part. The RCGAN was trained only on a dataset of trips, equally balanced between aggressive and normal.
IvC Data generation
RCGANs were originally developed and implemented in [29]
for medical applications. Our paper was inspired from this work to synthesize IMU signals for a normal and an aggressive trip. We will start this section by give a brief introduction to Recurrent Neural Networks (RNN). Next, we will introduce long shortterm memory RNNs, which is an extension of the RNN framework. This subsection ends with a description of the RCGAN model, which is using long shortterm memory RNNs.
IvC1 Recurrent Neural Networks
RNNs are mostly used for sequential modeling and learning. They process one element of input data at a time and implicitly store previous information using cyclic connections of hidden units. Given a sequence of vectors, , where , the RNN outputs a representation, that is a sequence of vectors , where . The sequence is determined iteratively through:
(1) 
where , and . The function is a nonlinear mapping and often chosen as applied componentwise.
The output vector transforms the current hidden state in a way that depends on the final task. For classification, it is computed as
(2) 
Note that are network parameters determined through gradient descent. The scalars and are the dimensions of the hidden layer, the input, and the output, respectively. For example, in the case of 2category classification,
and the probability vector
refers to the probabilities of each input element belonging to each category.IvC2 Long ShortTerm Memory (LSTM)
In practice, vanilla RNN encounters numerical computation difficulties. One reason presented in [31] is that it would cause the gradient to vanish and explode while computing the backpropagation through time, on data with long term dependencies. The vanilla RNNs only consider short term dependencies. The Long ShortTerm Memory (LSTM) technique was therefore introduced to mitigate this kind of risk. The latter incorporate a memory cell together with an input gate , an output gate and a forget gate . The memory cell enables the network to remember its state over time, and by doing so it is possible for the full network to capture longterm temporal dependencies present in the training data. The evolution of LSTM states are determined by:
(3) 
(4) 
(5) 
(6) 
(7) 
where and are learnable parameters. The function
denotes sigmoid activation function, that is applied elementwise. The quantities
, , and stand for the input, forget and output gates respectively. The output of the LSTM cell is and denoting pointwise vector products, i.e., Hadamard product. Fig. 3 illustrates the learning mechanism through the LSTM cell.IvC3 Recurrent Conditional Generative Adversarial Networks
RCGANs are generative recurrent neural networks that aim at generating realvalued time series subject to a conditional information. In the RCGAN architecture there are two different LSTMRNNs trained simultaneously, a generator and a discriminator , which have conflicting objectives. The generator learns over the training data, whereas the goal of the discriminator is to discriminate between the synthetic data generated by and the real data, as depicted in Fig. 4. We denote by and the feature dimensions of the data, the conditional information and the latent/noise space, respectively. Let be the length of the time series and .
In practice, the minmax game problem is described as:
(8) 
where is the distribution of the real data and is a prior distribution over the input noise variables. These latter, i.e., the sequences of points, are drawn independently from .
In our case, the input consists of fivedimensional time series data, i.e., the signals for trips, with a binary condition attributing the type of driving (normal/aggressive). The length of all time series is equal to . For more details about the architecture of our trained RCGAN, see Table I. Our RCGAN generates an example from a specific class. In other words, if we ask for the aggressive class, the generator produces one aggressive trip. Thus, after training the model, the number of generated trips per each class, should be defined. We fed the RCGAN with the training set, and then generated and new trips from the data.
IvD Data evaluation
In order to evaluate the quality of the RCGAN model, we used a semisupervised framework to classify whether a trip is aggressive or normal. Firstly, we extracted statistical features from the real and fake data. Nine statistical features were calculated out of the five time series to measure different properties of that variable, namely: mean, median, mode, standard deviation, skewness, kurtosis, 25 percentile, 75 percentile and interquartile range. A further description of a few of the statistical features is given below.
We denote by a realvalued series with and its mean and standard deviation, respectively.
IvD1 Mode
The mode is the most frequently appeared value in the serie.
IvD2 Skewness
Skewness is used to measure the asymmetry of the data. Let be the
th moment, i.e.,
(9) 
The skewness is then calculated with the third moment as .
IvD3 Kurtosis
Kurtosis is used to measure the peakedness of the probability distribution of the data and calculated as
, where is the momentIvD4 Percentiles
A percentile is the value of a variable below which a certain percent of observations fall. In other words, the percentile is a value such that at most of the measurements are less than this value and are greater.
IvD5 Interquartile Range
Interquartile Range (IQR) is a measure of statistical dispersion. It is defined as the difference between the and the
percentiles, called the upper and lower quartiles.
The unlabeled part of the dataset were used for training an Autoencoder (AE), see Fig. 1, in order to transfer its weights and biases to the DNN classifier. The Autoencoder is a neural network, which aims to reconstruct the input, i.e., the target output is the input. It is composed of two main parts, an encoder that serves to compress the data in a lower dimensional space and a decoder which reproduces the input out of the bottleneck. The AE is trained in order to minimize the error between the real input and the constructed one. More formally, let
be the input, where , the compressed representation , mapped by ,(10) 
where , , and
, are respectively the activation function of the encoder, the weight matrix, and the bias vector. The function
is parameterized by. The decoder part reconstructs the input from the hidden representation
by the function ,(11) 
where , , and are respectively the activation function of the decoder, the weight matrix, and the bias vector. is parameterized by .
Each training input vector is mapped to a corresponding which is then mapped to a reconstruction such that . The parameters and of the model are optimized to minimize the average reconstruction error such that
(12) 
with the loss function and is given by .
After training the Autoencoder on the unlabeled dataset, i.e., the trips from the simulator that was not used to train RCGAN, we use the weights and biases to initialize a supervised deep neural network (DNN) model and then finetune the DNN model using the labeled dataset to classify the type of driving. To measure how generated data can improve the data classification, we run various groups of experiments. In the first group which is our baseline, the classifier was trained and validated using only the real labeled dataset. In the following groups, we made all the combination of the training and the validation sets containing labeled real/fake/real+fake datasets. All the classifiers were trained only with the selected features.
We evaluate the classifier’s performance by measuring the Area Under Receiver Operating Characteristic (AUROC). This criterion is one of the most widely used metric to score the goodness of a predictor in a binary classification task. It ranges in value from to . The higher the AUROC, the better the classifier is at predicting the classes, which is the type of driving in our case. The AUROC is computed on the test set containing all the real data.
V Results and Discussion
Fig. 5 depicts a recorded trip and a generated trip, both labeled as normal. The figure illustrates how the RCGAN was able to grasp the correlation between the different signals of a normal trip, as well as the main patterns. In order to investigate the quality of the generated fake data and see whether it can be useful on a practical level, we applied our semisupervised framework as an extrinsic evaluation. The generated fake data were used in both the training and the validation set of the classifier. All combinations of real and generated fake data is covered in Table II.
We ran the experiments 200 times. After each trained RCGAN, we generated different amount of data and we utilized them in the validation or training set of our classifier. In of the simulations the RCGAN reached at least an AUROC strictly higher than the baseline value, for at least one combination of real and fake data in every of the 200 runs. Table II shows the performance of the classifier of the semisupervised framework, trained and validated on different sets consisting of combinations of real and fake data for a simulation outperforming the baseline. AUROC is measured on the test set which contains all the real trips. Bold depicts the AUROC superior to the baseline values. We can see that for most simulations the AUROC exceeds the baseline, for a variety of sets and ratios of real and generated fake data.
RCGAN  

Learning rate  0.001 
Batch size  1 
Number of epochs 
5000 
Generator optimizer  ADAM 
Discriminator optimizer  Gradient Descent 
Generator rounds  1 
Discriminator rounds  1 
RNNs hidden units  100 
Latent dimensions  25 
Smooth rate  0.1 
Autoencoder  
Number of epochs  100 
Hidden layers  [100,50,100] 
Activation function  
Classifier  
Hidden layers  [100,50,100] 
Learning rate^{1}  0.001, 0.01, 0.1 
Number of epochs^{1}  100, 200, 500 
Activation function^{1}  tanh, maxout, rectifier 

Grid search was based on these parameters for the
Since we varied the percentage of real and generated fake data in both training and validation sets of the classifier, it is of interest of how much generated fake data that is needed and how it should be utilized by the classifier. Table III highlights the summary over the set of simulations which outperform the baseline, i.e., the number of recorded AUROCs that exceeds the baseline value, for each combination set and ratio fake.
Training Set  Validation Set  Ratio Fake/Real  AUROC^{1} 
R^{2}  R  0%  0.823 
R + F^{3}  R + F  50%  0.858 
100%  0.851  
150%  0.805  
R  F  50%  0.846 
100%  0.774  
150%  0.845  
F  R  50%  0.799 
100%  0.858  
150%  0.851  
R + F  R  50%  0.851 
100%  0.832  
150%  0.844  
F  F  50%  0.776 
100%  0.846  
150%  0.833  
R  R + F  50%  0.841 
100%  0.851  
150%  0.813  
F  R + F  50%  0.841 
100%  0.805  
150%  0.805  
R + F  F  50%  0.849 
100%  0.805  
150%  0.841 

This measure is computed on the test set, containing all the real data, namely the trips.

The set R consists of the real trips which were used to train the RCGAN.

The set F consists of the fake data which were generated from the RCGAN.
On a first glance, we can divide the Table III into three groups. First one containing the combination set which have the highest total number. Training on the real, while validating on both the real and the fake seems to be the best option in order to ensure a better classification.
Sets  Counts^{1}  AUROC  

Training  Validation  50%  100%  150%  Mean  SD 
R+F  R+F  21  14  14  0.834  0.009 
R  F  76  76  86  0.843  0.007 
F  R  3  1  2  0.833  0.004 
R+F  R  24  14  15  0.833  0.008 
F  F  0  1  0  0.830  – 
R  R+F  106  101  104  0.840  0.008 
F  R+F  2  0  1  0.835  0.003 
R+F  F  12  26  17  0.835  0.009 

The recorded number for each sets and percentage of fake data.
Training on the real data and validating only on the fake data can be also a good way to use the generated data. The second group contains the following combinations; training on the real data and fake data, while validating on the real data, training and validating on both the fake data and real data, and lastly training on real data and fake data while validating on the fake data. This group is characterised by a lower number of records comparing to the first one. This underlines the fact that incorporating the fake data in the training set is less likely to improve the classifier. The third group contains the remaining combinations. This group is characterised by the fact that the training set is only composed of fake data. The negligible number of this group excludes the possibility of using only the fake data to improve the classifier accuracy. This result can be justified by the fact that the generation of data is done on the basis of the real ones, therefore substituting the content of the classifier’s training set from real data to fake data, would not guarantee an improvement. The generative model had to learn from the real data to end up having new ones close enough to the original, but still different.
Fig. 6 shows that the classifier can perform well by training merely on the generated fake data. By training and validating on the fake data, we can have an AUROC slightly lower than the baseline, which still guarantees a good prediction of the type of driving.
The first group also reached the highest average of AUROC between its elements, comparing to the other ones. Consequently, we capture the importance of incorporating the fake data only into the validation set.
On the other hand, we want to see whether the size of the generated data would affect the performance of the classifier. Since we know from the previous results, in which combination sets the fake data worth to be used, we limited the scope only on the first group, which only train the classifier on the real data. We can see in Table III, that increasing a ratio fake to would give in overall, higher chances to improve the model. In this case, it means that synthesising more data than the size of the original one, can give a better classification of driving behavior.
Vi Conclusion
In this paper, we outlined our experiences of using Recurrent Conditional GANs for generating IMU signals, which are assessed by the improvement of a semisupervised framework to classify the type of driving. The classification applied on the extracted features of the real and synthetic data, was mostly improved by using the latter in the validation set. The two main contributions in this work are the generation of IMU signals and the quantitative extrinsic assessment of the synthetic data using a deep learning based approach. For future research, we plan tox investigate how the parameters of the RCGAN can be improved with the aim to find the most convenient network architecture to ensure an close to optimal classification given the limited amount of labeled data.
References
 [1] S. Choi, J. Kim, D. Kwak, P. Angkititrakul, and J. H. Hansen, “Analysis and classification of driver behavior using invehicle canbus information,” in Biennial workshop on DSP for invehicle and mobile systems, 2007, pp. 17–19.
 [2] J.M. McNew, “Predicting cruising speed through datadriven driver modeling,” in 2012 15th International IEEE Conference on Intelligent Transportation Systems. IEEE, 2012, pp. 1789–1796.
 [3] N. Oliver and A. P. Pentland, “Driver behavior recognition and prediction in a smartcar,” in PROC SPIE INT SOC OPT ENG, vol. 4023. Citeseer, 2000, pp. 280–290.
 [4] M. Brambilla, P. Mascetti, and A. Mauri, “Comparison of different driving style analysis approaches based on trip segmentation over gps information,” in 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2017, pp. 3784–3791.
 [5] M. Enev, A. Takakuwa, K. Koscher, and T. Kohno, “Automobile driver fingerprinting,” Proceedings on Privacy Enhancing Technologies, vol. 2016, no. 1, pp. 34–50, 2016.
 [6] J. Carmona, F. García, D. Martín, A. Escalera, and J. Armingol, “Data fusion for driver behaviour analysis,” Sensors, vol. 15, no. 10, pp. 25 968–25 991, 2015.
 [7] U. Fugiglando, E. Massaro, P. Santi, S. Milardo, K. Abida, R. Stahlmann, F. Netter, and C. Ratti, “Driving behavior analysis through CAN bus data in an uncontrolled environment,” IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 2, pp. 737–748, 2018.
 [8] Y. Ma, Z. Zhang, S. Chen, Y. Yu, and K. Tang, “A comparative study of aggressive driving behavior recognition algorithms based on vehicle motion data,” IEEE Access, vol. 7, pp. 8028–8038, 2019.
 [9] M. Bahi and M. Batouche, “Deep semisupervised learning for dti prediction using large datasets and h2ospark platform,” in 2018 International Conference on Intelligent Systems and Computer Vision (ISCV). IEEE, 2018, pp. 1–7.

[10]
R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng, “Selftaught learning: Transfer learning from unlabeled data,” in
Proceedings of the 24th International Conference on Machine Learning
, ser. ICML ’07. New York, NY, USA: ACM, 2007, pp. 759–766.  [11] P. Zhang, X. Zhu, and L. Guo, “Mining data streams with labeled and unlabeled training examples,” in 2009 Ninth IEEE International Conference on Data Mining. IEEE, 2009, pp. 627–636.
 [12] J. Wei, J. M. Snider, T. Gu, J. M. Dolan, and B. Litkouhi, “A behavioral planning framework for autonomous driving,” in 2014 IEEE Intelligent Vehicles Symposium Proceedings. IEEE, 2014, pp. 458–464.
 [13] H. Horii, “Modifying autonomous vehicle driving by recognizing vehicle characteristics,” Aug. 15 2017, US Patent 9,731,713.
 [14] T. AlShihabi and R. R. Mourant, “A framework for modeling humanlike driving behaviors for autonomous vehicles in driving simulators,” in Proceedings of the fifth international conference on Autonomous agents. ACM, 2001, pp. 286–291.
 [15] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, et al., “End to end learning for selfdriving cars,” arXiv preprint arXiv:1604.07316, 2016.
 [16] E. Cheung, A. Bera, E. Kubin, K. Gray, and D. Manocha, “Classifying driver behaviors for autonomous vehicle navigation,” 2018.
 [17] N. Hirsenkorn, T. Hanke, A. Rauch, B. Dehlink, R. Rasshofer, and E. Biebl, “Virtual sensor models for realtime applications,” Advances in Radio Science, vol. 14, pp. 31–37, 2016.
 [18] S. Bernsteiner, Z. Magosi, D. LindvaiSoos, and A. Eichberger, “Radar sensor model for the virtual development process,” ATZelektronik worldwide, vol. 10, no. 2, pp. 46–52, 2015.
 [19] T. A. Wheeler, M. Holder, H. Winner, and M. J. Kochenderfer, “Deep stochastic radar models,” in 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2017, pp. 47–53.
 [20] N. Hirsenkorn, H. Kolsi, M. Selmi, A. Schaermann, T. Hanke, A. Rauch, R. Rasshofer, and E. Biebl, “Learning sensor models for virtual test and development,” in Workshop Fahrerassistenz und automatisiertes Fahren, 2017.
 [21] E. L. Zec, N. Mohammadiha, and A. Schliep, “Statistical sensor modelling for autonomous driving using autoregressive inputoutput hmms,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 1331–1336.
 [22] I. J. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” 2014.
 [23] A. Bansal, S. Ma, D. Ramanan, and Y. Sheikh, “Recyclegan: Unsupervised video retargeting,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 119–135.
 [24] B. Dai, S. Fidler, R. Urtasun, and D. Lin, “Towards diverse and natural image descriptions via a conditional gan,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2970–2979.

[25]
A. Gupta, J. Johnson, L. FeiFei, S. Savarese, and A. Alahi, “Social gan:
Socially acceptable trajectories with generative adversarial networks,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, 2018, pp. 2255–2264.  [26] Z. Xu, B. Liu, B. Wang, S. Chengjie, X. Wang, Z. Wang, and C. Qi, “Neural response generation via gan with an approximate embedding layer,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 617–626.
 [27] H. Arnelid, “Sensor modelling with recurrent conditional gans recurrent conditional generative adversarial networks for generating artificial realvalued time series,” 2018, Master’s Thesis. https://hdl.handle.net/20.500.12380/256175.
 [28] A. E. Sallab, I. Sobh, M. Zahran, and N. Essam, “LiDAR Sensor modeling and Data augmentation with GANs for Autonomous driving,” arXiv preprint arXiv:1905.07290, 2019.
 [29] C. Esteban, S. L. Hyland, and G. Rätsch, “Realvalued (medical) time series generation with recurrent conditional gans,” arXiv preprint arXiv:1706.02633, 2017.
 [30] S. Patro and K. K. Sahu, “Normalization: A preprocessing stage,” arXiv preprint arXiv:1503.06462, 2015.
 [31] Y. Bengio, P. Simard, P. Frasconi, et al., “Learning longterm dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994.