Various and complex scenarios with other surrounding vehicles engaged raise a big challenge for fully-autonomous driving due to the environment uncertainties. Classifying complex scenarios and then designing associated policies separately seems to be an easy way to overcome this challenge, but the flood of on-hand datasets could overwhelm the human insight and analysis because of limited prior knowledge on the complex driving scenarios2]
, which do not need full recovery of the internal relationship of interaction policies among multi-vehicles. But it really requires large amounts of high-quality data and would fail when encountering with the new scenarios that have never appeared in the training dataset. On the other hand, some researchers also resort to the reinforcement learning due to its unique abilities to learn failure lessons from new environment with exploration and exploitation. But it should try all possible scenarios before successfully driving in any existing and forthcoming scenarios. Sufficient testing in all possible scenarios is required for the end-users.
There exist some public databases , but most of them do not contain the multi-vehicle interaction data regarding the vehicle’s GPS trajectories and vehicle dynamic states because of the cost and technical limitations, but it is excessively time- and resource-consuming and dangerous as well . An alternative way is to develop an efficient model able to generate new scenarios that are statistically analogous to these in real world from the limited on-hand datasets (Fig. 1). This procedure consists of two stages: first projecting the encounter trajectories to a disentangled space, and then generating new trajectories with sampling from this space.
For generating new samples, one of the suitable solution candidates is the generative model, for example, Generative Adversarial Networks (GANs) , which have been applied to image style transformation and face reconstruction [6, 7]. Variational Autoencoder (VAE) , as another class of generative models, can control the characteristic of generated samples more explicably than GAN due to its significant theoretical improvements [9, 10]
. On the other hand, the convolutional Neural Network (CNN) is reasonable to deal with images, but it is unsuitable for time-series data processing. To that end, the combination of recurrent neural networks (RNN) and GAN or VAE provides a practically tractable way to deal time series. Most works are on Natural Language Processing (NLP) like machine translation and image caption 
. In these methods, Long Short-Term Memory (LSTM)
and Gated Recurrent Unit (GRU) were usually used by selectively remembering and forgetting the past states.
Some existing literature utilized the aforementioned generative models with supervised methods to predict spatiotemporal trajectories of human movements [15, 16, 17, 18, 19] or vehicle behavior[20, 21], given a specific trajectory. However, it is not applicable due to the difficulty of modeling all moving objects in scenarios. In order to make the generated trajectory reasonable, the potential trajectories of nearby objects must be considered simultaneously.
Supervised learning can extract the features of interaction; however, it is limited to transformation ability in the data space. Since unsupervised learning could extract intrinsic features and reconstruct scenes of multi-agent interactions, in this paper, we will develop an unsupervised learning framework (Fig. 2) to regenerate multi-trajectories time-series data address the above issues. An end-to-end system is built to extract the interpretable representations of driving encounters by combining an encoder (green) with a bidirectional GRU (purple). For the decoder (red), we implement two branches to process multiple sequences separately. These sequences interact with each other through hidden states containing information of the former samples. A hidden state of one sequence is considered as part of the input to the next state of the other sequence. In summary, the main contributions of this paper are threefold:
We utilize VAE to extract the representations of driving encounters, and then realize the intersection trajectory generation of two vehicles by sampling from these representations.
We propose an interactive structure to generate trajectories consistent with the spatiotemporal characteristics of real traffic trajectory.
We develop a new disentangled metric to comprehensively analyze and compare generative models regarding their robustness.
The reminder of this paper is organized as follows. Section II introduces some related works. Section III presents our developed methods. Section IV details the experiment procedure. Section V discusses and analyzes the experiment results. Finally, the conclusion and future work is given in Section VI.
Ii Related Works
Ii-a Generative Models
have been widely used to construct generative models. VAE introduces the idea of variational inference into neural networks to calculate posterior probability, while GAN utilizes the antagonism of game theory to build the generator and discriminator structure. The discriminator can distinguish the true samples from the false ones, thus the sample distribution created by the generator is gradually forced to approach the real data distribution. VAE can obtain the interpretable latent representatives and controllable decoupling features. However, one of the limitations is the ambiguity of the decoder when processing images. Thus the-VAE [9, 22] was developed to leverage the distribution formation and reconstruction.
Different models based on GAN have been developed, for example, the infoGAN structure is proposed to obtain interpretable representatives by maximizing mutual information combined with GAN . Its main advantage is to generate high-quality samples , but its training process is unstable and could run into mode collapse problem [6, 7]. Our goal is to generate trajectories, therefore we would not have the issues. Considering the difficulty of training GAN, we selected the -VAE  as the basic framework and modified it suitable for time-series information extraction and trajectory generation.
Ii-B Time Series Processing
Dealing with time series is challenging because of the dependent relation between two adjacent states, which requires to consider the characteristics of memory. LSTM  and GRU  are the potential solutions to this issue. The existing work of time-series processing usually combines the generative model with LSTM, GRU, and their variants. For instance, RGAN and RCGAN  were proposed to process medical time-series data, and VRAE and VRNN were proposed [26, 27] by combining RNN with VAE. The factorized hierarchical variational autoencoder (FHVAE) was proposed to denoisy the vocie data . In addition, Seq2seq is essentially a time-series-based autoencoder  and being a famous structure in the field of NLP [11, 29]. Since GRU is faster than LSTM in processing time series, we select GRU in our new model.
Ii-C Trajectory Generation
RNN is usually used to process coordinate trajectories, except for natural language and voice data. Most work on trajectory generation aims to predict the subsequent position of a trajectory under a supervised condition, for example, predicting pedestrian trajectories in multiple people scenarios [15, 16, 17, 18, 19]. Alahi et al. proposed a social LSTM  to increase the correlation between implicit states within LSTM, so that multiple agents in a neighborhood was considered simultaneously. An improvement of LSTM was made . They proposed social GAN and then the trajectory was generated by the generator in GAN. A new attention mechanism was added into LSTM to implement social and physical constraints . A structured LSTM structure was also proposed in  to predict pedestrian trajectory. Nevertheless, supervised learning methods can only generate results in a fixed data space, and they do not have latent code to control the generated results.
Another examples of trajectory generation are the sketch drawings  and the Chinese character stroke generation. They usually utilized VAE  and Seq2seq structures by combining the Mixture Density Network (MDN) 
. In MDN, the author used the Gaussian mixture model (GMM) to describe the generating result instead of directly outputting. Their model creates simple sketch of trajectories instead of images. However, these works all focus on single sequence and only one single binary encoder was used to determine pen-up and pen-down, without considering the relationship between multiple sequences. While our task is to deal with the intersection between two vehicles, which is more complicated.
Traffic scenario is quite complex and contains a great uncertainty information, regarding driving intention [34, 35]. Authors in  proposed a traffic-primitive-based framework to reconstruct the scenarios. A very simple LSTM structure for road vehicle trajectory prediction is developed , and then improved by considering behavior and social rules [20, 21]. These analysis of traffic scenes and signals have achieved some promising results.
In this section, we will briefly introduce the basic principles of VAE and -VAE, then propose two simple baselines of VAE and infoGAN. In addition, we will present the MVTG architecture to output high quality vehicle encounter trajectories. Finally, we will introduce our developed metric for comprehensive model evaluation and analysis.
Iii-a VAE and -Vae
The main formation difference between VAE and autoencoder is the additional term of a Kullback-Leibler (KL) divergence, , where and represent the real data and latent code, respectively. The generative model is defined by a standard Gaussian . is the distribution of representation of the real data. In order to calculate the backward propagation, we use the trick of reparameterization – let the neural network output the mean value
and varianceof the current distribution and express the as follows:
For the objective function of VAE, it consists of two parts:
Essentially, there exists a trade-off between the two parts: KL divergence forces the distribution of latent codes as close as possible to Gaussian, while the reconstruction error forces the latent code to contain more data information. A certain restriction exists between them in the training procedure, so the result is either that the quality of reconstruction is poor or that the distance between latent code and Gaussian distribution is far away. For-VAE, it improves the vanilla VAE by adding a hyper-parameter to the KL divergence term to adjust the ratio. As a result, we can control training balance by adjusting a single parameter.
Iii-B Baseline of Method
Considering the encoding and generation tasks of two sequences, firstly, we build a Seq2seq baseline system with GRU and
-VAE. The structure merges two sequences together and processes them simultaneously. The encoder uses a single layer GRU and reparameterize the output vectors by
and are two input sequences and is latent code vector with dimension . Then feeding the obtained to the decoder as the initial state and outputs sequence coordinates in a circular way. Namely, it uses the output coordinates of the last state as the input coordinates of the next state as
For the reconstruction error design, we use Mean Square Error (MSE) to compute two sequences separately, and the final objective function is
In order to test the capability of GAN, we implemented a GRU version of infoGAN as another baseline. Its generator and discriminator adopt the similar structure as the VAE baseline.
Iii-C Multi-Vehicle Trajectory Generator (MTG)
There exist some problems when generating multi-vehicle trajectories by using VAE. Firstly, the training process is quite slow due to the improper structure of encoder and decoder, thus resulting in invalid latent codes or coupled codes in , which affects the overall performance. Another problem is that the generated sequences with many sharp turns and circles, which are quite different from the real traffic trajectory. The structure of parallel outputting two sequences could make these sequences interactive with each other, thus two sequences are dependent.
According to these analysis, we propose a new VAE structure for modeling traffic sequences, called multi-vehicle trajectory generator (MTG), as shown in Fig. 2. We first replaced the encoder structure with a bidirectional GRU structure, as it still makes sense to reverse the vehicle trajectory sequence. The bidirectional GRU structure is used to analyze the whole data sequence better. This encoder is expressed as
In the decoder, we separate the two sequences and use the hidden state of one sequence as part of the input to the next state of the other sequence:
Since the hidden state retains all the information over past positions, it is able to provide guidance to generate another sequence. And generating the two sequences independently avoids mutual interference, thus making the generated results analogous to the real observed samples.
The Gaussian mixture models (GMM) have been used to describe the output of the decoder[31, 32] for the data with coordinate points equal to keypoints, and the slight modification of keypoints only changes the overall structure, thus obtaining similar semantic results. However, the vehicle trajectory should be continuous and smooth, and thus GMM can not be directly used here.
In many NLP-related works, the Teacher-Forcing algorithm is implemented to Seq2seq models during the training stage, wherein the ground truth is used as input of each state. It can increase the stability of the network and shorten the training time, but it would lead the previous information to directly pass into the decoder through the input of the decoder, and thus the hidden states will be disable to contain all the information.
Iii-D A New Disentangled Metric
Performance evaluation plays a pivotal role in the disentangled model development. In , a supervised metric was proposed by fixing one of the factors and randomly selecting the other factors. Then the latent codes were retrieved through decoder and encoder in turn. A simple classifier was then trained to identify which factor is fixed, and the model’s coding ability can be evaluated according to the identification result. But Kim et al.  claimed that the method proposed by  has defects in principle, and they proposed a method to calculate the normalized variance for analysis. The pipeline of the metric operation in  is shown in Algorithm 1. In practical experiment, an applicable method should be provided to analyze the model robustness and the effects of each factor. However, the method in  can only offer a relative evaluation of the capability of autoencoder, rather than analyzing the independence between the factors. We will elaborate this problem more specifically in experimental section.
In order to overcome this issue, we propose a comprehensive comparison and analysis metric, as shown in Algorithm 2. We divide the input samples into several groups with different variances (Each group has samples), represented as , where means the index of the latent code. And means the group index of different variances. Differing from the metric evaluation method in , we sample one factor in each group and fixing other factors. After passing through the decoder and encoder, we obtain a more comprehensive analysis by comparing the variance of each factor in the output with the variance of the input.
Iv Experiment and Data Processing
Iv-a Dataset and Preprocessing
The driving encounter data we used was collected by the University of Michigan Transportation Research Institute (UMTRI) 
. The dataset includes approximately 3,500 equipped vehicles. We used the latitude and longitude information as coordinates, which were collected by on-board GPS. The data was collected with a sampling frequency of 10 Hz. And in order to uniformly extract features, the linear interpolation was used to reshape each trajectory to the length of 50. Considering the property of neural networks, we normalized the data intoby
Some examples of vehicle encounter trajectories are shown in the first column of Fig. 3. The trajectories of two vehicles are marked with blue and red.
Iv-B Experiment Settings
In this paper, we compare three different architectures of generating trajectories and then evaluate them using our proposed metric. In next section, we will show the generation and evaluation results. To make latent codes understandable, we use t-Distributed Stochastic Neighbor Embedding (t-SNE)
to display generative results over their feature space. We also show the benefits of our proposed evaluation metric by comparing with the metric proposed in.
All the results are generated with manually controlled test codes. We set the dimension of latent code . Thus, the test codes have 10 different groups for different codes. In each group, we varied the value of the code from -1 to 1 (ten numbers were chosen to display) and fixed other codes on 0. Thus 10 different variances (from 0.1 to 2.8) were tested for each code and represented with different colors.
V Result Analysis and Evaluation
V-a Generated Results
Fig. 3 shows the raw trajectories and the generated outcomes using three different models. The top four figures show the reconstruction results, which indicates that our proposed MTG outperforms other two models when reconstructing real trajectories. The bottom four plots show the capability of the latent codes. It can be seen that for infoGAN, the last three rows are almost the same, indicating that these codes neither control any feature nor contain any information about the data. But it is undeniable that and learns some simply rules. The third column and forth column displays the results of using the original VAE and our proposed VAE architecture, respectively. For the original VAE, it is quite complex with some circles and sharp turns, which are abnormal in real traffic scenarios. However, our developed architecture outputs more reasonable trajectories, that is, the last column also intuitively shows the control ability of all codes when varying from -1 to 1.
V-B Disentangled Evaluation of models
, and each plot shows the result of sampling one code with a normal distribution by fixing the other codes. As shown in the histogram, only the sampling code outputs a increasing variance, while other codes keep close to zero. It indicates that there no interference among those codes, that is, sampling one code only influences itself.
The line chart inside each figure shows the ratio of output variance and input variance for each code. A much robust model will make all lines close to zero and the line slope of the input code close to 1. For our proposed MTG, the slope of input code is larger than 1, but the other lines are quite close to 0. It demonstrates that our proposed architecture achieves a decoupling and stable performance, and generates meaningful samples with manually controllable latent codes. Figs. 5 and 6 show the results of original VAE and infoGAN, which includes two kinds of new patterns in Fig. 4.
One is the code in the 1st plot of Fig. 5. It indicates that some codes always are independent on input code changes with variances always around 1. Moreover, these codes output a normal distribution that the K-L divergence forces them to be. From these codes, it can be found that they are invalid and contain less information about the input data.
The other is the code in the 9th plot of Fig. 6. It indicates that the output variance changes along the input variance. This is because these codes are correlative and coupled, and they may control different features dependently. The appearance of interaction code results from the case that the model does not have the ability of factorizing the latent codes.
As for the ratio of output variance and input variance for each code, some codes show negative slope in Figs. 5 and 6. This phenomenon is quite consistent with the output variance of normal distribution. After the analysis of those results, it can be concluded that our proposed MTG outperforms the original VAE and infoGAN for both decoupling/encoding latent codes and generating reasonable trajectories.
V-C Feature Space Display
We utilized the t-SNE tool to display generated results in feature space. Fig. 7 displays four results of t-SNE in two dimensions. In total, 200 samples were generated with a variance of 1.0 for each codes independently. The dimension of each sample is
. We firstly used principal component analysis (PCA) to take dimension reduction to 5, then applied t-SNE to obtain the results in Fig.7. The goal of VAE is to project the data into a disentangled space, where codes are decoupled. As shown in Fig. 7, the codes of autoencoder are entangled and not continuous. Both InfoGAN and VAE show disentangled ability for most codes, while some codes still couple with each other. Nevertheless, MTG demonstrates its continuity and interdependency, i.e., the latent codes interact less with each other.
V-D Evaluation Metric
. The left plot displays the standard deviation of all 10 codes. The autoencoder obtains different standard deviation without the K-L divergence since the elements of embeddings are encoded in a entangled space. After normalization by dividing the standard deviation as mentioned before, the right of this figure shows the output normalized variance of all codes.
In our case, gets the lowest value, and this sample will be easily classified by the metric proposed in . After applying this metric to the autoencoder, it achieved a high score than VAE, which indicates that this disentangled metric has problem in evaluating the distribution of the latent codes and the interference among them. In the second plot of Fig. 8 (a), our proposed metric on autoencoder shows a different result: does greatly impact on other codes because most codes except for obtain a large variance.
Fig. 8 (b) shows the result of evaluation on VAE. It can be seen that the variance of is much lower than others, which ensures that VAE gets a high score. The left figure of using the metric proposed in  could not explain why is invalid; however, this can be easily revealed by using our proposed metric in the right figure since the output of always remains around 1.
This paper proposed a way to generate the encounter trajectory of two vehicles based on VAE. In order to extract the features with considering the relationship between the two spatiotemporal sequences better, a new network architecture was proposed. We also developed an evaluation metric capable of comprehensively analyzing the generated results and the their stability. Experimental results demonstrate that our proposed architecture achieves more disentangled and stable latent codes. Moreover, our proposed method can obtain more realistic encounter trajectories than the original VAE and infoGAN.
It is significant to successfully generate the trajectory of two vehicles encounters since the controllable generation of the trajectory could provide sufficient high-quality data at low cost for self-driving applications. We only generate a very short trajectory in this paper, but the start point could be set and these trajectories could be carefully cascaded together. In future work, we will use the actual road coordinates as conditions as well, and then utilize the conditional GAN or the conditional VAE to generate trajectories with considering more constrains regarding road profiles and vehicle dynamics.
More details about the hyper-parameter settings refers to the link: http://www.wenhao.pub/publication/trajectory-supplyment.pdf.
-  T. Appenzeller, “The scientists’ apprentice,” vol. 357, no. 6346, pp. 16–17, 2017.
S. Yang, W. Wang, C. Liu, and W. Deng, “Scene understanding in deep learning based end-to-end controllers for autonomous vehicles,”IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018.
-  W. Wang and D. Zhao, “Extracting traffic primitives directly from naturalistically logged data for self-driving applications,” IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 1223–1229, April 2018.
-  W. Wang, C. Liu, and D. Zhao, “How much data are enough? a statistical approach with case study on longitudinal driving behavior,” IEEE Transactions on Intelligent Vehicles, vol. 2, no. 2, p. 85–98, 2017.
-  I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, 2014, pp. 2672–2680.
-  M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.
-  I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” in Advances in Neural Information Processing Systems, 2017, pp. 5767–5777.
-  D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
-  I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae: Learning basic visual concepts with a constrained variational framework,” 2016.
-  H. Kim and A. Mnih, “Disentangling by factorising,” arXiv preprint arXiv:1802.05983, 2018.
-  I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” arXiv preprint arXiv:1409.3215, 2014.
-  O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” in
-  S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
-  K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
-  A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 961–971.
-  N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. Torr, and M. Chandraker, “Desire: Distant future prediction in dynamic scenes with interacting agents,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 336–345.
-  A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), no. CONF, 2018.
-  A. Sadeghian, V. Kosaraju, A. Sadeghian, N. Hirose, and S. Savarese, “Sophie: An attentive gan for predicting paths compliant to social and physical constraints,” arXiv preprint arXiv:1806.01482, 2018.
H. Su, J. Zhu, Y. Dong, and B. Zhang, “Forecast the plausible paths in crowd
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, 2017, pp. 2772–2778.
-  F. Altch and A. De La Fortelle, “An lstm network for highway trajectory prediction,” in Intelligent Transportation Systems (ITSC), 2017 IEEE 20th International Conference on. IEEE, 2017, pp. 353–359.
-  N. Deo and M. M. Trivedi, “Convolutional social pooling for vehicle trajectory prediction,” arXiv preprint arXiv:1805.06771, 2018.
-  C. P. Burgess, I. Higgins, A. Pal, L. Matthey, N. Watters, G. Desjardins, and A. Lerchner, “Understanding disentangling in -vae,” arXiv preprint arXiv:1804.03599, 2018.
-  X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “Infogan: Interpretable representation learning by information maximizing generative adversarial nets,” in Advances in neural information processing systems, 2016, pp. 2172–2180.
-  T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” arXiv preprint arXiv:1710.10196, 2017.
-  C. Esteban, S. L. Hyland, and G. Rätsch, “Real-valued (medical) time series generation with recurrent conditional gans,” arXiv preprint arXiv:1706.02633, 2017.
-  O. Fabius and J. R. van Amersfoort, “Variational recurrent auto-encoders,” arXiv preprint arXiv:1412.6581, 2014.
-  J. Chung, K. Kastner, L. Dinh, K. Goel, A. C. Courville, and Y. Bengio, “A recurrent latent variable model for sequential data,” in Advances in neural information processing systems, 2015, pp. 2980–2988.
-  W.-N. Hsu, Y. Zhang, and J. Glass, “Unsupervised learning of disentangled and interpretable representations from sequential data,” in Advances in neural information processing systems, 2017, pp. 1878–1889.
-  I. V. Serban, A. Sordoni, R. Lowe, L. Charlin, J. Pineau, A. Courville, and Y. Bengio, “A hierarchical latent variable encoder-decoder model for generating dialogues,” 2016.
-  T. Fernando, S. Denman, S. Sridharan, and C. Fookes, “Pedestrian trajectory prediction with structured memory hierarchies,” arXiv preprint arXiv:1807.08381, 2018.
-  D. Ha and D. Eck, “A neural representation of sketch drawings,” arXiv preprint arXiv:1704.03477, 2017.
-  X. Y. Zhang, F. Yin, Y. M. Zhang, C. L. Liu, and Y. Bengio, “Drawing and recognizing chinese characters with recurrent neural network,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PP, no. 99, pp. 1–1, 2018.
-  A. Graves, “Generating sequences with recurrent neural networks,” Computer Science, 2013.
-  A. Zyner, S. Worrall, and E. Nebot, “Naturalistic driver intention and path prediction using recurrent neural networks,” arXiv preprint arXiv:1807.09995, 2018.
-  W. Wang, J. Xi, and D. Zhao, “Learning and inferring a driver’s braking action in car-following scenarios,” IEEE Transactions on Vehicular Technology, vol. PP, no. 99, pp. 1–1, 2018.
-  N. Deo and M. M. Trivedi, “Multi-modal trajectory prediction of surrounding vehicles with maneuver based lstms,” arXiv preprint arXiv:1805.05499, 2018.
L. V. D. Maaten and G. Hinton, “Visualizing data using t-sne,”
Journal of Machine Learning Research, vol. 9, no. 2605, pp. 2579–2605, 2008.