1 Introduction
With the great success in deep learning, the deep generative model have been investigated widely. The generative adversarial nets (GAN)
[1] based methods are applied in many interesting applications including image superresolution [2][3][4], texttoimage translation [5], dialogues generation [6], etc. With the development of graphical technologies, the demand of higher resolution images has increased significantly. Moreover, generation of large highresolution images remains a challenge.However, existing GAN models experience limitations when generating large images. With the growing scale of images, vanilla GAN is hard to produce highquality natural images because it is difficult for the generator and the discriminator to achieve optimality simultaneously. When processing highdimensional images, the computation complexity and the training time increases significantly. The challenge is that the image has too many pixels and it is hard for a single generator to learn the empirical distribution. Therefore, the traditional GAN [1] does not scale well for the generation of large images. The variations of GAN such as deep convolutional GAN (DCGAN) [7], superresolution GAN (SRGAN) [8], Laplacian Pyramid GAN (LAPGAN) [9] and StackGAN [10]
are promising candidates for generative models in unsupervised learning. It is desirable to construct a generative model that efficiently processes data with large size and high dimensions.
Traditional GANbased methods operates in pixel space to generate images while tensorbased methods work in tensor space. Tensor representation [11] and its derivative methods such as tensor sparse coding [12] and tensor superresolution have a more concise and efficient representation of images, especially for large images. They provide an alternative method for representing large images in the tensor space, instead of the traditional pixel space or frequency domain, which could benefit challenges of generating largesized highresolution images.
Largesized or highdimensional images can be realized in several possible ways. Superresolution [13] is one of the classic methods used to construct highresolution images from lowresolution images for better human interpretation. The key idea to achieve superresolution is to use the nonredundant information contained in multiple lowresolution images induced by the subpixel shifts between them. One recent popular scheme for image superresolution is SRGAN [8]
, which combines GAN with deep transposed convolutional neural networks (CNNs) for generating highresolution images from lowresolution ones. The generator in SRGAN is used for upsampling the lowresolution images to superresolution images, which are distinguished from the original highresolution images by the discriminator.
is another method to efficiently to process largesized or highdimensional data. Using dictionary learning, we try to find sparse representation of input image data, which corresponds to the sparse coding technology of images. Traditional sparse coding method encodes images in matrices, while tensorbased sparse coding
[12] is more flexible with larger representation space. Multidimensional tensor sparse coding uses tlinear combination to obtain a more concise and small dictionary for representing the images, and the corresponding coefficients have richer physical explanations than the traditional methods. We apply the basic principles of superresolution and tensorbased dictionary learning in our generative model.For largesized and highdimensional images, the tensor representation is able to preserve the local proximity and capture the spatial patterns of elementary objects. Existing conventional sparse coding only captures linear correlations, which harms the spatial patterns of images. However, tensor sparse coding model can capture nonlinear correlations (linear upon sine/cosine basis), which is consistent with the existing neural networks using nonlinear activation functions. Tensor sparse coding replaces conventional vectorizing process with tensorizing process
[16][12][17][18]. For complex and highdimensional images, the conventional sparse coding process uses vector representation, and the vectorizing process ignores the spatial structure of the data. As a result, it generates a largesized dictionary and causes highcomputational complexity, which makes it infeasible for highdimensional data applications.Tensorbased dictionary learning adopts a series of dictionaries to approximate the structures of the input data in each scale, which significantly reduces the size of the dictionaries. Besides, the circular matrix defined at Section 3.1 maintains the original image invariant after shifting; this helps to preserve the spatial structure of the images. Benefitting from tensor representation, tensorbased dictionary learning has advantages in dictionary size, shifting invariance, and rich physical explanations of the tensor coefficients [12]. In general, tensorbased methods have a more efficient representation capability for largesized or highdimensional data, and could therefore benefit the generative models. We believe that incorporating the tensorbased methods includig tensor representation, tensor sparse coding, and tensor superresolution in the generative models will improve largesized highresolution images generation.
In this paper, we present a novel generative model called deep tensor generative adversarial nets (TGAN), cascading a DCGAN and tensorbased superresolution to generate largesized highquality images (e.g. ). The contribution of the proposed TGAN is threefold: (i) We apply tensor representation and tensor sparse coding for images representation in generative models. This is testified to have advantages of more concise and efficient representation of images with less loss on spatial patterns. (ii) We incorporate the tensor representation into the superresolution process, which is called tensor superresolution. The tensor superresolution is cascaded after a DCGAN with transposed convolutional layers, which generates lowresolution images directly from random distributions. (iii) The DCGAN and tensor dictionaries in tensor superresolution are both pretrained with a large number of highresolution and lowresolution images. The size of dictionaries is smaller with tensor representation than traditional, which accelerates the dictionary learning process in tensor superresolution. More details are shown in Fig. 1 for an illustration of the TGAN. The generation performance of TGAN surpasses traditional generative models including adversarial autoencoders [19] in inception score [20] on test datasets, especially for large images. Our code is available at https://github.com/hust512/TensorGAN.
2 Related Work
Recently, various approaches have been developed to study the deep generative model. There are two main types of the generative models that includes the adversarial model GAN [1]
and its modifications, and the probability model such as variational autoencodes (VAE)
[21] and adversarial autoencoders (AAE) [19].GAN is a twoplayer game that consists of a generator and a discriminator . The generator can generate realistic samples based on the input random noise, while the discriminator is aimed to identify whether the samples come from the real sample set or the generated data set. Finally, and reach a Nash equilibrium and is able to generate stable images. However, large images make this equilibrium hard to reach for and at the same time.
In order to generate highresolution images from lowresolution images, the model SRGAN [8] is proposed to realize superresolution of images. It uses CNN for extracting features from lowresolution images. The model of SRGAN testifies the strong capability of generative models in applications of images superresolution. Another popular and successful modification of the GAN is DCGAN [7]
comprising transposed CNNs, especially for imagesrelated applications for unsupervised learning in computer vision. Convolutional strides and transposed convolution are applied for the downsampling and upsampling. However, even with DCGAN, the bottleneck of GAN could be achieved easily for large images, which is that increasing the complexity of the generator does not necessarily improve the image quality. Moreover, StackGAN
[10] uses a twostage GAN to generate images of size , which are relatively large images for stateofart generative models.AAE [19] is a combination of GANs and VAE. AAE utilities only half of the autoencoder to map the original data distribution into the latent variable distribution ; then, it uses an adversarial approach to optimize
. The data sample generation is different between AAE and GAN. The GAN compares the generated data distribution with real data distribution in the discriminator and adopts a stochastic gradient descent process to optimize the entire model. On the other hand, AAE uses the discriminator to distinguish the latent variable distribution
. The discrete data that cannot be processed by the GAN is mapped to the continuous data in , which extends the range of the acceptable data.However, image representation in pixel space may not be an efficient way as in the traditional GANs. Tensor representation based methods have been adopted recently. Recent papers [22][12] apply tensor representation for dictionary learning with smaller dictionary size and better results than the traditional methods. Some theoretical analysis for tensor decomposition and its application are provided in [11] with details. Tensor decomposition lies in the core status of tensorbased methods, which provide an alternative representation mean for data such as large images.
3 Notations and Preliminaries
3.1 Tensor Product
We use boldface capital letters to denote matrices, e.g. , and calligraphic letters to denote tensors, e.g. . An order3 tensor is denoted as . The expansion of along the third dimension is represented as , where denotes the th frontal slice, for . The circular matrix representation of tensor is defined as
(1) 
The tensor product [23] of two tensors and is defined as
(2) 
where for and , and denotes the circular convolution operation. In addition, the tensor product has an equivalent matrixproduct form:
(3) 
3.2 Tensor Sparse Coding for images
Considering input images of size , we first sample the image tensor using tensor cubes and reshape it to be the input tensor block (detailed relationships of with and the tensor cubes are shown in Section 4). can be approximated with an overcomplete tensor dictionary , as follows [12]:
(4) 
where is the tensor coefficient with slice .
One of the proposed schemes for tensor sparse coding is based on the norm of the coefficient. The sparse coding problem in tensor representation is as follows:
(5) 
(6) 
where the size of the dictionary is , . However, traditional sparse coding requires the size of the dictionary to be , which significantly increases with the increase in dimensionality, as shown in [12]. A smaller dictionary is easier to learn in tensor sparse coding, which is a more efficient way to encode images compared with traditional sparse coding methods.
4 Deep Tensor Generative Adversarial Nets Scheme
We incorporate tensorbased methods including tensor representation, tensor sparse coding, tensor dictionary learning, and tensor superresolution into traditional generative models such as DCGAN. The proposed novel scheme is called TGAN.
The TGAN scheme could be divided into two phases: the training phase and the generation phase, as shown in Fig. 1. First of all, twodimensional (2D) images are transformed into the tensor space as a preprocess. In the generation phase: using pretrained DCGAN to generate lowresolution image tensors from random distributions, we apply tensor superresolution for transforming lowresolution image tensors to highresolution image tensors. Highresolution 2D images can be derived from the obtained highresolution image tensors. The tensor dictionaries we used in the tensor superresolution process and the DCGAN are both pretrained with large numbers of highresolution and lowresolution image tensors in the training phase. It is clear that the training phase is ahead of the generation phase in implementations.
We sequentially introduce details of the TGAN scheme in the following subsections. Subsection 4.1 provides a basic introduction to tensor representation applied in our TGAN scheme. In Subsection 4.2, we propose the “folding” and “unfolding” process of data preparation for tensor dictionary learning. In Subsection 4.3, we present the training phase of TGAN scheme, including the DCGAN training and tensor dictionaries learning. Subsection 4.4 provides details about tensor superresolution process, including theories and implementations. In Subsection 4.5, we present the generation phase of the TGAN scheme, which generates the superresolution images with the trained DCGAN and tensor dictionaries.
4.1 Tensor Representation in TGAN
Our proposed approach combines DCGAN with tensorbased superresolution, to directly generate highresolution images. Considering the advantages of small dictionary size and invariance of shifting [12], tensor sparse coding is the key point we want to apply in our model. We make the assumption [24] that the inner patterns of images can be at least approximately sparsely represented with a learned dictionary. For tensor dictionary representation, , where . Therefore, tensor representation of images is necessary, which acts as the main representation of images in our workflows.
4.2 Data Preprocess: “Folding” and “Unfolding”
We obtain the tensor input block with original images in the following manner, which we called the “folding” process. We first concatenate images shifted from the same original image for highresolution or for lowresolution (first upsampling it to be in the generation phase) with different pixels to obtain the image representation tensor , as shown in Fig. 2. Then we sample image tensors in all dimensions with the tensor block of size to obtain sample blocks, where . Therefore, the size of image representation tensor is . The tensor is reshaped to be input tensor blocks , where . For tensor dictionary learning process, the original images are 2D images from the training set; for the image generation process with trained dictionaries, the original image is generated with DCGAN from random distributions, and with in order to generate a single highresolution image from scratch. As the tensor dictionary is independent of the number of samples , the dictionary iteratively trained with a large number of samples could naturally be used for generating a single highresolution image.
The inverse process of the above “folding” process is called the “unfolding” process, which is used for recovering the highresolution 2D images from the obtained highresolution tensor output blocks. The “unfolding” is just a trivial combination of inversing each step in “folding”.
4.3 The Training Phase: DCGAN Training and Tensor Dictionary Learning
In our model, we first downsample the original images in the training set to highresolution images and lowresolution images at the downsampling rate , and we further transform them into tensor representation . Then we train DCGAN with to generate lowresolution tensor images from random distributions Uniform. We refer to the adversarial loss as utilities. The reconstruction loss and adversarial loss is formulated as a minimax game:
(7) 
where denote generator and discriminator of DCGAN, and
denote the latent vector and uniform distributions. The images in tensor representation
and are further transformed to be input tensor blocks and (as shown in the data preprocess of Section 4.2) for training the dictionaries and in tensor superresolution. We have tensor product relationships in tensor sparse coding: and , where denotes tensor sparse coefficients for highresolution images and lowresolution images respectively. Note that, in tensor superresolution, it is reasonable (reasons in Section 4.4) to set and denote it with .4.4 Details about Tensor Super Resolution
The goal for tensor superresolution is to transform lowresolution images into highresolution images through the tensor spares coding approach. For an input tensor , tensor dictionary learning is similar to (the only difference is the dimensions) the tensor sparse coding in Section 3.2, where is the tensor dictionary, and its slice is a basis, is the tensor sparse coefficient. The first and second term uses the Frobenius norm and norm in Equ. (5), respectively.
If taking the sparse coding process of different resolution images as similar patterns with respect to different bases, we could consider that highresolution and lowresolution tensor images from the same origins have sparse and approximate tensor coefficients . Therefore the constraints of two dictionaries could be combined as follows:
(8) 
where
(9) 
where represent input tensor blocks of highresolution and lowresolution images and denote the number of samples in two kinds of resolutions. We then apply the Lagrange dual method and iterative shrinkage threshold algorithm based on tensorproduct to solve the tensor dictionaries and tensor sparse coefficients. The minimization problem can be rewritten as:
(10) 
where stands for and stands for (coefficient can be absorbed in ). At the th iteration,
(11) 
where is a Lipschitz constant. Therefore,
(12) 
We can obtain the Lipschitz constant that ,
is the discrete fourier transformation (DFT) of the thirddimension slice
, and subscript implies that it is a conjugate transpose. In the implemented algorithm for the training process of , we use to solve above equations, which is the proximal operator [25]. We therefore obtain the tensor sparse coding coefficients through iteratively solving Equ. (12).For learning the dictionary with fixed , the optimization problem w.r.t each of the slices of becomes
(13)  
(14) 
Transform the above equations into the frequency domain,
(15) 
(16) 
Therefore, with the Langrange dual, we obtain
(17) 
Thus, the optimal formulation of satisfies:
(18) 
Therefore,
(19) 
Equ. (19) can be solved with Newton’s method. Substitute the derived in Equ. (18). Thus, we can derive the dictionary through inverse fourier transformation of .
4.5 The Generation Phase
In the generation phase, we first generate lowresolution images with the trained DCGAN model directly from latent vectors r in random distribution, and concatenate them to make image tensors . Then, we set to derive the tensor sparse coefficients with the relationship and trained dictionary with (here the “trained” dictionary does not mean the dictionary is derived through a training process like the neural networks, but a specific iteration algorithm for deriving the dictionary, see details in Section 4.3 and 4.4. Finally we use to generate highresolution output tensor block with derived dictionary . The output highresolution 2D images are obtained through “unfolding” the generated highresolution tensor block .
5 Performance Evaluation
In this section, we present the results of proposed TGAN scheme on three datasets: MNIST [26], CIFAR10 [27], PASCAL2 VOC [28]. The image size of these three datasets applied in our model is (downscaled from original pixels), repectively.
5.1 Experiments Setting
DCGAN neural network parameters: the generator network has one fully connected layer and three transposed convolutional layers, with a decreasing number of filter kernels, decreasing by a factor of 2 from to 64 kernels and finally one channel of output images. The discriminator has three convolutional layers, with an increasing number of filter kernels consistent with the generator. We use LeakyReLu [29] with parameter
to avoid maxpooling. Strided convolutions of size
are used in each convolutional layer and tranposed convolutional layer. The learning rate is set to and stochastic gradient descent is applied with a minibatch size of 32.By default, . The number of directions for pixelshifting is . The number of iterations . The sparsity parameter . in Prox method is 0.05. For MNIST data, original images of size (size values are set accordingly for other two datasets), downsampling rate of lowresolution images compared with highresolution images is .
5.2 Inception Score of Generation Results
We adopt the inception score (IS) metric [20][30] to compare performance of different schemes. The metric compares three kinds of samples, including our generated images, other generated images from similar generative methods and the real images. The inception score metric focus on comparing the qualities and diversities of their generated images. We input every generated image in Google Inception Net and obtained the conditional label distribution , where x is one generated image and y denotes the predicted label. Images that contain meaningful objects should have a conditional label distribution with low entropy. The inception score metric is . The comparison results of the AAE and our TGAN model are shown in Table 1. The proposed TGAN achieves better results in all three datasets, especially for largessized PASCAL2 images (e.g. ). Its inception score of 4.02 for PASCAL2 images significantly outperforms AAE of 3.81.
5.3 Generated Images of TGAN
Some of the testing results on benchmark datasets are shown in the end of the paper. Fig. 3 shows the comparison of MNIST images generation with TGAN and AAE. In the random selected 16 images, only 2 of TGAN generated images is kind of obscure to recognize, compared with at least 6 in AAE generated ones. The TGAN model provides images with more precise features of digital numbers, which benefits from its concise and efficient representation in tensor space. The effects of tensor superresolution are shown in Fig. 4 for MINIST images with ablation studies. The images generated with general DCGAN have much coarser features without the tensorbased superresolution process, which testifies that tensor superresolution can significantly increases the image quality with more convincing details. Fig. 5 and Fig. 6 shows the generation results on PASCAL2 and CIFAR10 datasets, both testify the capability of TGAN in generating images with better quality, especially for large images (e.g. ) in PASCAL2. Images generated with TGAN have more precise features and convincing details than images generated by AAE. This testifies that TGAN preserves spatial structure and local proximal information in a better way than traditional methods. Generally, the DCGAN generates basic shapes, structures, and colors of images, while the cascading tensor superresolution process improves the images with more details.
6 Conclusion
In this paper, we proposed a TGAN scheme that integrates DCGAN model and tensor superresolution, which is able to generate largesized highquality images. The proposed scheme applies tensor representation space as main operation space for image generation, which shows better results than traditional generative models working in image pixel space. Essentially, the adversarial process of TGAN takes place in a tensor space. Note that in the tensor superresolution process, tensor sparse coding brings several advantages: (i) the size of dictionary, which accelerates the training process for deriving the representation dictionary; (ii) more concise and efficient representation for images, which is verified in the generated images in our experiments. TGAN is superior in preserving spatial structures and local proximity information in images. Accordingly, the tensor superresolution benefits from tensor representation to generate higherquality images, especially for large images. Our proposed cascading TGAN scheme surpasses the stateoftheart generative model AAE on three datasets (MNIST, CIFAR10, and PASCAL2).
References
 [1] Ian Goodfellow, Jean PougetAbadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, 2014, pp. 2672–2680.

[2]
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew
Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz,
Zehan Wang, et al.,
“Photorealistic single image superresolution using a generative
adversarial network,”
in
IEEE Conference on Computer Vision and Pattern Recognition
, 2017.  [3] JunYan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros, “Unpaired imagetoimage translation using cycleconsistent adversarial networks,” in IEEE International Conference on Computer Vision, 2017.

[4]
Phillip Isola, JunYan Zhu, Tinghui Zhou, and Alexei A Efros,
“Imagetoimage translation with conditional adversarial networks,”
in IEEE Conference on Computer Vision and Pattern Recognition, 2017.  [5] Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, and Xiaodong He, “Attngan: Finegrained text to image generation with attentional generative adversarial networks,” arXiv preprint arXiv:1711.10485, 2017.

[6]
Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle
Pineau, Aaron C Courville, and Yoshua Bengio,
“A hierarchical latent variable encoderdecoder model for generating
dialogues,”
in
Association for the Advancement of Artificial Intelligence
, 2017, pp. 3295–3301.  [7] Alec Radford, Luke Metz, and Soumith Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
 [8] Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew P Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al., “Photorealistic single image superresolution using a generative adversarial network.,” in CVPR, 2017, vol. 2, p. 4.
 [9] Emily L Denton, Soumith Chintala, Rob Fergus, et al., “Deep generative image models using a laplacian pyramid of adversarial networks,” in Advances in Neural Information Processing Systems, 2015, pp. 1486–1494.
 [10] Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaolei Huang, Xiaogang Wang, and Dimitris Metaxas, “Stackgan: Text to photorealistic image synthesis with stacked generative adversarial networks,” in IEEE International Conference on Computer Vision, 2017, pp. 5907–5915.
 [11] Tamara G Kolda and Brett W Bader, “Tensor decompositions and applications,” SIAM revIew, vol. 51, no. 3, pp. 455–500, 2009.
 [12] Jiang Fei, XiaoYang Liu, Hongtao Lu, and Ruimin Shen, “Efficient multidimensional tensor sparse coding using tlinear combinations,” in Association for the Advancement of Artificial Intelligence, 2018.
 [13] Jianchao Yang, John Wright, Thomas S Huang, and Yi Ma, “Image superresolution via sparse representation,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861–2873, 2010.

[14]
Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro,
“Online dictionary learning for sparse coding,”
in
Proceedings of the 26th annual International Conference on Machine Learning
. ACM, 2009, pp. 689–696.  [15] Julien Mairal, Jean Ponce, Guillermo Sapiro, Andrew Zisserman, and Francis R Bach, “Supervised dictionary learning,” in Advances in Neural Information Processing Systems, 2009, pp. 1033–1040.
 [16] Na Qi, Yunhui Shi, Xiaoyan Sun, and Baocai Yin, “Tensr: Multidimensional tensor sparse representation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5916–5925.
 [17] Nadav Cohen, Or Sharir, and Amnon Shashua, “On the expressive power of deep learning: A tensor analysis,” in International Conference on Learning Theory, 2016, pp. 698–728.
 [18] Or Sharir, Ronen Tamari, Nadav Cohen, and Amnon Shashua, “Tractable generative convolutional arithmetic circuits,” arXiv preprint arXiv:1610.04167, 2016.
 [19] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey, “Adversarial autoencoders,” in International Conference on Learning Representations, 2016.
 [20] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen, “Improved techniques for training gans,” in Advances in Neural Information Processing Systems, 2016, pp. 2234–2242.
 [21] Diederik P Kingma and Max Welling, “Autoencoding variational bayes,” in International Conference on Learning Representations, 2014.
 [22] Shengqi Tan, Yanbo Zhang, Ge Wang, Xuanqin Mou, Guohua Cao, Zhifang Wu, and Hengyong Yu, “Tensorbased dictionary learning for dynamic tomographic reconstruction,” Physics in Medicine & Biology, vol. 60, no. 7, pp. 2803, 2015.

[23]
Ning Hao, Misha E Kilmer, Karen Braman, and Randy C Hoover,
“Facial recognition using tensortensor decompositions,”
SIAM Journal on Imaging Sciences, vol. 6, no. 1, pp. 437–463, 2013.  [24] Bin She, Yaojun Wang, Jiandong Liang, Zhining Liu, Chengyun Song, and Guangmin Hu, “A datadriven avo inversion method via learned dictionaries and sparse representation,” Geophysics, vol. 83, no. 6, pp. 1–91, 2018.
 [25] Neal Parikh, Stephen Boyd, et al., “Proximal algorithms,” Foundations and Trends® in Optimization, vol. 1, no. 3, pp. 127–239, 2014.
 [26] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner, “Gradientbased learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
 [27] Alex Krizhevsky and Geoffrey Hinton, “Learning multiple layers of features from tiny images,” Tech. Rep., Citeseer, 2009.
 [28] Mark Everingham, Luc Gool, Christopher K. Williams, John Winn, and Andrew Zisserman, “The pascal visual object classes (voc) challenge,” Int. J. Comput. Vision, vol. 88, no. 2, pp. 303–338, June 2010.
 [29] Bing Xu, Naiyan Wang, Tianqi Chen, and Mu Li, “Empirical evaluation of rectified activations in convolutional network,” arXiv preprint arXiv:1505.00853, 2015.
 [30] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2818–2826.
Comments
There are no comments yet.