Face aging, also known as age progression [Shu et al.2015], aims at rendering a given face image with aging effects while still preserving personalized features. The applications of face aging techniques range from social security to digital entertainment, such as predicting contemporary appearance of missing children and cross-age identity verification.
Due to the practical value of face aging, many approaches have been proposed to address this problem in the last two decades [Lanitis, Taylor, and Cootes2002, Tazoe et al.2012, Suo et al.2010, Tiddeman, Burt, and Perrett2001, Kemelmacher-Shlizerman, Suwajanakorn, and Seitz2014]
. With the rapid development of deep learning, deep generative models are adopted to synthesize aged face images[Wang et al.2016, Duong et al.2016, Duong et al.2017]. However, the most critical problem of these methods is that multiple face images of the same person at different ages are required at training stage, which is expensive to collect in practice.
To overcome this problem, in most recent studies, unpaired data is used to train the model for solving the face aging problem [Zhang, Song, and Qi2017, Yang et al.2018, Li et al.2018]. However, face aging with unpaired training data is in nature an ill-posed problem, where an input young face image may correspond to many aged candidate images. This intrinsic matching ambiguity will lead to seriously flawed generation results with defects including ghosting artifacts and incorrect facial attributes. For example, as shown in Fig. 1 (a), beard or moustache is mistakenly attached to each input female face image. This is because the model learns that growing a beard is a typical sign of aging but has no way to know that this does not happen to a woman, as face images of young woman could be paired up with elderly male face images when training with unpaired aging data. As another example, in Fig. 1 (b), make-up has been put on to the input faces and feminine facial features have emerged in the outputs, demonstrating the undesired effect of learning age progression based on face image pairs with mismatched attributes.
Although obvious unnatural changes of facial attributes are observed in the output faces in Fig. 1, they are still considered as the same person as in the corresponding input image according to the results of face verification. In other words, well maintained identity-related features do not imply reasonable aging results when training with unpaired data. Therefore, merely enforcing identity consistency, like most other recent studies do [Zhang, Song, and Qi2017, Antipov, Baccouche, and Dugelay2017, Yang et al.2018], is insufficient to eliminate matching ambiguities in unpaired training data, thus fails to achieve satisfactory face aging performance.
To solve the above-mentioned issues, in this paper, we propose a novel framework based on generative adversarial networks (GANs). Different from existing methods in the literature, we embed facial attribute vectors in the model so that the generator could be guided to consider semantic attribute information when learning age mappings from unpaired data, and output elderly face images with attributes faithful to each corresponding input.
Furthermore, to enhance aging details, based on the observation that signs of aging are mainly represented by wrinkles, laugh lines, and eye bags, which could be treated as local textures, we employ wavelet packet transform to extract features at multiple scales in the frequency space. Moreover, residual connection is employed in the generator to make it focus more on simulating aging effects and be less distracted by visual content irrelevant to age progression.
Main contributions of our work could be summarized as follows:
Facial attributes are incorporated into our method for face aging, since identity preservation is insufficient for generating reasonable results. To the best of our knowledge, this is the first work to involve facial attributes into face aging.
Wavelet packet transform is adopted to extract features of texture details at multiple scales in the frequency domain for generating fine-grained details of aging effects. In addition, residual connection is employed in the generator to make the framework concentrate on modeling the difference between young and aged faces.
Extensive experiments have been conducted to demonstrate the capability of the proposed method in rendering accurate age effects and preserving information related to identity and facial attributes. Quantitative results indicate that our method achieves state-of-the-art performance.
In the last few decades, face aging has been a very popular research topic and a greate amount of algorithms have been proposed to tackle this issue. In general, these methods could be divided into three categories: physical model-based methods, prototype-based methods, and deep learning-based methods.
Physical model-based methods mechanically simulate the changes of facial appearance w.r.t. time by modeling the anatomical structure of human faces. As an early attempt, Todd et al. [Todd et al.1980] modeled the transition of facial appearance by revised cardioidal strain transformation. Subsequent works investigated the problem from various biological aspects including muscles and overall facial structures [Lanitis, Taylor, and Cootes2002, Tazoe et al.2012]. However, physical model-based algorithms are computational expensive and large amount of image sequences of the same subject are required to model aging effects.
Data-driven prototyping approaches [Suo et al.2010, Tiddeman, Burt, and Perrett2001, Kemelmacher-Shlizerman, Suwajanakorn, and Seitz2014] come into view the next, where faces are divided into age groups and each group is represented by an average face (prototype) computed from the training data. After that, transition patterns between prototypes are regarded as effects of aging. The main problem of prototyping methods is that personalized facial features are eliminated when calculating average faces, thus the identity information is not well preserved.
In recent years, deep generative models with temporal architectures are adopted to synthesize images of elderly faces [Wang et al.2016, Duong et al.2016, Duong et al.2017]. However, in most of these works, face image sequence over a long age span for each subject is required thus their potential in practical use is limited. With the success of adversarial training in generating visually appealing images, many efforts have been made to tackle the problem of face aging using GANs [Goodfellow et al.2014]. Zhang et al. [Zhang, Song, and Qi2017]
proposed a conditional adversarial autoencoder (CAAE) to achieve age progression and regression by traversing in low-dimensional manifold. The work most similar to ours is[Yang et al.2018], in which a GAN-based model with pyramid architecture is proposed, and identity loss is adopted to achieve permanence. However, besides preserving identity information during aging process, we focus on alleviating the influence of matching ambiguity of unpaired training samples and ensuring attribute consistency by embedding facial attribute vectors in the model.
Due to the intrinsic matching ambiguity among unpaired training face images, undesired transitions in facial attributes may occur in the aging process and could not be eliminated by merely applying constraints on identity consistency. To solve this problem, we propose a face aging model that takes young face images and their corresponding attribute vectors as input, and generates elderly face images with attributes in agreement with the input ones. To better simulate signs of aging, residual connection is incorporated in the generator for the model to focus more on the difference between young and old faces, and wavelet packet transform module is adopted in the discriminator to capture multi-scale textural features of aging effects. An overview of the proposed framework is presented in Fig. 2.
Facial Attribute Embedded Generator
Since aging data of the same subject over a long time span is expensive to collect, most existing GAN-based studies on face aging train models with unpaired samples [Zhang, Song, and Qi2017, Yang et al.2018, Li et al.2018]. Although constraints on identity information and pixel values are usually imposed to restrict modifications made to input images, facial attributes may still undergo unnatural transitions (as shown in Fig. 1) due to mismatching of unpaired training images. To solve this problem, we incorporate high-level semantic attribute information in the face aging model to regularize image translation patterns and reduce the ambiguity of mappings between unpaired young and aged faces.
An intuitive and straightforward way of including attribute information would be adding an ‘attribute loss’ parallel to the discriminator loss to reduce the gap between output and input images in attribute space. However, this method still suffers from one-to-many correspondences of unpaired training data as the generator is unable to learn what semantic information is required to be preserved.
Rather than incorporate facial attribute information as an additional loss term, we embed the attribute vector in the generator so that semantic facial information is well considered in the generation process and encourages the model to produce face images with consistent attributes more effectively. To be specific, we employ an hourglass-shaped fully convolutional network as the generator, which has achieved success in previous image translation studies [Johnson, Alahi, and Fei-Fei2016, Zhu et al.2017, Yang et al.2018]
. It consists of an encoder network, a decoder network, and four residual blocks in between as the bottleneck. The input facial attribute vector is replicated and concatenated to the output blob of the last residual block as they both contain high-level semantic features. After the combination, the decoder network transforms the concatenated feature blob back to the image space. In this way, both semantic features extracted from visual data and attribute information encoded explicitly by the attribute vector are taken into consideration in the generation process.
Since age progression could be considered as rendering aging effects conditioned on the input young face image, we add the input image to the output of the decoder to form a residual connection. Compared to learning to synthesize the whole face image, this structure automatically makes the generator focus more on modeling the difference between faces from different age groups, namely the representative signs of aging, and be less likely being distracted by visual content irrelevant to aging, such as background. Finally, the scale of the output tensor is normalized by a hyperbolic tangent (tanh) mapping and the generated elderly face image is obtained.
In order to encourage generated face images to be indistinguishable from real ones, a discriminator is employed to tell whether a given face image is generic or not. Considering the fact that typical signs of aging, such as wrinkles, laugh lines, and eye bags, could be regarded as local image textures, we adopt wavelet packet transform (WPT, see Fig. 3) to capture age-related textural features.
To be specific, in the proposed model, multi-level WPT is performed to provide a more comprehensive analysis of textures in the given image, and wavelet coefficients at each decomposing level are fed into a convolutional pathway of the discriminator. To make the discriminator gain the ability of telling whether a generated image is faithful to attributes of the input young image, the input attribute vector is also replicated and concatenated to the output of an intermediate convolutional block of each pathway. At the end of the discriminator, same-sized outputs of all pathways are fused to form a single tensor, and adversarial loss is then estimated against the label tensor.
Compared to extracting multi-scale features by a sequence of convolutional layers as in [Yang et al.2018], the advantage of using WPT is that the computational cost is significantly reduced since calculating wavelet coefficients could be regarded as forwarding through a single convolutional layer. Therefore, WPT greatly reduces the number of convolutions performed in each forwarding process, thus facilitates the training process. Although this part of the model has been simplified, it still takes the advantage of multi-scale image texture analysis, which is helpful in improving the visual fidelity of generated images.
Overall Objective Functions
Training of GAN model simulates the process of optimizing a minimax-max two-player game between the generator and the discriminator . Unlike regular GANs [Goodfellow et al.2014], we adopt least square loss instead of negative log likelihood loss for that the margin between generated samples and the decision boundary will be minimized in the feature space, which further improves the quality of synthesized images [Mao et al.2017]. Practically, we pair up young face images and their corresponding attribute vectors , denoted as , and take them as input to the model. Only generic aged faces with attributes same as the input are considered as positive samples (), and real young faces are regarded as negative samples to help gain discriminating ability on aging effects.
Mathematically, the objective function for and could be written as follows,
where and denote the distriburtion of generic face images of young and old subjects, respectively.
In addition, pixel loss and identity loss are adopted to maintain consistency in both image-level and personalized feature-level. To be specific, we utilize the VGG-Face descriptor [Parkhi et al.2015], denoted by , to extract the identity related semantic representation of a face image. These two loss terms could be formulated as,
In conclusion, overall objective functions of the proposed model could be written as follows,
where and are coefficients balancing the importance of critics on identity and pixels, respectively. We optimize the model by minimizing and alternatively until the optimality is reached.
MORPH [Ricanek and Tesafaye2006] is a large aging dataset containing 55,000 face images of more than 13,000 subjects. Data samples in MORPH are color images of near-frontal faces exhibiting neutral expressions under uniform and moderate illumination with simple background. CACD [Chen, Chen, and Hsu2015] contains 163,446 face images of 2,000 celebrities captured in less controlled conditions. Besides large variations in pose, illumination, and expression (PIE variations), images in CACD are collected via Google Image Search, making it a very challenging dataset due to the mismatching between actual face presented in each image and associated labels provided (name and age).
As for facial attributes, MORPH provides researchers with labels including age, gender, and race for each image. We choose ‘gender’ and ‘race’ to be the attributes that are required to be preserved, since these two attributes are guaranteed to remain unchanged during natural aging process and are relatively objective compared to other attributes such as ‘attractive’ or ‘chubby’. Researchers may choose whatever attributes to preserve according to the task they are attempting to solve. For CACD, since face images with race other than ‘white’ only takes a small portion of the entire dataset, we only select ‘gender’ as the attribute to preserve. To be specific, we go through the name list of the celebrities and label the corresponding images accordingly. This introduces noise in gender labels due to the mismatching between the annotated name and the actual face presented in each image, which further increases the difficulty for our method to achieve good performance on this dataset.
All face images are cropped and aligned according to the five facial landmarks detected by MTCNN [Zhang et al.2016]. Following the convention in [Yang et al.2018, Li et al.2018], we divide the face images into four age groups, i.e., 30-, 31-40, 41-50, 51+, and only consider translations from 30- to the other three age groups. To evaluate the performance of the proposed method objectively and make quantitative results comparable with previous work, all metric measurements are conducted via public APIs of Face++ 111http://www.faceplusplus.com.
We choose Adam to be the optimizer of both and with learning rate and batch-size set to and 16, respectively. We apply pixel-level critic every 5 iterations, and is updated at every iteration. As for trade-off parameters, we set to 0.1 after a rough parameter search, and to 10 to make the identity loss to be of the same order of magnitude as the pixel loss. All experiments share the same hyper-parameter setting and are conducted under 5-fold cross validation on a Nvidia Titan Xp GPU.
|Age group||30 -||31 - 40||41 - 50||51 +||30 -||31 - 40||41 - 50||51 +|
|Deviation from the mean age of generic images (in absolute value)|
|Age group||31 - 40||41 - 50||51 +||31 - 40||41 - 50||51 +|
|31 - 40||-||95.47||89.53||-||91.74||90.54|
|41 - 50||-||-||90.50||-||-||91.12|
|Verification Rate (%) between Young and Aged Faces (threshold=76.5, FAR=1e-5)|
Qualitative Results of Face Aging
Sample results on Morph and CACD are shown in Fig. 4. It could be seen that our method is able to simulate translations between age groups and synthesize elderly face images with high visual fidelity. In addition, our method is robust to variations in terms of race, gender, expression, and occlusion.
Performance comparison with prior work on Morph is shown in Fig. 5. Clearly, traditional face aging methods, CONGRE [Suo et al.2012] and HFA [Yang et al.2016], only render subtle aging effects within tight facial area, which fails to accurately simulate the aging process. In contrast, GAN-based methods, GLCA-GAN [Li et al.2018] and GAN with pyramid architecture proposed in [Yang et al.2018], referred to as PAG-GAN, have achieved significant improvement on the quality of generation results. Our method generates face images of higher resolution () with enhanced details compared to GLCA-GAN, and reduces ghosting artifacts in the results compared to PAG-GAN (e.g. finer details of hair and beard).
Aging Accuracy and Identity Preservation
In this subsection, we report evaluation results on aging accuracy and identity preservation. We compare the performance of the proposed model with previously state-of-the-art methods CAAE [Zhang, Song, and Qi2017] and PAG-GAN [Yang et al.2018]. For CAAE, we produce results using the code provided by the authors with default parameters, and for PAG-GAN, we report the results in their paper.
Aging Accuracy: Age distributions of both generic and synthetic faces in each age group are estimated, where less discrepancy indicates more accurate simulation of the aging effect between two age groups. On Morph and CACD, face images of age under or equal to 30 are considered as testing samples, and their corresponding aged faces in the other three age groups (31-40, 41-50, 51+) are synthesized. We estimated the apparent age of both generation results and natural face images in the dataset for comparison.
Age estimation results on Morph and CACD are shown in Table 1 and Fig. 6. We compare our method with previous works in terms of deviations between mean ages. On Morph, it could be seen that estimated age distributions of synthetic elderly face images well match that of natural images in all age groups. Our method consistently outperforms PAG-GAN in translations to all three age groups, demonstrating the effectiveness of our method. Signs of aging in results of CAAE are not obvious enough, thus lead to large age estimation errors. On CACD, due to the existence of mismatching between face images and associated labels, slight performance drop could be observed. Still, the proposed method achieves results comparable to previous state-of-the-art. This shows that our method is relatively robust to noise in attribute labels and thus lower the requirement on the accuracy of the prior attribute detection process.
|Gender Preservation Rate (%)||Race Preservation Rate (%)||Deviation of Age Distributions|
|(in absolute value)|
|woFAE / woWPT||95.72||94.21||93.60||95.04||93.55||90.83||0.44||1.72||3.03|
|woFAE / wWPT||96.15||94.90||93.61||93.89||88.63||90.21||0.68||0.41||2.31|
|wFAE / woWPT||97.21||96.91||95.85||95.22||94.35||91.43||0.82||0.52||4.82|
|woFAE / woWPT||100.00||100.00||99.92|
|woFAE / wWPT||100.00||99.88||98.06|
|wFAE / woWPT||100.00||100.00||98.86|
Identity Preservation: Face verification experiments are conducted to investigate whether the identity information has been preserved during the face aging process. Similar to previous literature, comparisons between synthetic elderly face images from different age groups of the same subject are also conducted to inspect if the identity information is consistent among three separately trained translations.
Results of face verification experiments are shown in Table 2. On Morph, our method achieves the highest verification rate on all three translations and outperforms other approaches by a clear margin especially in mapping face images from 30- to 51+, demonstrating that the proposed method successfully achieves identity permanence during face aging. On the more challenging dataset CACD containing mismatched labels, the performance of our method is comparable to PAG-GAN with minor difference. Notably, as the time interval between two face images of a single subject increases, both verification confidence and accuracy decrease, which is reasonable as greater changes in facial appearance may occur as more time elapsed.
Facial Attribute Consistency
We evaluate the performance of facial attribute preservation by comparing perceived facial attributes before and after age progression, and results are listed in Table 3. On Morph, facial attributes of the majority of testing samples are well preserved in the aging process. Notably, similar to identity preservation, as the time interval between two face images becomes larger, preservation rates of both gender and race decrease, indicating that it is harder to achieve consistency in various aspects between data distributions with larger discrepancy. Similar trend could be observed from results obtained on CACD dataset. However, there exists a clear margin between performance on gender preservation between results obtained on Morph and CACD, reflecting the influence of mistakenly labeled data samples.
In this part, experiments are conducted to fully explore the contributions of facial attribute embedding (FAE) and wavelet packet transform (WPT) in simulating age translations. We investigate the impact of including/excluding attribute embedding (w/wo FAE, similar for WPT) on age distribution, face verification rate and attribute preservation rate. All experiments in this subsection are conducted only on Morph as labels are inaccurate on CACD dataset. Results are shown in Table 4 and 5 (please refer to the supplemental material for visual illustrations).
According to results in Table 4, introducing attribute embedding (wFAE) increases the preservation rates for both ‘gender’ and ‘race’ under all three age translations, especially in the case of mapping to 51+. This proves the effectiveness of attribute embedding as it aligns unpaired age data in terms of facial attributes and thus reduces the intrinsic ambiguity in data mapping.
In addition, it is clear that adopting WPT reduces the discrepancies between age distributions of generic and synthetic images in all cases. However, WPT provides little help in maintaining facial attribute consistency. This is because WPT only captures feature based on low-level visual data and could not bridge the semantic gap, so that the framework still suffers from mismatched data samples.
Combining results in Table 4 and 5, it could be seen that while attribute preservation rates still have room for improvement, verification rates are about to reach perfection. This observation validates our statement that identity preservation does not guarantee that facial attributes remain stable during the aging process. Therefore, besides constraints on identity, supervision on facial attributes are also necessary to reduce the intrinsic matching ambiguity of unpaired data and achieve satisfactory face aging results.
In this paper, we propose a novel GAN-based framework to synthesize aged face images. Due to the ineffectiveness of identity constraints in reducing the matching ambiguity of unpaired aging data, we propose to employ facial attributes to tackle this issue. Specifically, we embed facial attribute vectors to both generator and discriminator to encourage generated images to be faithful to facial attributes of the corresponding input image. To further improve the visual fidelity of generated face images, wavelet packet transform is introduced to extract textual features at multiple scales efficiently, and residual connection is employed in the generator to make the model concentrate more on aging effects. Extensive experiments are conducted on Morph and CACD, and qualitative results demonstrate that our method could synthesize lifelike face images robust to both PIE variations and noisy labels. Furthermore, quantitative results obtained via public APIs validate the effectiveness of the proposed method in aging accuracy as well as identity and attribute preservation.
- [Antipov, Baccouche, and Dugelay2017] Antipov, G.; Baccouche, M.; and Dugelay, J.-L. 2017. Face aging with conditional generative adversarial networks. IEEE International Conference on Image Processing (ICIP) 2089–2093.
- [Chen, Chen, and Hsu2015] Chen, B.-C.; Chen, C.-S.; and Hsu, W. H. 2015. Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Transactions on Multimedia (TMM) 17(6):804–815.
[Duong et al.2016]
Duong, C. N.; Luu, K.; Quach, K. G.; and Bui, T. D.
Longitudinal face modeling via temporal deep restricted boltzmann machines.In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5772–5780.
- [Duong et al.2017] Duong, C. N.; Quach, K. G.; Luu, K.; Le, T. H. N.; and Savvides, M. 2017. Temporal non-volume preserving approach to facial age-progression and age-invariant face recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 3755–3763.
- [Goodfellow et al.2014] Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. 2014. Generative adversarial nets. In Advances in neural information processing systems (NIPS), 2672–2680.
[Johnson, Alahi, and
Johnson, J.; Alahi, A.; and Fei-Fei, L.
Perceptual losses for real-time style transfer and super-resolution.In European Conference on Computer Vision (ECCV), 694–711.
- [Kemelmacher-Shlizerman, Suwajanakorn, and Seitz2014] Kemelmacher-Shlizerman, I.; Suwajanakorn, S.; and Seitz, S. M. 2014. Illumination-aware age progression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3334–3341.
- [Lanitis, Taylor, and Cootes2002] Lanitis, A.; Taylor, C. J.; and Cootes, T. F. 2002. Toward automatic simulation of aging effects on face images. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 24(4):442–455.
- [Li et al.2018] Li, P.; Hu, Y.; Li, Q.; He, R.; and Sun, Z. 2018. Global and local consistent age generative adversarial networks. In International Conference on Pattern Recognition (ICPR).
- [Mao et al.2017] Mao, X.; Li, Q.; Xie, H.; Lau, R. Y.; Wang, Z.; and Smolley, S. P. 2017. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2813–2821.
- [Parkhi et al.2015] Parkhi, O. M.; Vedaldi, A.; Zisserman, A.; et al. 2015. Deep face recognition. In British Machine Vision Conference (BMVC), 41.1–41.12.
- [Ricanek and Tesafaye2006] Ricanek, K., and Tesafaye, T. 2006. Morph: A longitudinal image database of normal adult age-progression. In the International Conference on Automatic Face and Gesture Recognition (FG), 341–345.
- [Shu et al.2015] Shu, X.; Tang, J.; Lai, H.; Liu, L.; and Yan, S. 2015. Personalized age progression with aging dictionary. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 3970–3978.
- [Suo et al.2010] Suo, J.; Zhu, S.-C.; Shan, S.; and Chen, X. 2010. A compositional and dynamic model for face aging. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 32(3):385–401.
- [Suo et al.2012] Suo, J.; Chen, X.; Shan, S.; Gao, W.; and Dai, Q. 2012. A concatenational graph evolution aging model. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 34(11):2083–2096.
- [Tazoe et al.2012] Tazoe, Y.; Gohara, H.; Maejima, A.; and Morishima, S. 2012. Facial aging simulator considering geometry and patch-tiled texture. In ACM SIGGRAPH.
- [Tiddeman, Burt, and Perrett2001] Tiddeman, B.; Burt, M.; and Perrett, D. 2001. Prototyping and transforming facial textures for perception research. IEEE Computer graphics and applications 21(5):42–50.
- [Todd et al.1980] Todd, J. T.; Mark, L. S.; Shaw, R. E.; and Pittenger, J. B. 1980. The perception of human growth. Scientific American 242(2):132–145.
- [Wang et al.2016] Wang, W.; Cui, Z.; Yan, Y.; Feng, J.; Yan, S.; Shu, X.; and Sebe, N. 2016. Recurrent face aging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2378–2386.
- [Yang et al.2016] Yang, H.; Huang, D.; Wang, Y.; Wang, H.; and Tang, Y. 2016. Face aging effect simulation using hidden factor analysis joint sparse representation. IEEE Transactions on Image Processing (TIP) 25(6):2493–2507.
- [Yang et al.2018] Yang, H.; Huang, D.; Wang, Y.; and Jain, A. K. 2018. Learning face age progression: A pyramid architecture of gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 31–39.
- [Zhang et al.2016] Zhang, K.; Zhang, Z.; Li, Z.; and Qiao, Y. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23(10):1499–1503.
- [Zhang, Song, and Qi2017] Zhang, Z.; Song, Y.; and Qi, H. 2017. Age progression/regression by conditional adversarial autoencoder. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4352–4360.
[Zhu et al.2017]
Zhu, J.-Y.; Park, T.; Isola, P.; and Efros, A. A.
Unpaired image-to-image translation using cycle-consistent adversarial networks.In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2242–2251.