FA-GANs: Facial Attractiveness Enhancement with Generative Adversarial Networks on Frontal Faces

05/17/2020 ∙ by Jingwu He, et al. ∙ Megvii Technology Limited Nanjing University 2

Facial attractiveness enhancement has been an interesting application in Computer Vision and Graphics over these years. It aims to generate a more attractive face via manipulations on image and geometry structure while preserving face identity. In this paper, we propose the first Generative Adversarial Networks (GANs) for enhancing facial attractiveness in both geometry and appearance aspects, which we call "FA-GANs". FA-GANs contain two branches and enhance facial attractiveness in two perspectives: facial geometry and facial appearance. Each branch consists of individual GANs with the appearance branch adjusting the facial image and the geometry branch adjusting the facial landmarks in appearance and geometry aspects, respectively. Unlike the traditional facial manipulations learning from paired faces, which are infeasible to collect before and after enhancement of the same individual, we achieve this by learning the features of attractiveness faces through unsupervised adversarial learning. The proposed FA-GANs are able to extract attractiveness features and impose them on the enhancement results. To better enhance faces, both the geometry and appearance networks are considered to refine the facial attractiveness by adjusting the geometry layout of faces and the appearance of faces independently. To the best of our knowledge, we are the first to enhance the facial attractiveness with GANs in both geometry and appearance aspects. The experimental results suggest that our FA-GANs can generate compelling perceptual results in both geometry structure and facial appearance and outperform current state-of-the-art methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Related researches have shown that human faces tend to make a powerful first impression, and thus may have some influence on later social behaviors. The advantages brought by attractive faces are proven in many scientific studies [30]. Consequently, growing numbers of celebrities enhance their facial attractiveness in daily life.

Fig. 1: Exemplar results of our FA-GANs and the mobile APP of Meitu for facial attractiveness enhancement. The original faces, the geometry adjusted faces, the appearance adjusted faces, the Meitu adjusted faces, and the FA-GANs adjusted faces are shown from left to right, respectively.

Enhancing facial attractiveness is a process of adjusting a given face in the sense of aesthetics. Besides its academic purpose, it also shares a wide market in entertainment industry as a way to beautify the portrait. Many existing apps, e.g. Meitu, B612, allow people to adjust their portraits after taking photos manually in both appearance and geometry structure aspects. In this paper, we aim at automating this process to relieve people from all the detailed adjustment works and enhancing the attractiveness with a data-driven manner.

A key challenge in our task is the lack of paired data. It is infeasible to collect the paired faces before and after enhancement of the same individual for supervised learning due to high cost in money and time. Furthermore, people define facial attractiveness in different ways. Their answers vary a lot when asked to explicit this concept. However, there does exist a universal criterion in some ways, such that numerous actresses or actors are recognized by the majority for their beauty or handsomeness. Therefore, it is feasible to explore the relationship between the ordinary and the attractive faces to make the enhancement, even though it might be difficult to build a set of succinct rules to define facial attractiveness.

During the last decades, the proposed facial manipulation methods can be divided into two categories roughly. The methods in the first category resort to traditional computer graphics techniques to warp the input faces directly, and employ computer vision techniques to alter facial appearance by makeup. To enhance 2D facial image attractiveness, Leyvand et al. [15] introduce a method by learning the distances among the key points and adjusting the facial image with the multilevel free-form deformation technique. Li et al. [16] simulate makeup through adaptations of physically-based reflectance models. For the 3D face models with both geometry and texture information, Liao et al. [22] propose a sequence of manipulations concerning symmetry, frontal face proportion, and facial angular profile proportion correction. The methods in the other category generate the desired face via building the generative models. The representative methods are the deep generative networks, which have exhibited a remarkable capability in facial image generation. These methods mainly alter the facial attractiveness with a reference image to transfer the makeup and adjust the appearance.

Geometry structure of the face plays a critical role in facial attractiveness. However, there does not exist a unified framework for improving the attractiveness involving both geometry and appearance aspects. In this paper, we propose facial attractiveness enhancement Generative Adversarial Networks (GANs) called FA-GANs to achieve attractiveness enhancement by altering the input face in both geometry and appearance aspects. Given the difficulty in paired data collection, we propose a data-driven approach for enhancing face attractiveness without the necessity of paired faces as input. We collect the portraits of the famous beautiful/handsome people and the ordinary people as the attractive and the unattractive faces, respectively. The facial attractiveness rules are explored by learning feature maps of the attractiveness ranking networks trained on the faces we collected.

FA-GANs consist of two branches used for enhancement in the aspects of geometry and appearance, respectively. Both of them are with the structure of GANs. For the geometry branch, we employ the face landmarks, which is detected by the project of face recognition to depict the geometry structure of a face. For the appearance branch, a face ranking module with the structure of VGG-16 is pre-trained to get the ability of assessing facial attractiveness. The ranking module determines the attractiveness domain that the ordinary faces should be enhanced. Then, the networks of deep face descriptor 

[29] are utilized by minimizing the distance of the facial images before and after enhanced, which ensures that the enhanced face and the corresponding input face can be recognized as the same individual. To the best of our knowledge, we are the first to enhance the facial attractiveness with GANs in both geometry and appearance aspects.

The main contributions of our paper include:

  • We propose FA-GANs for facial attractiveness enhancement, which enhance the input face in both geometry and appearance aspects automatically.

  • The pre-trained attractiveness ranking module is embedded in FA-GANs to learn the features of attractiveness faces via unsupervised adversarial learning, which avoid the input of paired faces.

  • FA-GANs consist of two branches, and each branch can adjust faces in geometry or appearance aspects independently.

  • The experimental results show that FA-GANs generate the compelling perceptual results for face enhancement and outperform state-of-the-art.

Ii Related Works

Facial attractiveness enhancement is a problem of face attractiveness analysis and face generation. In this section, we first briefly introduce the researches in geometry and appearance facial image enhancement, and then review recent facial image generation methods with generative adversarial networks.

Ii-a Geometry-based Facial Image Enhancement

Facial attractiveness has been investigated for a long time in many research areas, such as computer vision, computer geometry, and cognitive psychology. Numerous empirical experiments suggest that beauty cognition does exist over ethnicity, social class, age, and gender. The shapes of beautiful faces are defined in different ways among different individual groups, but beautiful faces can always be recognized as attractive by individuals from other groups  [13].

Enhancing facial attractiveness in geometry structure includes 2D image and 3D face model. For the 2D facial image, Leyvand et al. [15]

propose a data-driven enhancement approach for facial attractiveness, which adjusts the facial geometry structure by learning from the distance vectors of the landmark points. Li et al. 

[17] propose a deep face beautification framework, which is able to automatically modify the geometrical structure of a face to boost the attractiveness. For the 3D face model, Liao et al. [22] propose an enhancing approach concerning the geometry symmetry, frontal face proportion, and facial angular profile proportion. They apply the overall proportion optimization to the frontal face restricted to the Neoclassical Canons and golden ratios. The combination of asymmetry correction and frontal or profile ideal proportions adjustment can achieve better results. However, the research also suggests that asymmetry correction is more effective than adjusting the proportions of the frontal or profile [13]. Qian et al. [31] propose Additive Focal Variational Auto-encoder (AF-VAE) to arbitrarily manipulate high-resolution face images.

Ii-B Appearance-based Facial Image Enhancement

Arakawa et. al [2]

propose a system using interactive evolutionary computing which removes undesirable skin components from human facial images to make the face look beautiful. Other works devote to making the enhancement via face makeup methods. Liu et al. 

[23] propose a fully automatic makeover recommendation and synthesis system named beauty e-Expert. Liu et. al [24] propose an end-to-end Deep Localized Makeup Transfer Network to automatically synthesize the makeup for female face and achieve very natural-looking results. Li et. al [18] achieve makeup transfer from a given reference makeup face to another non-makeup one in high quality with their proposed BeautyGAN. Chang et. al [6] propose PairedCycleGAN, which can quickly transfer the style from an arbitrary reference makeup photo to an arbitrary source photo.

Ii-C Generative Adversarial Networks

Our facial enhancement architecture is also one kind of deep generative model derived from GANs [7]

. GANs work in a simple and efficient way to train both the generator and discriminator via the min-max two player game achieving remarkable results on unsupervised learning tasks. Recently, plenty of modified architectures of GANs 

[36, 4, 12, 20, 38, 10, 3, 10, 39] are proposed to fulfill face manipulations. For the face age progression problem, Yang et al. [37] propose a pyramid architecture of GANs to learn face age progression and achieve remarkable results. Besides, the Identity-Preserved Conditional Generative Adversarial Networks (IPCGANs) are proposed to generate a face with target age while preserving the identity [36].

In the research area of face synthesis, Song et. al [35] propose the geometry guided GANs to achieve facial expression removing and synthesizing. Shen et. al [33] synthesize faces while preserving identity with proposed three player GAN called FaceID-GAN. Bao et. al [3] propose a framework based on GANs to disentangle identity and attributes of faces, and recombine different identities and attributes for identity preserving face synthesis in open domain. Li et. al [20] propose an effective object completion algorithm using a deep generative model. Cao et. al [5] propose CariGANs for unpaired photo-to-caricature translation, which generate caricature image in two steps of geometry and appearance adjustment.

GANs and their variants achieve great success in super resolution

[14], image translation [41], and image synthesis [40], which provide a practical solution for the realistically image generating problem. Motivated by these, we propose our facial attractiveness enhancement framework based on GANs to generate the enhanced and identity preserved facial images.

Iii Our Method

Fig. 2: The systematic overview of FA-GANs. FA-GANs consist of two branches, and they are appearance enhancement branch and geometry enhancement branch. Geometry enhancement consistency between these two branches is used for learning the geometry adjustment in appearance generator.

The original GANs consist of a generator and a discriminator and train and iteratively via an adversarial process alternatively. and compete with each other via a min-max game in Equation (1). We aim to get until cannot distinguish real samples from the generated ones.

(1)

where

is a noise sampled from a prior probability distribution

, and denotes a data item sampled from real data distribution .

Let X and Y denote the domains of unattractiveness and attractiveness faces respectively, and no paired faces exist between these two domains. Given an unattractiveness sample , our goal is to learn a mapping , which can transfer to an attractiveness sample . Unlike previous works enhancing the facial attractiveness in either geometry structure or appearance texture, FA-GANs make the enhancement involving both aspects. Therefore, FA-GANs consist of two branches: the geometry adjustment and the appearance adjustment .

The proposed framework of FA-GANs with two branches is shown in Fig. 2, and both of them are with the architecture of GANs. To generate the face with adjustment on two aspects, the geometry branch is trained to learn the geometry-to-geometry translation from to . Then, the appearance branch is trained with the help of geometry translation to learn the adjustment in both appearance and geometry aspects. Similar to the traditional beautification engine [15] of 2D facial image, facial landmarks are used to represent the geometry structure. We denote and as the domains of geometry structures of unattractiveness and attractiveness faces, respectively. The geometry branch learns the mapping to convert unattractiveness geometry structure to the attractiveness one . The appearance branch learns to convert the instance to an intermediate instance , where is an appearance adjusted face with the same geometry structure of , and is the intermediate domain with geometry structure of and appearance texture of . The mapping of appearance branch learned is defined as .

To combine these two branches and enhance the facial attractiveness in both aspects, we further minimize the energy function to ensure the consistency between the geometry structure of the intermediate face and the geometry structure generated by geometry branch. The consistency energy is defined as:

(2)

To achieve this, we further introduce the pre-trained geometry analysis networks to extract the geometry structure from generated by the appearance branch. FA-GANs combine the geometry mapping and appearance mapping and achieve the mapping by involving the consistency between geometry and appearance branches. The geometry branch, appearance branch, and the combination of these two branches are described in details in Section  III-A,  III-B, and  III-C, respectively.

Iii-a Facial Geometry Enhancement Branch

Fig. 3: Framework of our facial geometry enhancement. Geometry enhancement is performed on the domain of distance vectors, which are derived from the landmarks detected on the original face. And then, the enhanced landmarks are inferred from the enhanced distance vectors. At last, the geometry enhanced face is generated by mapping the original face to the enhanced landmarks.

Iii-A1 Geometry data

We extract 2D facial landmarks for each of the training face images. For each face, landmarks are detected and located on the outlines of left eyebrow, right eyebrow, left eye, right eye, nose bridge, nose tip, top lip, bottom lip, and chin. The extracted landmarks are demonstrated in Fig. 3. To normalize the scale, all faces are cropped and resized to the size of pixels. Instead of learning from facial landmarks directly, we prefer to enhance facial attractiveness via adjusting the relative distances among important facial components, such as eyes, nose, lip, and chin. Landmarks are more sensitive to small errors than relative distances. Therefore, the detected landmarks are used to construct a distance embedding mesh containing edges with Delaunay triangulation. We further reduce the dimension of the distance vector to

with the presentation of principal component analysis (PCA) and reserve

of total variants.

Iii-A2 Geometry enhancement networks

Geometry branch aims to learn the mapping , while no samples in and

are paired. We achieve this by constructing the geometry adjustment GANs including only full connected (FC) and ReLU 

[28] layers. Unlike traditional GANs, which train the generator until it fools the discriminator, the feature maps of are also utilized to help for learning the geometry mapping. The generator takes as input and outputs the adjusted distance vector . Fig. 3 shows the overview of our geometry adjustment branch. Apart from the last layer, discriminator has the same architecture with the generator, and their architectures are shown in Table I.

Module Layer Activation size
Input
FC-ReLU
FC-ReLU
FC-ReLU
FC-ReLU
Generator FC
Discriminator FC 1
TABLE I: Architecture of geometry enhancement branch.

Following the classic GANs [7, 25], the process of training discriminator amounts to minimizing the loss:

(3)

where and are the discriminator and the generator, respectively. Moreover, takes both the unattractiveness samples and the generated samples as negative samples and it takes the attractiveness samples as positive samples. The loss of discriminator can be rewritten as:

(4)

Considering that samples coming from the same category have similar feature maps, the feature loss is introduced and the loss of generator is defined as follows:

(5)

where is the FC layer and indicates the i-th layer in generator .

Iii-A3 Geometry adjustment on faces

Geometry branch converts to where is the distance vector derived from the landmark points . To adjust facial geometry, the enhanced points corresponding to

are estimated by minimizing the following energy function for the best fit,

(6)

where is the element of the facial mesh connectivity matrix, and the distance term is the entry in corresponding to . minimization of the energy function is performed by Levenberg-Marquardt [27] algorithm. Then the geometry enhanced face is generated by mapping the original face texture from points to .

Iii-B Facial Appearance Enhancement Branch

The outstanding performance of GANs in fitting data distribution has significantly promoted many computer vision applications such as image-to-image translation 

[9, 41, 5, 18]. Inspired by these studies, we employ GANs to perform facial appearance enhancement while preserving identity. The appearance adjustment networks and the loss are introduced in the following subsections.

Iii-B1 Appearance adjustment networks

The process of appearance enhancement only requires a forward pass through the generator . The generator is designed with the U-Net [32]. The discriminator is introduced to output the indicators suggesting the probability that comes from attractiveness category. Different from the classic discriminator of GANs, we implement our discriminator with the pyramid architecture [37] to estimate high-level attractiveness-related cues in a fine-grained way.

Specifically, a facial attractiveness classifier is pre-trained with the architecture of VGG-16 

[34] to classify attractiveness and unattractiveness faces. The hierarchical layers of VGG-16 have the ability to capture image features from pixel to semantic level. The generator is optimized until the discriminator is confused about and for all the pixel and semantic level features. Consequently, the feature maps of the 2nd, 4th, 7th, and 10th convolutional layers in VGG-16 are integrated into the discriminator for adversarial learning. The generator not only transfers to but also preserves the identity. We achieve this by employing the network of deep face descriptor [29] to measure the identity similarity between and . The deep face descriptor is trained on a large face dataset containing millions of facial images by recognizing unique individuals. We remove the classification layer and take the last FC layer as the identity preserving output forming an identity descriptor . Both the original face and enhanced face are fed into to generate with small margin between and .

Layer Activation size
Input 3224224
stride padding
stride padding
stride padding
stride padding
stride padding
stride padding
stride padding
stride padding
stride padding
FC-ReLU
FC-ReLU
FC 32
TABLE II: Convolutional layers of the distance feature extractor.

Iii-B2 Loss

Four types of loss for training the appearance branch are defined, which are adversarial loss, identity loss, pixel loss, and total variation loss.

a) Adversarial loss. Similar to the geometry adjustment branch, both unattractiveness faces and the generated faces are deemed as negative faces and the attractiveness faces are deemed as positive faces. We also adopt the adversarial loss of LSGAN [25] to train the appearance adjustment branch. The adversarial loss is defined as:

(7)
(8)

b) Identity loss. To ensure that appearance enhanced face and the original face come from the same individual, the identity loss is defined as:

(9)

where is the identity descriptor described in above subsection.

c) Pixel loss and total variation loss. The generated face is more attractive than the input one . However, the gap between and in pixels should be constrained. The pixel loss enforces the enhanced face to have small difference with the original face in the raw-pixel space. In addition, in experiences of other image generation tasks, regularization performs better with less blurring than regularization. The pixel loss is formulated as:

(10)

where are the width and height of the image, respectively, and is the number of channels. Total variation loss is defined as total variation regularizer [1] to encourage spatial smoothness in the enhanced face .

Iii-C Geometry Enhancement Consistency

Given a face as well as its distance vector , geometry enhancement branch converts to , where lies in the domain of . At the same time, appearance branch converts to , where lies in the domain of . As illustrated in Fig. 2, FA-GANs combine these two branches and output the enhanced face with the generator in appearance enhancement branch. The geometry consistency exists between and , where is the distance vector of . To extract the distance vector, an extractor

is pre-trained on our dataset. The extractor is constructed with convolutional and FC layers, and each convolutional layer is followed by a Batch Normalization 

[8] layer and a ReLU [28] layer. The architecture of the extractor is shown in Table II.

The extracted vector is the PCA representation of distance vectors derived from landmarks. Thus the geometry enhancement consistency is defined in the following loss:

(11)

In summary, FA-GANs contain the loss in appearance branch and the geometry enhancement consistency loss. Therefore, the system training loss can be written as:

(12)
(13)

We train and alternately until learns the desired facial attractiveness transformation and becomes a reliable estimator.

Fig. 4: Results of geometry branch adjustment. The original face, distance vectors of original face, adjusted face, and distance vectors of adjusted face are displayed from left to right in each row, respectively.

Iv Experiments

In this section, we first introduce the face dataset used in our experiments, and analyze the performance of two adjustment branches. Then, we compare them with the existing geometry and appearance adjustment methods. Furthermore, the results of FA-GANs with two steps methods of appearance adjustment following geometry adjustment are compared. At last, we also compare FA-GANs with the state-of-the-arts including the mobile APP of Meitu [26].

Iv-a Datasets

Facial attractiveness analysis is the basic and critical problem settings for facial image enhancement. The attractiveness criterion is explored via a data-driven way. However, few datasets of the facial attractiveness are publicly available. Fortunately, Liang et. al [21] propose a diverse benchmark dataset, called SCUT-FBP5500 to achieve multi-paradigm facial beauty prediction. The SCUT-FBP5500 dataset contains totally 5500 frontal faces involving diverse properties, such as male/female, Asian/Caucasian, and ages. Specifically, it includes 2000 Asian females, 2000 Asian males, 750 Caucasian females, and 750 Caucasian males aged from 15 to 60 with neutral expression. The attractiveness of each facial image is labeled with 60 different labelers into 1-5 score, where 1 indicates the most unattractiveness, while 5 means the most attractiveness. Our attractiveness learner classifies the faces into two categories of attractiveness and unattractiveness. In order to obtain a dataset with unanimous faces, the faces that have conflict scores are ruled out. Moreover, we enlarge the dataset by further collecting the portraits of famous beautiful actresses as attractiveness faces and the portraits of ordinary people as unattractiveness faces. Totally, female facial images are collected for FA-GANs. The attractiveness and unattractiveness categories contain and images, respectively.

Fig. 5: Results of appearance branch adjustment. Each pair contains the input face and the appearance adjusted face.
Fig. 6: Parsed faces used for  [19]. The original as well as its parsed face and the reference as well as its parsed face are shown on left and right, respectively.
Fig. 7: Exemplar photos of original, geometry adjusted, appearance adjusted, two-step adjusted, and FA-GANs. Both the geometry and appearance aspects contains the results of branch of FA-GANs and the compared methods.

Iv-B Implementation Details

All faces are detected and their landmark points are extracted to form the dataset of geometry enhancement branch. The faces are cropped and scaled to the size of pixels as the training data of appearance enhancement branch. To get a more robust attractiveness ranking module, we also add the color jitter, random affine, random rotation, and random horizontal flip to the facial attractiveness network. The trade-off parameters , , , , and in Equation (12) are set to , , , , and , respectively. The Adam [11] algorithm with learning rate of is used to optimize FA-GANs. FA-GANs are trained on a computer with a 4.20GHz 4-core, i7-7700K Intel CPU and a GTX 1080Ti GPU, which costs around hours for iterations with the batch size of to generate the desired results.

Iv-C Branch Analysis of FA-GANs

To investigate the performance of each branch in FA-GANs, we perform ablation studies and mainly analyze the effect of each branch. These two branches are variations of GANs, and the geometry branch is pre-trained for training FA-GANs. Hence, we explore the performance of appearance branch by training it without the geometry enhancement consistency constraint. The adjustment results of geometry and appearance branches are demonstrated in Figs. 4 and 5. As we can see from Fig. 4, geometry branch enhances the facial attractiveness by generating faces with small cheek and big eyes, which suggest that people prefer smaller face and bigger eyes in nowadays. Besides, appearance branch adjusts faces mainly on the texture instead of the geometry structure as shown in Fig. 5. It tends to generate faces with clean cheek, red lip, and black eyes, which reveals the popular cognition of beauty. We further analyze the results of attractiveness in qualitative evaluation and time consuming in the following subsection.

Iv-D Comparison with State-of-the-Arts

In this section, we verify the effectiveness of FA-GANs, and compare our generated results with existing methods in geometry and appearance adjustment. Specifically, we make a comparison with the geometry adjustment method proposed in [15], and the appearance adjustment method with photorealistic image style transfer [19]. We implement [15]

with its KNN-based beautification engine with

searching in the domain of , and adjust the geometry structure as described in subsection III-A3. [19] needs a reference face to transfer the style to the input face . We choose by selecting the nearest face in the domain of , which can be obtained in the implementation of [15] with . With the help of facial landmarks we detected, the input face and the reference face are further parsed111https://github.com/zllrunning/face-parsing.PyTorch for [19], and the exemplar parsed faces are shown in Fig. 6. Furthermore, the attractiveness enhancement results involving these two aspects are also analyzed. Given a face, we first adjust its geometry structure with [15] or our geometry branch and then adjust its appearance with our appearance branch or [19]. Moreover, we also compare FA-GANs to BeautyGAN [18] with the reference face obtained for  [19].

To investigate the performance of FA-GANs with existing mobile APPs, we also make a comparison with the APP of Meitu, which is famous for beautifying the faces. Each face is adjusted by the “One-click beauty” in face beautification module of Meitu and with the default settings.

Iv-D1 Qualitative evaluation

To analyze the performance of comparison results in an objective way, the facial attractiveness is assessed by Face++ API222https://www.faceplusplus.com/beauty/. It gives two scores and indicating that males and females generally think this face is more attractive than and persons, respectively. The comparison results are shown in Fig. 7, and are arranged into six groups. They are original, geometry adjustment, appearance adjustment, two-step adjustment, the state-of-the-arts, and FA-GANs, respectively. For each method, identity preservation is also evaluated and defined as cosine distance between and , where and are the original face and the adjusted face, respectively, and is the deep face descriptor.

In Fig. 7,  [15] adjusts the geometry structure effectively comparing to the original faces, and it also tends to generate portrait with smaller cheek and bigger eyes.  [19] adjusts the appearance mainly depending on the reference image, and the generated face looks homogeneous on the face area. groups of the original faces and their adjustment results are evaluated, and the averaged attractiveness scores for each adjustment method are shown in Fig. 8.

All these adjustment methods get higher attractiveness score than the original face. The statistical results suggest that FA-GANs achieve the best performance over all these adjustments, and the two-step adjustment method of geometry branch followed by appearance branch achieves the second best. FA-GANs promote the attractiveness score from to and to in female and male views, respectively. It suggests that FA-GANs enhance the attractiveness effectively. Similarly, the two-step adjustment method of geometry branch followed by appearance branch achieves the second best getting scores of , , respectively and outperforms BeautyGAN of , and Meitu of , . Comparing the geometry adjustment results of geometry branch in FA-GANs and [15], geometry branch outperforms [15] with a small margin in both assessing aspects of female and male. On the other hand, appearance branch performs better than [19] no matter these appearance adjust methods applied directly on original faces or on the other geometry adjusted results. Comparing the results between appearance and geometry adjustment, appearance always achieves higher score than geometry indicating that people can enhance their attractiveness by paying more attention to makeups than facelifts.

The statistical results in Fig. 8 point out that Meitu promotes the beauty scores less than FA-GANs, the appearance branch, and BeautyGAN. This is because “One-click beauty” shrinks the cheek slightly and makes the face area looks whiter while ignoring the mouth and the eyebrows. As we can see from Fig. 7, all of the instances generated by Meitu have whiter skins while the colors of mouth and eyebrows are diluting. Compared to Meitu, our appearance branch also generates whiter skin and has the makeup style on the mouth, eyes, and eyebrows.

For the identity preservation, the evaluated similarity scores are , , , , , , , , , , , and with respect to the listed methods in Fig. 8. All of the methods preserve the identity with the similarity greater than . Meitu preserves the identity best with the similarity of . Our geometry branch achieves the second best with the similarity of , and FA-GANs also performs well with the similarity of .

Fig. 8: Statistical results of beauty scores. We assess the facial attractiveness of original and the adjusted faces. The assessing results in female and male views are shown in the left and right, respectively.
total
Geo branch
 [15]
TABLE III: Time consuming analysis on geometry adjustment.
face parsing total
App branch
 [19]
TABLE IV: Time consuming analysis on appearance adjustment.
geo app geo  [19]  [15] app  [15]  [19] FA-GANs
TABLE V: Time consuming analysis on both geometry and appearance adjustment.

Iv-D2 Runtime time analysis

To build an effective and efficient method for enhancing the attractiveness of faces, we further compare these methods in time consuming. All the time consuming experiments are performed on the aforementioned computer. Geometry adjustment can be recognized as three steps, which contain extracting distance vector from original face , enhancing to , and mapping to enhanced face . The time consuming analysis of these steps is shown in Table III. As can be seen in Table III,  [15] performs faster than our geometry branch in seconds on our geometry adjustment dataset. The geometry adjustment methods spend most time on mapping to enhanced face . Appearance branch enhances the attractiveness with only a forward pass through the generator. However,  [19] adjusts the appearance with an extra reference image, and requires parsing to get a better result. The time consumption of appearance adjustment is shown in Table IV. It suggests that appearance branch only requires seconds to make adjustment on average. At last, we further compare the time consuming in two-step adjustment methods and FA-GANs in Table V. It demonstrates that FA-GANs perform best in time consuming and adjust the face in only seconds. Furthermore, we compare FA-GANs with BeautyGAN in adjusting an input facial image. FA-GANs also performs faster than BeautyGAN, which takes seconds on average.

V Conclusions

This paper proposes the FA-GANs for solving the facial attractiveness enhancement problem in both geometry and appearance aspects. FA-GANs learn the implicit attractiveness rules via the pre-trained facial attractiveness ranking module, and avoid training on the paired faces before and after attractiveness enhanced. FA-GANs contain two branches of geometry adjustment and appearance adjustment, and both of them can enhance the attractiveness independently. A thorough analysis demonstrates that FA-GANs achieve superiority over the existed geometry and appearance enhancement methods. Experiments also suggest that FA-GANs generate compelling perceptual results and enhance the facial attractiveness effectively and efficiently.

References

  • [1] H. A. Aly and E. Dubois (2005-10) Image up-sampling using total-variation regularization with a new observation model. TIP 14 (10), pp. 1647–1659. External Links: Document, ISSN 1057-7149 Cited by: §III-B2.
  • [2] K. Arakawa and K. Nomoto (2005-12) A system for beautifying face images using interactive evolutionary computing. In ISPACS, Vol. , pp. 9–12. External Links: Document, ISSN Cited by: §II-B.
  • [3] J. Bao, D. Chen, F. Wen, H. Li, and G. Hua (2018-06) Towards open-set identity preserving face synthesis. In

    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    ,
    Cited by: §II-C, §II-C.
  • [4] A. Bulat and G. Tzimiropoulos (2018-06) Super-fan: integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In CVPR, Cited by: §II-C.
  • [5] K. Cao, J. Liao, and L. Yuan (2018-12) CariGANs: unpaired photo-to-caricature translation. TOG 37 (6), pp. 244:1–244:14. External Links: ISSN 0730-0301, Document Cited by: §II-C, §III-B.
  • [6] H. Chang, J. Lu, F. Yu, and A. Finkelstein (2018-06) PairedCycleGAN: asymmetric style transfer for applying and removing makeup. In CVPR, Cited by: §II-B.
  • [7] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio (2014) Generative adversarial nets. In NIPS, pp. 2672–2680. Cited by: §II-C, §III-A2.
  • [8] S. Ioffe and C. Szegedy (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In ICML, volume 37, ICML’15, pp. 448–456. Cited by: §III-C.
  • [9] P. Isola, J. Zhu, T. Zhou, and A. A. Efros (2017-07)

    Image-to-image translation with conditional adversarial networks

    .
    In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. . External Links: Document, ISSN 1063-6919 Cited by: §III-B.
  • [10] T. Karras, S. Laine, and T. Aila (2019-06) A style-based generator architecture for generative adversarial networks. In CVPR, Cited by: §II-C.
  • [11] D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §IV-B.
  • [12] J. Kossaifi, L. Tran, Y. Panagakis, and M. Pantic (2018-06) GAGAN: geometry-aware generative adversarial networks. In CVPR, Cited by: §II-C.
  • [13] A. Laurentini and A. Bottino (2014) Computer analysis of face beauty: a survey. CVIU 125, pp. 184 – 199. External Links: ISSN 1077-3142, Document Cited by: §II-A, §II-A.
  • [14] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi (2017-07) Photo-realistic single image super-resolution using a generative adversarial network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. . External Links: Document, ISSN 1063-6919 Cited by: §II-C.
  • [15] T. Leyvand, D. Cohen-Or, G. Dror, and D. Lischinski (2008) Data-driven enhancement of facial attractiveness. In SIGGRAPH, pp. 38:1–38:9. External Links: ISBN 978-1-4503-0112-1 Cited by: §I, §II-A, §III, §IV-D1, §IV-D1, §IV-D2, §IV-D, TABLE III, TABLE V.
  • [16] C. Li, K. Zhou, and S. Lin (2015-06) Simulating makeup through physics-based manipulation of intrinsic image layers. In CVPR, Cited by: §I.
  • [17] J. Li, C. Xiong, L. Liu, X. Shu, and S. Yan (2015) Deep face beautification. In MM, pp. 793–794. External Links: ISBN 978-1-4503-3459-4, Document Cited by: §II-A.
  • [18] T. Li, R. Qian, C. Dong, S. Liu, Q. Yan, W. Zhu, and L. Lin (2018) BeautyGAN: instance-level facial makeup transfer with deep generative adversarial network. In Proceedings of the 26th ACM International Conference on Multimedia, MM ’18, New York, NY, USA, pp. 645–653. External Links: ISBN 978-1-4503-5665-7, Document Cited by: §II-B, §III-B, §IV-D.
  • [19] Y. Li, M. Liu, X. Li, M. Yang, and J. Kautz (2018-09) A closed-form solution to photorealistic image stylization. In ECCV, Cited by: Fig. 6, §IV-D1, §IV-D1, §IV-D2, §IV-D, TABLE IV, TABLE V.
  • [20] Y. Li, S. Liu, J. Yang, and M. Yang (2017-07) Generative face completion. In CVPR, Cited by: §II-C, §II-C.
  • [21] L. Liang, L. Lin, L. Jin, D. Xie, and M. Li (2018) SCUT-fbp5500: a diverse benchmark dataset for multi-paradigm facial beauty prediction. Cited by: §IV-A.
  • [22] Q. Liao, X. Jin, and W. Zeng (2012-10) Enhancing the symmetry and proportion of 3d face geometry. TVCG 18 (10), pp. 1704–1716. External Links: Document, ISSN 1077-2626 Cited by: §I, §II-A.
  • [23] L. Liu, H. Xu, J. Xing, S. Liu, X. Zhou, and S. Yan (2013) ”Wow! you are so beautiful today!”. In Proceedings of the 21st ACM International Conference on Multimedia, MM ’13, New York, NY, USA, pp. 3–12. External Links: ISBN 978-1-4503-2404-5, Document Cited by: §II-B.
  • [24] S. Liu, X. Ou, R. Qian, W. Wang, and X. Cao (2016) Makeup like a superstar: deep localized makeup transfer network. In IJCAI, pp. 2568–2575. Cited by: §II-B.
  • [25] X. Mao, Q. Li, H. Xie, R. Y.K. Lau, Z. Wang, and S. Paul Smolley (2017-10) Least squares generative adversarial networks. In ICCV, Cited by: §III-A2, §III-B2.
  • [26] Meitu. Note: https://corp.meitu.com/en/ Cited by: §IV.
  • [27] J. J. Moré (1978) The levenberg-marquardt algorithm: implementation and theory. In Numerical analysis, pp. 105–116. Cited by: §III-A3.
  • [28] V. Nair and G. E. Hinton (2010) Rectified linear units improve restricted boltzmann machines. In ICML, pp. 807–814. External Links: ISBN 978-1-60558-907-7 Cited by: §III-A2, §III-C.
  • [29] P. Omkar M., V. Andrea, and Z. Andrew (2015-09) Deep face recognition. In BMVC, X. Xianghua, J. Mark W., and G. K. L. Tam (Eds.), pp. 41.1–41.12. External Links: Document, ISBN 1-901725-53-7 Cited by: §I, §III-B1.
  • [30] D. I. Perrett, K. J. Lee, I. Penton-Voak, D. Rowland, S. Yoshikawa, D. M. Burt, S. Henzi, D. L. Castles, and S. Akamatsu (1998) Effects of sexual dimorphism on facial attractiveness. Nature 394 (6696), pp. 884. Cited by: §I.
  • [31] S. Qian, K. Lin, W. Wu, Y. Liu, Q. Wang, F. Shen, C. Qian, and R. He (2019-10) Make a face: towards arbitrary high fidelity face manipulation. In The IEEE International Conference on Computer Vision (ICCV), Cited by: §II-A.
  • [32] O. Ronneberger, P. Fischer, and T. Brox (2015) U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi (Eds.), Cham, pp. 234–241. External Links: ISBN 978-3-319-24574-4 Cited by: §III-B1.
  • [33] Y. Shen, P. Luo, J. Yan, X. Wang, and X. Tang (2018-06) FaceID-gan: learning a symmetry three-player gan for identity-preserving face synthesis. In CVPR, Cited by: §II-C.
  • [34] K. Simonyan and A. Zisserman (2014) Very deep convolutional networks for large-scale image recognition. CoRR. Cited by: §III-B1.
  • [35] L. Song, Z. Lu, R. He, Z. Sun, and T. Tan (2018) Geometry guided adversarial facial expression synthesis. In MM, pp. 627–635. External Links: ISBN 978-1-4503-5665-7, Document Cited by: §II-C.
  • [36] Z. Wang, X. Tang, W. Luo, and S. Gao (2018-06) Face aging with identity-preserved conditional generative adversarial networks. In CVPR, Cited by: §II-C.
  • [37] H. Yang, D. Huang, Y. Wang, and A. K. Jain (2018-06) Learning face age progression: a pyramid architecture of gans. In CVPR, Cited by: §II-C, §III-B1.
  • [38] R. Yi, Y. Liu, Y. Lai, and P. L. Rosin (2019-06) APDrawingGAN: generating artistic portrait drawings from face photos with hierarchical gans. In CVPR, Cited by: §II-C.
  • [39] X. Yin, X. Yu, K. Sohn, X. Liu, and M. Chandraker (2017-10) Towards large-pose face frontalization in the wild. In The IEEE International Conference on Computer Vision (ICCV), Cited by: §II-C.
  • [40] H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. Metaxas (2017-10) StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV), Vol. , pp. 5908–5916. External Links: Document, ISSN 2380-7504 Cited by: §II-C.
  • [41] J. Zhu, T. Park, P. Isola, and A. A. Efros (2017-10) Unpaired image-to-image translation using cycle-consistent adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV), Vol. , pp. 2242–2251. External Links: Document, ISSN 2380-7504 Cited by: §II-C, §III-B.