3D Face Synthesis Driven by Personality Impression

09/27/2018 ∙ by Yining Lang, et al. ∙ Beijing Institute of Technology 0

Synthesizing 3D faces that give certain personality impressions is commonly needed in computer games, animations, and virtual world applications for producing realistic virtual characters. In this paper, we propose a novel approach to synthesize 3D faces based on personality impression for creating virtual characters. Our approach consists of two major steps. In the first step, we train classifiers using deep convolutional neural networks on a dataset of images with personality impression annotations, which are capable of predicting the personality impression of a face. In the second step, given a 3D face and a desired personality impression type as user inputs, our approach optimizes the facial details against the trained classifiers, so as to synthesize a face which gives the desired personality impression. We demonstrate our approach for synthesizing 3D faces giving desired personality impressions on a variety of 3D face models. Perceptual studies show that the perceived personality impressions of the synthesized faces agree with the target personality impressions specified for synthesizing the faces. Please refer to the supplementary materials for all results.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

A face conveys a lot of information about a person. People usually form an impression about another person in less than a second, mainly by looking at another person’s face. Researchers in psychology, cognitive science, and biometrics conducted a lot of studies to explore how facial appearances may influence personality impression [Willis and Todorov2006, Hassin and Trope2000]. Some researchers investigated the relationship between personality impressions and specific facial features [Eisenthal, Dror, and Ruppin2006]

. There are also attempts in training machine learning models for predicting personality impressions based on facial features 

[Gray et al.2010, Joo, Steen, and Zhu2015].

To create realistic 3D faces for the computer games, digital entertainments, and virtual reality applications, some works have been carried on generating realistic face [Hu et al.2017], vivid animation [Sohre et al.2018], natural expressions [Marsella et al.2013], and so on. Yet, synthesizing 3D faces that give certain personality is not explored, which is one of the most important considerations during the creative process. For example, the main characters in games and animations are usually designed to look confident and smart, whereas the “bad guys” are usually designed to look hostile. However, while there are automatic tools for synthesizing human faces of different ethnicities and genders, the problem of synthesizing 3D faces with respect to personality impressions is still unsolved. We propose a data-driven optimization approach to solve this problem.

As the personality impression of a face depends a lot on its subtle details, under the current practice, creating a face to give a certain personality impression is usually done through a “trial-and-error” approach: a designer creates several faces; asks for people’s feedback on their impressions of the faces; and then modifies the faces accordingly. This process iterates until a satisfactory face is created. This design process involves substantial tuning efforts by a designer and is not scalable. Manual creation of faces could also be very challenging if the objectives are abstract or sophisticated. For example, while it could be relatively easy to create a face to give an impression of being friendly, it could be hard to create a face to give an impression of being friendly but silly, which could be desirable for a certain virtual character.

We propose a novel approach to automate this face creation process. Our approach leverages Convolutional Neural Networks (CNN) techniques to learn the non-trivial mapping between low-level subtle details of a face and high-level personality impressions. The trained networks can then be applied for synthesizing a 3D face to give a desired personality impression via an optimization process. We demonstrate that our approach can automatically synthesize a variety of 3D faces to give different personality impressions, hence overcoming the current scalability bottleneck. The synthesized faces could find practical uses in virtual world applications (e.g. , synthesizing a gang of hostile-looking guys to be used as enemies in a game).

The major contributions of our paper include:

  • Introducing a novel problem of synthesizing 3D faces based on personality impressions.

  • Proposing a learning-based optimization approach and a data-driven MCMC sampler for synthesizing faces with desired personality impressions.

  • Demonstrating the practical uses of our approach for different novel face editing, virtual reality applications and digital entertainments.

Related Work

Faces and personality impressions. Personality impression is an active research topic in psychology and cognitive science. Researchers are interested in studying how different factors, e.g. , face, body, profile, motion, influence the formation of personality impression on others [Naumann et al.2009]. Recent work [Over and Cook2018] suggests that facial appearances play an important role in giving personality impressions.

Some works focused on examining what facial features influence personality impression. Vernon et al. vernon2014modeling modeled the relationship between physical facial features extracted from images and impression of social traits. Zell et al. zell2015stylize studied the roles of face geometry and texture in affecting the perception of computer-generated faces. Some findings were adopted to predict human-related attributes based on a face. Xu et al. xu2015new proposed a cascaded fine-tuning deep learning model to predict facial attractiveness. Joo et al. joo2015automated proposed an approach to infer the personality traits of a person from his face.

Motivated by these findings, we use deep learning techniques to learn the relationship between facial appearances and personality impressions based on a collected face dataset with personality impression annotations, which is applied to guide the synthesis of 3D faces to give desired personality impressions by an optimization.

Face Modeling and Exaggeration. Some commercial 3D modeling software can be used by designers for creating 3D virtual characters with rich facial details, such as Character Generator, MakeHuman, Fuse, and so on. These tools provide a variety of controls of a 3D face model, including geometry and texture, e.g. , adjusting the shape of the nose, changing skin color. However, to create or modify a face to give a certain personality impression, a designer has to manually tune many low-level facial features, which could be very tedious and difficult.

Another line of work closely relevant to ours is face exaggeration, which refers to generating a facial caricature with exaggerated face features. Suwajanakorn et al. Suwajanakorn_2015_ICCV proposed an approach for creating a controllable 3D face model of a person from a large photo collection of that person captured in different occasions. Le et al. le2011shape performed exaggeration differently by using primitive shapes to locate the face components, followed by deforming these shapes to generate an exaggerated face. They empirically found that specific combinations of primitive shapes tend to establish certain personality stereotypes. Recently, Tian and Xiao tian2016facial proposed an approach for face exaggeration on 2D face images based on a number of shape and texture features related to personality traits.

Compared to these works, our learning-based optimization approach provides high-level controls for 3D face modeling, by which designers can synthesize faces with respect to specified personality impressions conveniently.

Figure 1: Overview of our approach.

Data-Driven 3D Modeling. Data-driven techniques have been successfully applied for 3D modeling [Kalogerakis et al.2012, Talton et al.2011]. Huang et al. huang2017shape devised deeply-learned generative models for 3D shape synthesis. Ritchie et al. ritchie2015controlling used Sequential Monte Carlo to guide the procedural generation of 3D models in an efficient manner. Along the direction of face modeling, Saito saito2016photorealistic et al. used deep neural networks trained with a high-resolution face database to automatically infer a high-fidelity texture map of an input face image.

Modeling the relationships between low-level facial features and high-level personality impressions is difficult. In addition, directly searching in such a complex and high-dimensional space is inefficient and unstable. In our work, we apply data-driven techniques to model the relationship between facial appearances and personality impressions. Furthermore, we speed up face synthesis by formulating a data-driven sampling approach to facilitate the optimization.

Overview

Figure 1 shows an overview of our approach. Given an input 3D face, our approach optimizes the face geometry and texture such that the optimized face gives the desired personality impression specified by the user. To achieve this goal, we present an automatic face synthesis framework driven by personality impression, which consists of two stages: learning and optimization.

In the learning stage, we define types of personality impression. Then we learn a CNN personality impression classifier for each type. To train the CNN classifier, we collected images from CASIA WebFace database  [Yi et al.2014] and annotated them with the corresponding personality impression. We also learn an end-to-end metric to evaluate the similarity between the synthesized face and the input one. The metric plays the role of constraining 3D face deformation.

In the optimization stage, our approach modifies the face geometry and texture iteratively. The resulting face is then evaluated by the personality impression cost function, as well as the similarity cost function. To speed up our optimization, we devise a data-driven sampling approach based on the learned priors. The optimization continues until a face giving the desired personality impression is synthesized.

Problem Formulation

Personality Impression Types. In our experiments, we use four pairs of personality impressions types: a) smart/silly; b) friendly/hostile; c) humorous/boring; and d) confident/unconfident. These personality impression types are commonly used in psychology [Mischel2013].

3D Face Representation. To model a 3D face, we use a multi-linear PCA approach to represent the face geometry and the face texture based on the morphable face models [Blanz and Vetter1999], akin to the representation of [Hu et al.2017]. Our approach operates on a textured 3D face mesh model. We represent a face by its geometry

, which is a vector containing the 3D coordinates of the

vertices of the face mesh, as well as a vector containing the RGB values of the pixels of its texture image.

Each face is divided into

regions (eyes, jaw, nose, chin, cheeks, mouth, eyebrows and face contour) as depicted in the supplementary materials. For each face region, we learn two Principal Component Analysis (PCA) models for representing its geometry and texture in low-dimensional spaces. The PCA models are learned using 3D faces from the Basel Face Model database 

[Paysan et al.2009].

First, we manually segment each face into the eight regions. Then, for each region, we perform a PCA on the geometry and a PCA on the texture to compute the averages and the sets of eigenvectors. In our implementation, when doing the PCAs for the

-th region, for all vertices in and all pixels in that do not belong to the -th region, we just set their values to zero so that all regions have the same dimensionality and can be linearly combined to form the whole face :

(1)

Here is the index of a face region; and denote the average geometry and average texture for the -th face region; and are matrices whose columns are respectively the eigenvectors of the geometry and texture. We use eigenvectors in our experiments. and are vectors whose entries are the coefficients corresponding respectively to the eigenvectors of the geometry and texture. This representation allows our approach to manipulate an individual face region by modifying its coefficients and . Based on the PCA models of the face regions, a 3D face is parameterized as a tuple containing the coefficients.

Facial Attributes. Although different faces can be synthesized by changing the face coefficients and , in general these coefficients do not correspond to geometry and texture facial attributes that can be intuitively controlled by a human modeler for changing a face’s outlook. It would be desirable to devise a number of facial attributes in accordance with human language (e.g. , “changing the mouth to be wider”), to facilitate designers in interactively modifying a 3D face, and to allow our optimizer to learn from and mimic human artists on the tasks of modifying a face with respect to personality impression.

We describe how the effect of changing a facial attribute can be captured and subsequently applied for modifying a face. For simplicity, we assume that each facial attribute is defined only in one face region rather than across regions. Based on a set of exemplar faces from the Basel Face Model database with assigned facial attribute , we compute the sums:

(2)

where and are the average geometry and average texture computed over the whole Basel Face Model dataset. is the markedness of the attribute in face , which is manually assigned. is the normalization factor. Given a face , the result of changing facial attribute on this face is given by , where is a parameter for controlling the extent of applying facial attribute .

In total, we devise facial attributes. Each attribute is modeled by example faces. We demonstrate the effect of each attribute on an example face. It is worth noting that the representation of a 3D face can be replaced by other 3D face representations that provide controls of a face. Please find the corresponding results in the supplementary material.

Optimization Objectives. We synthesize a 3D face to give a desired personality impression by an optimization process, which considers two factors: (1) Personality Impression: how likely the synthesized face gives the desired personality impression. (2) Similarity Metric: how similar the synthesized face is with the input face.

Given an input 3D face and a desired personality impression type, our approach synthesizes a 3D face which gives the desired personality impression by minimizing a total cost function:

(3)

where contains the face coefficients for synthesizing a 3D face. is the personality impression cost term for evaluating image of the face synthesized from with regard to the desired personality impression type . The face image is rendered using the frontal view of the face. Lambertian surface reflectance is assumed and the illumination is approximated by second-order spherical harmonics [Ramamoorthi and Hanrahan2001]. is the similarity cost term, which measures the similarity between the image of the synthesized face and the image of the input face, constraining the deformation of the input face during the optimization. is a trade-off parameter to balance the costs of personality impression and similarity.

Personlaity Impression Classification

To compute the personality impression cost for a synthesized face in each iteration of the optimization, we leverage modern deep CNN with high-end performances and train a classifier for each personality impression type, which provides a score for the synthesized face with regard to the personality impression type. To achieve this, we create a face image dataset annotated with personality impression labels based on CASIA WebFace database [Yi et al.2014], which consists of face images covering both genders and different ethnicities. Then, we fine-tune GoogLeNet [Szegedy et al.2015] with a personality impression classification task on the dataset. Please refer to our supplementary material for more details about the database.

Learning. We construct our network based on the original GoogLeNet with pre-trained parameters. The network is layers deep with average pooling layers. It has a fully connected layer with units and rectified linear activation. During the fine-tuning process, the images with the corresponding labels in the personality impression dataset are fed to the network and an average classification loss is applied. Please find the details about the training process and the visualization of the network in our supplementary material.

Evaluation. We evaluate our approach with real face images on personality impression classification. We compare the fine-tuned GoogLeNet of our approach (CNN-R) with the approach of using landmark features [Zhu and Ramanan2012] and being trained by a standard SVM classifier (LM-R). Both approaches are based on the same splitting strategy of the dataset ( for training and for testing). CNN-R attains an average accuracy of across all personality impression types, whereas LM-R attains an average accuracy of . Please refer to the supplementary material for more quantitative comparison results.

Face Similarity Metric

To constrain the synthesized face to look similar to the input face, we evaluate the similarity between the image of the synthesized face and image of the original input face in the optimization. To achieve this, we train a Siamese network [Chopra, Hadsell, and LeCun2005], an end-to-end network, to evaluate whether a pair of face images correspond to the same face. The network learns a feature extractor which takes face images and outputs feature vectors, such that the feature vectors corresponding to images of the same face are close to each other, while those corresponding to images of different faces are far away from each other.

We train the Siamese network using the LFW dataset [Huang et al.2007]. The training dataset is constructed as , where and are any two images from the LFW dataset, and is the label. If and are from the same face, , otherwise .

The Siamese network consists of two identical Convolutional Networks that share the same set of weights . The training process learns the weights

by minimizing a loss function

, where and . is the mapped features of an input face image , which are synthesized by the learned identical Convolutional Network. By minimizing the loss function , the distance between the mapped features of and is driven by to be small if and correspond to the same face, and is driven by to be large vice versa. The constant is set as . The parameters are learned by standard cross entropy loss and back-propagation of the error.

Cost Functions

Given a textured 3D face model and a desired personality impression type as the input, our approach employs a data-driven MCMC sampler to update the face coefficients iteratively so as to modify the face. In each iteration, the synthesized face represented by is evaluated by the total cost . The optimization continues until a face giving the desired personality impression is synthesized. We discuss the personality impression cost and the similarity cost in the following.

Personality Impression Cost. The image of the face synthesized by face coefficients is evaluated with respect to the desired personality impression type in the cost function , defined based on the fine-tuned GoogLeNet:

(4)

where is the output of the full connected layer of the fine-tuned network. and reflect the possibilities of the image belonging to the personality impression type or not, respectively. is the face feature vector of on the -nd layer of the network; contains the parameters of the full connected layer, which map the feature vector to a 2D vector (our fine-tuned network is a two-category classifier).

A low cost value means the synthesized face image gives the desired type of personality impression, according to the classifier trained by face images annotated with personality impression labels.

(a) Input
(b)
(c)
Figure 2: Effects of when opimizing an example face to give a hostile personality impression. A larger constrains the synthesized face to resemble the input face more closely.

Similarity Cost. We want to constrain the synthesized face to look similar to the input face. To achieve this, we apply the Siamese network trained for evaluating the similarity between a pair of face images to define a similarity cost as a soft constraint of the optimization:

(5)

where and are the feature vectors of the image of the synthesized face and the image of the input face computed by the Siamese network. is a normalization factor computed over all face images from the LFW dataset. A low cost value means that the synthesized face image is similar to the input face image .

To demonstrate how the similarity cost term affects the face synthesis results during optimization, we show an example of optimizing a face model with the personality impression type of hostile in Figure 2. When the trade-off parameter is set as , the face is optimized to become more hostile-looking yet it differs from the input face significantly. When is set as , the face is optimized to look somewhat hostile and it resembles the input face more closely. In our experiments, we set by default.

Face Synthesis by Optimization

We use a Markov chain Monte Carlo (MCMC) sampler to explore the space of face coefficients efficiently. As the top-down nature of MCMC sampling makes it slow due to the initial ”burn-in” period, we devise a data-driven MCMC sampler for our problem. We propose two types of data-driven Markov chain dynamics: Region-Move and Prior-Move, corresponding to local refinement and global reconfiguration of the face.

Region-Move. We want to learn from how human artists modify faces to give a certain personality impression, so as to enable our sampler to mimic such modification process during an optimization. Considering that each face region’s contribution to a specified personality impression is different, we devise a Region-Move which modifies a face according to “important” face regions likely to be associated with the specified personality impression in training data.

Our training data is created based on face models. We recruited artists who are familiar with face modeling (with to years of experience in avatar design and 3D modeling). Each artist was asked to modify each of the face models to give the personality impression types by controlling the facial attributes. After the manual modifications, we project the original face models and all the manually modified face models into the PCA spaces, so that each face can be represented by its face coefficients .

For each personality type, let contain the sums of face coefficients differences for the face regions. is the sum of differences of the geometry coefficients of the -th face region, where and are the geometry coefficients of the original face model and a face model modified by an artist respectively. The sum of differences of the texture coefficients is similarly defined. Suppose the current face is with face coefficients . During sampling, a face region

is selected with probability

. Then a facial attribute in face region is randomly selected and modified so as to create a new face with new face coefficients , where . The changes and are learned in Section Facial Attribute for each facial attribute .

Essentially, a face region that is more commonly modified by artists to achieve the target personality impression type is modified by our sampler with a higher probability.

Figure 3: Total costs over iterations in optimizing a face using a data-driven sampler and a baseline sampler.

Prior-Move. We also leverage the personality impression dataset to learn a prior distribution of the face coefficients for each personality impression, so as to guide our sampler to sample face coefficients near the prior face coefficients, which likely induce a similar personality impression.

Figure 4: Results of synthesizing faces with different personality impression types.

For each personality impression type

, we estimate a prior distribution with the following steps:

  • Select images in the personality impression dataset which are annotated with the personality impression type ; and form a subset .

  • Reconstruct the corresponding 3D face model for each image by the implementation of [Blanz and Vetter1999, Blanz and Vetter2003]. These 3D face models are projected onto the PCA spaces and are represented using face coefficients. Thus, we form a face coefficients set .

  • Fit a normal distribution for each of the geometry and texture coefficients (

    and ) of each face region based on .

Given the prior distribution, our sampler draws a value from the normal distribution of each of the geometry and texture coefficients, to generate new face coefficients .

Optimization. We apply simulated annealing with a Metropolis-Hastings state-searching step to search for face coefficients that minimize the total cost function . In each iteration of the optimization, one type of moves is selected and applied to propose new face coefficients , which is evaluated by the total cost function . The Region-Move and Prior-Move are selected with probabilities and respectively. In our experiments, we set by default. The proposed face coefficients generated by the move are accepted according to the Metropolis criterion:

(6)

where is a Boltzmann-like objective function and is the temperature parameter of the annealing process. By default, we empirically set to and decrease it by every iterations until it reaches zero. We terminate the optimization if the absolute change in the total cost value is less than over the past iterations. In our experiments, a full optimization takes about iterations (about seconds) to finish.

Figure 3 shows an example of optimizing a face using the proposed data-driven sampler and using a baseline sampler which randomly picks one of the facial attributes and resets its value randomly. By using more effective moves, the data-driven optimization converges faster to obtain a solution with a lower cost value.

Experiments

We conducted experiments on a Linux machine equipped with an Intel i7-5930K CPU, 32GB of RAM and a Nvidia GTX 1080 graphics card. The optimization and learning components of our approach were implemented in C++.

Results and Discussion

Different Faces. We test our approach to synthesizing different faces to give different personality impressions. Figure 4 shows two groups of the input faces and the synthesized faces. For each input face, a face is synthesized using each of the impression types. Please refer to our supplementary material for more results of different races.

We observe some interesting features that may result in the corresponding personality impressions. For instance, comparing the results of confident and unconfident faces, we observe that the confident faces usually have a higher nose bridge and bigger eyes. In addition, the eyebrows also look sharp and slightly slanted, which make a person look like in a state of concentration. The mouth corners lift slightly and the mouths show a subtle smile. As for the unconfident faces, the eyebrows are generally dropping or furrowed, showing a subtle sign of nervousness. The eye corners are also dropping, and the eyes look tired. The cheeks generally look more bonier. The mouths are also drooping, which could be perceived as signs of frustration.

We observe that usually a combination of facial features accounts for the personality impression of a face. As there are as many as facial attributes, it is rather hard to manually tune these attributes to model a face. The CNN classifiers effectively learn the relationships between facial features and a personality impression type, such that they can drive face synthesis by personality impression automatically.

Multiple Personality Impressions. We also apply our approach to synthesize faces with respect to multiple personality impressions. Such faces could be useful in movies and games. For example, it is common to have antagonists who look smart and hostile. Our approach can be easily extended to synthesize such faces, by optimizing a face with respect to multiple personality impression costs, each of which corresponds to one personality impression type. Please refer to our supplementary material for details.

Figure 5: Characters with faces synthesized by our approach. Left: a silly-looking man, a smart-looking lady and a confident-looking boss in an office. Right: a boring-looking man, a hostile-looking man and an unconfident-looking lady on a street.

Generating Crowds. Crowds of virtual characters showing certain personality impressions are often needed in movies and computer games. Our approach makes it very easy and convenient to create such virtual characters. Figure 5 shows two examples of an office scene and a street scene showing virtual character faces synthesized with personality impressions. Our optimization approach could be employed for automatically synthesizing virtual character faces (e.g. , generating random hostile-looking enemies in a 3D game) to enhance the realism of a virtual world.

Remodeling 3D-reconstructed Faces. Our approach can also be used for remodeling 3D-reconstructed faces to give different types of personality impression, which can help the user generate some personalized characters for 3D games and movies. We reconstructed their 3D faces based on their face images [Blanz and Vetter1999]. We then applied our approach to remodeling their faces with respect to different types of personality impressions. In all cases, the similarity cost constrains the synthesized faces to resemble the original faces of the real persons. We show some examples in our supplementary material.

Perceptual Studies

We conducted perceptual studies to evaluate the quality of our results. The major goal is to verify if the perceived personality impressions of the synthesized faces match with the personality impression types. We recruited participants from different countries via Amazon Turk. They are evenly distributed by gender and are aged to

. Each participant was shown some synthesized faces and was asked about the personality impression they perceived. Definitions of the personality impression types from a dictionary were shown as reference. Our supplementary material contains all original data and the results of t-tests, discussed as follows.

Recognizing Face Personality Impression. In this study, we want to verify if the personality impression types of the synthesized faces agree with human impressions. We used the faces from Figure 4. Each of these faces was synthesized using a single personality impression type and was voted by human participants. In voting for the personality impression type of a face, a participant needed to choose out of the personality impression types used in our approach. In total, we obtained votes for faces.

Figure 6

shows the results as a confusion matrix. The average accuracy is about

(compared to the chance-level classification accuracy of ). For each personality impression type, the matching type gets the highest number of votes as shown by the diagonal.

“Friendly” and “hostile” receive a relatively high accuracy (about ), probably because the facial features leading to such personality impressions are more prominent and easily recognizable. For example, participants usually perceive a face as hostile-looking when they see dense moustache, slanted eyebrows and a drooping mouth. For other personality impressions such as humorous and boring, the accuracy is relatively lower (about ), probably because the facial features leading to such personality impressions are less apparent, or because the participants do not have a strong association between facial features and such personality impressions.

The facial features of some personalities are overlapped, which makes some people have several different, but similar personalities. For instance, a smart person may also look confident. Thus, participants may choose a similar which reduces the total accuracy.

Figure 6: Accuracy of determining a single personality impression type of faces synthesized in the perceptual study. Percentages of votes are shown.

We also investigate how human participants form the personality impressions of faces synthesized with two personality impression types in our supplementary material.

Influence of Expression. Next we want to investigate whether facial expression changes will affect the personality impression of the synthesized faces. For example, does a face optimized to be hostile-looking still look hostile with a happy smile? Such findings could bring interesting insights for designing virtual character faces. We conducted an empirical study to investigate the effects of expressions on the synthesized faces. Please also refer to our supplementary material for details.

The average accuracy is about (a drop from on synthesized faces without any expression). Facial expressions do have an impact on some personality impressions. For example, with an angry expression, a face optimized to be friendly-looking may appear hostile. The accuracy of the friendly (angry) face is ; compared to the accuracy of on the friendly face without any expression (Figure 6), the accuracy drops by . However, the personality impression on confident-looking faces seems to be relatively unaffected by facial expressions. For instance, even with an angry expression, a face optimized to look confident still has votes for confident. This is probably because people have strong associations between certain facial features and “confident”, and those facial features are still apparent under facial expression changes.

Though this study is not comprehensive, it gives some good insights about the effects of expressions on personality impression. We believe that a more comprehensive perceptual study will be an interesting avenue for future research.

Summary

Limitations. To stay focused on face’s geometry and texture, we do not consider the influence of hair, accessories or clothing (e.g. , hair style, hair color, hats, glasses) on personality impression. Besides, speech and facial movements, as well as head and body poses, can also influence the impression of one’s personality, just as experienced actors can change the personality impressions they make by controlling speech, facial expression and body movements. While we only focus on static facial features in this work, we refer the reader to recent interesting efforts on adding personality to human motion [Durupinar et al.2017].

Future Work. Our approach could be extended to consider more personality impression types, other high-level perceptual factors, or synthesizing faces of cartoon characters to give certain personality impressions. Our approach follows the discriminative criteria to train the personality impression classifier. For future work, it would be interesting to investigate applying a deep generative network for synthesizing 3D faces, as the adversarial training approach (GAN) goodfellow2014generative has witnessed good successes in 2D image generation.

References

  • [Blanz and Vetter1999] Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In ACM SIGGRAPH, 187–194. ACM Press/Addison-Wesley Publishing Co.
  • [Blanz and Vetter2003] Blanz, V., and Vetter, T. 2003. Face recognition based on fitting a 3d morphable model. IEEE PAMI 25(9):1063–1074.
  • [Chopra, Hadsell, and LeCun2005] Chopra, S.; Hadsell, R.; and LeCun, Y. 2005. Learning a similarity metric discriminatively, with application to face verification. In CVPR. IEEE.
  • [Durupinar et al.2017] Durupinar, F.; Kapadia, M.; Deutsch, S.; Neff, M.; and Badler, N. I. 2017. Perform: Perceptual approach for adding ocean personality to human motion using laban movement analysis. TOG 36(1):6.
  • [Eisenthal, Dror, and Ruppin2006] Eisenthal, Y.; Dror, G.; and Ruppin, E. 2006. Facial attractiveness: Beauty and the machine. Neural Computation 18(1):119–142.
  • [Goodfellow et al.2014] Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. 2014. Generative adversarial nets. In NIPS. Curran Associates, Inc. 2672–2680.
  • [Gray et al.2010] Gray, D.; Yu, K.; Xu, W.; and Gong, Y. 2010. Predicting facial beauty without landmarks. ECCV 434–447.
  • [Hassin and Trope2000] Hassin, R., and Trope, Y. 2000. Facing faces: studies on the cognitive aspects of physiognomy. Journal of Personality and Social Psychology 78(5):837.
  • [Hu et al.2017] Hu, L.; Saito, S.; Wei, L.; Nagano, K.; Seo, J.; Fursund, J.; Sadeghi, I.; Sun, C.; Chen, Y.-C.; and Li, H. 2017. Avatar digitization from a single image for real-time rendering. TOG 36(6):195.
  • [Huang et al.2007] Huang, G. B.; Ramesh, M.; Berg, T.; and Learned-Miller, E. 2007. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49.
  • [Huang et al.2017] Huang, H.; Kalogerakis, E.; Yumer, E.; and Mech, R. 2017. Shape synthesis from sketches via procedural models and convolutional networks.
  • [Joo, Steen, and Zhu2015] Joo, J.; Steen, F. F.; and Zhu, S.-C. 2015. Automated facial trait judgment and election outcome prediction: Social dimensions of face. In ICCV.
  • [Kalogerakis et al.2012] Kalogerakis, E.; Chaudhuri, S.; Koller, D.; and Koltun, V. 2012. A probabilistic model for component-based shape synthesis. TOG 31(4):55.
  • [Le, Why, and Ashraf2011] Le, N.; Why, Y.; and Ashraf, G. 2011. Shape stylized face caricatures. Advances in Multimedia Modeling 536–547.
  • [Marsella et al.2013] Marsella, S.; Xu, Y.; Lhommet, M.; Feng, A.; Scherer, S.; and Shapiro, A. 2013. Virtual character performance from speech. In Proceedings of Eurographics Symposium on Computer Animation, 25–35. ACM.
  • [Mischel2013] Mischel, W. 2013. Personality and assessment. Psychology Press.
  • [Naumann et al.2009] Naumann, L. P.; Vazire, S.; Rentfrow, P. J.; and Gosling, S. D. 2009. Personality judgments based on physical appearance. Personality and Social Psychology Bulletin 35(12):1661–1671.
  • [Over and Cook2018] Over, H., and Cook, R. 2018. Where do spontaneous first impressions of faces come from? Cognition 170:190–200.
  • [Paysan et al.2009] Paysan, P.; Knothe, R.; Amberg, B.; Romdhani, S.; and Vetter, T. 2009. A 3d face model for pose and illumination invariant face recognition. In Advanced video and signal based surveillance, 2009. AVSS’09. Sixth IEEE International Conference on, 296–301. Ieee.
  • [Ramamoorthi and Hanrahan2001] Ramamoorthi, R., and Hanrahan, P. 2001. An efficient representation for irradiance environment maps. In SIGGRAPH, 497–500. ACM.
  • [Ritchie et al.2015] Ritchie, D.; Mildenhall, B.; Goodman, N. D.; and Hanrahan, P. 2015. Controlling procedural modeling programs with stochastically-ordered sequential monte carlo. TOG 34(4):105.
  • [Saito et al.2017] Saito, S.; Wei, L.; Hu, L.; Nagano, K.; and Li, H. 2017. Photorealistic facial texture inference using deep neural networks. In CVPR. IEEE.
  • [Sohre et al.2018] Sohre, N.; Adeagbo, M.; Helwig, N.; Lyford-Pike, S.; and Guy, S. 2018. Pvl: A framework for navigating the precision-variety trade-off in automated animation of smiles. AAAI.
  • [Suwajanakorn, Seitz, and Kemelmacher-Shlizerman2015] Suwajanakorn, S.; Seitz, S. M.; and Kemelmacher-Shlizerman, I. 2015. What makes tom hanks look like tom hanks. In ICCV.
  • [Szegedy et al.2015] Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; and Rabinovich, A. 2015. Going deeper with convolutions. In CVPR, 1–9.
  • [Talton et al.2011] Talton, J. O.; Lou, Y.; Lesser, S.; Duke, J.; Měch, R.; and Koltun, V. 2011. Metropolis procedural modeling. TOG 30(2):11.
  • [Tian and Xiao2016] Tian, L., and Xiao, S. 2016. Facial feature exaggeration according to social psychology of face perception. Computer Graphics Forum 35(7):391–399.
  • [Vernon et al.2014] Vernon, R. J.; Sutherland, C. A.; Young, A. W.; and Hartley, T. 2014. Modeling first impressions from highly variable facial images. Proceedings of the National Academy of Sciences 111(32):E3353–E3361.
  • [Willis and Todorov2006] Willis, J., and Todorov, A. 2006. First impressions making up your mind after a 100-ms exposure to a face. Psychological Science 17(7):592–598.
  • [Xu et al.2015] Xu, J.; Jin, L.; Liang, L.; Feng, Z.; and Xie, D. 2015. A new humanlike facial attractiveness predictor with cascaded fine-tuning deep learning model. arXiv:1511.02465.
  • [Yi et al.2014] Yi, D.; Lei, Z.; Liao, S.; and Li, S. Z. 2014. Learning face representation from scratch. arXiv:1411.7923.
  • [Zell et al.2015] Zell, E.; Aliaga, C.; Jarabo, A.; Zibrek, K.; Gutierrez, D.; McDonnell, R.; and Botsch, M. 2015. To stylize or not to stylize?: the effect of shape and material stylization on the perception of computer-generated faces. TOG 34(6):184.
  • [Zhu and Ramanan2012] Zhu, X., and Ramanan, D. 2012. Face detection, pose estimation, and landmark localization in the wild. In CVPR, 2879–2886.