Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data

04/21/2016
by   Jing Wang, et al.
0

The way people look in terms of facial attributes (ethnicity, hair color, facial hair, etc.) and the clothes or accessories they wear (sunglasses, hat, hoodies, etc.) is highly dependent on geo-location and weather condition, respectively. This work explores, for the first time, the use of this contextual information, as people with wearable cameras walk across different neighborhoods of a city, in order to learn a rich feature representation for facial attribute classification, without the costly manual annotation required by previous methods. By tracking the faces of casual walkers on more than 40 hours of egocentric video, we are able to cover tens of thousands of different identities and automatically extract nearly 5 million pairs of images connected by or from different face tracks, along with their weather and location context, under pose and lighting variations. These image pairs are then fed into a deep network that preserves similarity of images connected by the same track, in order to capture identity-related attribute features, and optimizes for location and weather prediction to capture additional facial attribute features. Finally, the network is fine-tuned with manually annotated samples. We perform an extensive experimental analysis on wearable data and two standard benchmark datasets based on web images (LFWA and CelebA). Our method outperforms by a large margin a network trained from scratch. Moreover, even without using manually annotated identity labels for pre-training as in previous methods, our approach achieves results that are better than the state of the art.

READ FULL TEXT

page 2

page 4

page 6

page 8

research
02/12/2019

De-identification without losing faces

Training of deep learning models for computer vision requires large imag...
research
11/28/2014

Deep Learning Face Attributes in the Wild

Predicting face attributes in the wild is challenging due to complex fac...
research
11/30/2021

CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning

For video captioning, "pre-training and fine-tuning" has become a de fac...
research
10/19/2022

FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping

In this work, we present a new single-stage method for subject agnostic ...
research
10/12/2019

How are attributes expressed in face DCNNs?

As deep networks become increasingly accurate at recognizing faces, it i...
research
12/06/2021

General Facial Representation Learning in a Visual-Linguistic Manner

How to learn a universal facial representation that boosts all face anal...

Please sign up or login with your details

Forgot password? Click here to reset