Knowing When to Quit: Selective Cascaded Regression with Patch Attention for Real-Time Face Alignment

08/01/2021
by   Gil Shapira, et al.
5

Facial landmarks (FLM) estimation is a critical component in many face-related applications. In this work, we aim to optimize for both accuracy and speed and explore the trade-off between them. Our key observation is that not all faces are created equal. Frontal faces with neutral expressions converge faster than faces with extreme poses or expressions. To differentiate among samples, we train our model to predict the regression error after each iteration. If the current iteration is accurate enough, we stop iterating, saving redundant iterations while keeping the accuracy in check. We also observe that as neighboring patches overlap, we can infer all facial landmarks (FLMs) with only a small number of patches without a major accuracy sacrifice. Architecturally, we offer a multi-scale, patch-based, lightweight feature extractor with a fine-grained local patch attention module, which computes a patch weighting according to the information in the patch itself and enhances the expressive power of the patch features. We analyze the patch attention data to infer where the model is attending when regressing facial landmarks and compare it to face attention in humans. Our model runs in real-time on a mobile device GPU, with 95 Mega Multiply-Add (MMA) operations, outperforming all state-of-the-art methods under 1000 MMA, with a normalized mean error of 8.16 on the 300W challenging dataset.

READ FULL TEXT

page 1

page 6

page 7

page 8

research
03/13/2022

Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning

Heatmap regression methods have dominated face alignment area in recent ...
research
06/19/2020

Attention Mesh: High-fidelity Face Mesh Prediction in Real-time

We present Attention Mesh, a lightweight architecture for 3D face mesh p...
research
08/09/2017

Joint Face Alignment and 3D Face Reconstruction with Application to Face Recognition

Face alignment and 3D face reconstruction are traditionally accomplished...
research
12/19/2021

Improving Face-Based Age Estimation with Attention-Based Dynamic Patch Fusion

With the increasing popularity of convolutional neural networks (CNNs), ...
research
08/04/2023

M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition

Recently, vision Transformers (ViTs) have been actively applied to fine-...
research
08/19/2022

Accelerating Vision Transformer Training via a Patch Sampling Schedule

We introduce the notion of a Patch Sampling Schedule (PSS), that varies ...
research
10/03/2011

Face Recognition using Optimal Representation Ensemble

Recently, the face recognizers based on linear representations have been...

Please sign up or login with your details

Forgot password? Click here to reset