LOTR: Face Landmark Localization Using Localization Transformer

09/21/2021
by   Ukrit Watchareeruetai, et al.
0

This paper presents a novel Transformer-based facial landmark localization network named Localization Transformer (LOTR). The proposed framework is a direct coordinate regression approach leveraging a Transformer network to better utilize the spatial information in the feature map. An LOTR model consists of three main modules: 1) a visual backbone that converts an input image into a feature map, 2) a Transformer module that improves the feature representation from the visual backbone, and 3) a landmark prediction head that directly predicts the landmark coordinates from the Transformer's representation. Given cropped-and-aligned face images, the proposed LOTR can be trained end-to-end without requiring any post-processing steps. This paper also introduces the smooth-Wing loss function, which addresses the gradient discontinuity of the Wing loss, leading to better convergence than standard loss functions such as L1, L2, and Wing loss. Experimental results on the JD landmark dataset provided by the First Grand Challenge of 106-Point Facial Landmark Localization indicate the superiority of LOTR over the existing methods on the leaderboard and two recent heatmap-based approaches.

READ FULL TEXT
research
01/28/2018

Joint Voxel and Coordinate Regression for Accurate 3D Facial Landmark Localization

3D face shape is more expressive and viewpoint-consistent than its 2D co...
research
07/08/2022

RePFormer: Refinement Pyramid Transformer for Robust Facial Landmark Detection

This paper presents a Refinement Pyramid Transformer (RePFormer) for rob...
research
10/19/2020

A Backbone Replaceable Fine-tuning Network for Stable Face Alignment

Heatmap regression based face alignment algorithms have achieved promine...
research
09/23/2022

Transformer-Based Microbubble Localization

Ultrasound Localization Microscopy (ULM) is an emerging technique that e...
research
07/13/2015

Unconstrained Facial Landmark Localization with Backbone-Branches Fully-Convolutional Networks

This paper investigates how to rapidly and accurately localize facial la...
research
09/22/2022

Colonoscopy Landmark Detection using Vision Transformers

Colonoscopy is a routine outpatient procedure used to examine the colon ...
research
03/19/2022

Multi-Domain Multi-Definition Landmark Localization for Small Datasets

We present a novel method for multi image domain and multi-landmark defi...

Please sign up or login with your details

Forgot password? Click here to reset