Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery

09/05/2023
by   JoonKyu Park, et al.
0

Understanding how two hands interact with each other is a key component of accurate 3D interacting hand mesh recovery. However, recent Transformer-based methods struggle to learn the interaction between two hands as they directly utilize two hand features as input tokens, which results in distant token problem. The distant token problem represents that input tokens are in heterogeneous spaces, leading Transformer to fail in capturing correlation between input tokens. Previous Transformer-based methods suffer from the problem especially when poses of two hands are very different as they project features from a backbone to separate left and right hand-dedicated features. We present EANet, extract-and-adaptation network, with EABlock, the main component of our network. Rather than directly utilizing two hand features as input tokens, our EABlock utilizes two complementary types of novel tokens, SimToken and JoinToken, as input tokens. Our two novel tokens are from a combination of separated two hand features; hence, it is much more robust to the distant token problem. Using the two type of tokens, our EABlock effectively extracts interaction feature and adapts it to each hand. The proposed EANet achieves the state-of-the-art performance on 3D interacting hands benchmarks. The codes are available at https://github.com/jkpark0825/EANet.

READ FULL TEXT
research
11/19/2022

TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer

In this paper, we introduce a set of effective TOken REduction (TORE) st...
research
04/19/2022

Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer

Vision transformers have achieved great successes in many computer visio...
research
03/29/2022

MatteFormer: Transformer-Based Image Matting via Prior-Tokens

In this paper, we propose a transformer-based image matting model called...
research
07/21/2023

Strip-MLP: Efficient Token Interaction for Vision MLP

Token interaction operation is one of the core modules in MLP-based mode...
research
03/27/2023

Recovering 3D Hand Mesh Sequence from a Single Blurry Image: A New Dataset and Temporal Unfolding

Hands, one of the most dynamic parts of our body, suffer from blur due t...
research
03/23/2023

Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild

Despite recent achievements, existing 3D interacting hands recovery meth...
research
05/10/2022

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

Transformers have achieved great success in pluralistic image inpainting...

Please sign up or login with your details

Forgot password? Click here to reset