DeePoint: Pointing Recognition and Direction Estimation From A Fixed View

04/14/2023
by   Shu Nakamura, et al.
0

In this paper, we realize automatic visual recognition and direction estimation of pointing. We introduce the first neural pointing understanding method based on two key contributions. The first is the introduction of a first-of-its-kind large-scale dataset for pointing recognition and direction estimation, which we refer to as the DP Dataset. DP Dataset consists of more than 2 million frames of over 33 people pointing in various styles annotated for each frame with pointing timings and 3D directions. The second is DeePoint, a novel deep network model for joint recognition and 3D direction estimation of pointing. DeePoint is a Transformer-based network which fully leverages the spatio-temporal coordination of the body parts, not just the hands. Through extensive experiments, we demonstrate the accuracy and efficiency of DeePoint. We believe DP Dataset and DeePoint will serve as a sound foundation for visual human intention understanding.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 7

page 8

page 10

research
11/07/2022

Recognition of Facets for Knapsack Polytope is DP-complete

DP is a complexity class that is the class of all languages that are the...
research
02/16/2022

Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization

Direct-path relative transfer function (DP-RTF) refers to the ratio betw...
research
01/08/2022

Spatio-Temporal Tuples Transformer for Skeleton-Based Action Recognition

Capturing the dependencies between joints is critical in skeleton-based ...
research
04/21/2023

DP-Adam: Correcting DP Bias in Adam's Second Moment Estimation

We observe that the traditional use of DP with the Adam optimizer introd...
research
03/22/2023

Exploring the Benefits of Visual Prompting in Differential Privacy

Visual Prompting (VP) is an emerging and powerful technique that allows ...
research
11/26/2020

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

We present a novel method for multi-view depth estimation from a single ...
research
05/11/2021

DP-SIGNSGD: When Efficiency Meets Privacy and Robustness

Federated learning (FL) has emerged as a promising collaboration paradig...

Please sign up or login with your details

Forgot password? Click here to reset