Temporal alignment of fine-grained human actions in videos is important ...
State-of-the-art methods for self-supervised sequential action alignment...
Narrated instructional videos often show and describe manipulations of
s...
We present, for the first time, a comprehensive framework for egocentric...
Mixed reality headsets, such as the Microsoft HoloLens 2, are powerful
s...
Modeling hand-object manipulations is essential for understanding how hu...
The lack of large-scale real datasets with annotationsmakes transfer lea...
We present a unified framework for understanding 3D hand and object
inte...
We propose a single-shot approach for simultaneously detecting an object...
Most recent approaches to monocular 3D human pose estimation rely on Dee...
We propose an efficient approach to exploiting motion information from
c...