Human actions in egocentric videos are often hand-object interactions
co...
One promising use case of AI assistants is to help with complex procedur...
We present AssemblyHands, a large-scale benchmark dataset with accurate ...
Temporal action segmentation from videos aims at the dense labeling of v...
Assembly101 is a new procedural activity dataset featuring 4321 videos o...
Modeling the visual changes that an action brings to a scene is critical...
Can we teach a robot to recognize and make predictions for activities th...
This technical report extends our work presented in [9] with more
experi...
Future prediction requires reasoning from current and past observations ...
The task of temporally detecting and segmenting actions in untrimmed vid...
When judging style, a key question that often arises is whether or not a...
How can we teach a robot to predict what will happen next for an activit...
This paper presents a new method for unsupervised segmentation of comple...
This paper is motivated from a young boy's capability to recognize an
il...