EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding

01/05/2023
by   Shuhan Tan, et al.
0

Recent advances in egocentric video understanding models are promising, but their heavy computational expense is a barrier for many real-world applications. To address this challenge, we propose EgoDistill, a distillation-based approach that learns to reconstruct heavy egocentric video clip features by combining the semantics from a sparse set of video frames with the head motion from lightweight IMU readings. We further devise a novel self-supervised training strategy for IMU feature learning. Our method leads to significant improvements in efficiency, requiring 200x fewer GFLOPs than equivalent video models. We demonstrate its effectiveness on the Ego4D and EPICKitchens datasets, where our method outperforms state-of-the-art efficient video understanding methods.

READ FULL TEXT

page 1

page 4

page 7

page 8

research
07/20/2020

MotionSqueeze: Neural Motion Feature Learning for Video Understanding

Motion plays a crucial role in understanding videos and most state-of-th...
research
03/28/2023

SELF-VS: Self-supervised Encoding Learning For Video Summarization

Despite its wide range of applications, video summarization is still hel...
research
09/16/2022

Spatial-then-Temporal Self-Supervised Learning for Video Correspondence

Learning temporal correspondence from unlabeled videos is of vital impor...
research
10/04/2021

How You Move Your Head Tells What You Do: Self-supervised Video Representation Learning with Egocentric Cameras and IMU Sensors

Understanding users' activities from head-mounted cameras is a fundament...
research
10/15/2019

Tiny Video Networks

Video understanding is a challenging problem with great impact on the ab...
research
12/07/2021

Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning

Despite the great progress in video understanding made by deep convoluti...
research
08/24/2023

Motion-Guided Masking for Spatiotemporal Representation Learning

Several recent works have directly extended the image masked autoencoder...

Please sign up or login with your details

Forgot password? Click here to reset