DeepAI AI Chat
Log In Sign Up

Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

by   Keshav Bhandari, et al.

Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature. This paper proposes the first perceptually natural-synthetic omnidirectional benchmark dataset with a 360-degree field of view, FLOW360, with 40 different videos and 4,000 video frames. We conduct comprehensive characteristic analysis and comparisons between our dataset and existing optical flow datasets, which manifest perceptual realism, uniqueness, and diversity. To accommodate the omnidirectional nature, we present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF). We train our network in a contrastive manner with a hybrid loss function that combines contrastive loss and optical flow loss. Extensive experiments verify the proposed framework's effectiveness and show up to 40 state-of-the-art approaches. Our FLOW360 dataset and code are available at


page 6

page 7

page 19

page 21

page 22


Optical Flow Estimation in 360^∘ Videos: Dataset, Model and Application

Optical flow estimation has been a long-lasting and fundamental problem ...

GlobalFlowNet: Video Stabilization using Deep Distilled Global Motion Estimates

Videos shot by laymen using hand-held cameras contain undesirable shaky ...

Motion-Focused Contrastive Learning of Video Representations

Motion, as the most distinct phenomenon in a video to involve the change...

PanoFlow: Learning Optical Flow for Panoramic Images

Optical flow estimation is a basic task in self-driving and robotics sys...

Revisiting Optical Flow Estimation in 360 Videos

Nowadays 360 video analysis has become a significant research topic in t...

Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting

Instance-level contrastive learning techniques, which rely on data augme...

From Third Person to First Person: Dataset and Baselines for Synthesis and Retrieval

First-person (egocentric) and third person (exocentric) videos are drast...