FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation

03/02/2023
by   Xiaoyu Shi, et al.
0

FlowFormer introduces a transformer architecture into optical flow estimation and achieves state-of-the-art performance. The core component of FlowFormer is the transformer-based cost-volume encoder. Inspired by the recent success of masked autoencoding (MAE) pretraining in unleashing transformers' capacity of encoding visual representation, we propose Masked Cost Volume Autoencoding (MCVA) to enhance FlowFormer by pretraining the cost-volume encoder with a novel MAE scheme. Firstly, we introduce a block-sharing masking strategy to prevent masked information leakage, as the cost maps of neighboring source pixels are highly correlated. Secondly, we propose a novel pre-text reconstruction task, which encourages the cost-volume encoder to aggregate long-range information and ensures pretraining-finetuning consistency. We also show how to modify the FlowFormer architecture to accommodate masks during pretraining. Pretrained with MCVA, FlowFormer++ ranks 1st among published methods on both Sintel and KITTI-2015 benchmarks. Specifically, FlowFormer++ achieves 1.07 and 1.94 average end-point error (AEPE) on the clean and final pass of Sintel benchmark, leading to 7.76% and 7.18% error reductions from FlowFormer. FlowFormer++ obtains 4.52 F1-all on the KITTI-2015 test set, improving FlowFormer by 0.16.

READ FULL TEXT

page 4

page 6

research
06/08/2023

FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow

This paper introduces a novel transformer-based network architecture, Fl...
research
03/30/2022

FlowFormer: A Transformer Architecture for Optical Flow

We introduce Optical Flow TransFormer (FlowFormer), a transformer-based ...
research
07/31/2023

SAMFlow: Eliminating Any Fragmentation in Optical Flow with Segment Anything Model

Optical flow estimation aims to find the 2D dense motion field between t...
research
01/16/2021

Optical Flow Estimation via Motion Feature Recovery

Optical flow estimation with occlusion or large displacement is a proble...
research
04/17/2023

LLA-FLOW: A Lightweight Local Aggregation on Cost Volume for Optical Flow Estimation

Lack of texture often causes ambiguity in matching, and handling this is...
research
12/20/2022

CGCV:Context Guided Correlation Volume for Optical Flow Neural Networks

Optical flow, which computes the apparent motion from a pair of video fr...
research
12/23/2019

The Five Elements of Flow

In this work we propose five concrete steps to improve the performance o...

Please sign up or login with your details

Forgot password? Click here to reset