Estimating Extreme 3D Image Rotation with Transformer Cross-Attention

03/05/2023
by   Shay Dekel, et al.
1

The estimation of large and extreme image rotation plays a key role in multiple computer vision domains, where the rotated images are related by a limited or a non-overlapping field of view. Contemporary approaches apply convolutional neural networks to compute a 4D correlation volume to estimate the relative rotation between image pairs. In this work, we propose a cross-attention-based approach that utilizes CNN feature maps and a Transformer-Encoder, to compute the cross-attention between the activation maps of the image pairs, which is shown to be an improved equivalent of the 4D correlation volume, used in previous works. In the suggested approach, higher attention scores are associated with image regions that encode visual cues of rotation. Our approach is end-to-end trainable and optimizes a simple regression loss. It is experimentally shown to outperform contemporary state-of-the-art schemes when applied to commonly used image rotation datasets and benchmarks, and establishes a new state-of-the-art accuracy on these datasets. We make our code publicly available.

READ FULL TEXT

page 1

page 8

page 9

research
03/20/2021

Paying Attention to Multiscale Feature Maps in Multimodal Image Matching

We propose an attention-based approach for multimodal image patch matchi...
research
12/06/2022

AbHE: All Attention-based Homography Estimation

Homography estimation is a basic computer vision task, which aims to obt...
research
04/28/2021

Extreme Rotation Estimation using Dense Correlation Volumes

We present a technique for estimating the relative 3D rotation of an RGB...
research
05/17/2022

POViT: Vision Transformer for Multi-objective Design and Characterization of Nanophotonic Devices

We solve a fundamental challenge in semiconductor IC design: the fast an...
research
03/21/2021

Paying Attention to Activation Maps in Camera Pose Regression

Camera pose regression methods apply a single forward pass to the query ...
research
10/30/2018

Multimodal matching using a Hybrid Convolutional Neural Network

In this work we propose a novel Convolutional Neural Network (CNN) archi...
research
05/24/2017

Deep Rotation Equivariant Network

Recently, learning equivariant representations has attracted considerabl...

Please sign up or login with your details

Forgot password? Click here to reset