Where in the World is this Image? Transformer-based Geo-localization in the Wild

04/29/2022
by   Shraman Pramanick, et al.
7

Predicting the geographic location (geo-localization) from a single ground-level RGB image taken anywhere in the world is a very challenging problem. The challenges include huge diversity of images due to different environmental scenarios, drastic variation in the appearance of the same location depending on the time of the day, weather, season, and more importantly, the prediction is made from a single image possibly having only a few geo-locating cues. For these reasons, most existing works are restricted to specific cities, imagery, or worldwide landmarks. In this work, we focus on developing an efficient solution to planet-scale single-image geo-localization. To this end, we propose TransLocator, a unified dual-branch transformer network that attends to tiny details over the entire image and produces robust feature representation under extreme appearance variations. TransLocator takes an RGB image and its semantic segmentation map as inputs, interacts between its two parallel branches after each transformer layer, and simultaneously performs geo-localization and scene recognition in a multi-task fashion. We evaluate TransLocator on four benchmark datasets - Im2GPS, Im2GPS3k, YFCC4k, YFCC26k and obtain 5.5 state-of-the-art. TransLocator is also validated on real-world test images and found to be more effective than previous methods.

READ FULL TEXT

page 2

page 8

page 13

page 25

page 26

research
08/07/2019

Location Field Descriptors: Single Image 3D Model Retrieval in the Wild

We present Location Field Descriptors, a novel approach for single image...
research
10/19/2021

Bilateral-ViT for Robust Fovea Localization

The fovea is an important anatomical landmark of the retina. Detecting t...
research
05/05/2023

HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

Visual localization is critical to many applications in computer vision ...
research
09/13/2019

Hierarchical Joint Scene Coordinate Classification and Regression for Visual Localization

Visual localization is pivotal to many applications in computer vision a...
research
03/07/2023

Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes

Determining the exact latitude and longitude that a photo was taken is a...
research
09/17/2023

Effective Image Tampering Localization via Enhanced Transformer and Co-attention Fusion

Powerful manipulation techniques have made digital image forgeries be ea...
research
07/23/2023

ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple yet General Complementary Transformer

Deep learning (DL) has advanced the field of dense prediction, while gra...

Please sign up or login with your details

Forgot password? Click here to reset