Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes

03/07/2023
by   Brandon Clark, et al.
0

Determining the exact latitude and longitude that a photo was taken is a useful and widely applicable task, yet it remains exceptionally difficult despite the accelerated progress of other computer vision tasks. Most previous approaches have opted to learn a single representation of query images, which are then classified at different levels of geographic granularity. These approaches fail to exploit the different visual cues that give context to different hierarchies, such as the country, state, and city level. To this end, we introduce an end-to-end transformer-based architecture that exploits the relationship between different geographic levels (which we refer to as hierarchies) and the corresponding visual scene information in an image through hierarchical cross-attention. We achieve this by learning a query for each geographic hierarchy and scene type. Furthermore, we learn a separate representation for different environmental scenes, as different scenes in the same location are often defined by completely different visual features. We achieve state of the art street level accuracy on 4 standard geo-localization datasets : Im2GPS, Im2GPS3k, YFCC4k, and YFCC26k, as well as qualitatively demonstrate how our method learns different representations for different visual hierarchies and scenes, which has not been demonstrated in the previous methods. These previous testing datasets mostly consist of iconic landmarks or images taken from social media, which makes them either a memorization task, or biased towards certain places. To address this issue we introduce a much harder testing dataset, Google-World-Streets-15k, comprised of images taken from Google Streetview covering the whole planet and present state of the art results. Our code will be made available in the camera-ready version.

READ FULL TEXT

page 2

page 8

page 14

page 15

page 16

page 17

page 18

page 19

research
05/05/2023

HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

Visual localization is critical to many applications in computer vision ...
research
10/07/2018

DeepGeo: Photo Localization with Deep Neural Network

In this paper we address the task of determining the geographical locati...
research
02/17/2016

PlaNet - Photo Geolocation with Convolutional Neural Networks

Is it possible to build a system to determine the location where a photo...
research
04/29/2022

Where in the World is this Image? Transformer-based Geo-localization in the Wild

Predicting the geographic location (geo-localization) from a single grou...
research
08/04/2019

To Learn or Not to Learn: Visual Localization from Essential Matrices

Visual localization is the problem of estimating a camera within a scene...
research
04/01/2023

NPR: Nocturnal Place Recognition in Street

Visual Place Recognition (VPR) is the task of retrieving database images...
research
01/28/2016

Geo-distinctive Visual Element Matching for Location Estimation of Images

We propose an image representation and matching approach that substantia...

Please sign up or login with your details

Forgot password? Click here to reset