Google Landmarks Dataset v2 – A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

04/03/2020
by   Tobias Weyand, et al.
0

While image retrieval and instance recognition techniques are progressing rapidly, there is a need for challenging datasets to accurately measure their performance – while posing novel challenges that are relevant for practical applications. We introduce the Google Landmarks Dataset v2 (GLDv2), a new benchmark for large-scale, fine-grained instance recognition and image retrieval in the domain of human-made and natural landmarks. GLDv2 is the largest such dataset to date by a large margin, including over 5M images and 200k distinct instance labels. Its test set consists of 118k images with ground truth annotations for both the retrieval and recognition tasks. The ground truth construction involved over 800 hours of human annotator work. Our new dataset has several challenging properties inspired by real world applications that previous datasets did not consider: An extremely long-tailed class distribution, a large fraction of out-of-domain test photos and large intra-class variability. The dataset is sourced from Wikimedia Commons, the world's largest crowdsourced collection of landmark photos. We provide baseline results for both recognition and retrieval tasks based on state-of-the-art methods as well as competitive results from a public challenge. We further demonstrate the suitability of the dataset for transfer learning by showing that image embeddings trained on it achieve competitive retrieval performance on independent datasets. The dataset images, ground-truth and metric scoring code are available at https://github.com/cvdfoundation/google-landmark.

READ FULL TEXT

page 2

page 6

research
03/25/2020

Two-stage Discriminative Re-ranking for Large-scale Landmark Retrieval

We propose an efficient pipeline for large-scale landmark image retrieva...
research
05/21/2021

Sharing Pain: Using Domain Transfer Between Pain Types for Recognition of Sparse Pain Expressions in Horses

Orthopedic disorders are a common cause for euthanasia among horses, whi...
research
08/07/2018

SketchyScene: Richly-Annotated Scene Sketches

We contribute the first large-scale dataset of scene sketches, SketchySc...
research
02/03/2022

The Met Dataset: Instance-level Recognition for Artworks

This work introduces a dataset for large-scale instance-level recognitio...
research
12/04/2018

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Retrieving object instances among cluttered scenes efficiently requires ...
research
10/19/2022

GSV-Cities: Toward Appropriate Supervised Visual Place Recognition

This paper aims to investigate representation learning for large scale v...
research
09/04/2023

Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations

Fine-grained and instance-level recognition methods are commonly trained...

Please sign up or login with your details

Forgot password? Click here to reset