Diverse, Difficult, and Odd Instances (D2O): A New Test Set for Object Classification

01/29/2023
by   Ali Borji, et al.
0

Test sets are an integral part of evaluating models and gauging progress in object recognition, and more broadly in computer vision and AI. Existing test sets for object recognition, however, suffer from shortcomings such as bias towards the ImageNet characteristics and idiosyncrasies (e.g., ImageNet-V2), being limited to certain types of stimuli (e.g., indoor scenes in ObjectNet), and underestimating the model performance (e.g., ImageNet-A). To mitigate these problems, we introduce a new test set, called D2O, which is sufficiently different from existing test sets. Images are a mix of generated images as well as images crawled from the web. They are diverse, unmodified, and representative of real-world scenarios and cause state-of-the-art models to misclassify them with high confidence. To emphasize generalization, our dataset by design does not come paired with a training set. It contains 8,060 images spread across 36 categories, out of which 29 appear in ImageNet. The best Top-1 accuracy on our dataset is around 60 accuracy on ImageNet. We find that popular vision APIs perform very poorly in detecting objects over D2O categories such as “faces”, “cars”, and “cats”. Our dataset also comes with a “miscellaneous” category, over which we test the image tagging models. Overall, our investigations demonstrate that the D2O test set contain a mix of images with varied levels of difficulty and is predictive of the average-case performance of models. It can challenge object recognition models for years to come and can spur more research in this fundamental area.

READ FULL TEXT

page 12

page 20

page 21

page 23

page 25

page 27

page 28

page 29

research
09/01/2014

ImageNet Large Scale Visual Recognition Challenge

The ImageNet Large Scale Visual Recognition Challenge is a benchmark in ...
research
03/10/2021

A Study of Face Obfuscation in ImageNet

Face obfuscation (blurring, mosaicing, etc.) has been shown to be effect...
research
06/24/2016

Captioning Images with Diverse Objects

Recent captioning models are limited in their ability to scale and descr...
research
07/06/2020

Are Labels Necessary for Classifier Accuracy Evaluation?

To calculate the model accuracy on a computer vision task, e.g., object ...
research
06/06/2019

Does Object Recognition Work for Everyone?

The paper analyzes the accuracy of publicly available object-recognition...
research
06/17/2020

Noise or Signal: The Role of Image Backgrounds in Object Recognition

We assess the tendency of state-of-the-art object recognition models to ...
research
03/19/2020

Overinterpretation reveals image classification model pathologies

Image classifiers are typically scored on their test set accuracy, but h...

Please sign up or login with your details

Forgot password? Click here to reset