ROBIN : A Benchmark for Robustness to Individual Nuisances in Real-World Out-of-Distribution Shifts

11/29/2021
by   Bingchen Zhao, et al.
4

Enhancing the robustness in real-world scenarios has been proven very challenging. One reason is that existing robustness benchmarks are limited, as they either rely on synthetic data or they simply measure robustness as generalization between datasets and hence ignore the effects of individual nuisance factors. In this work, we introduce ROBIN, a benchmark dataset for diagnosing the robustness of vision algorithms to individual nuisances in real-world images. ROBIN builds on 10 rigid categories from the PASCAL VOC 2012 and ImageNet datasets and includes out-of-distribution examples of the objects 3D pose, shape, texture, context and weather conditions. ROBIN is richly annotated to enable benchmark models for image classification, object detection, and 3D pose estimation. We provide results for a number of popular baselines and make several interesting observations: 1. Some nuisance factors have a much stronger negative effect on the performance compared to others. Moreover, the negative effect of an OODnuisance depends on the downstream vision task. 2. Current approaches to enhance OOD robustness using strong data augmentation have only marginal effects in real-world OOD scenarios, and sometimes even reduce the OOD performance. 3. We do not observe any significant differences between convolutional and transformer architectures in terms of OOD robustness. We believe our dataset provides a rich testbed to study the OOD robustness of vision algorithms and will help to significantly push forward research in this area.

READ FULL TEXT

page 1

page 3

page 4

page 12

research
04/17/2023

OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

Enhancing the robustness of vision algorithms in real-world scenarios is...
research
07/26/2021

Using Synthetic Corruptions to Measure Robustness to Natural Distribution Shifts

Synthetic corruptions gathered into a benchmark are frequently used to m...
research
06/29/2020

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

We introduce three new robustness benchmarks consisting of naturally occ...
research
04/20/2023

Enhancing object detection robustness: A synthetic and natural perturbation approach

Robustness against real-world distribution shifts is crucial for the suc...
research
07/05/2022

Generalization to translation shifts: a study in architectures and augmentations

We provide a detailed evaluation of various image classification archite...
research
07/24/2023

Does Progress On Object Recognition Benchmarks Improve Real-World Generalization?

For more than a decade, researchers have measured progress in object rec...
research
08/12/2021

DiagViB-6: A Diagnostic Benchmark Suite for Vision Models in the Presence of Shortcut and Generalization Opportunities

Common deep neural networks (DNNs) for image classification have been sh...

Please sign up or login with your details

Forgot password? Click here to reset