Shortcut Removal for Improved OOD-Generalization

11/24/2022
by   Nicolas M. Müller, et al.
0

Machine learning is a data-driven discipline, and learning success is largely dependent on the quality of the underlying data sets. However, it is becoming increasingly clear that even high performance on held-out test data does not necessarily mean that a model generalizes or learns anything meaningful at all. One reason for this is the presence of machine learning shortcuts, i.e., hints in the data that are predictive but accidental and semantically unconnected to the problem. We present a new approach to detect such shortcuts and a technique to automatically remove them from datasets. Using an adversarially trained lens, any small and highly predictive clues in images can be detected and removed. We show that this approach 1) does not cause degradation of model performance in the absence of these shortcuts, and 2) reliably identifies and neutralizes shortcuts from different image datasets. In our experiments, we are able to recover up to 93,8 shortcuts. Finally, we apply our model to a real-world dataset from the medical domain consisting of chest x-rays and identify and remove several types of shortcuts that are known to hinder real-world applicability. Thus, we hope that our proposed approach fosters real-world applicability of machine learning.

READ FULL TEXT

page 6

page 8

research
11/08/2019

Certified Data Removal from Machine Learning Models

Good data stewardship requires removal of data at the request of the dat...
research
06/08/2022

To remove or not remove Mobile Apps? A data-driven predictive model approach

Mobile app stores are the key distributors of mobile applications. They ...
research
07/18/2019

Automating concept-drift detection by self-evaluating predictive model degradation

A key aspect of automating predictive machine learning entails the capab...
research
09/21/2023

Identification of pneumonia on chest x-ray images through machine learning

Pneumonia is the leading infectious cause of infant death in the world. ...
research
03/20/2022

Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic Data

In portraits, eyeglasses may occlude facial regions and generate cast sh...
research
03/29/2022

Zero-shot meta-learning for small-scale data from human subjects

While developments in machine learning led to impressive performance gai...
research
06/21/2021

Graceful Degradation and Related Fields

When machine learning models encounter data which is out of the distribu...

Please sign up or login with your details

Forgot password? Click here to reset