On Image Classification: Correlation v.s. Causality

08/22/2017
by   Zheyan Shen, et al.
0

Image classification is one of the fundamental problems in computer vision. Owing to the availability of large image datasets like ImageNet and YFCC100M, a plethora of research has been conducted to do high precision image classification and many remarkable achievements have been made. The success of most existing methods hinges on a basic hypothesis that the testing image set has the same distribution as the training image set. However, in many real applications, we cannot guarantee the validity of the i.i.d. hypothesis since the testing image set is unseen. It is thus desirable to learn an image classifier, which can perform well even in non-i.i.d. situations. In this paper, we propose a novel Causally Regularized Logistic Regression (CRLR) algorithm to address the non-i.i.d. problem without knowing testing data information by searching for causal features. The causal features refer to characteristics truly determining whether a special object belongs to a category or not. Algorithmically, we propose a causal regularizer for causal feature identification by jointly optimizing it with a logistic loss term. Assisted with the causal regularizer, we can estimate the causal contribution (causal effect) of each focal image feature (viewed as a treatment variable) by sample reweighting which ensures the distributions of all remaining image features between images with different focal feature levels are close. The resultant classifier will be based on the estimated causal contributions of the features, rather than traditional correlation-based contributions. To validate the e effectiveness of our CRLR algorithm, we manually construct a new image dataset from YFCC100M, simulating various non-i.i.d. situations in the real world, and conduct extensive experiments for image classification. Experimental results clearly demonstrate that our CRLR algorithm outperforms the state-of-the-art methods.

READ FULL TEXT

page 2

page 14

research
10/20/2022

Hypothesis Testing using Causal and Causal Variational Generative Models

Hypothesis testing and the usage of expert knowledge, or causal priors, ...
research
06/07/2019

NICO: A Dataset Towards Non-I.I.D. Image Classification

The I.I.D. hypothesis between training data and testing data is the basi...
research
10/22/2021

The Causal Loss: Driving Correlation to Imply Causation

Most algorithms in classical and contemporary machine learning focus on ...
research
03/22/2022

Out-of-distribution Generalization with Causal Invariant Transformations

In real-world applications, it is important and desirable to learn a mod...
research
11/07/2021

Positivity Validation Detection and Explainability via Zero Fraction Multi-Hypothesis Testing and Asymmetrically Pruned Decision Trees

Positivity is one of the three conditions for causal inference from obse...
research
12/17/2015

Unsupervised Feature Construction for Improving Data Representation and Semantics

Feature-based format is the main data representation format used by mach...
research
04/26/2022

Causal Transportability for Visual Recognition

Visual representations underlie object recognition tasks, but they often...

Please sign up or login with your details

Forgot password? Click here to reset