Counterfactual Maximum Likelihood Estimation for Training Deep Networks

06/07/2021
by   Xinyi Wang, et al.
13

Although deep learning models have driven state-of-the-art performance on a wide array of tasks, they are prone to learning spurious correlations that should not be learned as predictive clues. To mitigate this problem, we propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We give theoretical analysis on the underlying general Structural Causal Model (SCM) and propose to perform Maximum Likelihood Estimation (MLE) on the interventional distribution instead of the observational distribution, namely Counterfactual Maximum Likelihood Estimation (CMLE). As the interventional distribution, in general, is hidden from the observational data, we then derive two different upper bounds of the expected negative log-likelihood and propose two general algorithms, Implicit CMLE and Explicit CMLE, for causal predictions of deep learning models using observational data. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning. The results show that CMLE methods outperform the regular MLE method in terms of out-of-domain generalization performance and reducing spurious correlations, while maintaining comparable performance on the regular evaluations.

READ FULL TEXT
research
01/26/2022

Improved Maximum Likelihood Estimation of ARMA Models

In this paper we propose a new optimization model for maximum likelihood...
research
06/23/2023

Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation

Image captioning aims to describe visual content in natural language. As...
research
11/10/2022

Unsupervised Mixture Estimation via Approximate Maximum Likelihood based on the Cramér - von Mises distance

Mixture distributions with dynamic weights are an efficient way of model...
research
10/27/2022

Maximum likelihood estimation for left-truncated log-logistic distributions with a given truncation point

The maximum likelihood estimation of the left-truncated log-logistic dis...
research
08/24/2021

Maximum Likelihood Estimation for Multimodal Learning with Missing Modality

Multimodal learning has achieved great successes in many scenarios. Comp...
research
11/02/2020

Noise-Contrastive Estimation for Multivariate Point Processes

The log-likelihood of a generative model often involves both positive an...
research
07/23/2021

Human Pose Regression with Residual Log-likelihood Estimation

Heatmap-based methods dominate in the field of human pose estimation by ...

Please sign up or login with your details

Forgot password? Click here to reset