
Least informative distributions in Maximum qloglikelihood estimation
We use the Maximum qloglikelihood estimation for Least informative dis...
read it

Efficiency of maximum likelihood estimation for a multinomial distribution with known probability sums
For a multinomial distribution, suppose that we have prior knowledge of ...
read it

Finitesample risk bounds for maximum likelihood estimation with arbitrary penalties
The MDL twopart coding index of resolvability provides a finitesampl...
read it

Some computational aspects of maximum likelihood estimation of the skewt distribution
Since its introduction, the skewt distribution has received much attent...
read it

Learning Causal Models Online
Predictive models – learned from observational data not covering the com...
read it

Softmax QDistribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML
Reward augmented maximum likelihood (RAML), a simple and effective learn...
read it

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions
Integrating discrete probability distributions and combinatorial optimiz...
read it
Counterfactual Maximum Likelihood Estimation for Training Deep Networks
Although deep learning models have driven stateoftheart performance on a wide array of tasks, they are prone to learning spurious correlations that should not be learned as predictive clues. To mitigate this problem, we propose a causalitybased training framework to reduce the spurious correlations caused by observable confounders. We give theoretical analysis on the underlying general Structural Causal Model (SCM) and propose to perform Maximum Likelihood Estimation (MLE) on the interventional distribution instead of the observational distribution, namely Counterfactual Maximum Likelihood Estimation (CMLE). As the interventional distribution, in general, is hidden from the observational data, we then derive two different upper bounds of the expected negative loglikelihood and propose two general algorithms, Implicit CMLE and Explicit CMLE, for causal predictions of deep learning models using observational data. We conduct experiments on two realworld tasks: Natural Language Inference (NLI) and Image Captioning. The results show that CMLE methods outperform the regular MLE method in terms of outofdomain generalization performance and reducing spurious correlations, while maintaining comparable performance on the regular evaluations.
READ FULL TEXT
Comments
There are no comments yet.