Log In Sign Up

On the benefits of maximum likelihood estimation for Regression and Forecasting

by   Pranjal Awasthi, et al.

We advocate for a practical Maximum Likelihood Estimation (MLE) approach for regression and forecasting, as an alternative to the typical approach of Empirical Risk Minimization (ERM) for a specific target metric. This approach is better suited to capture inductive biases such as prior domain knowledge in datasets, and can output post-hoc estimators at inference time that can optimize different types of target metrics. We present theoretical results to demonstrate that our approach is always competitive with any estimator for the target metric under some general conditions, and in many practical settings (such as Poisson Regression) can actually be much superior to ERM. We demonstrate empirically that our method instantiated with a well-designed general purpose mixture likelihood family can obtain superior performance over ERM for a variety of tasks across time-series forecasting and regression datasets with different data distributions.


page 1

page 2

page 3

page 4


Stochastic Maximum Likelihood Optimization via Hypernetworks

This work explores maximum likelihood optimization of neural networks th...

Maximum Likelihood Estimation in Gaussian Process Regression is Ill-Posed

Gaussian process regression underpins countless academic and industrial ...

Exploiting Evidence in Probabilistic Inference

We define the notion of compiling a Bayesian network with evidence and p...

Practical Coreset Constructions for Machine Learning

We investigate coresets - succinct, small summaries of large data sets -...

Some New Results for Poisson Binomial Models

We consider a problem of ecological inference, in which individual-level...

Quantifying and Reducing Bias in Maximum Likelihood Estimation of Structured Anomalies

Anomaly estimation, or the problem of finding a subset of a dataset that...

Regression-based Network Reconstruction with Nodal and Dyadic Covariates and Random Effects

Network (or matrix) reconstruction is a general problem which occurs if ...