
Asymptotically Optimal Bias Reduction for Parametric Models
An important challenge in statistical analysis concerns the control of t...
read it

Unbiased Estimation of the Reciprocal Mean for Nonnegative Random Variables
Many simulation problems require the estimation of a ratio of two expect...
read it

Distributionally Robust Parametric Maximum Likelihood Estimation
We consider the parameter estimation problem of a probabilistic generati...
read it

Computationally efficient likelihood inference in exponential families when the maximum likelihood estimator does not exist
In a regular full exponential family, the maximum likelihood estimator (...
read it

On the benefits of maximum likelihood estimation for Regression and Forecasting
We advocate for a practical Maximum Likelihood Estimation (MLE) approach...
read it

Exploiting Evidence in Probabilistic Inference
We define the notion of compiling a Bayesian network with evidence and p...
read it

Efficiency requires innovation
In estimation a parameter θ∈ R from a sample (x_1,...,x_n) from a popula...
read it
Quantifying and Reducing Bias in Maximum Likelihood Estimation of Structured Anomalies
Anomaly estimation, or the problem of finding a subset of a dataset that differs from the rest of the dataset, is a classic problem in machine learning and data mining. In both theoretical work and in applications, the anomaly is assumed to have a specific structure defined by membership in an anomaly family. For example, in temporal data the anomaly family may be time intervals, while in network data the anomaly family may be connected subgraphs. The most prominent approach for anomaly estimation is to compute the Maximum Likelihood Estimator (MLE) of the anomaly. However, it was recently observed that for some anomaly families, the MLE is an asymptotically biased estimator of the anomaly. Here, we demonstrate that the bias of the MLE depends on the size of the anomaly family. We prove that if the number of sets in the anomaly family that contain the anomaly is subexponential, then the MLE is asymptotically unbiased. At the same time, we provide empirical evidence that the converse is also true: if the number of such sets is exponential, then the MLE is asymptotically biased. Our analysis unifies a number of earlier results on the bias of the MLE for specific anomaly families, including intervals, submatrices, and connected subgraphs. Next, we derive a new anomaly estimator using a mixture model, and we empirically demonstrate that our estimator is asymptotically unbiased regardless of the size of the anomaly family. We illustrate the benefits of our estimator on both simulated disease outbreak data and a realworld highway traffic dataset.
READ FULL TEXT
Comments
There are no comments yet.