An Automatic Finite-Sample Robustness Metric: Can Dropping a Little Data Change Conclusions?

11/30/2020
by   Tamara Broderick, et al.
0

We propose a method to assess the sensitivity of econometric analyses to the removal of a small fraction of the sample. Analyzing all possible data subsets of a certain size is computationally prohibitive, so we provide a finite-sample metric to approximately compute the number (or fraction) of observations that has the greatest influence on a given result when dropped. We call our resulting metric the Approximate Maximum Influence Perturbation. Our approximation is automatically computable and works for common estimators (including OLS, IV, GMM, MLE, and variational Bayes). We provide explicit finite-sample error bounds on our approximation for linear and instrumental variables regressions. At minimal computational cost, our metric provides an exact finite-sample lower bound on sensitivity for any estimator, so any non-robustness our metric finds is conclusive. We demonstrate that the Approximate Maximum Influence Perturbation is driven by a low signal-to-noise ratio in the inference problem, is not reflected in standard errors, does not disappear asymptotically, and is not a product of misspecification. Several empirical applications show that even 2-parameter linear regression analyses of randomized trials can be highly sensitive. While we find some applications are robust, in others the sign of a treatment effect can be changed by dropping less than 1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2020

Finite sample breakdown point of multivariate regression depth median

Depth induced multivariate medians (multi-dimensional maximum depth esti...
research
09/29/2022

How good is your Gaussian approximation of the posterior? Finite-sample computable error bounds for a variety of useful divergences

The Bayesian Central Limit Theorem (BCLT) for finite-dimensional models,...
research
03/18/2022

Finite-sample analysis of identification of switched linear systems with arbitrary or restricted switching

This work aims to derive a data-independent finite-sample error bound fo...
research
10/29/2018

Complier stochastic direct effects: identification and robust estimation

Mediation analysis is critical to understanding the mechanisms underlyin...
research
06/29/2021

Bounds for the chi-square approximation of the power divergence family of statistics

It is well-known that each statistic in the family of power divergence o...
research
05/25/2023

Finite sample rates for logistic regression with small noise or few samples

The logistic regression estimator is known to inflate the magnitude of i...
research
05/28/2022

Provably Auditing Ordinary Least Squares in Low Dimensions

Measuring the stability of conclusions derived from Ordinary Least Squar...

Please sign up or login with your details

Forgot password? Click here to reset