On Binscatter

02/25/2019
by   Matias D. Cattaneo, et al.
0

Binscatter is very popular in applied microeconomics. It provides a flexible, yet parsimonious way of visualizing and summarizing large data sets in regression settings, and it is often used for informal evaluation of substantive hypotheses such as linearity or monotonicity of the regression function. This paper presents a foundational, thorough analysis of binscatter: we give an array of theoretical and practical results that aid both in understanding current practices (i.e., their validity or lack thereof) and in offering theory-based guidance for future applications. Our main results include principled number of bins selection, confidence intervals and bands, hypothesis tests for parametric and shape restrictions of the regression function, and several other new methods, applicable to canonical binscatter as well as higher-order polynomial, covariate-adjusted and smoothness-restricted extensions thereof. In particular, we highlight important methodological problems related to covariate adjustment methods used in current practice. We also discuss extensions to clustered data. Our results are illustrated with simulated and real data throughout. Companion general-purpose software packages for Stata and R are provided. Finally, from a technical perspective, new theoretical results for partitioning-based series estimation are obtained that may be of independent interest.

READ FULL TEXT
research
02/25/2019

Binscatter Regressions

We introduce the Stata (and R) package Binsreg, which implements the bin...
research
05/08/2023

Isotonic subgroup selection

Given a sample of covariate-response pairs, we consider the subgroup sel...
research
09/17/2017

Nonparametric Shape-restricted Regression

We consider the problem of nonparametric regression under shape constrai...
research
09/11/2018

Regression Discontinuity Designs Using Covariates

We study regression discontinuity designs when covariates are included i...
research
10/15/2021

Covariate Adjustment in Regression Discontinuity Designs

The Regression Discontinuity (RD) design is a widely used non-experiment...
research
01/09/2023

Optimal Subsampling Design for Polynomial Regression in one Covariate

Improvements in technology lead to increasing availability of large data...
research
06/01/2017

Data Analysis in Multimedia Quality Assessment: Revisiting the Statistical Tests

Assessment of multimedia quality relies heavily on subjective assessment...

Please sign up or login with your details

Forgot password? Click here to reset