Blocked Clusterwise Regression

01/29/2020
by   Max Cytrynbaum, et al.
0

A recent literature in econometrics models unobserved cross-sectional heterogeneity in panel data by assigning each cross-sectional unit a one-dimensional, discrete latent type. Such models have been shown to allow estimation and inference by regression clustering methods. This paper is motivated by the finding that the clustered heterogeneity models studied in this literature can be badly misspecified, even when the panel has significant discrete cross-sectional structure. To address this issue, we generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple, imperfectly-correlated latent variables that describe its response-type to different covariates. We give inference results for a k-means style estimator of our model and develop information criteria to jointly select the number clusters for each latent variable. Monte Carlo simulations confirm our theoretical results and give intuition about the finite-sample performance of estimation and model selection. We also contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting. Our results suggest that over-fitting can be severe in k-means style estimators when the number of clusters is over-specified.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2020

Spatial Differencing for Sample Selection Models with Unobserved Heterogeneity

This paper derives identification, estimation, and inference results usi...
research
07/14/2021

Bayesian Lifetime Regression with Multi-type Group-shared Latent Heterogeneity

Products manufactured from the same batch or utilized in the same region...
research
08/04/2019

Estimating Unobserved Individual Heterogeneity Using Pairwise Comparisons

We propose a new method for studying environments with unobserved indivi...
research
01/11/2021

A Degradation Performance Model With Mixed-type Covariates and Latent Heterogeneity

Successful modeling of degradation performance data is essential for acc...
research
12/21/2021

Shared Frailty Models Based on Cancer Data

Traditional survival analysis techniques focus on the occurrence of fail...
research
01/14/2020

Nonparametric regression for multiple heterogeneous networks

We study nonparametric methods for the setting where multiple distinct n...
research
06/09/2023

An introduction and tutorial to model-based clustering in education via Gaussian mixture modelling

Heterogeneity has been a hot topic in recent educational literature. Sev...

Please sign up or login with your details

Forgot password? Click here to reset