Adaptive Discrete Smoothing for High-Dimensional and Nonlinear Panel Data

12/30/2019
by   Xi Chen, et al.
8

In this paper we develop a data-driven smoothing technique for high-dimensional and non-linear panel data models. We allow for individual specific (non-linear) functions and estimation with econometric or machine learning methods by using weighted observations from other individuals. The weights are determined by a data-driven way and depend on the similarity between the corresponding functions and are measured based on initial estimates. The key feature of such a procedure is that it clusters individuals based on the distance / similarity between them, estimated in a first stage. Our estimation method can be combined with various statistical estimation procedures, in particular modern machine learning methods which are in particular fruitful in the high-dimensional case and with complex, heterogeneous data. The approach can be interpreted as a “ soft-clustering” in comparison to traditional“ hard clustering” that assigns each individual to exactly one group. We conduct a simulation study which shows that the prediction can be greatly improved by using our estimator. Finally, we analyze a big data set from didichuxing.com, a leading company in transportation industry, to analyze and predict the gap between supply and demand based on a large set of covariates. Our estimator clearly performs much better in out-of-sample prediction compared to existing linear panel data estimators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2020

Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data

This paper proposes a doubly robust two-stage semiparametric difference-...
research
12/01/2020

Evaluating (weighted) dynamic treatment effects by double machine learning

We consider evaluating the causal effects of dynamic treatments, i.e. of...
research
09/23/2021

Joint Estimation and Inference for Multi-Experiment Networks of High-Dimensional Point Processes

Modern high-dimensional point process data, especially those from neuros...
research
06/14/2018

Data-Driven Analytics for Benchmarking and Optimizing Retail Store Performance

Growing competitiveness and increasing availability of data is generatin...
research
03/06/2018

Invariant Smoothing on Lie Groups

In this paper we propose a (non-linear) smoothing algorithm for group-af...
research
07/28/2020

Collective Spectral Density Estimation and Clustering for Spatially-Correlated Data

In this paper, we develop a method for estimating and clustering two-dim...
research
01/21/2011

A fast and recursive algorithm for clustering large datasets with k-medians

Clustering with fast algorithms large samples of high dimensional data i...

Please sign up or login with your details

Forgot password? Click here to reset