Simultaneous semi-parametric estimation of clustering and regression

12/28/2020
by   Matthieu Marbac, et al.
0

We investigate the parameter estimation of regression models with fixed group effects, when the group variable is missing while group related variables are available. This problem involves clustering to infer the missing group variable based on the group related variables, and regression to build a model on the target variable given the group and eventually additional variables. Thus, this problem can be formulated as the joint distribution modeling of the target and of the group related variables. The usual parameter estimation strategy for this joint model is a two-step approach starting by learning the group variable (clustering step) and then plugging in its estimator for fitting the regression model (regression step). However, this approach is suboptimal (providing in particular biased regression estimates) since it does not make use of the target variable for clustering. Thus, we claim for a simultaneous estimation approach of both clustering and regression, in a semi-parametric framework. Numerical experiments illustrate the benefits of our proposition by considering wide ranges of distributions and regression models. The relevance of our new method is illustrated on real data dealing with problems associated with high blood pressure prevention.

READ FULL TEXT
research
06/24/2019

Parametric versus Semi and Nonparametric Regression Models

Three types of regression models researchers need to be familiar with an...
research
07/03/2021

Novel Semi-parametric Tobit Additive Regression Models

Regression method has been widely used to explore relationship between d...
research
04/29/2018

Simultaneous Parameter Learning and Bi-Clustering for Multi-Response Models

We consider multi-response and multitask regression models, where the pa...
research
07/26/2018

Optimal Designs in Multiple Group Random Coefficient Regression Models

The subject of this work is multiple group random coefficients regressio...
research
09/16/2020

Clustering Data with Nonignorable Missingness using Semi-Parametric Mixture Models

We are concerned in clustering continuous data sets subject to nonignora...
research
05/04/2018

Distribution Assertive Regression

In regression modelling approach, the main step is to fit the regression...
research
04/07/2018

A group-based approach to the least squares regression for handling multicollinearity from strongly correlated variables

Multicollinearity due to strongly correlated predictor variables is a lo...

Please sign up or login with your details

Forgot password? Click here to reset