DeepAI AI Chat
Log In Sign Up

Clustered regression with unknown clusters

03/23/2011
by   Kishor Barman, et al.
T.I.F.R.
0

We consider a collection of prediction experiments, which are clustered in the sense that groups of experiments ex- hibit similar relationship between the predictor and response variables. The experiment clusters as well as the regres- sion relationships are unknown. The regression relation- ships define the experiment clusters, and in general, the predictor and response variables may not exhibit any clus- tering. We call this prediction problem clustered regres- sion with unknown clusters (CRUC) and in this paper we focus on linear regression. We study and compare several methods for CRUC, demonstrate their applicability to the Yahoo Learning-to-rank Challenge (YLRC) dataset, and in- vestigate an associated mathematical model. CRUC is at the crossroads of many prior works and we study several prediction algorithms with diverse origins: an adaptation of the expectation-maximization algorithm, an approach in- spired by K-means clustering, the singular value threshold- ing approach to matrix rank minimization under quadratic constraints, an adaptation of the Curds and Whey method in multiple regression, and a local regression (LoR) scheme reminiscent of neighborhood methods in collaborative filter- ing. Based on empirical evaluation on the YLRC dataset as well as simulated data, we identify the LoR method as a good practical choice: it yields best or near-best prediction performance at a reasonable computational load, and it is less sensitive to the choice of the algorithm parameter. We also provide some analysis of the LoR method for an asso- ciated mathematical model, which sheds light on optimal parameter choice and prediction performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

02/02/2022

VC-PCR: A Prediction Method based on Supervised Variable Selection and Clustering

Sparse linear prediction methods suffer from decreased prediction accura...
07/18/2022

Analyzing Clustered Continuous Response Variables with Ordinal Regression Models

Continuous response variables often need to be transformed to meet regre...
11/03/2020

Spatially Clustered Regression

Spatial regression or geographically weighted regression models have bee...
07/05/2016

Algorithms for Generalized Cluster-wise Linear Regression

Cluster-wise linear regression (CLR), a clustering problem intertwined w...
06/24/2014

Automatic Dimension Selection for a Non-negative Factorization Approach to Clustering Multiple Random Graphs

We consider a problem of grouping multiple graphs into several clusters ...
12/07/2022

Multi-Randomized Kaczmarz for Latent Class Regression

Linear regression is effective at identifying interpretable trends in a ...
11/20/2019

Mixtures of multivariate generalized linear models with overlapping clusters

With the advent of ubiquitous monitoring and measurement protocols, stud...