Novel Prediction Techniques Based on Clusterwise Linear Regression

04/28/2018
by   Igor Gitman, et al.
0

In this paper we explore different regression models based on Clusterwise Linear Regression (CLR). CLR aims to find the partition of the data into k clusters, such that linear regressions fitted to each of the clusters minimize overall mean squared error on the whole data. The main obstacle preventing to use found regression models for prediction on the unseen test points is the absence of a reasonable way to obtain CLR cluster labels when the values of target variable are unknown. In this paper we propose two novel approaches on how to solve this problem. The first approach, predictive CLR builds a separate classification model to predict test CLR labels. The second approach, constrained CLR utilizes a set of user-specified constraints that enforce certain points to go to the same clusters. Assuming the constraint values are known for the test points, they can be directly used to assign CLR labels. We evaluate these two approaches on three UCI ML datasets as well as on a large corpus of health insurance claims. We show that both of the proposed algorithms significantly improve over the known CLR-based regression methods. Moreover, predictive CLR consistently outperforms linear regression and random forest, and shows comparable performance to support vector regression on UCI ML datasets. The constrained CLR approach achieves the best performance on the health insurance dataset, while enjoying only ≈ 20 times increased computational time over linear regression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2022

Certifying Data-Bias Robustness in Linear Regression

Datasets typically contain inaccuracies due to human error and societal ...
research
04/14/2018

Constrained maximum likelihood estimation of clusterwise linear regression models with unknown number of components

We consider an equivariant approach imposing data-driven bounds for the ...
research
01/10/2017

Efficient Image Set Classification using Linear Regression based Image Reconstruction

We propose a novel image set classification technique using linear regre...
research
10/16/2018

Hunting for Discriminatory Proxies in Linear Regression Models

A machine learning model may exhibit discrimination when used to make de...
research
08/29/2019

A Concert-planning Tool for Independent Musicians by Machine Learning Models

Our project aims at helping independent musicians to plan their concerts...
research
03/23/2011

Clustered regression with unknown clusters

We consider a collection of prediction experiments, which are clustered ...

Please sign up or login with your details

Forgot password? Click here to reset