Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models

03/21/2019
by   Taiyao Wang, et al.
0

We consider Mixed Linear Regression (MLR), where training data have been generated from a mixture of distinct linear models (or clusters) and we seek to identify the corresponding coefficient vectors. We introduce a Mixed Integer Programming (MIP) formulation for MLR subject to regularization constraints on the coefficient vectors. We establish that as the number of training samples grows large, the MIP solution converges to the true coefficient vectors in the absence of noise. Subject to slightly stronger assumptions, we also establish that the MIP identifies the clusters from which the training samples were generated. In the special case where training data come from a single cluster, we establish that the corresponding MIP yields a solution that converges to the true coefficient vector even when training data are perturbed by (martingale difference) noise. We provide a counterexample indicating that in the presence of noise, the MIP may fail to produce the true coefficient vectors for more than one clusters. We also provide numerical results testing the MIP solutions in synthetic examples with noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2017

Outlier Detection Using Distributionally Robust Optimization under the Wasserstein Metric

We present a Distributionally Robust Optimization (DRO) approach to outl...
research
02/10/2019

Iterative Least Trimmed Squares for Mixed Linear Regression

Given a linear regression setting, Iterative Least Trimmed Squares (ILTS...
research
06/02/2022

Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation

In this paper, we study the problem of sparse mixed linear regression on...
research
12/06/2021

Bayesian Estimation Approach for Linear Regression Models with Linear Inequality Restrictions

Univariate and multivariate general linear regression models, subject to...
research
11/13/2018

Spectral Deconfounding and Perturbed Sparse Linear Models

Standard high-dimensional regression methods assume that the underlying ...
research
09/20/2023

GLM Regression with Oblivious Corruptions

We demonstrate the first algorithms for the problem of regression for ge...
research
11/15/2019

Penalized k-means algorithms for finding the correct number of clusters in a dataset

In many applications we want to find the number of clusters in a dataset...

Please sign up or login with your details

Forgot password? Click here to reset