Regression with Label Permutation in Generalized Linear Model

06/23/2022
by   Guanhua Fang, et al.
0

The assumption that response and predictor belong to the same statistical unit may be violated in practice. Unbiased estimation and recovery of true label ordering based on unlabeled data are challenging tasks and have attracted increasing attentions in the recent literature. In this paper, we present a relatively complete analysis of label permutation problem for the generalized linear model with multivariate responses. The theory is established under different scenarios, with knowledge of true parameters, with partial knowledge of underlying label permutation matrix and without any knowledge. Our results remove the stringent conditions required by the current literature and are further extended to the missing observation setting which has never been considered in the field of label permutation problem. On computational side, we propose two methods, "maximum likelihood estimation" algorithm and "two-step estimation" algorithm, to accommodate for different settings. When the proportion of permuted labels is moderate, both methods work effectively. Multiple numerical experiments are provided and corroborate our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2019

A Two-Stage Approach to Multivariate Linear Regression with Sparsely Mismatched Data

A tacit assumption in linear regression is that (response, predictor)-pa...
research
08/09/2016

Linear Regression with an Unknown Permutation: Statistical and Computational Limits

Consider a noisy linear observation model with an unknown permutation, b...
research
09/05/2019

Permutation Recovery from Multiple Measurement Vectors in Unlabeled Sensing

In "Unlabeled Sensing", one observes a set of linear measurements of an ...
research
06/08/2015

Convergence Rates of Active Learning for Maximum Likelihood Estimation

An active learner is given a class of models, a large set of unlabeled e...
research
05/14/2016

Generalized Linear Models for Aggregated Data

Databases in domains such as healthcare are routinely released to the pu...
research
12/16/2020

A connection between the pattern classification problem and the General Linear Model for statistical inference

A connection between the General Linear Model (GLM) in combination with ...
research
11/02/2021

Regularization for Shuffled Data Problems via Exponential Family Priors on the Permutation Group

In the analysis of data sets consisting of (X, Y)-pairs, a tacit assumpt...

Please sign up or login with your details

Forgot password? Click here to reset