Minimax Supervised Clustering in the Anisotropic Gaussian Mixture Model: A new take on Robust Interpolation

11/13/2021
by   Stanislav Minsker, et al.
0

We study the supervised clustering problem under the two-component anisotropic Gaussian mixture model in high dimensions and in the non-asymptotic setting. We first derive a lower and a matching upper bound for the minimax risk of clustering in this framework. We also show that in the high-dimensional regime, the linear discriminant analysis (LDA) classifier turns out to be sub-optimal in the minimax sense. Next, we characterize precisely the risk of ℓ_2-regularized supervised least squares classifiers. We deduce the fact that the interpolating solution may outperform the regularized classifier, under mild assumptions on the covariance structure of the noise. Our analysis also shows that interpolation can be robust to corruption in the covariance of the noise when the signal is aligned with the "clean" part of the covariance, for the properly defined notion of alignment. To the best of our knowledge, this peculiar phenomenon has not yet been investigated in the rapidly growing literature related to interpolation. We conclude that interpolation is not only benign but can also be optimal, and in some cases robust.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2018

Sharp optimal recovery in the Two Component Gaussian Mixture Model

In this paper, we study the problem of clustering in the Two component G...
research
12/19/2018

Sharp optimal recovery in the Two Gaussian Mixture Model

In this paper, we study the non-asymptotic problem of exact recovery in ...
research
10/25/2022

Interpolating Discriminant Functions in High-Dimensional Gaussian Latent Mixtures

This paper considers binary classification of high-dimensional features ...
research
11/01/2017

A Large Dimensional Analysis of Regularized Discriminant Analysis Classifiers

This article carries out a large dimensional analysis of standard regula...
research
06/29/2020

Sharp Statistical Guarantees for Adversarially Robust Gaussian Classification

Adversarial robustness has become a fundamental requirement in modern ma...
research
02/17/2023

Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

In this manuscript we consider the problem of generalized linear estimat...

Please sign up or login with your details

Forgot password? Click here to reset