Perturbative construction of mean-field equations in extensive-rank matrix factorization and denoising

by   Antoine Maillard, et al.

Factorization of matrices where the rank of the two factors diverges linearly with their sizes has many applications in diverse areas such as unsupervised representation learning, dictionary learning or sparse coding. We consider a setting where the two factors are generated from known component-wise independent prior distributions, and the statistician observes a (possibly noisy) component-wise function of their matrix product. In the limit where the dimensions of the matrices tend to infinity, but their ratios remain fixed, we expect to be able to derive closed form expressions for the optimal mean squared error on the estimation of the two factors. However, this remains a very involved mathematical and algorithmic problem. A related, but simpler, problem is extensive-rank matrix denoising, where one aims to reconstruct a matrix with extensive but usually small rank from noisy measurements. In this paper, we approach both these problems using high-temperature expansions at fixed order parameters. This allows to clarify how previous attempts at solving these problems failed at finding an asymptotically exact solution. We provide a systematic way to derive the corrections to these existing approximations, taking into account the structure of correlations particular to the problem. Finally, we illustrate our approach in detail on the case of extensive-rank matrix denoising. We compare our results with known optimal rotationally-invariant estimators, and show how exact asymptotic calculations of the minimal error can be performed using extensive-rank matrix integrals.


page 1

page 2

page 3

page 4


Phase transitions and sample complexity in Bayes-optimal matrix factorization

We analyse the matrix factorization problem. Given a noisy measurement o...

Optimal denoising of rotationally invariant rectangular matrices

In this manuscript we consider denoising of large rectangular matrices: ...

Rank-one partitioning: formalization, illustrative examples, and a new cluster enhancing strategy

In this paper, we introduce and formalize a rank-one partitioning learni...

Mismatched Estimation of rank-one symmetric matrices under Gaussian noise

We consider the estimation of an n-dimensional vector s from the noisy e...

Rank-one matrix estimation with groupwise heteroskedasticity

We study the problem of estimating a rank-one matrix from Gaussian obser...

Statistical limits of dictionary learning: random matrix theory and the spectral replica method

We consider increasingly complex models of matrix denoising and dictiona...

Fast and Robust Fixed-Rank Matrix Recovery

We address the problem of efficient sparse fixed-rank (S-FR) matrix deco...