Supervised Quantile Normalization for Low-rank Matrix Approximation

02/08/2020
by   Marco Cuturi, et al.
0

Low rank matrix factorization is a fundamental building block in machine learning, used for instance to summarize gene expression profile data or word-document counts. To be robust to outliers and differences in scale across features, a matrix factorization step is usually preceded by ad-hoc feature normalization steps, such as tf-idf scaling or data whitening. We propose in this work to learn these normalization operators jointly with the factorization itself. More precisely, given a d× n matrix X of d features measured on n individuals, we propose to learn the parameters of quantile normalization operators that can operate row-wise on the values of X and/or of its factorization UV to improve the quality of the low-rank representation of X itself. This optimization is facilitated by the introduction of a differentiable quantile normalization operator built using optimal transport, providing new results on top of existing work by Cuturi et al. (2019). We demonstrate the applicability of these techniques on synthetic and genomics datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/23/2014

Matrix factorization with Binary Components

Motivated by an application in computational biology, we consider low-ra...
research
08/25/2017

Structured Low-Rank Matrix Factorization: Global Optimality, Algorithms, and Applications

Recently, convex formulations of low-rank matrix factorization problems ...
research
06/07/2016

Expectile Matrix Factorization for Skewed Data Analysis

Matrix factorization is a popular approach to solving matrix estimation ...
research
05/28/2019

Differentiable Sorting using Optimal Transport:The Sinkhorn CDF and Quantile Operator

Sorting an array is a fundamental routine in machine learning, one that ...
research
05/02/2019

Compression of Acoustic Event Detection Models with Low-rank Matrix Factorization and Quantization Training

In this paper, we present a compression approach based on the combinatio...
research
06/01/2017

Supervised Quantile Normalisation

Quantile normalisation is a popular normalisation method for data subjec...
research
06/18/2012

Clustering by Low-Rank Doubly Stochastic Matrix Decomposition

Clustering analysis by nonnegative low-rank approximations has achieved ...

Please sign up or login with your details

Forgot password? Click here to reset