Learning Low-Rank Document Embeddings with Weighted Nuclear Norm Regularization

10/27/2020
by   Katharina Morik, et al.
0

Recently, neural embeddings of documents have shown success in various language processing tasks. These low-dimensional and dense feature vectors of text documents capture semantic similarities better than traditional methods.However, the underlying optimization problem is non-convex and usually solved using stochastic gradient descent.Hence solutions are most-likely sub-optimal and not reproducible, as they are the result of a randomized algorithm.We present an alternative formulation for learning low-rank representations based on convex optimization. Instead of explicitly learning low-dimensional features, we compute a low-rank representation implicitly by regularizing full-dimensional solutions.Our approach uses the weighted nuclear norm, a regularizer that penalizes singular values of matrices. We optimize the regularized objective using accelerated proximal gradient descent.We apply the approach to learn embeddings of documents. These embeddings are guaranteed to converge to a global optimum in a deterministic manner.We show that our convex approach outperforms traditional convex approaches in a numerical study. Furthermore we demonstrate that the embeddings are useful for detecting similarities on a standard dataset. Then we apply our approach in an interdisciplinary research project to detect topics in religious online discussions. The topic descriptions obtained from a clustering of embeddings are coherent and insightful. An earlier version of this work stated, that the weighted nuclear norm is a convex regularizer. This is wrong – the weighted nuclear norm is non-convex, even though the name falsely suggests that it is a matrix norm.

READ FULL TEXT
research
09/14/2018

Efficient Rank Minimization via Solving Non-convexPenalties by Iterative Shrinkage-Thresholding Algorithm

Rank minimization (RM) is a wildly investigated task of finding solution...
research
08/01/2017

Large-Scale Low-Rank Matrix Learning with Nonconvex Regularizers

Low-rank modeling has many important applications in computer vision and...
research
12/03/2015

Fast Low-Rank Matrix Learning with Nonconvex Regularization

Low-rank modeling has a lot of important applications in machine learnin...
research
01/07/2019

Double Weighted Truncated Nuclear Norm Regularization for Low-Rank Matrix Completion

Matrix completion focuses on recovering a matrix from a small subset of ...
research
08/14/2018

A Comprehensive Survey for Low Rank Regularization

Low rank regularization, in essence, involves introducing a low rank or ...
research
02/07/2019

Matrix Cofactorization for Joint Representation Learning and Supervised Classification -- Application to Hyperspectral Image Analysis

Supervised classification and representation learning are two widely use...
research
03/23/2020

Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion

Fitting a matrix of a given rank to data in a least squares sense can be...

Please sign up or login with your details

Forgot password? Click here to reset