Spectral Algorithm for Low-rank Multitask Regression

by   Yotam Gigi, et al.

Multitask learning, i.e. taking advantage of the relatedness of individual tasks in order to improve performance on all of them, is a core challenge in the field of machine learning. We focus on matrix regression tasks where the rank of the weight matrix is constrained to reduce sample complexity. We introduce the common mechanism regression (CMR) model which assumes a shared left low-rank component across all tasks, but allows an individual per-task right low-rank component. This dramatically reduces the number of samples needed for accurate estimation. The problem of jointly recovering the common and the local components has a non-convex bi-linear structure. We overcome this hurdle and provide a provably beneficial non-iterative spectral algorithm. Appealingly, the solution has favorable behavior as a function of the number of related tasks and the small number of samples available for each one. We demonstrate the efficacy of our approach for the challenging task of remote river discharge estimation across multiple river sites, where data for each task is naturally scarce. In this scenario sharing a low-rank component between the tasks translates to a shared spectral reflection of the water, which is a true underlying physical model. We also show the benefit of the approach on the markedly different setting of image classification where the common component can be interpreted as the shared convolution filters.



There are no comments yet.


page 2

page 6

page 23


Nonasymptotic Guarantees for Low-Rank Matrix Recovery with Generative Priors

Many problems in statistics and machine learning require the reconstruct...

Towards Global Remote Discharge Estimation: Using the Few to Estimate The Many

Learning hydrologic models for accurate riverine flood prediction at sca...

Krylov Subspace Recycling for Fast Iterative Least-Squares in Machine Learning

Solving symmetric positive definite linear problems is a fundamental com...

A No-Free-Lunch Theorem for MultiTask Learning

Multitask learning and related areas such as multi-source domain adaptat...

Flexible Modeling of Latent Task Structures in Multitask Learning

Multitask learning algorithms are typically designed assuming some fixed...

Recovering low-rank structure from multiple networks with unknown edge distributions

In increasingly many settings, particularly in neuroimaging, data sets c...

Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs

The ability to compare and align related datasets living in heterogeneou...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Figure 1: An RGB satellite image of a portion of the Ganges river in India (left), along with its thresholded CMR approximation () that accurately identifies the water pixels (right). The generated parameters are and correspond to the 11 spectral bands of the satellite LANDSAT8.
Figure 2: Illustration of CMR usage in image multi-class classification: the first step is a convolution-like reshaping that transforms an image to a down-sampled version with multiple channels (bands). Next, the bands are up-lifted to higher dimensions via random non-linear mappings. A common mechanism is then applied in the bands dimension to construct the important features. Finally, individual regressions are performed per binary classification task. Graphics were generated via http://alexlenail.me/NN-SVG/

2 Related work

3 Common Mechanism Regression (CMR)

4 Theoretical Analysis of CMR

5 Experimental Evaluation

6 Summary and Future Work

In this work, we tackled the challenge of leveraging few data points from multiple related regression tasks, in order to improve predictive performance across all tasks. We proposed a common mechanism regression model and a corresponding spectral optimization algorithm for doing so. We proved that, despite the non-convex nature of the learning objective, it is possible to reconstruct the common mechanism, even when there are not enough samples to estimate the per-task component of the model. In particular, we characterized a favorable dependence on the number of related tasks and the number of samples for each task. We also demonstrated the efficacy of the approach on simple visual recognition scenarios using random convolution-like and nonlinear features, as well as a more challenging remote river discharge estimation task.

On the modeling front, it would be useful to generalize our CMR approach so as to also allow for robust and task-normalized loss functions. In terms of the theoretical analysis, it would be interesting to also consider the conditions for satisfying RIP in the CMR model.

7 Acknowledgements

This work was supported in part by the Israel Science Foundation (ISF) under Grant 1339/15.


Appendix A - Main theorem proof

Appendix B - Extra synthetic experiments