A General Scoring Rule for Randomized Kernel Approximation with Application to Canonical Correlation Analysis

10/11/2019
by   Yinsong Wang, et al.
0

Random features has been widely used for kernel approximation in large-scale machine learning. A number of recent studies have explored data-dependent sampling of features, modifying the stochastic oracle from which random features are sampled. While proposed techniques in this realm improve the approximation, their application is limited to a specific learning task. In this paper, we propose a general scoring rule for sampling random features, which can be employed for various applications with some adjustments. We first observe that our method can recover a number of data-dependent sampling methods (e.g., leverage scores and energy-based sampling). Then, we restrict our attention to a ubiquitous problem in statistics and machine learning, namely Canonical Correlation Analysis (CCA). We provide a principled guide for finding the distribution maximizing the canonical correlations, resulting in a novel data-dependent method for sampling features. Numerical experiments verify that our algorithm consistently outperforms other sampling techniques in the CCA task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2019

On Sampling Random Features From Empirical Leverage Scores: Implementation and Theoretical Guarantees

Random features provide a practical framework for large-scale kernel app...
research
09/13/2006

A kernel method for canonical correlation analysis

Canonical correlation analysis is a technique to extract common features...
research
12/19/2017

On Data-Dependent Random Features for Improved Generalization in Supervised Learning

The randomized-feature approach has been successfully employed in large-...
research
08/16/2018

Context-Aware DFM Rule Analysis and Scoring Using Machine Learning

To evaluate the quality of physical layout designs in terms of manufactu...
research
02/05/2016

On Column Selection in Approximate Kernel Canonical Correlation Analysis

We study the problem of column selection in large-scale kernel canonical...
research
02/12/2020

A Random-Feature Based Newton Method for Empirical Risk Minimization in Reproducing Kernel Hilbert Space

In supervised learning using kernel methods, we encounter a large-scale ...
research
08/28/2023

Improved learning theory for kernel distribution regression with two-stage sampling

The distribution regression problem encompasses many important statistic...

Please sign up or login with your details

Forgot password? Click here to reset