Kernels on Sample Sets via Nonparametric Divergence Estimates

02/01/2012
by   Dougal J. Sutherland, et al.
0

Most machine learning algorithms, such as classification or regression, treat the individual data point as the object of interest. Here we consider extending machine learning algorithms to operate on groups of data points. We suggest treating a group of data points as an i.i.d. sample set from an underlying feature distribution for that group. Our approach employs kernel machines with a kernel on i.i.d. sample sets of vectors. We define certain kernel functions on pairs of distributions, and then use a nonparametric estimator to consistently estimate those functions based on sample sets. The projection of the estimated Gram matrix to the cone of symmetric positive semi-definite matrices enables us to use kernel machines for classification, regression, anomaly detection, and low-dimensional embedding in the space of distributions. We present several numerical experiments both on real and simulated datasets to demonstrate the advantages of our new approach.

READ FULL TEXT

page 8

page 9

page 10

research
02/14/2012

Nonparametric Divergence Estimation with Applications to Machine Learning on Distributions

Low-dimensional embedding, manifold learning, clustering, classification...
research
12/02/2021

The Representation Jensen-Rényi Divergence

We introduce a divergence measure between data distributions based on op...
research
02/10/2021

Fast and stable deterministic approximation of general symmetric kernel matrices in high dimensions

Kernel methods are used frequently in various applications of machine le...
research
09/18/2017

A Summary Of The Kernel Matrix, And How To Learn It Effectively Using Semidefinite Programming

Kernel-based learning algorithms are widely used in machine learning for...
research
09/24/2015

Linear-time Learning on Distributions with Approximate Kernel Embeddings

Many interesting machine learning problems are best posed by considering...
research
10/10/2021

Adaptive joint distribution learning

We develop a new framework for embedding (joint) probability distributio...
research
07/07/2021

Samplets: A new paradigm for data compression

In this article, we introduce the concept of samplets by transferring th...

Please sign up or login with your details

Forgot password? Click here to reset