Variable Selection for Kernel Two-Sample Tests

02/15/2023
by   Jie Wang, et al.
3

We consider the variable selection problem for two-sample tests, aiming to select the most informative features to best distinguish samples from two groups. We propose a kernel maximum mean discrepancy (MMD) framework to solve this problem and further derive its equivalent mixed-integer programming formulations for linear, quadratic, and Gaussian types of kernel functions. Our proposed framework admits advantages of both computational efficiency and nice statistical properties: (i) A closed-form solution is provided for the linear kernel case. Despite NP-hardness, we provide an exact mixed-integer semi-definite programming formulation for the quadratic kernel case, which further motivates the development of exact and approximation algorithms. We propose a convex-concave procedure that finds critical points for the Gaussian kernel case. (ii) We provide non-asymptotic uncertainty quantification of our proposed formulation under null and alternative scenarios. Experimental results demonstrate good performance of our framework.

READ FULL TEXT
research
05/28/2022

Feature subset selection for kernel SVM classification via mixed-integer optimization

We study the mixed-integer optimization (MIO) approach to feature subset...
research
10/19/2015

Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model

This paper concerns a method of selecting a subset of features for a seq...
research
11/29/2022

An Approximation Algorithm for Indefinite Mixed Integer Quadratic Programming

In this paper, we give an algorithm that finds an epsilon-approximate so...
research
04/14/2021

Grouped Variable Selection with Discrete Optimization: Computational and Statistical Perspectives

We present a new algorithmic framework for grouped variable selection th...
research
01/20/2020

Mixed integer programming formulation of unsupervised learning

A novel formulation and training procedure for full Boltzmann machines i...
research
06/27/2022

Stability Verification of Neural Network Controllers using Mixed-Integer Programming

We propose a framework for the stability verification of Mixed-Integer L...
research
03/07/2018

A Bayesian framework for molecular strain identification from mixed diagnostic samples

We provide a mathematical formulation and develop a computational framew...

Please sign up or login with your details

Forgot password? Click here to reset