Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

02/17/2018
by   Makoto Yamada, et al.
0

Measuring divergence between two distributions is essential in machine learning and statistics and has various applications including binary classification, change point detection, and two-sample test. Furthermore, in the era of big data, designing divergence measure that is interpretable and can handle high-dimensional and complex data becomes extremely important. In the paper, we propose a post selection inference (PSI) framework for divergence measure, which can select a set of statistically significant features that discriminate two distributions. Specifically, we employ an additive variant of maximum mean discrepancy (MMD) for features and introduce a general hypothesis test for PSI. A novel MMD estimator using the incomplete U-statistics, which has an asymptotically Normal distribution (under mild assumptions) and gives high detection power in PSI, is also proposed and analyzed theoretically. Through synthetic and real-world feature selection experiments, we show that the proposed framework can successfully detect statistically significant features. Last, we propose a sample selection framework for analyzing different members in the Generative Adversarial Networks (GANs) family.

READ FULL TEXT
research
02/15/2018

Selecting the Best in GANs Family: a Post Selection Inference Framework

"Which Generative Adversarial Networks (GANs) generates the most plausib...
research
10/12/2016

Post Selection Inference with Kernels

We propose a novel kernel based post selection inference (PSI) algorithm...
research
03/12/2020

Asymptotic normality of a generalized maximum mean discrepancy estimator

In this paper, we propose an estimator of the generalized maximum mean d...
research
09/30/2021

Two Sample Testing in High Dimension via Maximum Mean Discrepancy

Maximum Mean Discrepancy (MMD) has been widely used in the areas of mach...
research
12/19/2014

Empirically Estimable Classification Bounds Based on a New Divergence Measure

Information divergence functions play a critical role in statistics and ...
research
05/22/2016

Interpretable Distribution Features with Maximum Testing Power

Two semimetrics on probability distributions are proposed, given as the ...
research
05/25/2023

Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score

Adversarial detection aims to determine whether a given sample is an adv...

Please sign up or login with your details

Forgot password? Click here to reset