Equitability, interval estimation, and statistical power

05/09/2015
by   Yakir A. Reshef, et al.
0

For analysis of a high-dimensional dataset, a common approach is to test a null hypothesis of statistical independence on all variable pairs using a non-parametric measure of dependence. However, because this approach attempts to identify any non-trivial relationship no matter how weak, it often identifies too many relationships to be useful. What is needed is a way of identifying a smaller set of relationships that merit detailed further analysis. Here we formally present and characterize equitability, a property of measures of dependence that aims to overcome this challenge. Notionally, an equitable statistic is a statistic that, given some measure of noise, assigns similar scores to equally noisy relationships of different types [Reshef et al. 2011]. We begin by formalizing this idea via a new object called the interpretable interval, which functions as an interval estimate of the amount of noise in a relationship of unknown type. We define an equitable statistic as one with small interpretable intervals. We then draw on the equivalence of interval estimation and hypothesis testing to show that under moderate assumptions an equitable statistic is one that yields well powered tests for distinguishing not only between trivial and non-trivial relationships of all kinds but also between non-trivial relationships of different strengths. This means that equitability allows us to specify a threshold relationship strength x_0 and to search for relationships of all kinds with strength greater than x_0. Thus, equitability can be thought of as a strengthening of power against independence that enables fruitful analysis of data sets with a small number of strong, interesting relationships and a large number of weaker ones. We conclude with a demonstration of how our two equivalent characterizations of equitability can be used to evaluate the equitability of a statistic in practice.

READ FULL TEXT

page 17

page 20

research
05/09/2015

Measuring dependence powerfully and equitably

Given a high-dimensional data set we often wish to find the strongest re...
research
05/09/2015

An Empirical Study of Leading Measures of Dependence

In exploratory data analysis, we are often interested in identifying pro...
research
01/09/2015

Equitability of Dependence Measure

A measure of dependence is said to be equitable if it gives similar scor...
research
07/28/2023

Multivariate Differential Association Analysis

Identifying how dependence relationships vary across different condition...
research
05/10/2019

Relationship Detection Measures for Binary SoC Data

System-on-Chip (SoC) designs are used in every aspect of computing and t...
research
07/02/2022

What Makes a Strong Monad?

Strong monads are important for several applications, in particular, in ...
research
01/14/2020

Shades of Perception- User Factors in Identifying Password Strength

The purpose of this study was to measure whether participant education, ...

Please sign up or login with your details

Forgot password? Click here to reset