Towards "simultaneous selective inference": post-hoc bounds on the false discovery proportion

03/19/2018
by   Eugene Katsevich, et al.
0

Some pitfalls of the false discovery rate (FDR) as an error criterion for multiple testing of n hypotheses include (a) committing to an error level q in advance limits its use in exploratory data analysis, and (b) controlling the false discovery proportion (FDP) on average provides no guarantee on its variability. We take a step towards overcoming these barriers using a new perspective we call "simultaneous selective inference." Many FDR procedures (such as Benjamini-Hochberg) can be viewed as carving out a path of potential rejection sets ∅ = R_0 ⊆ R_1 ⊆...⊆ R_n ⊆ [n], assigning some algorithm-dependent estimate FDP( R_k) to each one. Then, they choose k^* = {k: FDP( R_k) ≤ q}. We prove that for all these algorithms, given independent null p-values and a confidence level α, either the same FDP or a minor variant thereof bounds the unknown FDP to within a small explicit (algorithm-dependent) constant factor c_alg(α), uniformly across the entire path, with probability 1-α. Our bounds open up a middle ground between fully simultaneous inference (guarantees for all 2^n possible rejection sets), and fully selective inference (guarantees only for R_k^*). They allow the scientist to spot one or more suitable rejection sets (Select Post-hoc On the algorithm's Trajectory), by picking data-dependent sizes or error-levels, after examining the entire path of FDP( R_k) and the uniform upper band on FDP. The price for the additional flexibility of spotting is small, for example the multiplier for BH corresponding to 95

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset