Approximate selective inference via maximum likelihood
We consider an approximate version of the conditional approach to selective inference, exploiting the use of randomness for a more efficient use of information in data during inference. Approximation is used to bypass potentially expensive MCMC sampling from conditional distributions in moderate dimensions. In the current paper, we address the problem of computationally-tractable inference in many practical scenarios with more than one exploratory query conducted on the data to define and perhaps, redefine models and associated parameters. At the core of our maximum-likelihood based method is a convex optimization problem, motivated by a large-deviations bound from Panigrahi (2016). The solution to this optimization leads to an approximate pivot, yielding valid post-selective inference across a wide range of signal regimes. Efficient by orders of magnitude than MCMC sampling, adjusting for selection post multiple exploratory queries via our proposal involves solving only a single, tractable optimization-- which takes a separable form across queries. A much appealing feature of our method is that it allows the data analyst to pose several questions of the data before forming a target of interest, with questions being derived from a very general class of convex learning programs. Through an in-depth simulation analysis, we illustrate promise of our approach and provide comparisons with other post-selective methods in both randomized and non-randomized paradigms of inference.
READ FULL TEXT