Approximate co-sufficient sampling with regularization

09/14/2023
by   Wanrong Zhu, et al.
0

In this work, we consider the problem of goodness-of-fit (GoF) testing for parametric models – for example, testing whether observed data follows a logistic regression model. This testing problem involves a composite null hypothesis, due to the unknown values of the model parameters. In some special cases, co-sufficient sampling (CSS) can remove the influence of these unknown parameters via conditioning on a sufficient statistic – often, the maximum likelihood estimator (MLE) of the unknown parameters. However, many common parametric settings (including logistic regression) do not permit this approach, since conditioning on a sufficient statistic leads to a powerless test. The recent approximate co-sufficient sampling (aCSS) framework of Barber and Janson (2022) offers an alternative, replacing sufficiency with an approximately sufficient statistic (namely, a noisy version of the MLE). This approach recovers power in a range of settings where CSS cannot be applied, but can only be applied in settings where the unconstrained MLE is well-defined and well-behaved, which implicitly assumes a low-dimensional regime. In this work, we extend aCSS to the setting of constrained and penalized maximum likelihood estimation, so that more complex estimation problems can now be handled within the aCSS framework, including examples such as mixtures-of-Gaussians (where the unconstrained MLE is not well-defined due to degeneracy) and high-dimensional Gaussian linear models (where the MLE can perform well under regularization, such as an ℓ_1 penalty or a shape constraint).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2020

Testing goodness-of-fit and conditional independence with approximate co-sufficient sampling

Goodness-of-fit (GoF) testing is ubiquitous in statistics, with direct t...
research
06/10/2019

The Impact of Regularization on High-dimensional Logistic Regression

Logistic regression is commonly used for modeling dichotomous outcomes. ...
research
07/13/2016

Multiple-Instance Logistic Regression with LASSO Penalty

In this work, we consider a manufactory process which can be described b...
research
10/07/2020

High dimensional asymptotics of likelihood ratio tests in Gaussian sequence model under convex constraint

In the Gaussian sequence model Y=μ+ξ, we study the likelihood ratio test...
research
07/14/2023

Bounded-memory adjusted scores estimation in generalized linear models with large data sets

The widespread use of maximum Jeffreys'-prior penalized likelihood in bi...
research
06/08/2022

Identification testing via sample splitting – an application to Structural VAR models

In this article, a novel identification test is proposed, which can be a...
research
03/13/2018

Takeuchi's Information Criteria as a form of Regularization

Takeuchi's Information Criteria (TIC) is a linearization of maximum like...

Please sign up or login with your details

Forgot password? Click here to reset