DeepAI AI Chat
Log In Sign Up

Non-parametric targeted Bayesian estimation of class proportions in unlabeled data

by   Iván Díaz, et al.
cornell university

We introduce a novel Bayesian estimator for the class proportion in an unlabeled dataset, based on the targeted learning framework. Our procedure requires the specification of a prior (and outputs a posterior) only for the target of inference, instead of the prior (and posterior) on the full-data distribution employed by classical non-parametric Bayesian methods .When the scientific question can be characterized by a low-dimensional parameter functional, focus on such a prior and posterior distributions is more aligned with Bayesian subjectivism, compared to focus on entire data distributions. We prove a Bernstein-von Mises-type result for our proposed Bayesian procedure, which guarantees that the posterior distribution converges to the distribution of an efficient, asymptotically linear estimator. In particular, the posterior is Gaussian, doubly robust, and efficient in the limit, under the only assumption that certain nuisance parameters are estimated at slow rates. We perform numerical studies illustrating the frequentist properties of the method. We also illustrate their use in a motivating application to estimate the proportion of embolic strokes of undetermined source arising from occult cardiac sources or large-artery atherosclerotic lesions. Though we focus on the motivating example of the proportion of cases in an unlabeled dataset, the procedure is general and can be adapted to estimate any pathwise differentiable parameter in a non-parametric model.


page 1

page 2

page 3

page 4


Decompounding discrete distributions: A non-parametric Bayesian approach

Suppose that a compound Poisson process is observed discretely in time a...

A general Bayesian bootstrap for censored data based on the beta-Stacy process

We introduce a novel procedure to perform Bayesian non-parametric infere...

Bayesian Analysis for Over-parameterized Linear Model without Sparsity

In high-dimensional Bayesian statistics, several methods have been devel...

Bayesian variance estimation in the Gaussian sequence model with partial information on the means

Consider the Gaussian sequence model under the additional assumption tha...

The θ-augmented model for Bayesian semiparametric inference on functional parameters

Semiparametric Bayesian inference has so far relied on models for the ob...

A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

We propose a novel Bayesian approach to solve stochastic optimization pr...

Non-parametric Bayesian Vector Autoregression using Multi-subject Data

There has been a rich development of vector autoregressive (VAR) models ...