BNPdensity: Bayesian nonparametric mixture modeling in R

by   Julyan Arbel, et al.

Robust statistical data modelling under potential model mis-specification often requires leaving the parametric world for the nonparametric. In the latter, parameters are infinite dimensional objects such as functions, probability distributions or infinite vectors. In the Bayesian nonparametric approach, prior distributions are designed for these parameters, which provide a handle to manage the complexity of nonparametric models in practice. However, most modern Bayesian nonparametric models seem often out of reach to practitioners, as inference algorithms need careful design to deal with the infinite number of parameters. The aim of this work is to facilitate the journey by providing computational tools for Bayesian nonparametric inference. The article describes a set of functions available in the package BNPdensity in order to carry out density estimation with an infinite mixture model, including all types of censored data. The package provides access to a large class of such models based on normalized random measures, which represent a generalization of the popular Dirichlet process mixture. One striking advantage of this generalization is that it offers much more robust priors on the number of clusters than the Dirichlet. Another crucial advantage is the complete flexibility in specifying the prior for the scale and location parameters of the clusters, because conjugacy is not required. Inference is performed using a theoretically grounded approximate sampling methodology known as the Ferguson Klass algorithm. The package also offers several goodness of fit diagnostics such as QQ-plots, including a cross-validation criterion, the conditional predictive ordinate. The proposed methodology is illustrated on a classical ecological risk assessment method called the Species Sensitivity Distribution (SSD) problem, showcasing the benefits of the Bayesian nonparametric framework.


page 1

page 2

page 3

page 4


Hierarchical Species Sampling Models

This paper introduces a general class of hierarchical nonparametric prio...

CrossCat: A Fully Bayesian Nonparametric Method for Analyzing Heterogeneous, High Dimensional Data

There is a widespread need for statistical methods that can analyze high...

Importance conditional sampling for Bayesian nonparametric mixtures

Nonparametric mixture models based on the Pitman-Yor process represent a...

Semiparametric Bayesian Networks

We introduce semiparametric Bayesian networks that combine parametric an...

Robust prediction of failure time through unified Bayesian analysis of nonparametric transformation models

Nonparametric transformation models (NTMs) have sparked much interest in...

Evaluating Sensitivity to the Stick Breaking Prior in Bayesian Nonparametrics

A central question in many probabilistic clustering problems is how many...

Simple approximate MAP Inference for Dirichlet processes

The Dirichlet process mixture (DPM) is a ubiquitous, flexible Bayesian n...

Please sign up or login with your details

Forgot password? Click here to reset