Variational Inference for Semiparametric Bayesian Novelty Detection in Large Datasets

12/04/2022
by   Luca Benedetti, et al.
0

After being trained on a fully-labeled training set, where the observations are grouped into a certain number of known classes, novelty detection methods aim to classify the instances of an unlabeled test set while allowing for the presence of previously unseen classes. These models are valuable in many areas, ranging from social network and food adulteration analyses to biology, where an evolving population may be present. In this paper, we focus on a two-stage Bayesian semiparametric novelty detector, also known as Brand, recently introduced in the literature. Leveraging on a model-based mixture representation, Brand allows clustering the test observations into known training terms or a single novelty term. Furthermore, the novelty term is modeled with a Dirichlet Process mixture model to flexibly capture any departure from the known patterns. Brand was originally estimated using MCMC schemes, which are prohibitively costly when applied to high-dimensional data. To scale up Brand applicability to large datasets, we propose to resort to a variational Bayes approach, providing an efficient algorithm for posterior approximation. We demonstrate a significant gain in efficiency and excellent classification performance with thorough simulation studies. Finally, to showcase its applicability, we perform a novelty detection analysis using the openly-available Statlog dataset, a large collection of satellite imaging spectra, to search for novel soil types.

READ FULL TEXT
research
06/16/2020

A Two-Stage Bayesian Nonparametric Model for Novelty Detection with Robust Prior Information

Standard novelty detection methods aim at bi-partitioning the test units...
research
02/28/2018

Novelty Detection with GAN

The ability of a classifier to recognize unknown inputs is important for...
research
04/09/2019

Generative Models for Novelty Detection: Applications in abnormal event and situational change detection from data series

Novelty detection is a process for distinguishing the observations that ...
research
03/05/2019

Probabilistic Modeling for Novelty Detection with Applications to Fraud Identification

Novelty detection is the unsupervised problem of identifying anomalies i...
research
04/21/2016

Novelty Detection in MultiClass Scenarios with Incomplete Set of Class Labels

We address the problem of novelty detection in multiclass scenarios wher...
research
11/17/2020

Measuring the Novelty of Natural Language Text Using the Conjunctive Clauses of a Tsetlin Machine Text Classifier

Most supervised text classification approaches assume a closed world, co...
research
03/25/2021

Margin-free classification and new class detection using finite Dirichlet mixtures

We present a margin-free finite mixture model which allows us to simulta...

Please sign up or login with your details

Forgot password? Click here to reset