Learning beyond Predefined Label Space via Bayesian Nonparametric Topic Modelling

10/10/2019
by   Changying Du, et al.
14

In real world machine learning applications, testing data may contain some meaningful new categories that have not been seen in labeled training data. To simultaneously recognize new data categories and assign most appropriate category labels to the data actually from known categories, existing models assume the number of unknown new categories is pre-specified, though it is difficult to determine in advance. In this paper, we propose a Bayesian nonparametric topic model to automatically infer this number, based on the hierarchical Dirichlet process and the notion of latent Dirichlet allocation. Exact inference in our model is intractable, so we provide an efficient collapsed Gibbs sampling algorithm for approximate posterior inference. Extensive experiments on various text data sets show that: (a) compared with parametric approaches that use pre-specified true number of new categories, the proposed nonparametric approach can yield comparable performance; and (b) when the exact number of new categories is unavailable, i.e. the parametric approaches only have a rough idea about the new categories, our approach has evident performance advantages.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 6

page 9

page 11

page 15

research
09/22/2016

Nonparametric Bayesian Topic Modelling with the Hierarchical Pitman-Yor Processes

The Dirichlet process and its extension, the Pitman-Yor process, are sto...
research
06/20/2012

Nonparametric Bayes Pachinko Allocation

Recent advances in topic models have explored complicated structured dis...
research
01/16/2015

Bayesian Nonparametrics in Topic Modeling: A Brief Tutorial

Using nonparametric methods has been increasingly explored in Bayesian h...
research
10/27/2016

Geometric Dirichlet Means algorithm for topic inference

We propose a geometric algorithm for topic learning and inference that i...
research
06/29/2018

Hierarchical Dirichlet Process-based Open Set Recognition

In this paper, we proposed a novel hierarchical dirichlet process-based ...
research
06/29/2018

Nonparametric learning from Bayesian models with randomized objective functions

Bayesian learning is built on an assumption that the model space contain...
research
07/18/2017

Cooperative Hierarchical Dirichlet Processes: Superposition vs. Maximization

The cooperative hierarchical structure is a common and significant data ...

Please sign up or login with your details

Forgot password? Click here to reset