Accurate parameter estimation for Bayesian Network Classifiers using Hierarchical Dirichlet Processes

08/25/2017
by   Francois Petitjean, et al.
0

This paper introduces a novel parameter estimation method for the probability tables of Bayesian Network Classifiers (BNCs), using Hierarchical Dirichlet Processes (HDPs). The main result of this paper is to show that proper parameter estimation allows BNCs to outperform leading learning methods such as Random Forest for both 0-1 loss and RMSE, albeit just on categorical datasets. As data assets become larger, entering the hyped world of big, accurate classification requires three main elements: (1) classifiers with low-bias that can capture the fine-detail of large datasets (2) out-of-core learners that can learn from data without having to hold it all in main memory and (3) models that can classify new data very efficiently. The latest Bayesian Network classifiers (BNCs) have these requirements. Their bias can be controlled easily by increasing the number of parents of the nodes in the graph. Their structure can be learned out of core with a limited number of passes over the data. However, as the bias is made lower to accurately model classification tasks, so is the accuracy of their parameters' estimates. In this paper, we introduce the use of Hierarchical Dirichlet Processes for accurate parameter estimation of BNCs. We conduct an extensive set of experiments on 68 standard datasets and demonstrate that our resulting classifiers perform very competitively with Random Forest in terms of prediction, while keeping the out-of-core capability and superior classification time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2020

Fast Maximum Likelihood Estimation and Supervised Classification for the Beta-Liouville Multinomial

The multinomial and related distributions have long been used to model c...
research
04/17/2019

Exponential random graph model parameter estimation for very large directed networks

Exponential random graph models (ERGMs) are widely used for modeling soc...
research
08/04/2023

Learning from Topology: Cosmological Parameter Estimation from the Large-scale Structure

The topology of the large-scale structure of the universe contains valua...
research
05/14/2019

Resource-aware Elastic Swap Random Forest for Evolving Data Streams

Continual learning based on data stream mining deals with ubiquitous sou...
research
12/03/2021

Bayes in Wonderland! Predictive supervised classification inference hits unpredictability

The marginal Bayesian predictive classifiers (mBpc) as opposed to the si...
research
01/31/2018

The Impact of Automated Parameter Optimization on Defect Prediction Models

Defect prediction models---classifiers that identify defect-prone softwa...

Please sign up or login with your details

Forgot password? Click here to reset