Bayesian Nonparametric Classification for Incomplete Data With a High Missing Rate: an Application to Semiconductor Manufacturing Data

07/30/2021
by   Sewon Park, et al.
0

During the semiconductor manufacturing process, predicting the yield of the semiconductor is an important problem. Early detection of defective product production in the manufacturing process can save huge production cost. The data generated from the semiconductor manufacturing process have characteristics of highly non-normal distributions, complicated missing patterns and high missing rate, which complicate the prediction of the yield. We propose Dirichlet process - naive Bayes model (DPNB), a classification method based on the mixtures of Dirichlet process and naive Bayes model. Since the DPNB is based on the mixtures of Dirichlet process and learns the joint distribution of all variables involved, it can handle highly non-normal data and can make predictions for the test dataset with any missing patterns. The DPNB also performs well for high missing rates since it uses all information of observed components. Experiments on various real datasets including semiconductor manufacturing data show that the DPNB has better performance than MICE and MissForest in terms of predicting missing values as percentage of missing values increases.

READ FULL TEXT
research
12/02/2017

Efficient Bayesian Nonparametric Inference for Categorical Data with General High Missingness

Missingness in categorical data is a common problem in various real appl...
research
11/22/2017

Variational Bayesian Inference For A Scale Mixture Of Normal Distributions Handling Missing Data

In this paper, a scale mixture of Normal distributions model is develope...
research
12/17/2014

The supervised hierarchical Dirichlet process

We propose the supervised hierarchical Dirichlet process (sHDP), a nonpa...
research
06/20/2009

Automatic Defect Detection and Classification Technique from Image: A Special Case Using Ceramic Tiles

Quality control is an important issue in the ceramic tile industry. On t...
research
12/12/2002

Data Engineering for the Analysis of Semiconductor Manufacturing Data

We have analyzed manufacturing data from several different semiconductor...
research
02/22/2021

PCB-Fire: Automated Classification and Fault Detection in PCB

Printed Circuit Boards are the foundation for the functioning of any ele...
research
06/07/2023

Causally Learning an Optimal Rework Policy

In manufacturing, rework refers to an optional step of a production proc...

Please sign up or login with your details

Forgot password? Click here to reset