A Data Mining Approach to the Diagnosis of Tuberculosis by Cascading Clustering and Classification

08/04/2011
by   Asha. T, et al.
0

In this paper, a methodology for the automated detection and classification of Tuberculosis(TB) is presented. Tuberculosis is a disease caused by mycobacterium which spreads through the air and attacks low immune bodies easily. Our methodology is based on clustering and classification that classifies TB into two categories, Pulmonary Tuberculosis(PTB) and retroviral PTB(RPTB) that is those with Human Immunodeficiency Virus (HIV) infection. Initially K-means clustering is used to group the TB data into two clusters and assigns classes to clusters. Subsequently multiple different classification algorithms are trained on the result set to build the final classifier model based on K-fold cross validation method. This methodology is evaluated using 700 raw TB data obtained from a city hospital. The best obtained accuracy was 98.7 proposed approach helps doctors in their diagnosis decisions and also in their treatment planning procedures for different categories.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

05/21/2021

An Explainable Classification Model for Chronic Kidney Disease Patients

Currently, Chronic Kidney Disease (CKD) is experiencing a globally incre...
10/24/2019

Characterization and Development of Average Silhouette Width Clustering

The purpose of this paper is to introduced a new clustering methodology....
10/17/2018

A Disease Diagnosis and Treatment Recommendation System Based on Big Data Mining and Cloud Computing

It is crucial to provide compatible treatment schemes for a disease acco...
09/23/2015

Predicting Climate Variability over the Indian Region Using Data Mining Strategies

In this paper an approach based on expectation maximization (EM) cluster...
10/10/2018

Introducing a hybrid model of DEA and data mining in evaluating efficiency. Case study: Bank Branches

The banking industry is very important for an economic cycle of each cou...
08/27/2020

reval: a Python package to determine the best number of clusters with stability-based relative clustering validation

Determining the number of clusters that best partitions a dataset can be...
01/19/2022

HPCGen: Hierarchical K-Means Clustering and Level Based Principal Components for Scan Path Genaration

In this paper, we present a new approach for decomposing scan paths and ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.