A generalized flow for multi-class and binary classification tasks: An Azure ML approach

03/26/2016
by   Matthew Bihis, et al.
0

The constant growth in the present day real-world databases pose computational challenges for a single computer. Cloud-based platforms, on the other hand, are capable of handling large volumes of information manipulation tasks, thereby necessitating their use for large real-world data set computations. This work focuses on creating a novel Generalized Flow within the cloud-based computing platform: Microsoft Azure Machine Learning Studio (MAMLS) that accepts multi-class and binary classification data sets alike and processes them to maximize the overall classification accuracy. First, each data set is split into training and testing data sets, respectively. Then, linear and nonlinear classification model parameters are estimated using the training data set. Data dimensionality reduction is then performed to maximize classification accuracy. For multi-class data sets, data centric information is used to further improve overall classification accuracy by reducing the multi-class classification to a series of hierarchical binary classification tasks. Finally, the performance of optimized classification model thus achieved is evaluated and scored on the testing data set. The classification characteristics of the proposed flow are comparatively evaluated on 3 public data sets and a local data set with respect to existing state-of-the-art methods. On the 3 public data sets, the proposed flow achieves 78-97.5 classification accuracy. Also, the local data set, created using the information regarding presence of Diabetic Retinopathy lesions in fundus images, results in 85.3-95.7 than the existing methods. Thus, the proposed generalized flow can be useful for a wide range of application-oriented "big data sets".

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2019

An "outside the box" solution for imbalanced data classification

A common problem of the real-world data sets is the class imbalance, whi...
research
03/07/2023

Scatter-based common spatial patterns – a unified spatial filtering framework

The common spatial pattern (CSP) approach is known as one of the most po...
research
03/26/2016

Classification of Large-Scale Fundus Image Data Sets: A Cloud-Computing Framework

Large medical image data sets with high dimensionality require substanti...
research
06/09/2022

Multi-class Classification with Fuzzy-feature Observations: Theory and Algorithms

The theoretical analysis of multi-class classification has proved that t...
research
11/30/2017

On the importance of normative data in speech-based assessment

Data sets for identifying Alzheimer's disease (AD) are often relatively ...
research
09/06/2022

Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy

This paper proposes a novel non-intrusive system failure prediction tech...
research
01/30/2020

Better Multi-class Probability Estimates for Small Data Sets

Many classification applications require accurate probability estimates ...

Please sign up or login with your details

Forgot password? Click here to reset