Data Separability for Neural Network Classifiers and the Development of a Separability Index

05/27/2020
by   Shuyue Guan, et al.
38

In machine learning, the performance of a classifier depends on both the classifier model and the dataset. For a specific neural network classifier, the training process varies with the training set used; some training data make training accuracy fast converged to high values, while some data may lead to slowly converged to lower accuracy. To quantify this phenomenon, we created the Distance-based Separability Index (DSI), which is independent of the classifier model, to measure the separability of datasets. In this paper, we consider the situation where different classes of data are mixed together in the same distribution is most difficult for classifiers to separate, and we show that the DSI can indicate whether data belonging to different classes have similar distributions. When comparing our proposed approach with several existing separability/complexity measures using synthetic and real datasets, the results show the DSI is an effective separability measure. We also discussed possible applications of the DSI in the fields of data science, machine learning, and deep learning.

READ FULL TEXT

page 3

page 4

page 5

page 8

research
09/11/2021

A Novel Intrinsic Measure of Data Separability

In machine learning, the performance of a classifier depends on both the...
research
11/10/2022

A classification performance evaluation measure considering data separability

Machine learning and deep learning classification models are data-driven...
research
12/05/2012

Making Early Predictions of the Accuracy of Machine Learning Applications

The accuracy of machine learning systems is a widely studied research to...
research
02/06/2015

Unsupervised Fusion Weight Learning in Multiple Classifier Systems

In this paper we present an unsupervised method to learn the weights wit...
research
10/28/2020

Predicting Classification Accuracy when Adding New Unobserved Classes

Multiclass classifiers are often designed and evaluated only on a sample...
research
08/04/2014

Multithreshold Entropy Linear Classifier

Linear classifiers separate the data with a hyperplane. In this paper we...
research
11/15/2018

Exploiting Class Learnability in Noisy Data

In many domains, collecting sufficient labeled training data for supervi...

Please sign up or login with your details

Forgot password? Click here to reset