A classification performance evaluation measure considering data separability

11/10/2022
by   Lingyan Xue, et al.
0

Machine learning and deep learning classification models are data-driven, and the model and the data jointly determine their classification performance. It is biased to evaluate the model's performance only based on the classifier accuracy while ignoring the data separability. Sometimes, the model exhibits excellent accuracy, which might be attributed to its testing on highly separable data. Most of the current studies on data separability measures are defined based on the distance between sample points, but this has been demonstrated to fail in several circumstances. In this paper, we propose a new separability measure–the rate of separability (RS), which is based on the data coding rate. We validate its effectiveness as a supplement to the separability measure by comparing it to four other distance-based measures on synthetic datasets. Then, we demonstrate the positive correlation between the proposed measure and recognition accuracy in a multi-task scenario constructed from a real dataset. Finally, we discuss the methods for evaluating the classification performance of machine learning and deep learning models considering data separability.

READ FULL TEXT

page 12

page 15

page 25

research
05/27/2020

Data Separability for Neural Network Classifiers and the Development of a Separability Index

In machine learning, the performance of a classifier depends on both the...
research
06/13/2020

Analyzing the Impact of Foursquare and Streetlight Data with Human Demographics on Future Crime Prediction

Finding the factors contributing to criminal activities and their conseq...
research
09/11/2021

A Novel Intrinsic Measure of Data Separability

In machine learning, the performance of a classifier depends on both the...
research
01/22/2022

Good Classification Measures and How to Find Them

Several performance measures can be used for evaluating classification r...
research
05/16/2022

Phishing Detection Leveraging Machine Learning and Deep Learning: A Review

Phishing attacks trick victims into disclosing sensitive information. To...
research
01/16/2023

Large Deviations for Classification Performance Analysis of Machine Learning Systems

We study the performance of machine learning binary classification techn...
research
04/24/2017

Visual-Based Analysis of Classification Measures with Applications to Imbalanced Data

With a plethora of available classification performance measures, choosi...

Please sign up or login with your details

Forgot password? Click here to reset