How Complex is your classification problem? A survey on measuring classification complexity

08/10/2018
by   Ana C. Lorena, et al.
0

Extracting characteristics from the training datasets of classification problems has proven effective in a number of meta-analyses. Among them, measures of classification complexity can estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision boundary are among the existent measures for this characterization. This information can support the formulation of new data-driven pre-processing and pattern recognition techniques, which can in turn be focused on challenging characteristics of the problems. This paper surveys and analyzes measures which can be extracted from the training datasets in order to characterize the complexity of the respective classification problems. Their use in recent literature is also reviewed and discussed, allowing to prospect opportunities for future work in the area. Finally, descriptions are given on an R package named Extended Complexity Library (ECoL) that implements a set of complexity measures and is made publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2016

Divisive-agglomerative algorithm and complexity of automatic classification problems

An algorithm of solution of the Automatic Classification (AC for brevity...
research
10/23/2018

OCAPIS: R package for Ordinal Classification And Preprocessing In Scala

Ordinal Data are those where a natural order exist between the labels. T...
research
08/08/2017

Data-driven Advice for Applying Machine Learning to Bioinformatics Problems

As the bioinformatics field grows, it must keep pace not only with new d...
research
08/07/2020

Neural Complexity Measures

While various complexity measures for diverse model classes have been pr...
research
04/16/2018

conformalClassification: A Conformal Prediction R Package for Classification

The conformalClassification package implements Transductive Conformal Pr...
research
07/14/2022

problexity – an open-source Python library for binary classification problem complexity assessment

The classification problem's complexity assessment is an essential eleme...
research
06/13/2018

Benchmarks for Image Classification and Other High-dimensional Pattern Recognition Problems

A good classification method should yield more accurate results than sim...

Please sign up or login with your details

Forgot password? Click here to reset