Characterizing instance hardness in classification and regression problems

12/04/2022
by   Gustavo P. Torquette, et al.
0

Some recent pieces of work in the Machine Learning (ML) literature have demonstrated the usefulness of assessing which observations are hardest to have their label predicted accurately. By identifying such instances, one may inspect whether they have any quality issues that should be addressed. Learning strategies based on the difficulty level of the observations can also be devised. This paper presents a set of meta-features that aim at characterizing which instances of a dataset are hardest to have their label predicted accurately and why they are so, aka instance hardness measures. Both classification and regression problems are considered. Synthetic datasets with different levels of complexity are built and analyzed. A Python package containing all implementations is also provided.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2021

PyHard: a novel tool for generating hardness embeddings to support data-centric analysis

For building successful Machine Learning (ML) systems, it is imperative ...
research
05/05/2023

Data Complexity: A New Perspective for Analyzing the Difficulty of Defect Prediction Tasks

Defect prediction is crucial for software quality assurance and has been...
research
11/16/2022

Features for the 0-1 knapsack problem based on inclusionwise maximal solutions

Decades of research on the 0-1 knapsack problem led to very efficient al...
research
09/26/2022

Prayatul Matrix: A Direct Comparison Approach to Evaluate Performance of Supervised Machine Learning Models

Performance comparison of supervised machine learning (ML) models are wi...
research
07/29/2020

Boosting Ant Colony Optimization via Solution Prediction and Machine Learning

This paper introduces an enhanced meta-heuristic (ML-ACO) that combines ...
research
03/29/2022

HardVis: Visual Analytics to Handle Instance Hardness Using Undersampling and Oversampling Techniques

Despite the tremendous advances in machine learning (ML), training with ...
research
04/07/2022

Learning to Solve Travelling Salesman Problem with Hardness-adaptive Curriculum

Various neural network models have been proposed to tackle combinatorial...

Please sign up or login with your details

Forgot password? Click here to reset