Large Deviations for Classification Performance Analysis of Machine Learning Systems

01/16/2023
by   Paolo Braca, et al.
0

We study the performance of machine learning binary classification techniques in terms of error probabilities. The statistical test is based on the Data-Driven Decision Function (D3F), learned in the training phase, i.e., what is thresholded before the final binary decision is made. Based on large deviations theory, we show that under appropriate conditions the classification error probabilities vanish exponentially, as ∼exp(-n I + o(n) ), where I is the error rate and n is the number of observations available for testing. We also propose two different approximations for the error probability curves, one based on a refined asymptotic formula (often referred to as exact asymptotics), and another one based on the central limit theorem. The theoretical findings are finally tested using the popular MNIST dataset.

READ FULL TEXT
research
07/22/2022

Statistical Hypothesis Testing Based on Machine Learning: Large Deviations Analysis

We study the performance – and specifically the rate at which the error ...
research
10/21/2020

How to Control the Error Rates of Binary Classifiers

The traditional binary classification framework constructs classifiers w...
research
03/08/2021

Exact Distribution-Free Hypothesis Tests for the Regression Function of Binary Classification via Conditional Kernel Mean Embeddings

In this paper we suggest two statistical hypothesis tests for the regres...
research
06/03/2018

Second-Order Asymptotically Optimal Statistical Classification

Motivated by real-world machine learning applications, we analyze approx...
research
02/17/2019

A Novel Error Performance Analysis Methodology for OFDM-IM

Orthogonal frequency-division multiplexing with index modulation (OFDM-I...
research
11/10/2022

A classification performance evaluation measure considering data separability

Machine learning and deep learning classification models are data-driven...
research
11/16/2020

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

Contemporary machine learning applications often involve classification ...

Please sign up or login with your details

Forgot password? Click here to reset