Studying the Interplay between Information Loss and Operation Loss in Representations for Classification

12/30/2021
by   Jorge F. Silva, et al.
0

Information-theoretic measures have been widely adopted in the design of features for learning and decision problems. Inspired by this, we look at the relationship between i) a weak form of information loss in the Shannon sense and ii) the operation loss in the minimum probability of error (MPE) sense when considering a family of lossy continuous representations (features) of a continuous observation. We present several results that shed light on this interplay. Our first result offers a lower bound on a weak form of information loss as a function of its respective operation loss when adopting a discrete lossy representation (quantization) instead of the original raw observation. From this, our main result shows that a specific form of vanishing information loss (a weak notion of asymptotic informational sufficiency) implies a vanishing MPE loss (or asymptotic operational sufficiency) when considering a general family of lossy continuous representations. Our theoretical findings support the observation that the selection of feature representations that attempt to capture informational sufficiency is appropriate for learning, but this selection is a rather conservative design principle if the intended goal is achieving MPE in classification. Supporting this last point, and under some structural conditions, we show that it is possible to adopt an alternative notion of informational sufficiency (strictly weaker than pure sufficiency in the mutual information sense) to achieve operational sufficiency in learning.

READ FULL TEXT

page 32

page 35

page 37

research
11/07/2015

Information Extraction Under Privacy Constraints

A privacy-constrained information extraction problem is considered where...
research
01/27/2020

Feature selection in machine learning: Rényi min-entropy vs Shannon entropy

Feature selection, in the context of machine learning, is the process of...
research
02/01/2021

On conditional Sibson's α-Mutual Information

In this work, we analyse how to define a conditional version of Sibson's...
research
03/04/2019

Approximations of Shannon Mutual Information for Discrete Variables with Applications to Neural Population Coding

Although Shannon mutual information has been widely used, its effective ...
research
10/15/2020

Abstract Congruence Criteria for Weak Bisimilarity

We introduce three general compositionality criteria over operational se...
research
05/30/2022

Batch Normalization Is Blind to the First and Second Derivatives of the Loss

In this paper, we prove the effects of the BN operation on the back-prop...

Please sign up or login with your details

Forgot password? Click here to reset