Log In Sign Up

Evaluating representations by the complexity of learning low-loss predictors

by   William F. Whitney, et al.

We consider the problem of evaluating representations of data for use in solving a downstream task. We propose to measure the quality of a representation by the complexity of learning a predictor on top of the representation that achieves low loss on a task of interest, and introduce two methods, surplus description length (SDL) and ε sample complexity (εSC). In contrast to prior methods, which measure the amount of information about the optimal predictor that is present in a specific amount of data, our methods measure the amount of information needed from the data to recover an approximation of the optimal predictor up to a specified tolerance. We present a framework to compare these methods based on plotting the validation loss versus training set size (the "loss-data" curve). Existing measures, such as mutual information and minimum description length probes, correspond to slices and integrals along the data-axis of the loss-data curve, while ours correspond to slices and integrals along the loss-axis. We provide experiments on real data to compare the behavior of each of these methods over datasets of varying size along with a high performance open source library for representation evaluation at


page 1

page 2

page 3

page 4


On the Sample Complexity of Representation Learning in Multi-task Bandits with Global and Local structure

We investigate the sample complexity of learning the optimal arm for mul...

Latent Representation Prediction Networks

Deeply-learned planning methods are often based on learning representati...

Low-loss connection of weight vectors: distribution-based approaches

Recent research shows that sublevel sets of the loss surfaces of overpar...

Mutual Information-based Generalized Category Discovery

We introduce an information-maximization approach for the Generalized Ca...

Information-Theoretic Probing with Minimum Description Length

To measure how well pretrained representations encode some linguistic pr...

Multi-group Agnostic PAC Learnability

An agnostic PAC learning algorithm finds a predictor that is competitive...

Signed and Unsigned Partial Information Decompositions of Continuous Network Interactions

We investigate the partial information decomposition (PID) framework as ...

Code Repositories


A library for evaluating representations.

view repo