Towards an Analytical Definition of Sufficient Data

02/07/2022
by   Adam Byerly, et al.
0

We show that, for each of five datasets of increasing complexity, certain training samples are more informative of class membership than others. These samples can be identified a priori to training by analyzing their position in reduced dimensional space relative to the classes' centroids. Specifically, we demonstrate that samples nearer the classes' centroids are less informative than those that are furthest from it. For all five datasets, we show that there is no statistically significant difference between training on the entire training set and when excluding up to 2 centroid.

READ FULL TEXT

page 2

page 3

page 6

page 11

page 12

page 13

page 14

page 15

research
08/22/2020

Few-Shot Learning with Intra-Class Knowledge Transfer

We consider the few-shot classification task with an unbalanced dataset,...
research
04/15/2022

Synthesizing Informative Training Samples with GAN

Remarkable progress has been achieved in synthesizing photo-realistic im...
research
06/07/2023

Membership inference attack with relative decision boundary distance

Membership inference attack is one of the most popular privacy attacks i...
research
05/18/2020

Data Represention for Deep Learning with Priori Knowledge of Symmetric Wireless Tasks

Deep neural networks (DNNs) have been applied to address various wireles...
research
03/29/2023

Polarity is all you need to learn and transfer faster

Natural intelligences (NIs) thrive in a dynamic world - they learn quick...
research
05/05/2023

Reconstructing Training Data from Multiclass Neural Networks

Reconstructing samples from the training set of trained neural networks ...
research
08/24/2018

A Deterministic Self-Organizing Map Approach and its Application on Satellite Data based Cloud Type Classification

A self-organizing map (SOM) is a type of competitive artificial neural n...

Please sign up or login with your details

Forgot password? Click here to reset