On Learning from Label Proportions

02/24/2014
by   Felix X. Yu, et al.
0

Learning from Label Proportions (LLP) is a learning setting, where the training data is provided in groups, or "bags", and only the proportion of each class in each bag is known. The task is to learn a model to predict the class labels of the individual instances. LLP has broad applications in political science, marketing, healthcare, and computer vision. This work answers the fundamental question, when and why LLP is possible, by introducing a general framework, Empirical Proportion Risk Minimization (EPRM). EPRM learns an instance label classifier to match the given label proportions on the training data. Our result is based on a two-step analysis. First, we provide a VC bound on the generalization error of the bag proportions. We show that the bag sample complexity is only mildly sensitive to the bag size. Second, we show that under some mild assumptions, good bag proportion prediction guarantees good instance label prediction. The results together provide a formal guarantee that the individual labels can indeed be learned in the LLP setting. We discuss applications of the analysis, including justification of LLP algorithms, learning with population proportions, and a paradigm for learning algorithms with privacy guarantees. We also demonstrate the feasibility of LLP based on a case study in real-world setting: predicting income based on census data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2019

Address Instance-level Label Prediction in Multiple Instance Learning

Multiple Instance Learning (MIL) is concerned with learning from bags of...
research
03/24/2022

Risk Consistent Multi-Class Learning from Label Proportions

This study addresses a multiclass learning from label proportions (MCLLP...
research
05/16/2023

Learning from Aggregated Data: Curated Bags versus Random Bags

Protecting user privacy is a major concern for many machine learning sys...
research
10/07/2021

Fast learning from label proportions with small bags

In learning from label proportions (LLP), the instances are grouped into...
research
10/24/2018

Label Propagation for Learning with Label Proportions

Learning with Label Proportions (LLP) is the problem of recovering the u...
research
06/12/2020

Learning from Label Proportions: A Mutual Contamination Framework

Learning from label proportions (LLP) is a weakly supervised setting for...
research
06/30/2016

Ballpark Learning: Estimating Labels from Rough Group Comparisons

We are interested in estimating individual labels given only coarse, agg...

Please sign up or login with your details

Forgot password? Click here to reset