A Mutual Contamination Analysis of Mixed Membership and Partial Label Models

02/19/2016
by   Julian Katz-Samuels, et al.
0

Many machine learning problems can be characterized by mutual contamination models. In these problems, one observes several random samples from different convex combinations of a set of unknown base distributions. It is of interest to decontaminate mutual contamination models, i.e., to recover the base distributions either exactly or up to a permutation. This paper considers the general setting where the base distributions are defined on arbitrary probability spaces. We examine the decontamination problem in two mutual contamination models that describe popular machine learning tasks: recovering the base distributions up to a permutation in a mixed membership model, and recovering the base distributions exactly in a partial label model for classification. We give necessary and sufficient conditions for identifiability of both mutual contamination models, algorithms for both problems in the infinite and finite sample cases, and introduce novel proof techniques based on affine geometry.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2017

Decontamination of Mutual Contamination Models

Many machine learning problems can be characterized by mutual contaminat...
research
11/08/2022

Optimal Permutation Estimation in Crowd-Sourcing problems

Motivated by crowd-sourcing applications, we consider a model where we h...
research
09/29/2021

Finite-State Mutual Dimension

In 2004, Dai, Lathrop, Lutz, and Mayordomo defined and investigated the ...
research
04/02/2013

Sparse Signal Processing with Linear and Nonlinear Observations: A Unified Shannon-Theoretic Approach

We derive fundamental sample complexity bounds for recovering sparse and...
research
11/23/2022

SeedBERT: Recovering Annotator Rating Distributions from an Aggregated Label

Many machine learning tasks – particularly those in affective computing ...
research
07/09/2009

Beyond No Free Lunch: Realistic Algorithms for Arbitrary Problem Classes

We show how the necessary and sufficient conditions for the NFL to apply...
research
05/07/2021

Retrieving Data Permutations from Noisy Observations: High and Low Noise Asymptotics

This paper considers the problem of recovering the permutation of an n-d...

Please sign up or login with your details

Forgot password? Click here to reset