Quantifying With Only Positive Training Data

04/22/2020
by   Denis dos Reis, et al.
0

Quantification is the research field that studies the task of counting how many data points belong to each class in an unlabeled sample. Traditionally, researchers in this field assume the availability of training data containing labeled observations for all classes to induce quantification models. Although quantification methods usually estimate counts for every class, we are often interested in those regarding only a target class. In this context, we have proposed a novel setting, known as One-class Quantification (OCQ), where reliable training data is only available for the target class. On the other hand, Positive and Unlabeled Learning (PUL), which is another branch of Machine Learning, has offered solutions that can be applied to OCQ, despite quantification not being the focal point of PUL. In this article, we close the gap between PUL and OCQ and bring both areas together under a unified view. We compare our methods, Passive Aggressive Threshold (PAT) and One Distribution Inside (ODIn), against PUL methods and show that PAT generally is the fastest and most accurate algorithm. Contrary to PUL methods, PAT and ODIn also can induce quantification models that can be replied to quantify different samples of data. We additionally introduce Exhaustive TIcE (ExTIcE), an improved version of the PUL algorithm Tree Induction for c Estimation (TIcE), and show that it quantifies more accurately than PAT and the other algorithms in scenarios where a considerable number of negative observations are identical to positive observations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2021

A Comparative Evaluation of Quantification Methods

Quantification represents the problem of predicting class distributions ...
research
02/28/2016

Does quantification without adjustments work?

Classification is the task of predicting the class labels of objects bas...
research
11/07/2022

A Semiparametric Efficient Approach To Label Shift Estimation and Quantification

Transfer Learning is an area of statistics and machine learning research...
research
03/26/2019

A method on selecting reliable samples based on fuzziness in positive and unlabeled learning

Traditional semi-supervised learning uses only labeled instances to trai...
research
02/19/2019

DEDPUL: Method for Mixture Proportion Estimation and Positive-Unlabeled Classification based on Density Estimation

This paper studies Positive-Unlabeled Classification, the problem of sem...
research
09/04/2018

A Recurrent Neural Network for Sentiment Quantification

Quantification is a supervised learning task that consists in predicting...
research
01/15/2020

Generalized Bayesian Quantification Learning

Quantification Learning is the task of prevalence estimation for a test ...

Please sign up or login with your details

Forgot password? Click here to reset