Individualized Group Learning

by   Chencheng Cai, et al.
Rutgers University

Many massive data are assembled through collections of information of a large number of individuals in a population. The analysis of such data, especially in the aspect of individualized inferences and solutions, has the potential to create significant value for practical applications. Traditionally, inference for an individual in the data set is either solely relying on the information of the individual or from summarizing the information about the whole population. However, with the availability of big data, we have the opportunity, as well as a unique challenge, to make a more effective individualized inference that takes into consideration of both the population information and the individual discrepancy. To deal with the possible heterogeneity within the population while providing effective and credible inferences for individuals in a data set, this article develops a new approach called the individualized group learning (iGroup). The iGroup approach uses local nonparametric techniques to generate an individualized group by pooling other entities in the population which share similar characteristics with the target individual. Three general cases of iGroup are discussed, and their asymptotic performances are investigated. Both theoretical results and empirical simulations reveal that, by applying iGroup, the performance of statistical inference on the individual level are ensured and can be substantially improved from inference based on either solely individual information or entire population information. The method has a broad range of applications. Two examples in financial statistics and maritime anomaly detection are presented.


Common Misconceptions about Population Data

Databases covering all individuals of a population are increasingly used...

Divide-and-conquer methods for big data analysis

In the context of big data analysis, the divide-and-conquer methodology ...

Inferring Unfairness and Error from Population Statistics in Binary and Multiclass Classification

We propose methods for making inferences on the fairness and accuracy of...

Empirical Likelihood Inference With Public-Use Survey Data

Public-use survey data are an important source of information for resear...

Introduction to Neutrosophic Statistics

Neutrosophic Statistics means statistical analysis of population or samp...

A flexible Bayesian framework for individualized inference via dynamic borrowing

The explosion in high-resolution data capture technologies in health has...

What to do if N is two?

The field of in-vivo neurophysiology currently uses statistical standard...

Please sign up or login with your details

Forgot password? Click here to reset