Aggregating Predictions on Multiple Non-disclosed Datasets using Conformal Prediction

06/11/2018
by   Ola Spjuth, et al.
0

Conformal Prediction is a machine learning methodology that produces valid prediction regions under mild conditions. In this paper, we explore the application of making predictions over multiple data sources of different sizes without disclosing data between the sources. We propose that each data source applies a transductive conformal predictor independently using the local data, and that the individual predictions are then aggregated to form a combined prediction region. We demonstrate the method on several data sets, and show that the proposed method produces conservatively valid predictions and reduces the variance in the aggregated predictions. We also study the effect that the number of data sources and size of each source has on aggregated predictions, as compared with equally sized sources and pooled data.

READ FULL TEXT
research
08/15/2019

Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets

Conformal Prediction is a framework that produces prediction intervals b...
research
04/05/2019

Bayesian Heatmaps: Probabilistic Classification with Multiple Unreliable Information Sources

Unstructured data from diverse sources, such as social media and aerial ...
research
06/18/2023

2D-Shapley: A Framework for Fragmented Data Valuation

Data valuation – quantifying the contribution of individual data sources...
research
03/02/2023

EdgeServe: An Execution Layer for Decentralized Prediction

The relevant features for a machine learning task may be aggregated from...
research
12/29/2015

Sparse group factor analysis for biclustering of multiple data sources

Motivation: Modelling methods that find structure in data are necessary ...
research
07/24/2020

Cross-study learning for generalist and specialist predictions

Jointly using data from multiple similar sources for the training of pre...
research
10/28/2020

Online feature selection for rapid, low-overhead learning in networked systems

Data-driven functions for operation and management often require measure...

Please sign up or login with your details

Forgot password? Click here to reset