A clusterwise supervised learning procedure based on aggregation of distances

09/20/2019
by   Sothea Has, et al.
0

Nowadays, many machine learning procedures are available on the shelve and may be used easily to calibrate predictive models on supervised data. However, when the input data consists of more than one unknown cluster, and when different underlying predictive models exist, fitting a model is a more challenging task. We propose, in this paper, a procedure in three steps to automatically solve this problem. The KFC procedure aggregates different models adaptively on data. The first step of the procedure aims at catching the clustering structure of the input data, which may be characterized by several statistical distributions. It provides several partitions, given the assumptions on the distributions. For each partition, the second step fits a specific predictive model based on the data in each cluster. The overall model is computed by a consensual aggregation of the models corresponding to the different partitions. A comparison of the performances on different simulated and real data assesses the excellent performance of our method in a large variety of prediction problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2019

Consensual aggregation of clusters based on Bregman divergences to improve predictive models

A new procedure to construct predictive models in supervised learning pr...
research
06/05/2021

Cluster Analysis via Random Partition Distributions

Hierarchical and k-medoids clustering are deterministic clustering algor...
research
08/22/2021

The Exploitation of Distance Distributions for Clustering

Although distance measures are used in many machine learning algorithms,...
research
08/31/2022

Bayesian order identification of ARMA models with projection predictive inference

Auto-regressive moving-average (ARMA) models are ubiquitous forecasting ...
research
02/20/2020

Predictive Inference Is Free with the Jackknife+-after-Bootstrap

Ensemble learning is widely used in applications to make predictions in ...
research
11/30/2019

Crime in Philadelphia: Bayesian Clustering with Particle Optimization

Accurate estimation of the change in crime over time is a critical first...
research
07/20/2012

Fast nonparametric classification based on data depth

A new procedure, called DDa-procedure, is developed to solve the problem...

Please sign up or login with your details

Forgot password? Click here to reset