greed: An R Package for Model-Based Clustering by Greedy Maximization of the Integrated Classification Likelihood

04/29/2022
by   Etienne Côme, et al.
0

The greed package implements the general and flexible framework of arXiv:2002.11577 for model-based clustering in the R language. Based on the direct maximization of the exact Integrated Classification Likelihood with respect to the partition, it allows jointly performing clustering and selection of the number of groups. This combinatorial problem is handled through an efficient hybrid genetic algorithm, while a final hierarchical step allows accessing coarser partitions and extract an ordering of the clusters. This methodology is applicable in a wide variety of latent variable models and, hence, can handle various data types as well as heterogeneous data. Classical models for continuous, count, categorical and graph data are implemented, and new models may be incorporated thanks to S4 class abstraction. This paper introduces the package, the design choices that guided its development and illustrates its usage on practical use-cases.

READ FULL TEXT
research
02/26/2020

Hierarchical clustering with discrete latent variable models and the integrated classification likelihood

In this paper, we introduce a two step methodology to extract a hierarch...
research
10/12/2022

Model-based clustering in simple hypergraphs through a stochastic blockmodel

We present a new hypergraph stochastic blockmodel and an associated infe...
research
07/28/2022

Model based clustering of multinomial count data

We consider the problem of inferring an unknown number of clusters in re...
research
05/06/2019

Hybrid Density- and Partition-based Clustering Algorithm for Data with Mixed-type Variables

Clustering is an essential technique for discovering patterns in data. T...
research
04/30/2018

eggCounts: a Bayesian hierarchical toolkit to model faecal egg count reductions

This is a vignette for the R package eggCounts version 2.0. The package ...
research
11/03/2021

Selecting the number of clusters, clustering models, and algorithms. A unifying approach based on the quadratic discriminant score

Cluster analysis requires many decisions: the clustering method and the ...
research
07/20/2023

Sparse model-based clustering of three-way data via lasso-type penalties

Mixtures of matrix Gaussian distributions provide a probabilistic framew...

Please sign up or login with your details

Forgot password? Click here to reset