greed: An R Package for Model-Based Clustering by Greedy Maximization of the Integrated Classification Likelihood
The greed package implements the general and flexible framework of arXiv:2002.11577 for model-based clustering in the R language. Based on the direct maximization of the exact Integrated Classification Likelihood with respect to the partition, it allows jointly performing clustering and selection of the number of groups. This combinatorial problem is handled through an efficient hybrid genetic algorithm, while a final hierarchical step allows accessing coarser partitions and extract an ordering of the clusters. This methodology is applicable in a wide variety of latent variable models and, hence, can handle various data types as well as heterogeneous data. Classical models for continuous, count, categorical and graph data are implemented, and new models may be incorporated thanks to S4 class abstraction. This paper introduces the package, the design choices that guided its development and illustrates its usage on practical use-cases.
READ FULL TEXT