Clustering Hierarchies via a Semi-Parametric Generalized Linear Mixed Model: a statistical significance-based approach

02/23/2023
by   Alessandra Ragni, et al.
0

We introduce a novel statistical significance-based approach for clustering hierarchical data using semi-parametric linear mixed-effects models designed for responses with laws in the exponential family (e.g., Poisson and Bernoulli). Within the family of semi-parametric mixed-effects models, a latent clustering structure of the highest-level units can be identified by assuming the random effects to follow a discrete distribution with an unknown number of support points. We achieve this by computing α-level confidence regions of the estimated support point and identifying statistically different clusters. At each iteration of a tailored Expectation Maximization algorithm, the two closest estimated support points for which the confidence regions overlap collapse. Unlike the related state-of-the-art methods that rely on arbitrary thresholds to determine the merging of close discrete masses, the proposed approach relies on conventional statistical confidence levels, thereby avoiding the use of discretionary tuning parameters. To demonstrate the effectiveness of our approach, we apply it to data from the Programme for International Student Assessment (PISA - OECD) to cluster countries based on the rate of innumeracy levels in schools. Additionally, a simulation study and comparison with classical parametric and state-of-the-art models are provided and discussed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2020

Convergence and inference for mixed Poisson random sums

In this paper we obtain the limit distribution for partial sums with a r...
research
10/05/2016

Non-Parametric Cluster Significance Testing with Reference to a Unimodal Null Distribution

Cluster analysis is an unsupervised learning strategy that can be employ...
research
05/02/2022

MEGH: A parametric class of general hazard models for clustered survival data

In many applications of survival data analysis, the individuals are trea...
research
12/09/2022

Non-parametric estimation of mixed discrete choice models

In this paper, different strands of literature are combined in order to ...
research
11/25/2021

Exact Confidence Bounds in Discrete Models – Algorithmic Aspects of Sterne's Method

In this manuscript we review two methods to construct exact confidence b...
research
01/07/2022

Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

Proper statistical modeling incorporates domain theory about how concept...
research
03/20/2019

Modelling Diffusion through Statistical Network Analysis: A Simulation Study

The study of international relations by definition deals with interdepen...

Please sign up or login with your details

Forgot password? Click here to reset