Journal of the Royal Statistical Society. Series A: Statistics in Society

A statistical significance-based approach for clustering grouped data via generalized linear model with discrete random effects

Alessandra Ragni 1
Chiara Masci 1
Francesca Ieva 1, 2
Anna Maria Paganoni 1
1
 
MOX, Department of Mathematics, Politecnico di Milano , Piazza Leonardo da Vinci 32, Milan 20133 ,
2
 
Health Data Science Research Center, Human Technopole , Viale Rita Levi-Montalcini 1, Milan 20157 ,
Publication typeJournal Article
Publication date2025-03-07
scimago Q1
wos Q2
SJR0.775
CiteScore2.9
Impact factor1.5
ISSN09641998, 1467985X
Abstract

Identifying distinct subgroups within correlated data is essential for tailoring policies to specific needs, ensuring optimal outcomes for each group. In the context of model-based clustering, we introduce an innovative approach for clustering grouped data using linear mixed models with discrete random effects and exponential family responses (e.g. Poisson or Bernoulli). Our method uncovers the latent clustering structure, net of fixed effects, by assuming that random effects follow a discrete distribution with an a priori unknown number of support points. We refine this process within a modified Expectation–Maximization algorithm, collapsing support points of the discrete distribution with overlapping estimated confidence intervals or regions, derived from the asymptotic properties of maximum likelihood estimators. This approach offers a transparent interpretation of the latent structure, distinct from existing tools for discrete random effects, which often rely on discretionary tuning parameters or predetermined cluster counts. Through simulation studies, we compare our approach with traditional parametric methods and state-of-the-art techniques, demonstrating its effectiveness. We apply our model on real-world data from the Programme for International Student Assessment, aiming to classify countries based on their impact on low-achieving student rates in schools. Our methodology provides valuable insights for effective policy formulation.

Found 

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex
Found error?