Robust Clustering with Subpopulation-specific Deviations

11/10/2017
by   Briana Stephenson, et al.
0

The National Birth Defects Prevention Study (NBDPS) was a case-control study of birth defects conducted across 10 U.S. states. Researchers are interested in characterizing the etiologic role of maternal diet, using data tools such as the food frequency questionnaire. Maternal diet and behaviors have been shown to influence the development of congenital malformations. In a large, heterogeneous population, traditional clustering methods, such as latent class analysis, used to estimate dietary patterns can produce a large number of clusters due to a variety of factors, including study size and regional diversity. These factors result in a loss of interpretability of patterns that may differ due to minor consumption pattern changes. Based on adaptation of the local partition process, we propose a new method, Robust Profile Clustering, to handle these data complexities. Here, participants may be clustered at two levels: (1) globally, where women are assigned to an overall population-level cluster via an overfitted mixture model, and (2) locally, where regional variations in diet are accommodated via a beta-Bernoulli process dependent on subpopulation differences. We use our method to analyze the NBDPS data, deriving pre-pregnancy dietary patterns women in the NBDPS while accounting for regional variability.

READ FULL TEXT

page 18

page 21

page 24

research
07/09/2020

Supervised Robust Profile Clustering

In many studies, dimension reduction methods are used to profile partici...
research
03/26/2018

Multiview Hierarchical Agglomerative Clustering for Identification of Development Gap and Regional Potential Sector

The identification of regional development gaps is an effort to see how ...
research
05/06/2020

A Bernoulli Mixture Model to Understand and Predict Children Longitudinal Wheezing Patterns

In this research, we estimate that around 27.99(±2.15)% of the populatio...
research
11/17/2020

Defying the Circadian Rhythm: Clustering Participant Telemetry in the UK Biobank Data

The UK Biobank dataset follows over 500,000 volunteers and contains a di...
research
06/08/2021

Clustering with missing data: which imputation model for which cluster analysis method?

Multiple imputation (MI) is a popular method for dealing with missing va...
research
09/06/2023

A New Way to Look at Regional Survey Data: Differences in Vacancy Rates and Persons per Household by County, 2000-2005

Regional survey estimates and their significance levels are simultaneous...
research
01/30/2023

Using cluster analysis on municipal statistical data to configure public policies about Water, Sanitation and Hygiene in Venezuela

Objective: The aim of this research is to demonstrate how the use of hie...

Please sign up or login with your details

Forgot password? Click here to reset