Visual Subpopulation Discovery and Validation in Cohort Study Data

11/26/2017
by   Shiva Alemzadeh, et al.
0

Epidemiology aims at identifying subpopulations of cohort participants that share common characteristics (e.g. alcohol consumption) to explain risk factors of diseases in cohort study data. These data contain information about the participants' health status gathered from questionnaires, medical examinations, and image acquisition. Due to the growing volume and heterogeneity of epidemiological data, the discovery of meaningful subpopulations is challenging. Subspace clustering can be leveraged to find subpopulations in large and heterogeneous cohort study datasets. In our collaboration with epidemiologists, we realized their need for a tool to validate discovered subpopulations. For this purpose, identified subpopulations should be searched for independent cohorts to check whether the findings apply there as well. In this paper we describe our interactive Visual Analytics framework S-ADVIsED for SubpopulAtion Discovery and Validation In Epidemiological Data. S-ADVIsED enables epidemiologists to explore and validate findings derived from subspace clustering. We provide a coordinated multiple view system, which includes a summary view of all subpopulations, detail views, and statistical information. Users can assess the quality of subspace clusters by considering different criteria via visualization. Furthermore, intervals for variables involved in a subspace cluster can be adjusted. This extension was suggested by epidemiologists. We investigated the replication of a selected subpopulation with multiple variables in another population by considering different measurements. As a specific result, we observed that study participants exhibiting high liver fat accumulation deviate strongly from other subpopulations and from the total study population with respect to age, body mass index, thyroid volume and thyroid-stimulating hormone.

READ FULL TEXT

page 2

page 7

page 11

research
01/12/2019

Are Clusterings of Multiple Data Views Independent?

In the Pioneer 100 (P100) Wellness Project (Price and others, 2017), mul...
research
01/12/2022

Fine-grained Graph Learning for Multi-view Subspace Clustering

Multi-view subspace clustering has conventionally focused on integrating...
research
03/01/2021

Validation of cluster analysis results on validation data: A systematic framework

Cluster analysis refers to a wide range of data analytic techniques for ...
research
04/20/2022

An Empirical Study on the Relationship Between the Number of Coordinated Views and Visual Analysis

Coordinated Multiple views (CMVs) are a visualization technique that sim...
research
11/09/2021

Identifying the Risks of Chronic Diseases Using BMI Trajectories

Obesity is a major health problem, increasing the risk of various major ...
research
07/18/2023

Visual Validation versus Visual Estimation: A Study on the Average Value in Scatterplots

We investigate the ability of individuals to visually validate statistic...

Please sign up or login with your details

Forgot password? Click here to reset