Mitigating Health Data Poverty: Generative Approaches versus Resampling for Time-series Clinical Data

10/25/2022
by   Raffaele Marchesi, et al.
13

Several approaches have been developed to mitigate algorithmic bias stemming from health data poverty, where minority groups are underrepresented in training datasets. Augmenting the minority class using resampling (such as SMOTE) is a widely used approach due to the simplicity of the algorithms. However, these algorithms decrease data variability and may introduce correlations between samples, giving rise to the use of generative approaches based on GAN. Generation of high-dimensional, time-series, authentic data that provides a wide distribution coverage of the real data, remains a challenging task for both resampling and GAN-based approaches. In this work we propose CA-GAN architecture that addresses some of the shortcomings of the current approaches, where we provide a detailed comparison with both SMOTE and WGAN-GP*, using a high-dimensional, time-series, real dataset of 3343 hypotensive Caucasian and Black patients. We show that our approach is better at both generating authentic data of the minority class and remaining within the original distribution of the real data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/02/2021

PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series

Realistic synthetic time series data of sufficient length enables practi...
research
05/21/2023

PCF-GAN: generating sequential data via the characteristic function of measures on the path space

Generating high-fidelity time series data using generative adversarial n...
research
07/13/2021

Wasserstein GAN: Deep Generation applied on Bitcoins financial time series

Modeling financial time series is challenging due to their high volatili...
research
10/05/2022

Transformer-based conditional generative adversarial network for multivariate time series generation

Conditional generation of time-dependent data is a task that has much in...
research
06/30/2020

Conditional GAN for timeseries generation

It is abundantly clear that time dependent data is a vital source of inf...
research
05/20/2020

Early Classification of Time Series. Cost-based Optimization Criterion and Algorithms

An increasing number of applications require to recognize the class of a...
research
09/02/2021

Fault detection and diagnosis of batch process using dynamic ARMA-based control charts

A wide range of approaches for batch processes monitoring can be found i...

Please sign up or login with your details

Forgot password? Click here to reset