Data Encoding For Healthcare Data Democratisation and Information Leakage Prevention

05/05/2023
by   Anshul Thakur, et al.
0

The lack of data democratization and information leakage from trained models hinder the development and acceptance of robust deep learning-based healthcare solutions. This paper argues that irreversible data encoding can provide an effective solution to achieve data democratization without violating the privacy constraints imposed on healthcare data and clinical models. An ideal encoding framework transforms the data into a new space where it is imperceptible to a manual or computational inspection. However, encoded data should preserve the semantics of the original data such that deep learning models can be trained effectively. This paper hypothesizes the characteristics of the desired encoding framework and then exploits random projections and random quantum encoding to realize this framework for dense and longitudinal or time-series data. Experimental evaluation highlights that models trained on encoded time-series data effectively uphold the information bottleneck principle and hence, exhibit lesser information leakage from trained models.

READ FULL TEXT

page 15

page 40

research
02/28/2023

Your time series is worth a binary image: machine vision assisted deep framework for time series forecasting

Time series forecasting (TSF) has been a challenging research area, and ...
research
10/13/2022

Empirical Evaluation of Data Augmentations for Biobehavioral Time Series Data with Deep Learning

Deep learning has performed remarkably well on many tasks recently. Howe...
research
07/24/2022

CODiT: Conformal Out-of-Distribution Detection in Time-Series Data

Machine learning models are prone to making incorrect predictions on inp...
research
09/20/2023

Learning Patient Static Information from Time-series EHR and an Approach for Safeguarding Privacy and Fairness

Recent work in machine learning for healthcare has raised concerns about...
research
09/10/2020

Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs

Synthetic data generation is important to training and evaluating neural...
research
01/17/2018

Seismic-Net: A Deep Densely Connected Neural Network to Detect Seismic Events

One of the risks of large-scale geologic carbon sequestration is the pot...

Please sign up or login with your details

Forgot password? Click here to reset