HySec-Flow: Privacy-Preserving Genomic Computing with SGX-based Big-Data Analytics Framework

07/26/2021
by   Chathura Widanage, et al.
0

Trusted execution environments (TEE) such as Intel's Software Guard Extension (SGX) have been widely studied to boost security and privacy protection for the computation of sensitive data such as human genomics. However, a performance hurdle is often generated by SGX, especially from the small enclave memory. In this paper, we propose a new Hybrid Secured Flow framework (called "HySec-Flow") for large-scale genomic data analysis using SGX platforms. Here, the data-intensive computing tasks can be partitioned into independent subtasks to be deployed into distinct secured and non-secured containers, therefore allowing for parallel execution while alleviating the limited size of Page Cache (EPC) memory in each enclave. We illustrate our contributions using a workflow supporting indexing, alignment, dispatching, and merging the execution of SGX- enabled containers. We provide details regarding the architecture of the trusted and untrusted components and the underlying Scorn and Graphene support as generic shielding execution frameworks to port legacy code. We thoroughly evaluate the performance of our privacy-preserving reads mapping algorithm using real human genome sequencing data. The results demonstrate that the performance is enhanced by partitioning the time-consuming genomic computation into subtasks compared to the conventional execution of the data-intensive reads mapping algorithm in an enclave. The proposed HySec-Flow framework is made available as an open-source and adapted to the data-parallel computation of other large-scale genomic tasks requiring security and scalable computational resources.

READ FULL TEXT

page 1

page 6

research
10/04/2021

AsymML: An Asymmetric Decomposition Framework for Privacy-Preserving DNN Training and Inference

Leveraging parallel hardware (e.g. GPUs) to conduct deep neural network ...
research
06/19/2019

Efficient privacy preservation of big data for accurate data mining

Computing technologies pervade physical spaces and human lives, and prod...
research
12/08/2022

HyperEnclave: An Open and Cross-platform Trusted Execution Environment

A number of trusted execution environments (TEEs) have been proposed by ...
research
04/09/2019

Enabling Privacy-Preserving, Compute- and Data-Intensive Computing using Heterogeneous Trusted Execution Environment

There is an urgent demand for privacy-preserving techniques capable of s...
research
07/10/2020

COBRA: Compression via Abstraction of Provenance for Hypothetical Reasoning

Data analytics often involves hypothetical reasoning: repeatedly modifyi...
research
10/20/2019

Micro-level Modularity of Computaion-intensive Programs in Big Data Platforms: A Case Study with Image Data

With the rapid advancement of Big Data platforms such as Hadoop, Spark, ...
research
04/07/2021

Contingency Analysis Based on Partitioned and Parallel Holomorphic Embedding

In the steady-state contingency analysis, the traditional Newton-Raphson...

Please sign up or login with your details

Forgot password? Click here to reset