Scientific Workflows in Heterogeneous Edge-Cloud Computing: A Data Placement Strategy Based on Reinforcement learning

05/14/2022
by   Xin Du, et al.
0

The heterogeneous edge-cloud computing paradigm can provide an optimal solution to deploy scientific workflows compared to cloud computing or other traditional distributed computing environments. Owing to the different sizes of scientific datasets and the privacy issue concerning some of these datasets, it is essential to find a data placement strategy that can minimize data transmission time. Some state-of-the-art data placement strategies combine edge computing and cloud computing to distribute scientific datasets. However, the dynamic distribution of newly generated datasets to appropriate datacenters and exiting the spent datasets are still a challenge during workflows execution. To address this challenge, this study not only constructs a data placement model that includes shared datasets within individual and among multiple workflows across various geographical regions, but also proposes a data placement strategy (DYM-RL-DPS) based on algorithms of two stages. First, during the build-time stage of workflows, we use the discrete particle swarm optimization algorithm with differential evolution to pre-allocate initial datasets to proper datacenters. Then, we reformulate the dynamic datasets distribution problem as a Markov decision process and provide a reinforcement learning-based approach to learn the optimal strategy in the runtime stage of scientific workflows. Through simulating heterogeneous edge-cloud computing environments, we designed comprehensive experiments to demonstrate the superiority of DYM-RL-DPS. The results of our strategy can effectively reduce the data transmission time as compared to other strategies.

READ FULL TEXT
research
04/13/2021

Optimal Data Placement for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments

The heterogeneous edge-cloud computing paradigm can provide a more optim...
research
01/22/2019

A Time-driven Data Placement Strategy for a Scientific Workflow Combining Edge Computing and Cloud Computing

Compared to traditional distributed computing environments such as grids...
research
02/09/2018

Heterogeneous and Multidimensional Clairvoyant Dynamic Bin Packing for Virtual Machine Placement

Although the public cloud still occupies the largest portion of the tota...
research
01/18/2023

HLC2: a highly efficient cross-matching framework for large astronomical catalogues on heterogeneous computing environments

Cross-matching operation, which is to find corresponding data for the sa...
research
05/23/2023

Towards Optimal Serverless Function Scaling in Edge Computing Network

Serverless computing has emerged as a new execution model which gained a...
research
04/16/2017

Learn-Memorize-Recall-Reduce A Robotic Cloud Computing Paradigm

The rise of robotic applications has led to the generation of a huge vol...
research
02/21/2022

Hybrid Learning for Orchestrating Deep Learning Inference in Multi-user Edge-cloud Networks

Deep-learning-based intelligent services have become prevalent in cyber-...

Please sign up or login with your details

Forgot password? Click here to reset