A Time-driven Data Placement Strategy for a Scientific Workflow Combining Edge Computing and Cloud Computing

01/22/2019
by   Bing Lin, et al.
0

Compared to traditional distributed computing environments such as grids, cloud computing provides a more cost-effective way to deploy scientific workflows. Each task of a scientific workflow requires several large datasets that are located in different datacenters from the cloud computing environment, resulting in serious data transmission delays. Edge computing reduces the data transmission delays and supports the fixed storing manner for scientific workflow private datasets, but there is a bottleneck in its storage capacity. It is a challenge to combine the advantages of both edge computing and cloud computing to rationalize the data placement of scientific workflow, and optimize the data transmission time across different datacenters. Traditional data placement strategies maintain load balancing with a given number of datacenters, which results in a large data transmission time. In this study, a self-adaptive discrete particle swarm optimization algorithm with genetic algorithm operators (GA-DPSO) was proposed to optimize the data transmission time when placing data for a scientific workflow. This approach considered the characteristics of data placement combining edge computing and cloud computing. In addition, it considered the impact factors impacting transmission delay, such as the band-width between datacenters, the number of edge datacenters, and the storage capacity of edge datacenters. The crossover operator and mutation operator of the genetic algorithm were adopted to avoid the premature convergence of the traditional particle swarm optimization algorithm, which enhanced the diversity of population evolution and effectively reduced the data transmission time. The experimental results show that the data placement strategy based on GA-DPSO can effectively reduce the data transmission time during workflow execution combining edge computing and cloud computing.

READ FULL TEXT
research
04/13/2021

Optimal Data Placement for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments

The heterogeneous edge-cloud computing paradigm can provide a more optim...
research
05/14/2022

Scientific Workflows in Heterogeneous Edge-Cloud Computing: A Data Placement Strategy Based on Reinforcement learning

The heterogeneous edge-cloud computing paradigm can provide an optimal s...
research
07/31/2019

Cost-Driven Offloading for DNN-based Applications over Cloud, Edge and End Devices

Currently, deep neural networks (DNNs) have achieved a great success in ...
research
09/16/2022

Workflow-based Fast Data-driven Predictive Control with Disturbance Observer in Cloud-edge Collaborative Architecture

Data-driven predictive control (DPC) has been studied and used in variou...
research
06/16/2022

Modifying the Asynchronous Jacobi Method for Data Corruption Resilience

Moving scientific computation from high-performance computing (HPC) and ...
research
11/30/2020

A Comparative Evaluation of Population-based Optimization Algorithms for Workflow Scheduling in Cloud-Fog Environments

This work presents a comparative evaluation of four population-based opt...

Please sign up or login with your details

Forgot password? Click here to reset