A Time-driven Data Placement Strategy for a Scientific Workflow Combining Edge Computing and Cloud Computing

by   Bing Lin, et al.

Compared to traditional distributed computing environments such as grids, cloud computing provides a more cost-effective way to deploy scientific workflows. Each task of a scientific workflow requires several large datasets that are located in different datacenters from the cloud computing environment, resulting in serious data transmission delays. Edge computing reduces the data transmission delays and supports the fixed storing manner for scientific workflow private datasets, but there is a bottleneck in its storage capacity. It is a challenge to combine the advantages of both edge computing and cloud computing to rationalize the data placement of scientific workflow, and optimize the data transmission time across different datacenters. Traditional data placement strategies maintain load balancing with a given number of datacenters, which results in a large data transmission time. In this study, a self-adaptive discrete particle swarm optimization algorithm with genetic algorithm operators (GA-DPSO) was proposed to optimize the data transmission time when placing data for a scientific workflow. This approach considered the characteristics of data placement combining edge computing and cloud computing. In addition, it considered the impact factors impacting transmission delay, such as the band-width between datacenters, the number of edge datacenters, and the storage capacity of edge datacenters. The crossover operator and mutation operator of the genetic algorithm were adopted to avoid the premature convergence of the traditional particle swarm optimization algorithm, which enhanced the diversity of population evolution and effectively reduced the data transmission time. The experimental results show that the data placement strategy based on GA-DPSO can effectively reduce the data transmission time during workflow execution combining edge computing and cloud computing.



There are no comments yet.


page 1


Optimal Data Placement for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments

The heterogeneous edge-cloud computing paradigm can provide a more optim...

Scientific Workflows in Heterogeneous Edge-Cloud Computing: A Data Placement Strategy Based on Reinforcement learning

The heterogeneous edge-cloud computing paradigm can provide an optimal s...

A Fuzzy Scheduling Strategy for Workflow Decision Making in Uncertain Edge-Cloud Environments

Workflow decision making is critical to performing many practical workfl...

Cost-Driven Offloading for DNN-based Applications over Cloud, Edge and End Devices

Currently, deep neural networks (DNNs) have achieved a great success in ...

Training on the Edge: The why and the how

Edge computing is the natural progression from Cloud computing, where, i...

A Comparative Evaluation of Population-based Optimization Algorithms for Workflow Scheduling in Cloud-Fog Environments

This work presents a comparative evaluation of four population-based opt...

Engineering Edge-Cloud Offloading of Big Data for Channel Modelling in THz-range Communications

Channel estimation in mmWave and THz-range wireless communications (prod...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.