Enabling Dynamic and Intelligent Workflows for HPC, Data Analytics, and AI Convergence

04/20/2022
by   Jorge Ejarque, et al.
0

The evolution of High-Performance Computing (HPC) platforms enables the design and execution of progressively larger and more complex workflow applications in these systems. The complexity comes not only from the number of elements that compose the workflows but also from the type of computations they perform. While traditional HPC workflows target simulations and modelling of physical phenomena, current needs require in addition data analytics (DA) and artificial intelligence (AI) tasks. However, the development of these workflows is hampered by the lack of proper programming models and environments that support the integration of HPC, DA, and AI, as well as the lack of tools to easily deploy and execute the workflows in HPC systems. To progress in this direction, this paper presents use cases where complex workflows are required and investigates the main issues to be addressed for the HPC/DA/AI convergence. Based on this study, the paper identifies the challenges of a new workflow platform to manage complex workflows. Finally, it proposes a development approach for such a workflow platform addressing these challenges in two directions: first, by defining a software stack that provides the functionalities to manage these complex workflows; and second, by proposing the HPC Workflow as a Service (HPCWaaS) paradigm, which leverages the software stack to facilitate the reusability of complex workflows in federated HPC infrastructures. Proposals presented in this work are subject to study and development as part of the EuroHPC eFlows4HPC project.

READ FULL TEXT

page 5

page 9

page 18

page 19

page 20

page 22

page 23

page 38

research
03/06/2021

EVEREST: A design environment for extreme-scale big data analytics on heterogeneous platforms

High-Performance Big Data Analytics (HPDA) applications are characterize...
research
07/20/2020

BeeSwarm: Enabling Scalability Tests in Continuous Integration

Testing is one of the most important steps in software development. It e...
research
11/23/2020

Integrating Deep Learning in Domain Sciences at Exascale

This paper presents some of the current challenges in designing deep lea...
research
03/24/2020

AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance

The ever-growing availability of computing power and the sustained devel...
research
06/28/2021

Operational Data Analytics in Practice: Experiences from Design to Deployment in Production HPC Environments

As HPC systems grow in complexity, efficient and manageable operation is...
research
08/30/2021

ExaWorks: Workflows for Exascale

Exascale computers will offer transformative capabilities to combine dat...
research
07/31/2023

LEONARDO: A Pan-European Pre-Exascale Supercomputer for HPC and AI Applications

A new pre-exascale computer cluster has been designed to foster scientif...

Please sign up or login with your details

Forgot password? Click here to reset