O2ATH: An OpenMP Offloading Toolkit for the Sunway Heterogeneous Manycore Platform

09/10/2023
by   Haoran Lin, et al.
0

The next generation Sunway supercomputer employs the SW26010pro processor, which features a specialized on-chip heterogeneous architecture. Applications with significant hotspots can benefit from the great computation capacity improvement of Sunway many-core architectures by carefully making intensive manual many-core parallelization efforts. However, some legacy projects with large codebases, such as CESM, ROMS and WRF, contain numerous lines of code and do not have significant hotspots. The cost of manually porting such applications to the Sunway architecture is almost unaffordable. To overcome such a challenge, we have developed a toolkit named O2ATH. O2ATH forwards GNU OpenMP runtime library calls to Sunway's Athread library, which greatly simplifies the parallelization work on the Sunway architecture.O2ATH enables users to write both MPE and CPE code in a single file, and parallelization can be achieved by utilizing OpenMP directives and attributes. In practice, O2ATH has helped us to port two large projects, CESM and ROMS, to the CPEs of the next generation Sunway supercomputers via the OpenMP offload method. In the experiments, kernel speedups range from 3 to 15 times, resulting in 3 to 6 times whole application speedups.Furthermore, O2ATH requires significantly fewer code modifications compared to manually crafting CPE functions.This indicates that O2ATH can greatly enhance development efficiency when porting or optimizing large software projects on Sunway supercomputers.

READ FULL TEXT
research
08/01/2022

Design and Implementation of ShenWei Universal C/C++

The ShenWei many-core series processors powering multiple cutting-edge s...
research
05/16/2023

Experiences in Building a Composable and Functional API for Runtime SPIR-V Code Generation

This paper presents the Beehive SPIR-V Toolkit; a framework that can aut...
research
04/16/2021

Evaluation of Portable Acceleration Solutions for LArTPC Simulation Using Wire-Cell Toolkit

The Liquid Argon Time Projection Chamber (LArTPC) technology plays an es...
research
01/30/2018

Open3D: A Modern Library for 3D Data Processing

Open3D is an open-source library that supports rapid development of soft...
research
02/09/2023

AutoNMT: A Framework to Streamline the Research of Seq2Seq Models

We present AutoNMT, a framework to streamline the research of seq-to-seq...
research
04/10/2019

Application performance on a Cluster-Booster system

The DEEP projects have developed a variety of hardware and software tech...
research
06/03/2022

ChaTEAU: A Universal Toolkit for Applying the Chase

What do applications like semantic optimization, data exchange and integ...

Please sign up or login with your details

Forgot password? Click here to reset