Design and Implementation of ShenWei Universal C/C++
The ShenWei many-core series processors powering multiple cutting-edge supercomputers are equipped with their unique on-chip heterogeneous architecture. They have long required programmers to write separate codes for the control part on Management Processing Element (MPE) and accelerated part on Compute Processing Element (CPE), which is similar to open standards like OpenCL. Such a programming model results in shattered code and bad maintainability, and also make it hard to migrate existing projects targeting commodity processors. Borrowing the experience from CUDA and DPC++ and leveraging the unique unified main memory on ShenWei many-core architecture, we propose ShenWei Universal C/C++ (SWUC), a language extension to C/C++ that enables fluent programming acrossing the boundary of MPE and CPE. Through the use of several new attributes and compiler directives, users are able to write codes running on MPE and CPE in a single file. In case of C++, SWUC further support lambda expressions on CPEs, making it possible to have the code flow better matching the logical design. SWUC also manages to make the Athread library interfaces available, easing the learning curve for original ShenWei users. These powerful features together ensures SWUC to simplify the programming on ShenWei many-core processors and migration of existing C/C++ applications.
READ FULL TEXT