Challenges and Opportunities for RISC-V Architectures towards Genomics-based Workloads

06/27/2023
by   Gonzalo Gomez-Sanchez, et al.
0

The use of large-scale supercomputing architectures is a hard requirement for scientific computing Big-Data applications. An example is genomics analytics, where millions of data transformations and tests per patient need to be done to find relevant clinical indicators. Therefore, to ensure open and broad access to high-performance technologies, governments, and academia are pushing toward the introduction of novel computing architectures in large-scale scientific environments. This is the case of RISC-V, an open-source and royalty-free instruction-set architecture. To evaluate such technologies, here we present the Variant-Interaction Analytics use case benchmarking suite and datasets. Through this use case, we search for possible genetic interactions using computational and statistical methods, providing a representative case for heavy ETL (Extract, Transform, Load) data processing. Current implementations are implemented in x86-based supercomputers (e.g. MareNostrum-IV at the Barcelona Supercomputing Center (BSC)), and future steps propose RISC-V as part of the next MareNostrum generations. Here we describe the Variant Interaction Use Case, highlighting the characteristics leveraging high-performance computing, indicating the caveats and challenges towards the next RISC-V developments and designs to come from a first comparison between x86 and RISC-V architectures on real Variant Interaction executions over real hardware implementations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/07/2023

SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples

We present SemOpenAlex, an extensive RDF knowledge graph that contains o...
research
07/20/2020

Collaborative Cloud Computing Framework for Health Data with Open Source Technologies

The proliferation of sensor technologies and advancements in data collec...
research
09/01/2020

Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures

Distributed stream processing engines are designed with a focus on scala...
research
04/26/2019

A Benchmarking Study to Evaluate Apache Spark on Large-Scale Supercomputers

As dataset sizes increase, data analysis tasks in high performance compu...
research
09/05/2020

Unleashing In-network Computing on Scientific Workloads

Many recent efforts have shown that in-network computing can benefit var...
research
10/23/2017

Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications

Many scientific problems require multiple distinct computational tasks t...
research
02/11/2023

Porting numerical integration codes from CUDA to oneAPI: a case study

We present our experience in porting optimized CUDA implementations to o...

Please sign up or login with your details

Forgot password? Click here to reset