Defining Big Data Analytics Benchmarks for Next Generation Supercomputers

11/06/2018
by   Drew Schmidt, et al.
0

The design and construction of high performance computing (HPC) systems relies on exhaustive performance analysis and benchmarking. Traditionally this activity has been geared exclusively towards simulation scientists, who, unsurprisingly, have been the primary customers of HPC for decades. However, there is a large and growing volume of data science work that requires these large scale resources, and as such the calls for inclusion and investments in data for HPC have been increasing. So when designing a next generation HPC platform, it is necessary to have HPC-amenable big data analytics benchmarks. In this paper, we propose a set of big data analytics benchmarks and sample codes designed for testing the capabilities of current and next generation supercomputers.

READ FULL TEXT
research
02/01/2018

Big Data Dwarfs: Towards Fully Understanding Big Data Analytics Workloads

Though the big data benchmark suites like BigDataBench and CloudSuite ha...
research
01/23/2018

Task-parallel Analysis of Molecular Dynamics Trajectories

Different frameworks for implementing parallel data analytics applicatio...
research
08/23/2017

Big Data Meets HPC Log Analytics: Scalable Approach to Understanding Systems at Extreme Scale

Today's high-performance computing (HPC) systems are heavily instrumente...
research
02/11/2019

Scaling Big Data Platform for Big Data Pipeline

Monitoring and Managing High Performance Computing (HPC) systems and env...
research
12/16/2020

An Integrated Platform for Collaborative Data Analytics

While collaboration among data scientists is a key to organizational pro...
research
08/15/2017

GARDENIA: A Domain-specific Benchmark Suite for Next-generation Accelerators

This paper presents the Graph Analytics Repository for Designing Next-ge...

Please sign up or login with your details

Forgot password? Click here to reset