Analyzing Performance Properties Collected by the PerSyst Scalable HPC Monitoring Tool
The ability to understand how a scientific application is executed on a large HPC system is of great importance in allocating resources within the HPC data center. In this paper, we describe how we used system performance data to identify: execution patterns, possible code optimizations and improvements to the system monitoring. We also identify candidates for employing machine learning techniques to predict the performance of similar scientific codes.
READ FULL TEXT