SCOPE: C3SR Systems Characterization and Benchmarking Framework

09/18/2018 ∙ by Carl Pearson, et al. ∙ ibm University of Illinois at Urbana-Champaign 0

This report presents the design of the Scope infrastructure for extensible and portable benchmarking. Improvements in high- performance computing systems rely on coordination across different levels of system abstraction. Developing and defining accurate performance measurements is necessary at all levels of the system hierarchy, and should be as accessible as possible to developers with different backgrounds. The Scope project aims to lower the barrier to entry for developing performance benchmarks by providing a software architecture that allows benchmarks to be developed independently, by providing useful C/C++ abstractions and utilities, and by providing a Python package for generating publication-quality plots of resulting measurements.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Motivation

The IBM-ILLINOIS Center for Cognitive Computing Systems Research (C3

SR) is a long-term collaboration between IBM Research and the University of Illinois at Urbana-Champaign focused on developing advanced Cognitive Computing and Artificial Intelligence (AI) systems that are optimized across the vertical stacks of AI solutions, software, and hardware systems. In particular, C3SR aims to develop technologies to improve cognitive application developers’ productivity on heterogeneous infrastructure

[1].

This effort demands us to perform various system characterization and performance measurements work at all levels of computer system abstraction, including processors (such as X86, POWER, and ARM cores), system communication links (such as PCIe, NVLinks, CAPI/OpenCAPI), special system accelerators (both discrete ones such as FPGAs and integrated ones such as Tensor Cores), libraries (such as CUDA and CuDNN), and frameworks (such as TensorFlow, Caffee, and PyTorch).

Instead of developing ad-hoc performance measurement solutions for each task, we decide to develop a common system characterization and benchmarking infrastructure and tooling for all of our tasks, providing some uniformity across the various tasks.

As time passes by, we find that this infrastructure is proven not only useful, but also greatly boosted our productivity. A number of interesting systems projects at C3SR have already benefited from having such an infrastructure and tooling. Therefore, we decide to open source this infrastructure so that other research teams interested in systems work can also benefit from our work. Hence the genesis of this C3SR project: SCOPE.

Ii Design Philosophy

This whitepaper describes the motivation and design of the initial release (v1.0) of the Scope benchmarking infrastructure. Scope is designed around three primary goals:

  • extensibility: It should be easy for external groups to develop independent benchmarks without requiring centralized coordination with the Scope project. This allows different teams to develop their own measurement tools to fit their own needs.

  • portability: The Scope infrastructure should support as many different systems as possible. Though individual benchmarks may have specific requirements, Scope itself should not be a barrier to running benchmarks on a particular system. Scope has been tested on POWER8/POWER9- and x86_64-based systems, but should support any system that has a C++11 compiler and the CUDA [2] toolkit.

  • development silos: New groups of benchmarks should be able to be open- or closed-source. Each group of benchmarks may have its own software dependencies, compiler feature requirements, or other necessities, and those requirements should not be globally propagated to all benchmark code. This allows Scope to remain system-agnostic and useful to the widest possible audience.

The Scope infrastructure consists of three kinds of software components. First, the SCOPE repository (Section III) manages configuration and compilation and provides shared utility functions for the benchmarks. Second, scopes (Section IV) define groups of benchmarks, with their own optional dependencies and utilities. Third, the ScopePlot project (Section V) provides a Python package for plotting and manipulating Scope results. Figure 1 shows the relationship between Scope infrastructure components.

Fig. 1: Diagram of Scope architecture. The scope infrastructure is divided into three components. The Scope Repository contains code to initialize and run registered scopes and contains the main function (the entry point) to the SCOPE binary. Scopes are separate structured directories, usually git repositories, that contain the actual benchmark code. The ScopePlot repository holds Python code for implementing the ScopePlot python package. Running the SCOPE binary produces a benchmark data file, which can be consumed by ScopePlot to produce plots of results.

SCOPE and ScopePlot make heavy use of existing tools and libraries, notably Git [3] source control, CMake [4] compilation configuration, the Google Benchmark library [5], matplotlib [6], bokeh [7], and pandas[8].

Ii-a Licensing, Hosting, and Contributing

The Scope infrastructure is free and open source, licensed under the Apache 2.0 license [9]. Scope welcomes contributions. See CONTRIBUTING.md in the Scope source tree for up-to-date information about contributing. Table I lists the URLs for Scope infrastructure components.

Component URL
Scope https://github.com/c3sr/scope
ScopePlot https://github.com/c3sr/scope_plot
TABLE I: Hosting Locations

Iii SCOPE Repository

The SCOPE repository (github.com/c3sr/scope) is the entry point for building and running benchmarks. SCOPE is maintained as a Git repository to provide an open revision history and broad accessibility. SCOPE itself does not contain any benchmark code; instead, it has the following responsibilities:

  • retrieve benchmark code

  • configure the SCOPE binary compilation

  • fetch dependencies

  • provide common utilities

  • provide initialization hooks

Fig. 2: Stages of building Scope. (a), the download stage where SCOPE code is downloaded. (b), the configuration stage where scopes are downloaded, scopes are enabled, and dependencies are downloaded and built. (c) the build stage, where the SCOPE binary is produced. (d) the run stage, where the benchmarks are run.

Iii-a Retrieving Code

During the download stage, Figure 2(a), the SCOPE source code is retrieved.

Iii-B Configuring the SCOPE Binary Compilation and Fetching Dependencies

Since the SCOPE repository does not contain any benchmark code, a user will typically provide benchmark code by including additional scopes (Section IV) during the build process. Scopes are semi-independent packages that contain benchmark code. For development isolation, scopes are maintained as separate projects and are included in SCOPE as CMake-aware projects in the scopes directory of the SCOPE source tree.

SCOPE includes several completed scopes (Table IV) as Git submodules. When scopes are added as submodules, it is easy to assicated each SCOPE release with a specific revision of benchmark code, and the correct version of the scopes can be automatically downloaded when downloading the whole scope project. Other scopes can be manually added to this directory.

Scope uses CMake [4] to configure the Scope binary compilation during the configuration stage, shown in Figure 2(b). CMake simplifies the process of supporting a variety of compilers and systems as well as fetching dependencies. Through CMake, the Git submodules in SCOPE are downloaded and the fixed revisions are checked out. SCOPE then invokes CMake’s add_subdirectory command on all directories in the scopes directory to add those scopes to the SCOPE binary build. Each scope exports a CMake Object Library [10] - a CMake target that represents a set of object files containing the implementation of that scope, as well as associated information about required dependencies for those object files.

Since benchmarks may only be compatible with particular systems, Scope allows conditional compilation of benchmarks. By selectively including or excluding object libraries during configuration, Scope may selectively include benchmark code and associated dependencies. Each scope’s CMakeLists.txt may define options for disabling the scope. For example, cmake -DENABLE_EXAMPLE=ON scope-source-directory directs CMake to include object files and dependencies from ExampleScope (Section  IV-C) when building the Scope binary.

Scope relies on Hunter [11] for CMake to fetch dependencies. Some dependencies, such as spdlog, are present in Scope so all benchmarks may used them. Table II shows the libraries included in Scope v1.0.0. Other dependencies are only required for certain benchmarks. For example, CommScope requires libnuma [12] for pinning allocations or processes to memory regions on Linux systems, even though Scope itself does not.

Hunter downloads, configures, and builds the C++ library dependencies that SCOPE relies on. Individual scopes may also use Hunter to retrieve their dependencies, or they may require the user to provide them in some other way (for example, as libraries present on the system). If a scope uses Hunter, the scope’s dependencies will only be retrieved if that scope was enabled in the configuration step. All downloads through Hunter occur at the end of the configuration step, so that the required library and header files are present for the build.

Library
Google Benchmark v1.4.0 [5]
spdlog v0.16.3 [13]
fmt v4.1.0 [14]
clara v1.1.4 [15]
TABLE II: Scope Library Dependencies

Iii-C Compiling SCOPE

During the build stage (c), object files for all enabled scopes are produced and linked into a single binary.

In this step, every source file that was included in a scope submodule object library is compiled into an object file. The source files that implement the Scope infrastructure helpers are also compiled into object files, and then all of the objects are linked together in a single step to form the scope binary.

Iii-D Running SCOPE

Finally, running the produced SCOPE binary allows the user to select any subset of the included benchmarks and produce a data file at any location.

Iii-E Providing Common Utilities

Scope provides a set of utilities that scopes may use. The interfaces for these utilities are available to the scopes through C++ header files. When scopes include those headers, the implementations will be available when all objects are linked together.

  • The entire Google Benchmark library is provided to configure and register the benchmark code.

  • CUDA error checking is provided, as most extant benchmarks are CUDA benchmarks.

  • Logging is provided so that scopes may have a consistent output mechanism.

  • C++ functions for declaring new command line options.

  • C++ function for executing pre-benchmark initialization code.

  • Convenience CMake functions for integrating with SCOPE.

Scope also provides a tools/generate_sugar_files.py python script for generating Sugar-compatible[16] CMake files in each scope source tree. Scope uses Sugar to read the sugar.cmake files, which tell CMake where the CUDA and C++ source files for Scope and the benchmarks are. This script is provided so that the sugar.cmake files can be quickly regenerated during development of each scope. The script will create sugar.cmake files that export the CMake variables described in Table III. <VAR> is replaced by the string passed to the “–var” flag.

Library Source
<VAR>_SOURCES C/C++ source files
<VAR>_HEADERS C/C++ header files
<VAR>_CUDA_SOURCES CUDA source files
<VAR>_CUDA_HEADERS CUDA header files
TABLE III: CMake variables created by sugar.cmake files

Iii-F Providing CMake Functions

SCOPE provides three functions to help scopes integrate with the SCOPE CMake build. Scope_add_library is a wrapper around add_library, which also sets the SCOPE_NEW_TARGET variable so that SCOPE can link against the CMake Object Library defined in the scope. Target_include_scope_directories causes SCOPE’s utility include directories to be added to the compilation of the scope so that the SCOPE utilities can be used. SCOPE also provides scope_status, scope_warning, and scope_fatal, which scopes may use in their CMakeLists.txt files to print messages that will be visible during configure time.

Iii-G Initialization Hooks and Command-line Options

SCOPE provides the ability for benchmarks to run arbitrary initialization when the binary is executed. Scopes may register clara::Opts to create new command line arguments accepted by the SCOPE binary. Scopes may also register arbitrary code to be executed before command line arguments are parsed, or after arguments are parsed, but before any benchmarks are executed. Though these routines, benchmarks may do any unguided or user-directed initialization desired before benchmark execution.

Iv Design of SCOPE Submodules

SCOPE does not include any benchmark code — all benchmarks are provided through individual scopes. Scopes are structured directories in the Scope source tree, included in Scope through the CMake include_directory command. As of this writing, eight scopes are either completed or in development, to measure various aspects of system compute and communication performance at different levels of abstraction. Table IV describes the different scopes that are under development.

Abstraction Name Status Description
Hardware TCUScope Completed Nvidia GPU tensor cores
Data Transfer CommScope Released Nvidia CPU-GPU communication [17]
I/OScope In Progress Disk I/O operations
NCCLScope In Progress Nvidia’s NCCL library
Compute cdDNNScope Released Neural-network operations
InstrScope In Progress Instruction latencies and throughput
HistoScope In Progress Nvidia GPU histogramming
LinAlgScope In Progress Linear algebra operations
TABLE IV: Completed or in-progress scopes

Iv-a Defining a CMake Object Library

Each scope needs to include a CMakeLists.txt at the top level of its directory. This allows it to be included in the Scope configuration through the include_directory command in the Scope CMakeLists.txt. Fundamentally, each scope only needs to define a CMake object library target though the object form of the add_library command [10]. ExampleScope defines the example_scope target. Scope suggests using Sugar and the provided tooling (Section III-E) to automate the process of providing source files to the scope_add_library command. The scope target may have its own dependencies or other constraints. ExampleScope marks itself as requiring C++11 through the target_compile_features CMake command. This tells CMake that objects in ExampleScope should be built with C++11 support in the compiler. ExampleScope also adds include directories, a CUDA language standard, and a requirement to link against the Google Benchmark library to the example_scope target. These requirements will be propagated to the entire Scope build as required.

Iv-B Integration with Scope through Git Submodules

For development isolation, scopes are maintained as independent directories. If the scope is maintained as a Git repository, the scope can have its own versioning and revision history. Additionally, this allows the scope to be included in the SCOPE repository as a Git submodule. Git submodules can be automatically downloaded alongside Scope when Scope is downloaded, and pinned to the appropriate version.

Iv-C Example Scope

The Scope project provides a template scope ExampleScope [18] that demonstrates how a new scope can be structured. ExampleScope is available at https://github.com/c3sr/example_scope. ExampleScope demonstrates the following required or suggested structures:

Iv-C1 CMakeLists.txt (required) and Sugar (optional)

This file defines a CMake object library and uses Sugar to parse the sugar.cmake files in the ExampleScope source tree.

Iv-C2 Code Structure (optional)

ExampleScope places all of its source files in the src directory.

Iv-C3 Benchmark Library (required)

All benchmarks are registered through the Benchmark library. This enables the Scope binary to filter, run, and report results in a consistent way.

Iv-C4 Documentation (optional)

ExampleScope contains Markdown files describes each benchmark, the algorithm, and implementation in docs in its source tree.

Iv-C5 Initialization and Command Line Flags (optional)

ExampleScope uses clara::Opts to declare two new command-line arguments, and uses the SCOPE initialization hooks to cause SCOPE to exit during initialization if those options are used. Initialization code is placed in src/init.

V Scope Utilities

V-a ScopePlot Python Package

ScopePlot is a python package available to help plot and manipulate results in the JSON files produced by SCOPE. The Google Benchmark-formatted JSON files (hereafter referred to as JSON files) produced by Scope are unmodified from the format produced by the Google Benchmark library, so ScopePlot is compatible with other tools that use that library. ScopePlot is freely available on the Python Package Index (PyPI) at https://pypi.org/project/scope-plot/. ScopePlot is also open-source, hosted at https://github.com/rai-project/scope_plot. ScopePlot uses Python’s distutils to manage installation. When ScopePlot is installed, it provides the scope_plot binary. The rest of this section describes notable scope_plot subcommands.

V-A1 spec Subcommand

scope_plot spec generates an arbitrary plot from a YAML [19] specification file (hereafter referred to as a spec

file). Examples of valid specification files can be found in the ScopePlot source tree. The specification file controls the plot type (line with error bars, bar plot, linear regression plot with error bars), the source JSON file for each data series, filters to extract the desired data from the JSON file, per-series data transformations, and plot styling and formatting.

Fig. 3: Example line plot with error bars generated from ScopePlot.

V-A2 deps Subcommand

scope_plot deps can be used to help integrate the scope_plot command with GNU Makefiles. Similar to how make can be used to partially recompile code when certain files have changed, make can be used to regenerate plots when the underlying Benchmark JSON files have been updated. deps scans a spec file and emits the paths of the benchmark JSON files it depends on in make format, so that make target dependencies can be automatically generated as part of a build process.

V-A3 bar Subcommand

scope_plot bar generates a bar plot from a Benchmark JSON file. It has a subset of the functionality of the spec subcommand, without requiring a full spec file. Command line options allow the user to specify the fields used for the x- and y-axis data, and the plot title.

V-A4 cat Subcommand

scope_plot cat is inspired by the Linux/Unix “cat” command, but specialized to Google Benchmark JSON files. When passed one or more Benchmark JSON files, it concatenates the benchmarks field of those files and dumps the content to the standard output stream. In this way, it preserves the structure of the JSON files when they are concatenated, where the standard “cat” would simply append the JSON contents together, yielding a malformed result.

V-A5 filter_name Subcommand

scope_plot filter_name filters the benchmark outputs in the Benchmark JSON file to only keep benchmarks with a name that matches a provided regular expression.

V-A6 Using ScopePlot as a Library

ScopePlot may also be used as a python library to develop other JSON file manipulation and analysis tools. ScopePlot has an object model for JSON files and various methods for filtering them and converting them to pandas DataFrames.

V-B Scope Docker Images

Dockerfile Docker Hub Image Description
amd64.cuda75.Dockerfile c3sr/scope:amd64-cuda75-tag x86_64, CUDA 7.5,
amd64.cuda80.Dockerfile c3sr/scope:amd64-cuda80-tag x86_64, CUDA 8.0,
amd64.cuda92.Dockerfile c3sr/scope:amd64-cuda92-tag x86_64, CUDA 9.2,
ppc64le.cuda80.Dockerfile c3sr/scope:amd64-cuda80-tag POWER, CUDA 8.0,
ppc64le.cuda92.Dockerfile c3sr/scope:amd64-cuda92-tag POWER, CUDA 9.2,
TABLE V: Dockerfiles and Docker Images

Scope includes several Docker [20] images, which can be used to run benchmarks. This allows users on supported platforms to run pre-packaged versions of the benchmark code without having to compile or configure it. Table V lists the Docker files and corresponding Docker images publicly available on the Docker Hub.

Vi SCOPE Development and Maintenance

Fig. 4: SCOPE and ScopePlot development flow. Once source code is pushed to Github, it is immediately available there. Travis-CI is used to build Scope, and also start Docker4POWER CI jobs to build POWER Docker images. Travis-CI is also used to test ScopePlot. After successful builds or tests, the relevant artifacts are deployed to Docker Hub and the Python Package Index.

Scope and ScopePlot rely on continuous integration for testing and building Docker images. The continuous integration system is centered around Travis-CI [21]. Whenever modifications to Scope or ScopePlot are pushed to GitHub, Travis-CI starts a series of parallel jobs. Figure 4 summarizes the continuous integration flow for Scope and ScopePlot.

Vi-a Scope

Whenever a push is made to the Scope repository, Travis-CI is configured to start a series of parallel builds:

  • An x86-64 CUDA 9.2 build.

  • An x86-64 Docker CUDA 7.5 build.

  • An x86-64 Docker CUDA 8.0 build.

  • An x86-64 Docker CUDA 9.0 build.

Each of these builds incorporate CommScope and ExampleScope, the two completed scopes at the time of writing.

All of Travis’ build hardware is x86-64, so Travis is directly used to create x86-64-compatible docker images. To generate POWER-compatible Docker images, Scope uses rai [22], a separate job submission system. Two additional Travis jobs are started, each of which simply submit the POWER builds to rai on Oregon State University’s PowerCI [23] infrastructure.

  • POWER Docker CUDA 8.0 build.

  • POWER Docker CUDA 9.2 build.

If these Docker builds individually succeed, they are pushed to Docker hub to be immediately available to the public. An image corresponding to each tag is retained indefinitely. Furthermore, the most recent commit on each branch is available.

Vi-B ScopePlot

Whenever a push is made to the ScopePlot repository, Travis-CI is configured to test the ScopePlot package against Python 2.7, 3.4, 3.5, 3.6, and 3.7. If all of those tests pass, and the commit has a corresponding tag, a new version of ScopePlot is made available on the Python Package Index (PyPI) for installation with pip install scope_plot.

Vii Conclusion

The Scope project arose out of a desire to lower the barrier to entry for system benchmarking in the IBM / University of Illinois Urbana-Champaign Center for Cognitive Computing Systems Research (C3SR). Scope does this by incorporating common libraries and providing convenience functions for writing new system benchmarks, easily supporting compilation on x86 and POWER platforms, and by providing a command-line tool for managing and plotting results. Scope is free and open-source, and welcomes contributors and collaborators.

Acknowledgments

The authors would like to acknowledge contribution and insight from the following people: I-Hsin Chung (IBM T. J. Watson Research Center), Sarah Hashash and Andrew Schuh (University of Illinois at Urbana-Champaign)

References