Supporting High-Performance and High-Throughput Computing for Experimental Science

10/06/2018 ∙ by E. A. Huerta, et al. ∙ University of Illinois at Urbana-Champaign Rutgers University 0

The advent of experimental science facilities, instruments and observatories, such as the Large Hadron Collider (LHC), the Laser Interferometer Gravitational Wave Observatory (LIGO), and the upcoming Large Synoptic Survey Telescope (LSST), has brought about challenging, large-scale computational and data processing requirements. Traditionally, the computing infrastructures to support these facility's requirements were organized into separate infrastructure that supported their high-throughput needs and those that supported their high-performance computing needs. We argue that in order to enable and accelerate scientific discovery at the scale and sophistication that is now needed, this separation between High-Performance Computing (HPC) and High-Throughput Computing (HTC) must be bridged and an integrated, unified infrastructure must be provided. In this paper, we discuss several case studies where such infrastructures have been implemented. These case studies span different science domains, software systems, and application requirements as well as levels of sustainable. A further aim of this paper is to provide a basis to determine the common characteristics and requirements of such infrastructures, as well as to begin a discussion of how best to support the computing requirements of existing and future experimental science facilities.



There are no comments yet.


page 3

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

To discuss high performance computing (HPC) and high throughput computing (HTC), we initially need to distinguish between “computing modes” and “computing infrastructure”, as the terms HTC and HPC are often used for both. HTC as a computing mode is typically used for workloads that are primarily characterized by the number of tasks associated with the workload. HTC workloads are comprised of tasks that are typically independent of each other, that is to say, the tasks can start or complete in any order. Furthermore, while a task (defined as a unit of work) is most often equated to a job (defined as an entity submitted to a regular batch queue), a single task does not need to be mapped to a job; multiple tasks might also be mapped into a single job. In contrast, an HPC workload is characterized by a metric such as its scalability or some other measure of performance (e.g., number of flops). Typically an HPC workload comprises a single task that is executed as single job; however, HPC workloads might comprise multiple tasks with dependencies but may still be packed as a single job. These distinctions are important to appreciate the different workloads that are covered in this paper, and as the astute reader will notice, some workloads defy reduction into one or the other category.

Furthermore, there are computing infrastructures that are designed to primarily support one type of workload. The canonical example is Condor-based systems whose design point is to maximize the number of tasks per unit time, known as throughput. On the other hand, most supercomputers and high-performance clusters that have high-performance interconnects and memory are typically designed for HPC workloads. Traditionally, HTC workloads have not been executed on such HPC infrastructures, but as this paper illustrates, there have been many recent attempts to run HTC workloads on HPC infrastructures. This paper chronicles several examples and the software systems used to do so.

The pathway for convergence started over a decade ago when ATLAS, CMS, and the Laser Interferometer Gravitational wave Observatory (LIGO) DII:2016 ; LSC:2015 became stakeholders of the Open Science Grid (OSG), a project that began as a community effort to address the computing needs of US researchers utilizing the Large Hadron Collider (LHC) at CERN pordes2007open .

The adoption of OSG policies and good practices by disparate scientific communities, combined with the emergence of HPC container solutions, has paved the way for the development of HTC workloads that can seamlessly use both HTC and HPC infrastructures. This paradigm has benefited the scientific and HPC/HTC communities, and has played a central role in pushing the boundaries of our knowledge in high energy physics and gravitational wave astronomy, leading to remarkable discoveries that have been recognized with the Nobel Prizes in Physics in 2013 and 2017.

This article is organized around a set of case studies. Each study, in Section 2, briefly describes the science problem, and the rationale to go beyond available HTC infrastructures. Section 3 describes how HPC has been incorporated at both the middleware/system and application levels, and what the impact of the work will be on the science. In Section 4 we compare and briefly analyze the different approaches. In Section 5

we discuss the future of this program in the context of exascale computing and emergent trends in large scale computing and data analytics that leverage advances in machine and deep learning.

2 Case Studies

This section presents a brief overview of two large-scale projects, gravitational wave astrophysics and high energy physics, that initially met their data analysis needs with workloads that were tailored to use HTC infrastructures. We discuss how the computational needs of these science missions led to the construction of a unified HTC-HPC infrastructure, and the role of OSG and containers to facilitate and streamline this process.

2.1 Gravitational wave astrophysics: from theoretical insights to scientific discovery

According to Einstein’s theory of general relativity, gravity is a manifestation of spacetime curvature. Gravitational waves are generated when masses are accelerated to velocities closer to the speed of light. Gravitational waves remove energy from the system of masses, which translates into a rapid shrinkage of the orbital separation between the masses, and culminates in a cataclysmic collision accompanied by a burst of gravitational radiation.

Over the last two years, LIGO and its European parter Virgo, have made five gravitational wave detections that are consistent with the merger of two black holes DI:2016 ; secondBBH:2016 ; thirddetection ; fourth:2017 ; GW170608 . The trail of discovery has also led to the first direct detection of two colliding neutron stars bnsdet:2017 , which was observed with two cosmic messengers: gravitational waves and light. This multimessenger observation has provided evidence that the collision of neutron stars are the central engines that trigger short gamma ray bursts, the most energetic electromagnetic explosions in the Universe after the Big Bang, and the cosmic factories that where about half of all elements heavier than iron are produced bnsdet:2017 ; mma:2017arXiv ; 2017arXiv171005836T .

To understand the physics of gravitational wave sources and enable their discovery, a worldwide, decades-long research program was pursued to develop numerical methods to solve Einstein’s general relativity equations in realistic astrophysical settings smarr ; 1989fnr ; 1993PhRvLA ; 1995Sci270941M . The computational expense and scale of these numerical relativity simulations require large amounts of computing power. The lack of such computing power to address this physics problem is one of the elements that led to the foundation of the US National Science Foundation (NSF) supercomputer centers, including the National Center for Supercomputing Applications (NCSA).

The first numerical evolutions of two orbiting black holes that inspiral into each other and eventually merger were reported in 2005 preto . This breakthrough was reproduced independently by other groups shortly thereafter using entirely different software stacks baker:2006 ; camp:2006 . From that point onwards, numerical relativists embarked on a vigorous program to produce mature software that could be used to routinely simulate the merger of black holes in astrophysically motivated settings. Figure 1 shows still images of black hole collisions that represent a sample of the black hole mergers detected by the LIGO detectors, which we numerically simulated using the open source, Einstein Toolkit naka:1987 ; shiba:1995 ; baum:1998 ; baker:2006 ; camp:2006 ; Lama:2011 ; wardell_barry_2016_155394 ; ETL:2012CQGra ; Ansorg:2004ds ; Diener:2005tn ; Schnetter:2003rb ; Thornburg:2003sf community software on the Blue Waters supercomputer bluewaters:web ; Kramer2015 ; 2017arXiv170300924J .

Figure 1: Visualization of the event horizons and gravitational waves emitted by the first DI:2016 and fourth fourth:2017 pair of merging black holes detected by LIGO. These gravitational waves induce changes in the arm length of the LIGO and Virgo detectors that are smaller than the diameter of a proton.

While the numerical modeling of black hole collisions has evolved rapidly over the last decade, the modeling of astrophysical objects that involve matter, such as neutron star collisions, has progressed at a lower pace 2015CQGra..32q5009E ; 2016PhRvD..93l4062H ; 2014CQGra..31a5005M ; ETL:2012CQGra ; 2017JCoPh.335…84K . The different timescales involved in these complex systems, and the need to couple Einstein’s equations with magneto-hydrodynamics and microphysics is a challenging endeavor. Recent efforts to cross validate the physics described by different software stacks is an important step towards the development of mature software that can be routinely used to simulate these events. This research program is timely and relevant given that LIGO, Virgo, and several astronomical facilities are coordinating efforts to identify new multimessenger events in the upcoming LIGO-Virgo gravitational wave discovery campaign, known as O3. Figure 2 shows one of the numerical relativity simulations we produced to numerically model the neutron star collision detected by the LIGO and Virgo detectors. The simulation was produced with the GRHydro numerical relativity code 2014CQGra..31a5005M on the Blue Waters supercomputer.

Figure 2: Visualization of the merger of two neutron stars. This simulation is consistent with the astrophysical properties of the two colliding neutron stars detected by the LIGO and Virgo detectors.

Available catalogs of numerical relativity simulations Mroue:2013 have been used to calibrate semi-analytical waveform models that are utilized during gravitational wave discovery campaigns Bohe:2016gbl ; husacv:2016PhRvD ; khan:2016PhRvD ; Tara:2014 . This is because generating numerical relativity waveforms takes between several days (black hole mergers) and several months (neutron star mergers) in HPC infrastructures. However, since gravitational wave detection requires low latency analyses of gravitational wave data, numerical relativity catalogs are used to calibrate models that can generate simulated waveform signals in tens of milliseconds.

Once a new gravitational wave trigger is identified, numerical relativity catalogs that actually reproduce the signals extracted from LIGO and Virgo data are created. To inform this analysis, and constrain the region of interest in the 8-dimensional parameter space that describes gravitational wave sources, LIGO and Virgo data is carefully analyzed using robust Bayesian algorithms bambiann:2015PhRvD . With these catalogs of numerical relativity waveforms, it is possible to infer the astrophysical origin and environments of gravitational wave sources.

In conclusion, the numerical modeling of gravitational wave sources, and the validation of new discoveries with numerical relativity waveforms, depends critically on HPC infrastructure.

Gravitational Wave Detection with HTC workloads on HTC infrastructures

The choice of signal-processing techniques for gravitational wave detection has produced workloads that are computationally expensive and poorly scalable. These algorithms sift through gravitational wave data, looking for a high correlation with modeled waveform templates, which are calibrated with numerical relativity waveforms. If this template matching method finds a noise trigger with high significance, then it is followed up using a plethora of statistical algorithms to ensure that it is not a noise anomaly, but an actual gravitational wave candidate that is observed in several gravitational wave detectors.

In a typical discovery campaign, LIGO utilizes template banks that have distinct modeled waveforms. Each segment of gravitational wave data, in size, is matched-filtered against every single one of these template waveforms.

LIGO employs two separate pipelines, PyCBC 2016CQGra..33u5004U and gstLAL 2012ApJ…748..136C , to perform matched-filtering based gravitational waves. Both pipelines use HTCondor condor-practice as a workflow management system. To provide computing power for these computationally intensive searches, LIGO maintains its own computing infrastructure in the form of the LIGO Data Grid (LDG) that supplied the majority of the tens of millions of core hours of computing time used in the two previous observation campaigns. The LDG provides a homogeneous compute environment for gravitational wave data analysis. It consists of a single version of CentOS Linux as the operation system, and provides a software stack mandated by LIGO.

The observation of gravitational waves with an international network of gravitational wave detectors was accomplished in LIGO-Virgo’s second observing run, known as O2. In the upcoming third observing run O3, the Japanese KAGRA detector will further expand this network. Ongoing improvements to the sensitivity of these observatories, coupled with longer discovery campaigns, will exacerbate the need for computational resources, in particular for low-latency (order of seconds to minutes) gravitational wave searches. Anticipating this scenario, LIGO has expanded its LDG to exploit additional resources. In the following section we describe recent deliverables of this effort.

2.2 High-energy Particle Physics

The goal of particle physics is to understand the universe at its most fundamental level, including the constituents of matter and their interactions. Our best theory of nature—the standard model (SM)—is a quantum field theory (QFT) that describes the strong, electromagnetic (EM) and weak interactions among fundamental particles, which are described as fields. In the SM, the weak and EM forces have the same strength at very high energy (as existed in the early Universe) described by single electroweak interaction and particles must be massless to preserve gauge invariance

in which different configurations of the fields lead to identical physics results. Gauge invariance is a required ingredient of any QFT describing nature, otherwise calculated values of physically measurable quantities, such as the probability of particles scattering with one another at high energy, can be infinite. Since we know through observation that the EM force is much stronger than the weak force, electroweak symmetry is a

broken symmetry. We also know that most fundamental particles have mass, including the weak force carriers that are massive and have a short-ranged interaction. Exactly how this symmetry is broken and how fundamental particles acquire their mass without violating gauge invariance is one of the most important questions in particle physics.

The SM provides a mechanism that answers both of these questions simultaneously. Particle masses arise when the electroweak symmetry is spontaneously broken by the interaction of massless fields with the Higgs field, an invisible, spinless field that permeates all space and has a non-zero value everywhere, even in its lowest energy state. A would-be massless particle that interacts with the Higgs field is slowed down from the speed-of-light due to this interaction and consequently acquires a non-zero mass. This Higgs mechanism makes the remarkable prediction that a single massive, neutral, spinless particle called the “Higgs boson”—a quantum excitation of the Higgs field—must exist PhysRevLett.13.321 ; HIGGS1964132 ; PhysRevLett.13.508 ; PhysRevLett.13.585 ; PhysRev.145.1156 ; PhysRev.155.1554 .

Equally remarkable is that we have progressed technologically to be able to produce Higgs bosons in the laboratory. While Higgs boson mass is not predicted by the SM, it must be less than 1000 times the mass of the proton to avoid the infinities previously mentioned. In Einstein’s special relativity, a mass is equivalent to an energy content through the relation  1952E . Particle accelerators can impart energy to particles to form an equivalent amount of mass when they are collided. The Large Hadron Collider (LHC) at CERN in Geneva, Switzerland is the world’s most powerful particle collider and was built with the primary goal of either discovering the Higgs boson or refuting its existence. At the LHC, counter-rotating bunches of 10 protons are accelerated inside a 27-km circular ring and focused to collide at rate of 40 MHz with a (design) center-of-mass energy of 14 trillion electron-volts. This is the energy equivalent of the rest mass of 14,000 protons, which is sufficient to excite the Higgs field to produce Higgs bosons higgs1 ; higgs2 ; higgs3 ; higgs4 . Higgs bosons decay almost immediately and sophisticated detectors surrounding the collision region are used to detect and measure their decay products, enable physicists to piece them together to search for Higgs boson production within the LHC.

In 2012, the ATLAS and CMS collaborations announced the discovery of a Higgs boson at the LHC HIGG-2012-27 ; Chatrchyan:2012xdj . This discovery lead to the 2013 Nobel Prize in Physics for the theory of the Higgs mechanism and prediction of the Higgs boson. Since this discovery, the properties of this new particle—its mass, spin, couplings to other SM particles, and certain symmetry properties—have been measured with increasing precision and found to agree with the SM prediction.

The development of the SM is a triumph of 20 century physics, with the last piece of the puzzle put in place by the discovery of the Higgs boson. However, this far from the end of the story of particle physics. Even without including gravity, we know that the SM is an incomplete description of nature and leaves many open questions to be answered. For example, the SM does not include non-zero neutrino mass and mixing which is observed in solar and atmospheric neutrino experiments Ahmad:2001an ; Fukuda:1998mi , nor does it account for the predominance of matter over antimatter. Moreover, the SM accounts for only 5% of the known mass-energy content of the universe and does not describe dark matter or dark energy that comprises the rest.

The LHC will continue to provide a unique window into the subatomic world to pursue answers to these questions and study processes that took place only a tiny instant of time after the Big Bang. The next phase of this global scientific endeavor will be the High-Luminosity LHC (HL-LHC) which will collect data starting circa 2026 and continue into the 2030s. The goal is to search for physics beyond the SM and, should it be discovered, to study its details and implications. During the HL-LHC era, the ATLAS and CMS experiments will record 10 times as much data from 100 times as many collisions as was used to discover the Higgs boson, raising the prospect of exciting discoveries during the HL-LHC era.

ATLAS Data Analysis with HTC workloads on HTC infrastructures

The ATLAS detector ATLASDetector is a multi-purpose particle detector at the LHC with a forward-backward symmetric cylindrical geometry and a nearly solid angle coverage of the LHC collision region. The ATLAS detector is eight stories tall, weighs 7000 tonnes and consists of 100 million electronics channels. At a proton bunch crossing rate of 40 MHz, there are 1 billion proton-proton interactions per second occurring within the ATLAS detector. The rate of data generated by the detector is far too high to collect all of these collisions, so a sophisticated trigger system is employed to decide which events are sufficiently interesting for offline analysis. On average, only 1 in every 100,000 collisions is archived for offline analysis. A first-level trigger is implemented in hardware and uses a subset of the detector information to reduce the accepted rate to a peak value of 70 kHz. This is followed by a software-based trigger run on a computing cluster that reduces the average recorded collision rate to 1 kHz.

With of data generated annually by the LHC experiments, processing, analyzing, and sharing the data with thousands of physicists around the world is an enormous challenge. To translate the observed data into insights about fundamental physics, the important quantum mechanical processes and response of the detector to them need to be simulated to a high-level of detail and with a high-degree of accuracy.

Historically, the ATLAS experiment has used a geographically distributed grid of approximately 200,000 cores continuously (250,000 cores at peak) to process, simulate, and analyze its data. The ATLAS experiment is currently responsible for 1,000 million core-hours per year for processing, simulation, and analysis of data, with more than 300 PB of active data. In spite of these capabilities, the unprecedented needs of ATLAS have led to contention for computing resources. The shortfall became particularly acute in 2016-17, as the LHC delivered about 50% more data than planned, and the LHC continues to generate more data than planned. The shortfall will not be met by growth from Moore’s Law, or simply more dollars to buy resources. Furthermore, once the HL-LHC starts producing data in 2020 this problem will only be magnified, making the gap more acute.

A partial response to these challenges has been to move from utilizing infrastructure that was exclusively distributed and that only supported the HTC mode, to an infrastructure mix that also included HPC infrastructure such as the Blue Waters and Titan supercomputers. This expansion of infrastructure types has been primarily motivated by the very practical need to alleviate the “resource scarcity” as the requirements of ATLAS have continued to grow. This has required addressing both intellectual and technical challenges of using HPC infrastructure in a HTC mode. Section 3.2 discusses these challenges and the ATLAS project’s response to them.

3 Open Science Grid and containers pave the way to create a unified HTC and HPC infrastructure

OSG provides federated access to compute resources for data-intensive research across science domains, and it is primarily used in physics. Workloads that best use this large pool of resources need to meet clearly defined criteria:

  • They consist of loosely coupled jobs that require a few cores to at most one node.

  • Furthermore, since compute resources are not owned by OSG, and jobs may be killed and re-started at different sites when higher priority jobs enter the system, workflow managers that can preempt a job without losing the work the job has already accomplished should be used.

  • Additionally, jobs should be single-threaded and require less than 2 GB of memory in each invocation, and which can run for up to twelve hours.

  • Input and output data for each job is limited to 10 GB.

Another important consideration is that OSG resources do not typically have the same software ecosystem required by LIGO or ATLAS workloads. This has led to the development of software stacks that seamlessly run on disparate compute resources. For desktop and server applications, Docker containers have become one of the preferred solutions to address this problem of encapsulating all required software dependencies of an application and providing a uniform way to share these packages.

The High Energy Physics community has made extensive use of containers to run ATLAS workloads on disparate OSG compute resources. Shifter, and more recently Singularity, have been implemented as container solutions by the OSG project. Similarly, LIGO scientists containerized their most compute-intensive HTC workload to seamlessly run on OSG resources. As a result, OSG contributed about 10% of the compute time consumed during LIGO-Virgo’s first and second discovery campaigns. Both Shifter and Singularity are currently used to containerized LIGO’s software stacks.

ATLAS and LIGO scientists soon realized that the approach used to connect their HTC infrastructure to the OSG could also be used to construct a unified HTC-HPC infrastructure. To accomplish this, containers were deployed on HPC infrastructures, which were then configured as OSG compute elements. This approached enabled a seamless use of HPC infrastructures for ATLAS and LIGO large scale data analyses. In the following section we describe how these two milestones were accomplished, highlighting the similarities and differences between these approaches.

3.1 Scaling gravitational wave discovery with OSG and Shifter connecting the LDG to Blue Waters

In the last two LIGO and Virgo observing runs, the LDG benefited from adopting OSG as a universal adapter to external resources, increasing its pool of compute resources to include campus and regional clusters, the NSF funded Extreme Science and Engineering Discovery Environment (XSEDE) 2017Weitzel , and opportunistic cycles from US Department of Energy Laboratories and High Energy Physics clusters.

In oder to connect the LDG to Blue Waters, the NSF-supported, leadership-class supercomputer operated by NCSA, authors of this article spearheaded the unification of OSG, Shifter, and Blue Waters BOSS:2017 ; shifter .

Since Shifter is supported natively by Blue Waters, LIGO used it to encapsulate a full analysis software stack and to use Blue Waters as a computing resource during O2 BOSS:2017 , adding gravitational wave science to the portfolio of science enabled by Blue Waters. Figure 3 shows the setup used to start Shifter jobs.

To further leverage existing efforts to use cycles on HPC clusters for HTC workloads, LIGO decided to base its efforts on the existing OSG infrastructure and to create a container that can be used on any OSG resource along with methods to user Blue Waters as an OSG resource provider.

The solution we have developed to use Blue Waters, or any other HPC infrastructure, is as follows (also see Figure 4BOSS:2017 :

  • LIGO data analysis jobs are submitted to the HTCondor scheduler running at an LDG site, which oversees the workload and schedules work items on either the local compute resources or on remote resources.

  • Glidein sfiligoi2009pilot pilots are submitted as regular jobs to the HPC cluster’s job scheduler to reserve a number of compute nodes for use by OSG. This creates a virtual private batch systems that reports back to the Glidein Workload Management System at the LDG site to temporarily become part of the HTCondor worker pool.

  • Jobs are run on the Glidein workers until all have completed or the pilot jobs expire.

Figure 3: The components involved in starting a Shifter job on Blue Waters. Jobs are submitted to the workload manager on Blue Waters’ login nodes, which launches jobs on the compute nodes. For jobs requesting the use of containers, the workload manager first instructs the shifter runtime environment to pull an up-to-date copy of the container image from Docker Hub. The container image is repackaged as a user defined image using a regular squashfs disk image. Finally, the disk image is loop-mounted by the jobs on the compute nodes during their prologue and unloaded in the epilogue after the job ends.
Figure 4: Interaction between the LIGO Data Grid, the Open Science Grid, and PyCBC jobs during a detection run. Pilot jobs are started on Blue Waters compute nodes, which register themselves with OSG and request compute jobs. The LDG-hosted Condor submission host supplies compute jobs to OSG to be executed by the Blue Waters workers. Once a worker starts up, it requests data to be processed from the data hub hosted at Nebraska Supercomputer center and returns results to LIGO.

Figure 5: Left panel: gravitational wave astrophysics (LIGO and NANOGrav) and high energy physics (ATLAS) projects that use HPC containers to seamlessly exploit the unique computing capabilities of the Blue Waters supercomputer. Right panel: Open Science Grid compute resources used for large-scale gravitational wave data analysis. The chart shows the first time Blue Waters was used at scale as an Open Science Grid compute element, which corresponds to the gravitational wave discovery of two colliding neutron stars by the LIGO and Virgo detectors.

As shown in Figure 5, the first time this framework was used for a production scale analysis was for the validation of the first gravitational detection of two colliding neutron stars by the LIGO and Virgo detectors, an event that marked the beginning of multimessenger astronomy bnsdet:2017 ; mma:2017arXiv ; grb:2017ApJ .

This work has multifold implications. First of all, it provides an additional pool of computational resources that has already been used to promptly validate major scientific discoveries. In the near future, Blue Waters will continue to provide resources to accelerate gravitational wave searches, and to enable computational analyses beyond the core investigations that may lead to new insights through follow up analyses. This work will also benefit cluster utilization without affecting the network performance of HPC jobs. More importantly, this success clearly exhibits the interoperability of NSF cyberinfrastructure resources, and makes a significant step to further the goals of the US National Strategic Computing Initiative, i.e., to foster the convergence of data analytic computing, modeling and simulation. Supporting high throughput LIGO data analysis workloads concurrently with highly parallel numerical relativity simulations and many other complex workloads is the most recent success and most complex example of successfully achieving convergence on Leadership Class computers like Blue Waters, which is much earlier than was expected to be possible.

3.2 OSG and containers in Blue Waters for High Energy Physics

Figure 6: [Top to bottom] CPU core utilization, number of LHC collision events processed, ATLAS data consumed and produced (simulated and refined detector data) during a 33 day period starting on April 26, 2018.

To simulate and process large amount of data from the LHC, Blue Waters has been integrated into the ATLAS production processing environment by leveraging OSG CONNECT and MWT2 services. ATLAS jobs require a specific environment on the target site to execute properly. These include a variant of the CentOS6 operating system, numerous RPM packages and the distribution of ATLAS software libraries via CERN Virtual Memory File System (CVMFS) repositories. Blue Waters compute nodes themselves do not provide the required environment for ATLAS jobs as they use an older SUSE OS variant nor do they include many of needed RPM packages. Docker Images are delivered via Shifter to create an environment on Blue Waters nodes that are compatible with the ATLAS job payload. Though CVMFS cannot be used directly due to a lack of FUSE availability on Blue Waters, access to on-disk copies of the repositories is made available via a softlink from the required root of CVMFS to the location of the local repositories created by an rsync-based CVMFS replication service. To comply with Blue Waters’ two factor authentication, the RSA One Time Password (OTP) authentication system is used to create a proxy valid for 11 days. The OTP-based proxy is renewed on a weekly basis using MyProxy MyProxy . The OpenSSH client (gsissh) uses this proxy to ssh into a Blue Waters login node and startup SSH glideins for an HTCondor overlay that is used to schedule ATLAS jobs on Blue Waters.

The ATLAS jobs start flowing into Blue Waters when glideins submitted within a container in Blue Waters contact a Production and Distributed Analysis (PanDA) workload management system Maeno:2011zz at CERN to get an ATLAS payload. The glideins also pull all the necessary data and files using the Local Site Mover (LSM) at the University of Chicago. To minimize network transfer on stage-in, data is cached to the Blue Waters local Lustre file system. When the ATLAS jobs run, they use the stage-in data for their input and write their output back to the Lustre scratch disk. Once the workload is complete, the output data is transferred to the data storage system at the University of Chicago.

Figure 6 shows a particular one month period in 2018 in which 35k Blue Waters cores were utilized (peak) to process 35M collision events. The job output was made available to the rest of the ATLAS collaboration for use in analysis of the LHC data to improve SM measurements and to search for new physics beyond the SM.

3.3 ATLAS & PanDA & Titan

The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size resource. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources has proven to be insufficient to meet current and projected future demands.

Figure 7: Schematic showing the primary stages in execution of ATLAS workloads on Titan using the BigPanDa workload management system. PanDA’s broker acts on jobs (as opposed to tasks), and uses their description to determine how best to insert aggregate and shape into existing backfill slots on Titan. Although not used to submit to non-Oak Ridge Leadership Computing Facility sites, in principle the PanDA broker could set jobs to go to another resource (Site A).

The ATLAS experiment has embraced Titan, a US Department of Energy (DOE) leadership facility, in conjunction with traditional distributed high-throughput computing to reach sustained production scales approaching 100M core-hours a years (in 2017), and easily surpassing 100M in 2018. Underpinning these efforts has been the PanDA workload management system, which was extended to support the execution of ATLAS workloads on Titan htchpc2017converging , as shown in Figure 7. This work initially critically evaluated the design and operational considerations needed to support the sustained, scalable and production usage of Titan for ATLAS workloads in a high-throughput mode using the “backfill” operational mode. It also preliminarily characterized a next generation executor for PanDA to support new workloads and advanced execution modes as well as outlining early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other infrastructures in a general and extensible manner.

As shown in Figure 7, ATLAS payloads use Titan compute resources as follows: PanDA pilots run on Titan’s data transfer nodes (DTNs). This is advantageous, since DTNs can communicate with the PanDa server through a fast internet connection (10-GB/s). Furthermore, the worker nodes on Titan and the DTNs use a shared file system, which allows the pilots to stage in data and files that are needed by the payload, and to stage out data products once the payload is completed. PanDA pilots query Titan’s Moab scheduler to check whether available resources are suitable for PanDA jobs, and transfers this information to the PanDA server which then prepares a list of jobs that can be submitted on Titan. Thereafter, the pilot transfers all the necessary input data from Brookhaven National Laboratory, a Tier 1 ATLAS computer center.

4 Analysis of the case studies

In this section we discuss similarities and differences between the case studies. We start by identifying similarities in the approaches followed by the high energy physics and gravitational wave communities to run HTC-type workloads in the Blue Waters supercomputer

4.1 Similarities between case studies

We have identified the following common features between LIGO and ATLAS workloads that utilize Blue Waters

  1. CVMFS and/or xrootd is used for global distribution of software and data

  2. Shifter is used as a container solution for both software stacks

  3. ATLAS and LIGO workloads are planned targeting Blue Waters as an OSG compute element

  4. Jobs submitted to the OSG will start flowing into Blue Waters when glideins are started within a Shifter container in Blue Waters

  5. These workloads use HTCondor to schedule jobs. LIGO workloads also use Pegasus as a workflow management system

  6. The workloads use temporary certificates to comply with two factor authentication

  7. The OSG is used as a global adapter to connect ATLAS and LIGO compute-resources to Blue Waters

  8. These workloads use the backfill operational mode to maximize cluster utilization without loss of overall quality-of-service

4.2 Differences between case studies

In this section we focus on the ATLAS workload designed to run at production scale in the Titan supercomputer. The differences between this LHC workload and those discussed in the previous section are:

  1. Instead of HTCondor, this workload uses PanDA as the workload management system

  2. It targets Titan, a US DOE leadership-class supercomputer, to reach sustained production of 51M core-hours per year

  3. PanDA brokers were deployed on Titan to enable distributed computing at scale

  4. PanDA Broker pulls jobs’ input files from Brookhaven National Laboratory Data Center to the Oak Ridge Leadership Computing Facility (OLCF) Lustre file system. On the other hand, LIGO and ATLAS workloads that utilize Blue Waters, transfer data at scale from Nebraska and the University of Chicago, respectively

  5. PanDA Brokers are deployed on DTNs because these nodes are part of the OLCF infrastructure and can access Titan without RSA SecureID authentication. DTNs are not part of Titan’s worker nodes and, therefore, are not used to execute Titan’s jobs

  6. PanDA Broker queries Titan’s Moab scheduler about the current available backfill slot, and creates an MPI script, wrapping enough ATLAS jobs’ payload to fit the backfill slot. Thereafter, PanDA Broker submits the MPI script to the Titan’s PBS batch system as shown in Figure 7. In contrast, ATLAS and LIGO workloads in Blue Waters use the
    COMMTRANSPARENT flag, so that each task can be placed anywhere within the torus network without affecting the network performance of other jobs, and increasing the overall system utilization

  7. Once every MPI script is finished, PanDA Broker transfers the data products to Brookhaven National Laboratory. In contrast, LIGO data products on Blue Waters are transferred back to the host LDG cluster, and ATLAS workloads using Blue Waters resources stageout data products to the data storage space at the University of Chicago

5 Exascale Computing: Scope and future applications

In early 2019, the LIGO, Virgo and KAGRA detectors will gather data concurrently for the first time. This one-year campaign will benefit from ongoing commissioning work at the LIGO and Virgo sites. The implications of this are multifold. First of all, more sensitive detectors means that gravitational wave signals will spend more time in the detectors’ sensitive frequency range. In turn, effectual searches will require many more waveforms that are significantly longer than in previous campaigns.

Additionally, more sensitive detectors means that they can probe a larger volume of the Universe, which will boost the number of sources that will be detected. From a data analysis perspective, this means that we will require a significant increase in the pool of computational resources to keep the same cycle of detection to publication. If the detection rate increases by at least a factor of two, this level of activity will become unsustainable. Requiring that new data becomes publicly available with a six month latency also implies that compute-power will be utilized to address core data analysis activities, at the expense of not pursuing high risk-high reward science investigations that may lead to groundbreaking discoveries.

This situation is not unique to gravitational wave data analysis. For the HEP, the data volumes to be processed by 2022 (Run 3) and then the HL-HLC starts producing data (Run 4) will increase by factor of 10-100 compared to the existing volumes (Run 2). It is acknowledged in the HPC community that there is a growing disconnect between commercial clouds and HPC infrastructures, where the computing power and data storage concentrate, and edge environments which are experiencing the largest increase of data volumes but lack the needed infrastructure to cope with it. In this scenario, LIGO and ATLAS represent an edge environment that will generate very large datasets in the very near future, and will require access to ever increasing pools of computational power.

A range of opportunities have been discussed in the HPC community to alleviate these challenges. Some of them include data processing as close as possible to the data sources, and logically centered cloud-like processing. The use of containers will continue to play a significant role to seamlessly run compute-intensive workloads on commercial clouds, HPC infrastructures, and computing resources deployed in edge environments where the datasets are generated. The development of a common interface for containerization will facilitate convergence for all the ecosystem of applications that scientific cyberinfrastructure has to address.


As the data revolutions continues to evolve, new paradigms will emerge to support compute-intensive and data-intensive work either in HPC centers or edge environments. Global recommendations from the HPC communities for edge environments include the development of new algorithms to compress datasets by one or more orders of magnitude, and to understand how to use lossy compression. Furthermore, next-generation workloads may include not only classical HPC-type applications, but also machine and deep learning applications, which require a new level of abstraction between software and hardware to run these type of hybrid workloads. As HPC and the big data revolution continue to develop and converge, new needs and opportunities will arise, including the use of HPC math libraries for high end data analysis, the development of new standards for shared memory, and the interoperability between programming models and data formats.

The data revolution has already initiated a paradigm shift in gravitational wave astrophysics and high-energy physics. Deep learning algorithms have been used to show that gravitational wave detection can be carried out faster than real-time, while also increasing the depth and speed of established LIGO detection algorithms, and enabling the detection of new classes of gravitational wave sources geodf:2017a ; geodf:2017b ; rebei:2018 ; hshen:2017 ; georgenoise ; hall . Deep learning approaches to the search for new physics at the LHC started around 2012 and has since been applied to address many challenges including simulation, particle identification, and event characterization Guest:2018yhq . These algorithms have been developed by combining HPC, innovative hardware architectures, and deep learning algorithms. The potential of this new wave of innovation as an alternative paradigm to combining HTC and HPC to cope with the ever increasing demand for computational infrastructure of edge environments will be discussed in future work.

This research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the State of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications. We thank Brett Bode, Greg Bauer, HonWai Leong and William Kramer for useful interactions.


  • (1) Blue Waters, Sustained Petascale Computing.
  • (2) Abbott, B.P., Abbott, R., Abbott, T.D., Abernathy, M.R., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., et al.: GW150914: The Advanced LIGO Detectors in the Era of First Discoveries. Physical Review Letters 116(13), 131103 (2016). DOI 10.1103/PhysRevLett.116.131103
  • (3) Abbott, B.P., Abbott, R., Abbott, T.D., Abernathy, M.R., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., et al.: GW151226: Observation of Gravitational Waves from a 22-Solar-Mass Binary Black Hole Coalescence. Physical Review Letters 116(24), 241103 (2016). DOI 10.1103/PhysRevLett.116.241103
  • (4) Abbott, B.P., Abbott, R., Abbott, T.D., Abernathy, M.R., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., et al.: Observation of Gravitational Waves from a Binary Black Hole Merger. Physical Review Letters 116(6), 061102 (2016). DOI 10.1103/PhysRevLett.116.061102
  • (5) Abbott, B.P., Abbott, R., Abbott, T.D., Abernathy, M.R., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., et al.: GW170104: Observation of a 50-Solar-Mass Binary Black Hole Coalescence at Redshift 0.2. Physical Review Letters 118, 221,101 (2017). DOI 10.1103/PhysRevLett.118.221101. URL
  • (6)

    Abbott, B.P., Abbott, R., Abbott, T.D., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., Adya, V.B., et al.: Estimating the Contribution of Dynamical Ejecta in the Kilonova Associated with GW170817.

    Astrophys. J. Lett 850, L39 (2017). DOI 10.3847/2041-8213/aa9478
  • (7) Abbott, B.P., Abbott, R., Abbott, T.D., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., Adya, V.B., et al.: Gravitational Waves and Gamma-Rays from a Binary Neutron Star Merger: GW170817 and GRB 170817A. Astrophys. J. Lett 848, L13 (2017). DOI 10.3847/2041-8213/aa920c
  • (8) Abbott, B.P., Abbott, R., Abbott, T.D., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., Adya, V.B., et al.: GW170814: A Three-Detector Observation of Gravitational Waves from a Binary Black Hole Coalescence. Physical Review Letters 119(14), 141101 (2017). DOI 10.1103/PhysRevLett.119.141101
  • (9) Abbott, B.P., Abbott, R., Abbott, T.D., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., Adya, V.B., et al.: GW170817: Observation of Gravitational Waves from a Binary Neutron Star Inspiral. Physical Review Letters 119(16), 161101 (2017). DOI 10.1103/PhysRevLett.119.161101
  • (10) Abbott, B.P., Abbott, R., Abbott, T.D., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., Adhikari, R.X., Adya, V.B., et al.: Multi-messenger Observations of a Binary Neutron Star Merger. Astrophys. J. Lett 848, L12 (2017). DOI 10.3847/2041-8213/aa91c9
  • (11) Ahmad, Q.R., et al.: Measurement of the rate of interactions produced by solar neutrinos at the Sudbury Neutrino Observatory. Phys. Rev. Lett. 87, 071,301 (2001). DOI 10.1103/PhysRevLett.87.071301
  • (12) Anninos, P., Hobill, D., Seidel, E., Smarr, L., Suen, W.M.: Collision of two black holes. Physical Review Letters 71, 2851–2854 (1993). DOI 10.1103/PhysRevLett.71.2851
  • (13) Ansorg, M., Brügmann, B., Tichy, W.: A single-domain spectral method for black hole puncture data. Phys. Rev. D 70, 064,011 (2004). DOI 10.1103/PhysRevD.70.064011
  • (14) ATLAS Collaboration: The ATLAS experiment at the CERN Large Hadron Collider. JINST 3, S08,003 (2008). DOI 10.1088/1748-0221/3/08/S08003
  • (15) Baker, J.G., Centrella, J., Choi, D.I., Koppitz, M., van Meter, J.: Gravitational-Wave Extraction from an Inspiraling Configuration of Merging Black Holes. Physical Review Letters 96(11), 111102 (2006). DOI 10.1103/PhysRevLett.96.111102
  • (16) Baumgarte, T.W., Shapiro, S.L.: Numerical integration of Einstein’s field equations. Phys. Rev.59(2), 024007 (1998). DOI 10.1103/PhysRevD.59.024007
  • (17) Belkin, M., Haas, R., Arnold, G.W., Leong, H.W., Huerta, E.A., Lesny, D., Neubauer, M.: Container solutions for HPC Systems: A Case Study of Using Shifter on Blue Waters. ArXiv e-prints (2018)
  • (18) Bohé, A., Shao, L., Taracchini, A., Buonanno, A., Babak, S., Harry, I.W., Hinder, I., Ossokine, S., Pürrer, M., Raymond, V., Chu, T., Fong, H., Kumar, P., Pfeiffer, H.P., Boyle, M., Hemberger, D.A., Kidder, L.E., Lovelace, G., Scheel, M.A., Szilágyi, B.: Improved effective-one-body model of spinning, nonprecessing binary black holes for the era of gravitational-wave astrophysics with advanced detectors. Phys. Rev.95(4), 044028 (2017). DOI 10.1103/PhysRevD.95.044028
  • (19) Campanelli, M., Lousto, C.O., Marronetti, P., Zlochower, Y.: Accurate Evolutions of Orbiting Black-Hole Binaries without Excision. Physical Review Letters 96(11), 111101 (2006). DOI 10.1103/PhysRevLett.96.111101
  • (20) Cannon, K., Cariou, R., Chapman, A., Crispin-Ortuzar, M., Fotopoulos, N., Frei, M., Hanna, C., Kara, E., Keppel, D., Liao, L., Privitera, S., Searle, A., Singer, L., Weinstein, A.: Toward Early-warning Detection of Gravitational Waves from Compact Binary Coalescence. Astrophys. J.748, 136 (2012). DOI 10.1088/0004-637X/748/2/136
  • (21) de Florian, D., Grojean, C., Maltoni, F., Mariotti, C., Nikitenko, A., Pieri, M., Savard, P., Schumacher, M., Tanaka, R., Aggleton, R., et al.: Handbook of LHC Higgs Cross Sections: 4. Deciphering the Nature of the Higgs Sector. ArXiv e-prints (2016)
  • (22) Diener, P., Dorband, E.N., Schnetter, E., Tiglio, M.: New, efficient, and accurate high order derivative and dissipation operators satisfying summation by parts, and applications in three-dimensional multi-block evolutions. J. Sci. Comput. 32, 109–145 (2007). DOI 10.1007/s10915-006-9123-7
  • (23) Einstein, A.: Does the inertia of a body depend upon its energy-content?, pp. 67–71 (1952)
  • (24)

    Englert, F., Brout, R.: Broken symmetry and the mass of gauge vector mesons.

    Phys. Rev. Lett. 13, 321–323 (1964). DOI 10.1103/PhysRevLett.13.321
  • (25) Etienne, Z.B., Paschalidis, V., Haas, R., Mösta, P., Shapiro, S.L.: IllinoisGRMHD: an open-source, user-friendly GRMHD code for dynamical spacetimes. Classical and Quantum Gravity 32(17), 175009 (2015). DOI 10.1088/0264-9381/32/17/175009
  • (26) Fukuda, Y., et al.: Evidence for oscillation of atmospheric neutrinos. Phys. Rev. Lett. 81, 1562–1567 (1998). DOI 10.1103/PhysRevLett.81.1562
  • (27) G. Aad et al. [ATLAS Collaboration]: Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Phys. Lett. B 716, 1 (2012). DOI 10.1016/j.physletb.2012.08.020
  • (28) George, D., Huerta, E.A.: Deep Learning for real-time gravitational wave detection and parameter estimation: Results with Advanced LIGO data. Physics Letters B 778, 64–70 (2018). DOI 10.1016/j.physletb.2017.12.053
  • (29)

    George, D., Huerta, E.A.: Deep neural networks to enable real-time multimessenger astrophysics.

    Phys. Rev. D 97, 044,039 (2018). DOI 10.1103/PhysRevD.97.044039. URL
  • (30)

    George, D., Shen, H., Huerta, E.A.: Classification and unsupervised clustering of ligo data with deep transfer learning.

    Phys. Rev. D 97, 101,501 (2018). DOI 10.1103/PhysRevD.97.101501. URL
  • (31) Guest, D., Cranmer, K., Whiteson, D.: Deep Learning and its Application to LHC Physics (2018)
  • (32) Guralnik, G.S., Hagen, C.R., Kibble, T.W.B.: Global conservation laws and massless particles. Phys. Rev. Lett. 13, 585–587 (1964). DOI 10.1103/PhysRevLett.13.585
  • (33) Haas, R., Ott, C.D., Szilagyi, B., Kaplan, J.D., Lippuner, J., Scheel, M.A., Barkett, K., Muhlberger, C.D., Dietrich, T., Duez, M.D., Foucart, F., Pfeiffer, H.P., Kidder, L.E., Teukolsky, S.A.: Simulations of inspiraling and merging double neutron stars using the Spectral Einstein Code. Phys. Rev.93(12), 124062 (2016). DOI 10.1103/PhysRevD.93.124062
  • (34) Higgs, P.: Broken symmetries, massless particles and gauge fields. Physics Letters 12(2), 132 – 133 (1964). DOI
  • (35) Higgs, P.W.: Broken symmetries and the masses of gauge bosons. Phys. Rev. Lett. 13, 508–509 (1964). DOI 10.1103/PhysRevLett.13.508
  • (36) Higgs, P.W.: Spontaneous symmetry breakdown without massless bosons. Phys. Rev. 145, 1156–1163 (1966). DOI 10.1103/PhysRev.145.1156
  • (37) Hobill, D.W., Smarr, L.L.: Supercomputing and numerical relativity: a look at the past, present and future., pp. 1–17 (1989)
  • (38)

    Huerta, E.A., George, D., Zhao, Z., Allen, G.: Real-time regression analysis with deep convolutional neural networks.

    ArXiv e-prints (2018)
  • (39) Huerta, E.A., Haas, R., Fajardo, E., Katz, D.S., Anderson, S., Couvares, P., Willis, J., Bouvet, T., Enos, J., Kramer, W.T.C., Leong, H.W., Wheeler, D.: BOSS-LDG: A Novel Computational Framework that Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery. In: 2017 IEEE 13th International Conference on e-Science (e-Science) (2017). DOI 10.1109/eScience.2017.47
  • (40) Husa, S., Khan, S., Hannam, M., Pürrer, M., Ohme, F., Forteza, X.J., Bohé, A.: Frequency-domain gravitational waves from nonprecessing black-hole binaries. I. New numerical waveforms and anatomy of the signal. Phys. Rev.93(4), 044006 (2016). DOI 10.1103/PhysRevD.93.044006
  • (41) Jones, M.D., White, J.P., Innus, M., DeLeon, R.L., Simakov, N., Palmer, J.T., et al.: Workload Analysis of Blue Waters. ArXiv e-prints (2017)
  • (42) Khan, S., Husa, S., Hannam, M., Ohme, F., Pürrer, M., Forteza, X.J., Bohé, A.: Frequency-domain gravitational waves from nonprecessing black-hole binaries. II. A phenomenological model for the advanced detector era. Phys. Rev.93(4), 044007 (2016). DOI 10.1103/PhysRevD.93.044007
  • (43) Kibble, T.W.B.: Symmetry breaking in non-abelian gauge theories. Phys. Rev. 155, 1554–1561 (1967). DOI 10.1103/PhysRev.155.1554
  • (44) Kidder, L.E., Field, S.E., Foucart, F., Schnetter, E., Teukolsky, S.A., Bohn, A., Deppe, N., Diener, P., Hébert, F., Lippuner, J., Miller, J., Ott, C.D., Scheel, M.A., Vincent, T.: SpECTRE: A task-based discontinuous Galerkin code for relativistic astrophysics. Journal of Computational Physics 335, 84–114 (2017). DOI 10.1016/
  • (45) Kramer, W., Butler, M., Bauer, G., Chadalavada, K., Mendes, C.: Blue Waters Parallel I/O Storage Sub-system. In: Prabhat, Q. Koziol (eds.) High Performance Parallel I/O, pp. 17–32. CRC Publications, Taylor and Francis Group (2015)
  • (46) LHC Higgs Cross Section Working Group, Dittmaier, S., et al.: Handbook of LHC Higgs Cross Sections: 1. Inclusive Observables. ArXiv e-prints (2011)
  • (47) LHC Higgs Cross Section Working Group, Dittmaier, S., et al.: Handbook of LHC Higgs Cross Sections: 2. Differential Distributions. ArXiv e-prints (2012)
  • (48) Löffler, F., Faber, J., Bentivegna, E., Bode, T., Diener, P., Haas, R., Hinder, I., Mundim, B.C., Ott, C.D., Schnetter, E., Allen, G., Campanelli, M., Laguna, P.: The Einstein Toolkit: a community computational infrastructure for relativistic astrophysics. Classical and Quantum Gravity 29(11), 115001 (2012). DOI 10.1088/0264-9381/29/11/115001
  • (49) Maeno, T., De, K., Wenaus, T., Nilsson, P., Stewart, G.A., Walker, R., Stradling, A., Caballero, J., Potekhin, M., Smith, D.: Overview of ATLAS PanDA workload management. J. Phys. Conf. Ser. 331, 072,024 (2011). DOI 10.1088/1742-6596/331/7/072024
  • (50) Matzner, R.A., Seidel, H.E., Shapiro, S.L., Smarr, L., Suen, W.M., Teukolsky, S.A., Winicour, J.: Geometry of a Black Hole Collision. Science 270, 941–947 (1995). DOI 10.1126/science.270.5238.941
  • (51) Mösta, P., Mundim, B.C., Faber, J.A., Haas, R., Noble, S.C., Bode, T., Löffler, F., Ott, C.D., Reisswig, C., Schnetter, E.: GRHydro: a new open-source general-relativistic magnetohydrodynamics code for the Einstein toolkit. Classical and Quantum Gravity 31(1), 015005 (2014). DOI 10.1088/0264-9381/31/1/015005
  • (52) Mroué, A.H., Scheel, M.A., Szilágyi, B., Pfeiffer, H.P., Boyle, M., Hemberger, D.A., Kidder, L.E., Lovelace, G., Ossokine, S., Taylor, N.W., Zenginoğlu, A., Buchman, L.T., Chu, T., Foley, E., Giesler, M., Owen, R., Teukolsky, S.A.: Catalog of 174 Binary Black Hole Simulations for Gravitational Wave Astronomy. Physical Review Letters 111(24), 241104 (2013). DOI 10.1103/PhysRevLett.111.241104
  • (53) Nakamura, T., Oohara, K., Kojima, Y.: General Relativistic Collapse to Black Holes and Gravitational Waves from Black Holes. Progress of Theoretical Physics Supplement 90, 1–218 (1987). DOI 10.1143/PTPS.90.1
  • (54) Novotny, J., Tuecke, S., Welch, V.: An online credential repository for the grid: Myproxy. In: Proceedings 10th IEEE International Symposium on High Performance Distributed Computing, pp. 104–111 (2001). DOI 10.1109/HPDC.2001.945181
  • (55) Oleynik, D., Panitkin, S., Turilli, M., Angius, A., Oral, S., De, K., Klimentov, A., Wells, J.C., Jha, S.: High-throughput computing on high-performance platforms: A case study. In: 2017 IEEE 13th International Conference on e-Science (e-Science), pp. 295–304 (2017). DOI 10.1109/eScience.2017.43
  • (56) Pollney, D., Reisswig, C., Schnetter, E., Dorband, N., Diener, P.: High accuracy binary black hole simulations with an extended wave zone. Phys. Rev.83(4), 044045 (2011). DOI 10.1103/PhysRevD.83.044045
  • (57) Pordes, R., Petravick, D., Kramer, B., Olson, D., Livny, M., Roy, A., Avery, P., Blackburn, K., Wenaus, T., Würthwein, F., et al.: The open science grid. In: Journal of Physics: Conference Series, vol. 78, p. 012057. IOP Publishing (2007)
  • (58) Pretorius, F.: Evolution of Binary Black-Hole Spacetimes. Physical Review Letters 95(12), 121,101–+ (2005). DOI 10.1103/PhysRevLett.95.121101
  • (59) Rebei, A., Huerta, E.A., Wang, S., Habib, S., Haas, R., Johnson, D., George, D.: Fusing numerical relativity and deep learning to detect higher-order multipole waveforms from eccentric binary black hole mergers. ArXiv e-prints (2018)
  • (60) S. Chatrchyan et al. [CMS Collaboration]: Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Phys. Lett. B 716, 30–61 (2012). DOI 10.1016/j.physletb.2012.08.021
  • (61) Schnetter, E., Hawley, S.H., Hawke, I.: Evolutions in 3-D numerical relativity using fixed mesh refinement. Class. Quantum Grav. 21, 1465–1488 (2004). DOI 10.1088/0264-9381/21/6/014
  • (62) Sfiligoi, I., Bradley, D.C., Holzman, B., Mhashilkar, P., Padhi, S., Wurthwein, F.: The pilot way to grid resources using glideinwms. In: Computer Science and Information Engineering, 2009 WRI World Congress on, vol. 2, pp. 428–432. IEEE (2009)
  • (63)

    Shen, H., George, D., Huerta, E.A., Zhao, Z.: Denoising Gravitational Waves using Deep Learning with Recurrent Denoising Autoencoders.

    ArXiv e-prints (2017)
  • (64) Shibata, M., Nakamura, T.: Evolution of three-dimensional gravitational waves: Harmonic slicing case. Phys. Rev.52, 5428–5444 (1995). DOI 10.1103/PhysRevD.52.5428
  • (65) Smarr, L.: Numerical Construction of Space-Time. Royal Society of London Proceedings Series A 368, 15–16 (1979). DOI 10.1098/rspa.1979.0109
  • (66) Taracchini, A., Buonanno, A., Pan, Y., Hinderer, T., Boyle, M., Hemberger, D.A., Kidder, L.E., Lovelace, G., Mroué, A.H., Pfeiffer, H.P., Scheel, M.A., Szilágyi, B., Taylor, N.W., Zenginoglu, A.: Effective-one-body model for black-hole binaries with generic mass ratios and spins. Phys. Rev.89(6), 061502 (2014). DOI 10.1103/PhysRevD.89.061502
  • (67) Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the condor experience. Concurrency - Practice and Experience 17(2-4), 323–356 (2005)
  • (68) The LHC Higgs Cross Section Working Group, Heinemeyer, S., et al.: Handbook of LHC Higgs Cross Sections: 3. Higgs Properties. ArXiv e-prints (2013)
  • (69) The LIGO Scientific Collaboration, Aasi, J., et al.: Advanced LIGO. Classical and Quantum Gravity 32(7), 074001 (2015). DOI 10.1088/0264-9381/32/7/074001
  • (70) The LIGO Scientific Collaboration, the Virgo Collaboration, Abbott, B.P., Abbott, R., Abbott, T.D., Acernese, F., Ackley, K., Adams, C., Adams, T., Addesso, P., et al.: GW170608: Observation of a 19-solar-mass Binary Black Hole Coalescence. ArXiv e-prints (2017)
  • (71) Thornburg, J.: A Fast Apparent-Horizon Finder for 3-Dimensional Cartesian Grids in Numerical Relativity. Class. Quantum Grav. 21, 743–766 (2004). DOI 10.1088/0264-9381/21/2/026
  • (72) Usman, S.A., Nitz, A.H., Harry, I.W., Biwer, C.M., Brown, D.A., Cabero, M., Capano, C.D., Dal Canton, T., Dent, T., Fairhurst, S., Kehl, M.S., Keppel, D., Krishnan, B., Lenon, A., Lundgren, A., Nielsen, A.B., Pekowsky, L.P., Pfeiffer, H.P., Saulson, P.R., West, M., Willis, J.L.: The PyCBC search for gravitational waves from compact binary coalescence. Classical and Quantum Gravity 33(21), 215004 (2016). DOI 10.1088/0264-9381/33/21/215004
  • (73) Veitch, J., Raymond, V., Farr, B., Farr, W., Graff, P., Vitale, S., Aylott, B., Blackburn, K., Christensen, N., Coughlin, M., Del Pozzo, W., Feroz, F., Gair, J., Haster, C.J., Kalogera, V., Littenberg, T., Mandel, I., O’Shaughnessy, R., Pitkin, M., Rodriguez, C., Röver, C., Sidery, T., Smith, R., Van Der Sluys, M., Vecchio, A., Vousden, W., Wade, L.: Parameter estimation for compact binaries with ground-based gravitational-wave observations using the LALInference software library. Phys. Rev.91(4), 042003 (2015). DOI 10.1103/PhysRevD.91.042003
  • (74) Wardell, B., Hinder, I., Bentivegna, E.: Simulation of GW150914 binary black hole merger using the Einstein Toolkit (2016). DOI 10.5281/zenodo.155394. URL
  • (75) Weitzel, D., Bockelman, B., Brown, D.A., Couvares, P., Würthwein, F., Fajardo Hernandez, E.: Data Access for LIGO on the OSG. ArXiv e-prints (1705.06202 [cs.DC]) (2017)