I Executive Summary
In this article we discuss advances in deep learning research that we have pioneered and successfully applied for realtime classification and regression of timeseries data both in simulated and in nonstationary, nonGaussian data [1, 2]
. We also describe a novel combination of deep learning and transfer learning that we have developed and applied for the classification and clustering of noise anomalies
[3, 4]. We discuss the general applicability of these new methodologies across science domains to tackle computational grand challenges.Ii Key Challenges
The key challenges we address in this paper concern the application of deep neural networks (DNNs) to enable realtime classification and regression of timeseries data that span a higherdimensional parameter space, and which are embedded in nonGaussian and nonstationary noise; and the combination of deep learning and transfer learning to develop deep neural networks for classification and clustering of noise anomalies using small training datasets. To highlight the relevance and timeliness of these research themes we consider two science cases: gravitational wave astrophysics and large scale astronomical surveys.
The most sensitive algorithms that enabled the discovery of gravitational waves target only a 4dimensional parameter space out of the 9dimensional space that describes the gravitational wave sources available to the LIGO detectors. Limiting factors of these matchedfiltering algorithms is their computational expense and lack of scalability for realtime regression analyses [5, 6, 7, 8]. This issue is exacerbated when matchedfiltering is combined with fully Bayesian algorithms, leading to full production analyses that take from several hours to months using LIGO computing centers, and stateoftheart high performance computing facilities such as the Blue Waters supercomputer [9].
This computational grand challenge is ubiquitous across science domains. In the context of image processing, the extraction of specific signatures from telescope images will become a major challenge once the Large Synoptic Survey Telescope—the most sensitive astronomical camera to take snapshots of the southern hemisphere—starts operating by the end of this decade. This astronomical facility will generate TBsize datasets on a nightly basis, releasing tens of thousands of images every minute that will contain an unprecedented amount of information of the nearby Universe. Key information embedded in these images will have to be processed in realtime to enable groundbreaking scientific discoveries. A task that is not feasible with existing algorithms.
Iii New Research Directions
There are ongoing efforts to try to alleviate the lack of scalability of matchedfiltering algorithms [6]
. Other approaches involve the development of new signal processing techniques using machine learning
[10, 11, 12, 13, 14, 15, 3, 16]. While these traditional machine learning techniques, including shallow artificial neural networks (ANNs), require “handcrafted” features extracted from the data as inputs rather than the raw noisy data itself, DNNs are capable of extracting these features automatically.
In the context of image classification, we have applied deep learning for the classification of noise anomalies with spectrogram images as inputs to convolutional neural networks (CNNs) [17, 15, 3] and unsupervised clustering of transients [3]. Using images as inputs is advantageous for two reasons: (i) there are well established architectures of 2D CNNs which have been shown to work (GoogLeNet [18], VGG [19], ResNet [20]); and (ii) pretrained weights are available for them, which can significantly speed up the training process via transfer learning while also providing higher accuracy even for small datasets [3].
In the context of realtime regression, our new deep learning method performs very well when we consider a 2dimensional parameter space. However, we want to explore its performance in higherdimensional parameter space. A key feature that will prove critical in that scenario is the scalability of deep learning, i.e., all the intensive computation is diverted to the onetime training stage, after which the datasets can be discarded.
These two problems share common themes for future research: i) development of optimal strategies to train deep neural nets using TBsize datasets, e.g., using genetic algorithms; ii) development of statistical techniques to increase the sensitivity of neural nets to extract low signaltonoise ratio signals from noisy timeseries; iii) systematic exploration to elucidate why deep convolutional neural networks outperform machine learning classifiers, such as Random Forest, Support Vector Machine, kNearest Neighbors, Hidden Markov Model, Shallow Neural Networks, etc.,
[1], and explore whether this property is held in higherorder dimensional signal manifolds; and iv) assess whether deep learning point parameter estimation results are consistent with maximum likelihood Bayesian results, and thus useful as seeds to accelerate existing Bayesian formulations.
Iv State of the Art
The key challenge to carry out realtime regression analysis is related to the fact that most deep neural network algorithms output point parameter estimation values. Ideally, we would like to also provide statistical information, as it is customarily done in Bayesian studies. Recent work has started to shed light in this direction [21]. Being able to carry out realtime regression with deep neural networks that provide statistical information would be a remarkable achievement with far reaching consequences.
To accomplish this work, we would have to be able to generalize to higherdimensional signal manifolds the work we introduced in [1, 2]
. To tackle this problem, it will be necessary to quantify whether the parameter space can be compactified, thereby removing parameter space degeneracies and accelerating the training time and hyperparameter optimization of neural nets. It will also be necessary to assess what type of neural nets are optimal for the problem at hand, i.e., we have found that recurrent neural nets are ideal for denoising
[16] timeseries, whether deep convolutional neural nets are optimal for regression and classification [1, 2].In the gravitational wave detection scenario, this would imply that a single deep learning algorithm, running on a dedicated inference GPU, would suffice to process the lightweight data (2MB/second) that is generated in low latency by gravitational wave detectors. Similarly, if a similar framework is applied to process images, and extract specific signatures embedded in noise, such as the images to be generated by LSST [3, 22] , then both timeseries data and images could be postprocessed simultaneously in realtime, facilitating the observation of astrophysical phenomena using multimessenger astronomy, i.e., contemporaneous observations with gravitational waves, light, neutrinos and cosmic rays.
V Maturity and uniqueness
Deep learning is uniquely posed to overcome what is known as the curse of dimensionality
[23, 24], since it is known to be highly scalable. This intrinsic ability of DNNs to take advantage of large datasets is a unique feature to enable classification and regression analyses over a higher dimensional parameterspace that is beyond the reach of existing algorithms.Furthermore, DNNs are excellent at generalizing or extrapolating to new data. In the context of gravitational wave astronomy, our preliminary results indicates that DNNs, trained with only signals from a 2dimensional parameter space were able to detect and reconstruct the parameters of signals that span up to a 4dimensional signal manifolds, and which currently may go unnoticed with established detection algorithms [25, 26, 27, 28]. With existing computational resources on supercomputers, such as Blue Waters, we estimate that it is feasible to train DNNs that target a 9D parameter space within a few weeks.
Furthermore, DNN algorithms requires minimal preprocessing. CNNs are capable of automatically learning to perform bandpass filtering on raw timeseries inputs [29], and that they are excellent at suppressing highly nonstationary colored noise [30] especially when incorporating realtime noise characteristics [31]
. This suggests that manually devised preprocessing and whitening steps may be eliminated and raw data can be fed to DNNs. This would be particularly advantageous since it is known that Fourier transforms are the bottlenecks of matchedfiltering based algorithms.
Vi Novelty
The deep learning algorithms we pioneered in [1, 2] constitute the first demonstration that deep convolutional neural networks can be applied for realtime classification and regression of weak signals embedded in nonstationary and nonGaussian noise. It is also the first time that DNN were shown to be to exhibit features similar to Gaussian Process Regression [32, 33, 34], and to generalize to signals beyond the templates used for training. Furthermore, our DNNs can be evaluated faster than realtime with a single CPU, and very intensive searches over a broader range of signals can be easily carried out with one dedicated GPU. These results have sparked a keen interest in the gravitational wave community, and have led to a plethora of independent studies within the gravitational wave physics and computer science community.
Acknowledgement
This research is part of the Blue Waters sustainedpetascale computing project, which is supported by the National Science Foundation (awards OCI0725070 and ACI1238993) and the State of Illinois. Blue Waters is a joint effort of the University of Illinois at UrbanaChampaign and its National Center for Supercomputing Applications.
References
 [1] D. George and E. A. Huerta, “Deep neural networks to enable realtime multimessenger astrophysics,” Phys. Rev. D, vol. 97, p. 044039, Feb 2018.
 [2] D. George and E. A. Huerta, “Deep Learning for realtime gravitational wave detection and parameter estimation: Results with Advanced LIGO data,” Physics Letters B, vol. 778, pp. 64–70, Mar. 2018.
 [3] D. George, H. Shen, and E. A. Huerta, “Deep Transfer Learning: A new deep learning glitch classification method for advanced LIGO,” ArXiv eprints, June 2017.
 [4] D. George, H. Shen, and E. A. Huerta, “Glitch Classification and Clustering for LIGO with Deep Transfer Learning,” ArXiv eprints, Nov. 2017.
 [5] B. J. Owen and B. S. Sathyaprakash, “Matched filtering of gravitational waves from inspiraling compact binaries: Computational cost and template placement,” Phys. Rev. D , vol. 60, p. 022002, July 1999.
 [6] N. Indik, H. Fehrmann, F. Harke, B. Krishnan, and A. B. Nielsen, “Reducing the number of templates for alignedspin compact binary coalescence gravitational wave searches,” ArXiv eprints, Dec. 2017.
 [7] I. Harry, S. Privitera, A. Bohé, and A. Buonanno, “Searching for gravitational waves from compact binaries with precessing spins,” Phys. Rev. D , vol. 94, p. 024012, July 2016.
 [8] R. Smith, S. E. Field, K. Blackburn, C.J. Haster, M. Pürrer, V. Raymond, and P. Schmidt, “Fast and accurate inference on gravitational waves from precessing compact binaries,” Phys. Rev. D , vol. 94, p. 044031, Aug. 2016.
 [9] E. A. Huerta, R. Haas, E. Fajardo, D. S. Katz, S. Anderson, P. Couvares, J. Willis, T. Bouvet, J. Enos, W. T. C. Kramer, H. W. Leong, and D. Wheeler, “BOSSLDG: A Novel Computational Framework that Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery,” ArXiv eprints, Sept. 2017.

[10]
P. Graff, F. Feroz, M. P. Hobson, and A. Lasenby, “BAMBI: blind accelerated multimodal Bayesian inference,”
MNRAS , vol. 421, pp. 169–180, Mar. 2012.  [11] N. Mukund, S. Abraham, S. Kandhasamy, S. Mitra, and N. S. Philip, “Transient classification in ligo data using difference boosting neural network,” Phys. Rev. D, vol. 95, p. 104059, May 2017.
 [12] J. Powell et al., “Classification methods for noise transients in advanced gravitationalwave detectors II: performance tests on Advanced LIGO data,” Classical and Quantum Gravity, vol. 34, p. 034002, Feb. 2017.
 [13] J. Powell, D. Trifirò, E. Cuoco, I. S. Heng, and M. Cavaglià, “Classification methods for noise transients in advanced gravitationalwave detectors,” Classical and Quantum Gravity, vol. 32, p. 215012, Nov. 2015.
 [14] M. Zevin, S. Coughlin, S. Bahaadini, E. Besler, N. Rohani, S. Allen, M. Cabero, K. Crowston, A. Katsaggelos, S. Larson, T. K. Lee, C. Lintott, T. Littenberg, A. Lundgren, C. Oesterlund, J. Smith, L. Trouille, and V. Kalogera, “Gravity Spy: Integrating Advanced LIGO Detector Characterization, Machine Learning, and Citizen Science,” ArXiv eprints, Nov. 2016.
 [15] S. Bahaadini, N. Rohani, S. Coughlin, M. Zevin, V. Kalogera, and A. K. Katsaggelos, “Deep Multiview Models for Glitch Classification,” ArXiv eprints, Apr. 2017.

[16]
H. Shen, D. George, E. A. Huerta, and Z. Zhao, “Denoising Gravitational Waves using Deep Learning with Recurrent Denoising Autoencoders,”
ArXiv eprints, Nov. 2017.  [17] M. Zevin, S. Coughlin, S. Bahaadini, E. Besler, N. Rohani, S. Allen, M. Cabero, K. Crowston, A. Katsaggelos, S. Larson, T. K. Lee, C. Lintott, T. Littenberg, A. Lundgren, C. Oesterlund, J. Smith, L. Trouille, and V. Kalogera, “Gravity Spy: Integrating Advanced LIGO Detector Characterization, Machine Learning, and Citizen Science,” ArXiv eprints, Nov. 2016.

[18]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
, June 2015.  [19] K. Simonyan and A. Zisserman, “Very deep convolutional networks for largescale image recognition,” CoRR, vol. abs/1409.1556, 2014.
 [20] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015.
 [21] J. Lee, Y. Bahri, R. Novak, S. S. Schoenholz, J. Pennington, and J. SohlDickstein, “Deep Neural Networks as Gaussian Processes,” ArXiv eprints, Oct. 2017.
 [22] N. Sedaghat and A. Mahabal, “Effective Image Differencing with ConvNets for Realtime Transient Hunting,” ArXiv eprints, Oct. 2017.
 [23] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
 [24] Y. Bengio and Y. LeCun, “Scaling learning algorithms towards AI,” in Large Scale Kernel Machines (L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, eds.), MIT Press, 2007.
 [25] E. A. Huerta, P. Kumar, B. Agarwal, D. George, H.Y. Schive, H. P. Pfeiffer, R. Haas, W. Ren, T. Chu, M. Boyle, D. A. Hemberger, L. E. Kidder, M. A. Scheel, and B. Szilagyi, “A complete waveform model for compact binaries on eccentric orbits,” Phys. Rev. D , vol. 95, p. 024038, Jan. 2017.
 [26] V. Tiwari, S. Klimenko, N. Christensen, E. A. Huerta, S. R. P. Mohapatra, A. Gopakumar, M. Haney, P. Ajith, S. T. McWilliams, G. Vedovato, M. Drago, F. Salemi, G. A. Prodi, C. Lazzaro, S. Tiwari, G. Mitselmakher, and F. Da Silva, “Proposed search for the detection of gravitational waves from eccentric binary black holes,” Phys. Rev. D , vol. 93, p. 043007, Feb. 2016.
 [27] E. A. Huerta, P. Kumar, S. T. McWilliams, R. O’Shaughnessy, and N. Yunes, “Accurate and efficient waveforms for compact binaries on eccentric orbits,” Phys. Rev. D , vol. 90, p. 084016, Oct. 2014.
 [28] E. A. Huerta and D. A. Brown, “Effect of eccentricity on binary neutron star searches in advanced LIGO,” Phys. Rev. D , vol. 87, p. 127501, June 2013.
 [29] W. Dai, C. Dai, S. Qu, J. Li, and S. Das, “Very deep convolutional neural networks for raw waveforms,” CoRR, vol. abs/1610.00087, 2016.
 [30] Y. Xu, J. Du, L. R. Dai, and C. H. Lee, “A regression approach to speech enhancement based on deep neural networks,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, pp. 7–19, Jan 2015.
 [31] A. Kumar and D. Florêncio, “Speech enhancement in multiplenoise conditions using deep neural networks,” CoRR, vol. abs/1605.02427, 2016.
 [32] D. J. C. Mackay, Information Theory, Inference and Learning Algorithms. Oct. 2003.
 [33] C. J. Moore, C. P. L. Berry, A. J. K. Chua, and J. R. Gair, “Improving gravitationalwave parameter estimation using Gaussian process regression,” Phys. Rev. D , vol. 93, p. 064001, Mar. 2016.
 [34] C. J. Moore and J. R. Gair, “Novel Method for Incorporating Model Uncertainties into Gravitational Wave Parameter Estimates,” Physical Review Letters, vol. 113, p. 251101, Dec. 2014.
Comments
There are no comments yet.