Real-time regression analysis with deep convolutional neural networks

05/07/2018 ∙ by E. A. Huerta, et al. ∙ University of Illinois at Urbana-Champaign 0

We discuss the development of novel deep learning algorithms to enable real-time regression analysis for time series data. We showcase the application of this new method with a timely case study, and then discuss the applicability of this approach to tackle similar challenges across science domains.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Executive Summary

In this article we discuss advances in deep learning research that we have pioneered and successfully applied for real-time classification and regression of time-series data both in simulated and in non-stationary, non-Gaussian data [1, 2]

. We also describe a novel combination of deep learning and transfer learning that we have developed and applied for the classification and clustering of noise anomalies 

[3, 4]. We discuss the general applicability of these new methodologies across science domains to tackle computational grand challenges.

Ii Key Challenges

The key challenges we address in this paper concern the application of deep neural networks (DNNs) to enable real-time classification and regression of time-series data that span a higher-dimensional parameter space, and which are embedded in non-Gaussian and non-stationary noise; and the combination of deep learning and transfer learning to develop deep neural networks for classification and clustering of noise anomalies using small training datasets. To highlight the relevance and timeliness of these research themes we consider two science cases: gravitational wave astrophysics and large scale astronomical surveys.

The most sensitive algorithms that enabled the discovery of gravitational waves target only a 4-dimensional parameter space out of the 9-dimensional space that describes the gravitational wave sources available to the LIGO detectors. Limiting factors of these matched-filtering algorithms is their computational expense and lack of scalability for real-time regression analyses [5, 6, 7, 8]. This issue is exacerbated when matched-filtering is combined with fully Bayesian algorithms, leading to full production analyses that take from several hours to months using LIGO computing centers, and state-of-the-art high performance computing facilities such as the Blue Waters supercomputer [9].

This computational grand challenge is ubiquitous across science domains. In the context of image processing, the extraction of specific signatures from telescope images will become a major challenge once the Large Synoptic Survey Telescope—the most sensitive astronomical camera to take snapshots of the southern hemisphere—starts operating by the end of this decade. This astronomical facility will generate TB-size datasets on a nightly basis, releasing tens of thousands of images every minute that will contain an unprecedented amount of information of the nearby Universe. Key information embedded in these images will have to be processed in real-time to enable groundbreaking scientific discoveries. A task that is not feasible with existing algorithms.

Iii New Research Directions

There are ongoing efforts to try to alleviate the lack of scalability of matched-filtering algorithms [6]

. Other approaches involve the development of new signal processing techniques using machine learning 

[10, 11, 12, 13, 14, 15, 3, 16]

. While these traditional machine learning techniques, including shallow artificial neural networks (ANNs), require “handcrafted” features extracted from the data as inputs rather than the raw noisy data itself, DNNs are capable of extracting these features automatically.

In the context of image classification, we have applied deep learning for the classification of noise anomalies with spectrogram images as inputs to convolutional neural networks (CNNs) [17, 15, 3] and unsupervised clustering of transients [3]. Using images as inputs is advantageous for two reasons: (i) there are well established architectures of 2D CNNs which have been shown to work (GoogLeNet [18], VGG [19], ResNet [20]); and (ii) pre-trained weights are available for them, which can significantly speed up the training process via transfer learning while also providing higher accuracy even for small datasets [3].

In the context of real-time regression, our new deep learning method performs very well when we consider a 2-dimensional parameter space. However, we want to explore its performance in higher-dimensional parameter space. A key feature that will prove critical in that scenario is the scalability of deep learning, i.e., all the intensive computation is diverted to the one-time training stage, after which the datasets can be discarded.

These two problems share common themes for future research: i) development of optimal strategies to train deep neural nets using TB-size datasets, e.g., using genetic algorithms; ii) development of statistical techniques to increase the sensitivity of neural nets to extract low signal-to-noise ratio signals from noisy time-series; iii) systematic exploration to elucidate why deep convolutional neural networks outperform machine learning classifiers, such as Random Forest, Support Vector Machine, k-Nearest Neighbors, Hidden Markov Model, Shallow Neural Networks, etc., 

[1]

, and explore whether this property is held in higher-order dimensional signal manifolds; and iv) assess whether deep learning point parameter estimation results are consistent with maximum likelihood Bayesian results, and thus useful as seeds to accelerate existing Bayesian formulations.

Iv State of the Art

The key challenge to carry out real-time regression analysis is related to the fact that most deep neural network algorithms output point parameter estimation values. Ideally, we would like to also provide statistical information, as it is customarily done in Bayesian studies. Recent work has started to shed light in this direction [21]. Being able to carry out real-time regression with deep neural networks that provide statistical information would be a remarkable achievement with far reaching consequences.

To accomplish this work, we would have to be able to generalize to higher-dimensional signal manifolds the work we introduced in [1, 2]

. To tackle this problem, it will be necessary to quantify whether the parameter space can be compactified, thereby removing parameter space degeneracies and accelerating the training time and hyperparameter optimization of neural nets. It will also be necessary to assess what type of neural nets are optimal for the problem at hand, i.e., we have found that recurrent neural nets are ideal for de-noising 

[16] time-series, whether deep convolutional neural nets are optimal for regression and classification [1, 2].

In the gravitational wave detection scenario, this would imply that a single deep learning algorithm, running on a dedicated inference GPU, would suffice to process the lightweight data (2MB/second) that is generated in low latency by gravitational wave detectors. Similarly, if a similar framework is applied to process images, and extract specific signatures embedded in noise, such as the images to be generated by LSST [3, 22] , then both time-series data and images could be post-processed simultaneously in real-time, facilitating the observation of astrophysical phenomena using multimessenger astronomy, i.e., contemporaneous observations with gravitational waves, light, neutrinos and cosmic rays.

V Maturity and uniqueness

Deep learning is uniquely posed to overcome what is known as the curse of dimensionality 

[23, 24], since it is known to be highly scalable. This intrinsic ability of DNNs to take advantage of large datasets is a unique feature to enable classification and regression analyses over a higher dimensional parameter-space that is beyond the reach of existing algorithms.

Furthermore, DNNs are excellent at generalizing or extrapolating to new data. In the context of gravitational wave astronomy, our preliminary results indicates that DNNs, trained with only signals from a 2-dimensional parameter space were able to detect and reconstruct the parameters of signals that span up to a 4-dimensional signal manifolds, and which currently may go unnoticed with established detection algorithms [25, 26, 27, 28]. With existing computational resources on supercomputers, such as Blue Waters, we estimate that it is feasible to train DNNs that target a 9D parameter space within a few weeks.

Furthermore, DNN algorithms requires minimal pre-processing. CNNs are capable of automatically learning to perform band-pass filtering on raw time-series inputs [29], and that they are excellent at suppressing highly non-stationary colored noise [30] especially when incorporating real-time noise characteristics [31]

. This suggests that manually devised pre-processing and whitening steps may be eliminated and raw data can be fed to DNNs. This would be particularly advantageous since it is known that Fourier transforms are the bottlenecks of matched-filtering based algorithms.

Vi Novelty

The deep learning algorithms we pioneered in [1, 2] constitute the first demonstration that deep convolutional neural networks can be applied for real-time classification and regression of weak signals embedded in non-stationary and non-Gaussian noise. It is also the first time that DNN were shown to be to exhibit features similar to Gaussian Process Regression [32, 33, 34], and to generalize to signals beyond the templates used for training. Furthermore, our DNNs can be evaluated faster than real-time with a single CPU, and very intensive searches over a broader range of signals can be easily carried out with one dedicated GPU. These results have sparked a keen interest in the gravitational wave community, and have led to a plethora of independent studies within the gravitational wave physics and computer science community.

Acknowledgement

This research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the State of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications.

References

  • [1] D. George and E. A. Huerta, “Deep neural networks to enable real-time multimessenger astrophysics,” Phys. Rev. D, vol. 97, p. 044039, Feb 2018.
  • [2] D. George and E. A. Huerta, “Deep Learning for real-time gravitational wave detection and parameter estimation: Results with Advanced LIGO data,” Physics Letters B, vol. 778, pp. 64–70, Mar. 2018.
  • [3] D. George, H. Shen, and E. A. Huerta, “Deep Transfer Learning: A new deep learning glitch classification method for advanced LIGO,” ArXiv e-prints, June 2017.
  • [4] D. George, H. Shen, and E. A. Huerta, “Glitch Classification and Clustering for LIGO with Deep Transfer Learning,” ArXiv e-prints, Nov. 2017.
  • [5] B. J. Owen and B. S. Sathyaprakash, “Matched filtering of gravitational waves from inspiraling compact binaries: Computational cost and template placement,” Phys. Rev. D , vol. 60, p. 022002, July 1999.
  • [6] N. Indik, H. Fehrmann, F. Harke, B. Krishnan, and A. B. Nielsen, “Reducing the number of templates for aligned-spin compact binary coalescence gravitational wave searches,” ArXiv e-prints, Dec. 2017.
  • [7] I. Harry, S. Privitera, A. Bohé, and A. Buonanno, “Searching for gravitational waves from compact binaries with precessing spins,” Phys. Rev. D , vol. 94, p. 024012, July 2016.
  • [8] R. Smith, S. E. Field, K. Blackburn, C.-J. Haster, M. Pürrer, V. Raymond, and P. Schmidt, “Fast and accurate inference on gravitational waves from precessing compact binaries,” Phys. Rev. D , vol. 94, p. 044031, Aug. 2016.
  • [9] E. A. Huerta, R. Haas, E. Fajardo, D. S. Katz, S. Anderson, P. Couvares, J. Willis, T. Bouvet, J. Enos, W. T. C. Kramer, H. W. Leong, and D. Wheeler, “BOSS-LDG: A Novel Computational Framework that Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery,” ArXiv e-prints, Sept. 2017.
  • [10]

    P. Graff, F. Feroz, M. P. Hobson, and A. Lasenby, “BAMBI: blind accelerated multimodal Bayesian inference,”

    MNRAS , vol. 421, pp. 169–180, Mar. 2012.
  • [11] N. Mukund, S. Abraham, S. Kandhasamy, S. Mitra, and N. S. Philip, “Transient classification in ligo data using difference boosting neural network,” Phys. Rev. D, vol. 95, p. 104059, May 2017.
  • [12] J. Powell et al., “Classification methods for noise transients in advanced gravitational-wave detectors II: performance tests on Advanced LIGO data,” Classical and Quantum Gravity, vol. 34, p. 034002, Feb. 2017.
  • [13] J. Powell, D. Trifirò, E. Cuoco, I. S. Heng, and M. Cavaglià, “Classification methods for noise transients in advanced gravitational-wave detectors,” Classical and Quantum Gravity, vol. 32, p. 215012, Nov. 2015.
  • [14] M. Zevin, S. Coughlin, S. Bahaadini, E. Besler, N. Rohani, S. Allen, M. Cabero, K. Crowston, A. Katsaggelos, S. Larson, T. K. Lee, C. Lintott, T. Littenberg, A. Lundgren, C. Oesterlund, J. Smith, L. Trouille, and V. Kalogera, “Gravity Spy: Integrating Advanced LIGO Detector Characterization, Machine Learning, and Citizen Science,” ArXiv e-prints, Nov. 2016.
  • [15] S. Bahaadini, N. Rohani, S. Coughlin, M. Zevin, V. Kalogera, and A. K. Katsaggelos, “Deep Multi-view Models for Glitch Classification,” ArXiv e-prints, Apr. 2017.
  • [16]

    H. Shen, D. George, E. A. Huerta, and Z. Zhao, “Denoising Gravitational Waves using Deep Learning with Recurrent Denoising Autoencoders,”

    ArXiv e-prints, Nov. 2017.
  • [17] M. Zevin, S. Coughlin, S. Bahaadini, E. Besler, N. Rohani, S. Allen, M. Cabero, K. Crowston, A. Katsaggelos, S. Larson, T. K. Lee, C. Lintott, T. Littenberg, A. Lundgren, C. Oesterlund, J. Smith, L. Trouille, and V. Kalogera, “Gravity Spy: Integrating Advanced LIGO Detector Characterization, Machine Learning, and Citizen Science,” ArXiv e-prints, Nov. 2016.
  • [18] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in

    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    , June 2015.
  • [19] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” CoRR, vol. abs/1409.1556, 2014.
  • [20] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015.
  • [21] J. Lee, Y. Bahri, R. Novak, S. S. Schoenholz, J. Pennington, and J. Sohl-Dickstein, “Deep Neural Networks as Gaussian Processes,” ArXiv e-prints, Oct. 2017.
  • [22] N. Sedaghat and A. Mahabal, “Effective Image Differencing with ConvNets for Real-time Transient Hunting,” ArXiv e-prints, Oct. 2017.
  • [23] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
  • [24] Y. Bengio and Y. LeCun, “Scaling learning algorithms towards AI,” in Large Scale Kernel Machines (L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, eds.), MIT Press, 2007.
  • [25] E. A. Huerta, P. Kumar, B. Agarwal, D. George, H.-Y. Schive, H. P. Pfeiffer, R. Haas, W. Ren, T. Chu, M. Boyle, D. A. Hemberger, L. E. Kidder, M. A. Scheel, and B. Szilagyi, “A complete waveform model for compact binaries on eccentric orbits,” Phys. Rev. D , vol. 95, p. 024038, Jan. 2017.
  • [26] V. Tiwari, S. Klimenko, N. Christensen, E. A. Huerta, S. R. P. Mohapatra, A. Gopakumar, M. Haney, P. Ajith, S. T. McWilliams, G. Vedovato, M. Drago, F. Salemi, G. A. Prodi, C. Lazzaro, S. Tiwari, G. Mitselmakher, and F. Da Silva, “Proposed search for the detection of gravitational waves from eccentric binary black holes,” Phys. Rev. D , vol. 93, p. 043007, Feb. 2016.
  • [27] E. A. Huerta, P. Kumar, S. T. McWilliams, R. O’Shaughnessy, and N. Yunes, “Accurate and efficient waveforms for compact binaries on eccentric orbits,” Phys. Rev. D , vol. 90, p. 084016, Oct. 2014.
  • [28] E. A. Huerta and D. A. Brown, “Effect of eccentricity on binary neutron star searches in advanced LIGO,” Phys. Rev. D , vol. 87, p. 127501, June 2013.
  • [29] W. Dai, C. Dai, S. Qu, J. Li, and S. Das, “Very deep convolutional neural networks for raw waveforms,” CoRR, vol. abs/1610.00087, 2016.
  • [30] Y. Xu, J. Du, L. R. Dai, and C. H. Lee, “A regression approach to speech enhancement based on deep neural networks,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, pp. 7–19, Jan 2015.
  • [31] A. Kumar and D. Florêncio, “Speech enhancement in multiple-noise conditions using deep neural networks,” CoRR, vol. abs/1605.02427, 2016.
  • [32] D. J. C. Mackay, Information Theory, Inference and Learning Algorithms. Oct. 2003.
  • [33] C. J. Moore, C. P. L. Berry, A. J. K. Chua, and J. R. Gair, “Improving gravitational-wave parameter estimation using Gaussian process regression,” Phys. Rev. D , vol. 93, p. 064001, Mar. 2016.
  • [34] C. J. Moore and J. R. Gair, “Novel Method for Incorporating Model Uncertainties into Gravitational Wave Parameter Estimates,” Physical Review Letters, vol. 113, p. 251101, Dec. 2014.