Lu Lu

verfied profile

1 follower

  • DeepXDE: A deep learning library for solving differential equations

    Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network using automatic differentiation. The PINN algorithm is simple, and it can be applied to different types of PDEs, including integro-differential equations, fractional PDEs, and stochastic PDEs. Moreover, PINNs solve inverse problems as easily as forward problems. We propose a new residual-based adaptive refinement (RAR) method to improve the training efficiency of PINNs. For pedagogical reasons, we compare the PINN algorithm to a standard finite element method. We also present a Python library for PINNs, DeepXDE, which is designed to serve both as an education tool to be used in the classroom as well as a research tool for solving problems in computational science and engineering. DeepXDE supports complex-geometry domains based on the technique of constructive solid geometry, and enables the user code to be compact, resembling closely the mathematical formulation. We introduce the usage of DeepXDE and its customizability, and we also demonstrate the capability of PINNs and the user-friendliness of DeepXDE for five different examples. More broadly, DeepXDE contributes to the more rapid development of the emerging Scientific Machine Learning field.

    07/10/2019 ∙ by Lu Lu, et al. ∙ 33 share

    read it

  • Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness

    The accuracy of deep learning, i.e., deep neural networks, can be characterized by dividing the total error into three main types: approximation error, optimization error, and generalization error. Whereas there are some satisfactory answers to the problems of approximation and optimization, much less is known about the theory of generalization. Most existing theoretical works for generalization fail to explain the performance of neural networks in practice. To derive a meaningful bound, we study the generalization error of neural networks for classification problems in terms of data distribution and neural network smoothness. We introduce the cover complexity (CC) to measure the difficulty of learning a data set and the inverse of modules of continuity to quantify neural network smoothness. A quantitative bound for expected accuracy/error is derived by considering both the CC and neural network smoothness. We validate our theoretical results by several data sets of images. The numerical results verify that the expected error of trained networks scaled with the square root of the number of classes has a linear relationship with respect to the CC. In addition, we observe a clear consistency between test loss and neural network smoothness during the training process.

    05/27/2019 ∙ by Pengzhan Jin, et al. ∙ 31 share

    read it

  • Gated Multiple Feedback Network for Image Super-Resolution

    The rapid development of deep learning (DL) has driven single image super-resolution (SR) into a new era. However, in most existing DL based image SR networks, the information flows are solely feedforward, and the high-level features cannot be fully explored. In this paper, we propose the gated multiple feedback network (GMFN) for accurate image SR, in which the representation of low-level features are efficiently enriched by rerouting multiple high-level features. We cascade multiple residual dense blocks (RDBs) and recurrently unfolds them across time. The multiple feedback connections between two adjacent time steps in the proposed GMFN exploits multiple high-level features captured under large receptive fields to refine the low-level features lacking enough contextual information. The elaborately designed gated feedback module (GFM) efficiently selects and further enhances useful information from multiple rerouted high-level features, and then refine the low-level features with the enhanced high-level information. Extensive experiments demonstrate the superiority of our proposed GMFN against state-of-the-art SR methods in terms of both quantitative metrics and visual quality. Code is available at

    07/09/2019 ∙ by Qilei Li, et al. ∙ 5 share

    read it

  • Machine Learning for Vehicular Networks

    The emerging vehicular networks are expected to make everyday vehicular operation safer, greener, and more efficient, and pave the path to autonomous driving in the advent of the fifth generation (5G) cellular system. Machine learning, as a major branch of artificial intelligence, has been recently applied to wireless networks to provide a data-driven approach to solve traditionally challenging problems. In this article, we review recent advances in applying machine learning in vehicular networks and attempt to bring more attention to this emerging area. After a brief overview of the major concept of machine learning, we present some application examples of machine learning in solving problems arising in vehicular networks. We finally discuss and highlight several open issues that warrant further research.

    12/19/2017 ∙ by Hao Ye, et al. ∙ 0 share

    read it

  • Noncoherent Detection for Physical Layer Network Coding

    This paper investigates noncoherent detection in a two-way relay channel operated with physical layer network coding (PNC), assuming FSK modulation and short-packet transmissions. For noncoherent detection, the detector has access to the magnitude but not the phase of the received signal. For conventional communication in which a receiver receives the signal from a transmitter only, the phase does not affect the magnitude, hence the performance of the noncoherent detector is independent of the phase. PNC, however, is a multiuser system in which a receiver receives signals from multiple transmitters simultaneously. The relative phase of the signals from different transmitters affects the received signal magnitude through constructive-destructive interference. In particular, for good performance, the noncoherent detector in PNC must take into account the influence of the relative phase on the signal magnitude. Building on this observation, this paper delves into the fundamentals of PNC noncoherent detector design. To avoid excessive overhead, we do away from preambles. We show how the relative phase can be deduced directly from the magnitudes of the received data symbols. Numerical results show that our detector performs nearly as well as a "fictitious" optimal detector that has perfect knowledge of the channel gains and relative phase.

    03/13/2018 ∙ by Zhaorui Wang, et al. ∙ 0 share

    read it

  • Collapse of Deep and Narrow Neural Nets

    Recent theoretical work has demonstrated that deep neural networks have superior performance over shallow networks, but their training is more difficult, e.g., they suffer from the vanishing gradient problem. This problem can be typically resolved by the rectified linear unit (ReLU) activation. However, here we show that even for such activation, deep and narrow neural networks will converge to erroneous mean or median states of the target function depending on the loss with high probability. We demonstrate this collapse of deep and narrow neural networks both numerically and theoretically, and provide estimates of the probability of collapse. We also construct a diagram of a safe region of designing neural networks that avoid the collapse to erroneous states. Finally, we examine different ways of initialization and normalization that may avoid the collapse problem.

    08/15/2018 ∙ by Lu Lu, et al. ∙ 0 share

    read it

  • Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems

    Physics-informed neural networks (PINNs) have recently emerged as an alternative way of solving partial differential equations (PDEs) without the need of building elaborate grids, instead, using a straightforward implementation. In particular, in addition to the deep neural network (DNN) for the solution, a second DNN is considered that represents the residual of the PDE. The residual is then combined with the mismatch in the given data of the solution in order to formulate the loss function. This framework is effective but is lacking uncertainty quantification of the solution due to the inherent randomness in the data or due to the approximation limitations of the DNN architecture. Here, we propose a new method with the objective of endowing the DNN with uncertainty quantification for both sources of uncertainty, i.e., the parametric uncertainty and the approximation uncertainty. We first account for the parametric uncertainty when the parameter in the differential equation is represented as a stochastic process. Multiple DNNs are designed to learn the modal functions of the arbitrary polynomial chaos (aPC) expansion of its solution by using stochastic data from sparse sensors. We can then make predictions from new sensor measurements very efficiently with the trained DNNs. Moreover, we employ dropout to correct the over-fitting and also to quantify the uncertainty of DNNs in approximating the modal functions. We then design an active learning strategy based on the dropout uncertainty to place new sensors in the domain to improve the predictions of DNNs. Several numerical tests are conducted for both the forward and the inverse problems to quantify the effectiveness of PINNs combined with uncertainty quantification. This NN-aPC new paradigm of physics-informed deep learning with uncertainty quantification can be readily applied to other types of stochastic PDEs in multi-dimensions.

    09/21/2018 ∙ by Dongkun Zhang, et al. ∙ 0 share

    read it

  • Bayesian Sequential Design Based on Dual Objectives for Accelerated Life Tests

    Traditional accelerated life test plans are typically based on optimizing the C-optimality for minimizing the variance of an interested quantile of the lifetime distribution. The traditional methods rely on some specified planning values for the model parameters, which are usually unknown prior to the actual tests. The ambiguity of the specified parameters can lead to suboptimal designs for optimizing the intended reliability performance. In this paper, we propose a sequential design strategy for life test plans based on considering dual objectives. In the early stage of the sequential experiment, we suggest to allocate more design locations based on optimizing the D-optimality to quickly gain precision in the estimated model parameters. In the later stage of the experiment, we can allocate more samples based on optimizing the C-optimality to maximize the precision of the estimated quantile of the lifetime distribution. We compare the proposed sequential design strategy with existing test plans considering only a single criterion and illustrate the new method with an example on fatigue testing of polymer composites.

    11/30/2018 ∙ by Lu Lu, et al. ∙ 0 share

    read it

  • Dying ReLU and Initialization: Theory and Numerical Examples

    The dying ReLU refers to the problem when ReLU neurons become inactive and only output 0 for any input. There are many empirical and heuristic explanations on why ReLU neurons die. However, little is known about its theoretical analysis. In this paper, we rigorously prove that a deep ReLU network will eventually die in probability as the depth goes to infinite. Several methods have been proposed to alleviate the dying ReLU. Perhaps, one of the simplest treatments is to modify the initialization procedure. One common way of initializing weights and biases uses symmetric probability distributions, which suffers from the dying ReLU. We thus propose a new initialization procedure, namely, a randomized asymmetric initialization. We prove that the new initialization can effectively prevent the dying ReLU. All parameters required for the new initialization are theoretically designed. Numerical examples are provided to demonstrate the effectiveness of the new initialization procedure.

    03/15/2019 ∙ by Lu Lu, et al. ∙ 0 share

    read it

  • How to Host a Data Competition: Statistical Advice for Design and Analysis of a Data Competition

    Data competitions rely on real-time leaderboards to rank competitor entries and stimulate algorithm improvement. While such competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the host are highly variable. Without careful planning, a supervised learning competition is vulnerable to overfitting, where the winning solutions are so closely tuned to the particular set of provided data that they cannot generalize to the underlying problem of interest to the host. This paper outlines some important considerations for strategically designing relevant and informative data sets to maximize the learning outcome from hosting a competition based on our experience. It also describes a post-competition analysis that enables robust and efficient assessment of the strengths and weaknesses of solutions from different competitors, as well as greater understanding of the regions of the input space that are well-solved. The post-competition analysis, which complements the leaderboard, uses exploratory data analysis and generalized linear models (GLMs). The GLMs not only expand the range of results we can explore, they also provide more detailed analysis of individual sub-questions including similarities and differences between algorithms across different types of scenarios, universally easy or hard regions of the input space, and different learning objectives. When coupled with a strategically planned data generation approach, the methods provide richer and more informative summaries to enhance the interpretation of results beyond just the rankings on the leaderboard. The methods are illustrated with a recently completed competition to evaluate algorithms capable of detecting, identifying, and locating radioactive materials in an urban environment.

    01/16/2019 ∙ by Christine M. Anderson-Cook, et al. ∙ 0 share

    read it