Physics-Guided Machine Learning for Scientific Discovery: An Application in Simulating Lake Temperature Profiles

Physics-based models of dynamical systems are often used to study engineering and environmental systems. Despite their extensive use, these models have several well-known limitations due to simplified representations of the physical processes being modeled or challenges in selecting appropriate parameters. While-state-of-the-art machine learning models can sometimes outperform physics-based models given ample amount of training data, they can produce results that are physically inconsistent. This paper proposes a physics-guided recurrent neural network model (PGRNN) that combines RNNs and physics-based models to leverage their complementary strengths and improves the modeling of physical processes. Specifically, we show that a PGRNN can improve prediction accuracy over that of physics-based models, while generating outputs consistent with physical laws. An important aspect of our PGRNN approach lies in its ability to incorporate the knowledge encoded in physics-based models. This allows training the PGRNN model using very few true observed data while also ensuring high prediction accuracy. Although we present and evaluate this methodology in the context of modeling the dynamics of temperature in lakes, it is applicable more widely to a range of scientific and engineering disciplines where physics-based (also known as mechanistic) models are used, e.g., climate science, materials science, computational chemistry, and biomedicine.



There are no comments yet.


page 5


Physics Guided RNNs for Modeling Dynamical Systems: A Case Study in Simulating Lake Temperature Profiles

This paper proposes a physics-guided recurrent neural network model (PGR...

Physics-Guided Recurrent Graph Networks for Predicting Flow and Temperature in River Networks

This paper proposes a physics-guided machine learning approach that comb...

Enhancing predictive skills in physically-consistent way: Physics Informed Machine Learning for Hydrological Processes

Current modeling approaches for hydrological modeling often rely on eith...

Comparison of Models for Training Optical Matrix Multipliers in Neuromorphic PICs

We experimentally compare simple physics-based vs. data-driven neural-ne...

The synthesis of data from instrumented structures and physics-based models via Gaussian processes

A recent development which is poised to disrupt current structural engin...

A Hybrid Science-Guided Machine Learning Approach for Modeling and Optimizing Chemical Processes

This study presents a broad perspective of hybrid process modeling and o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Physics-based models have been widely used to study engineering and environmental systems in domains such as hydrology, climate science, materials science, agriculture, and computational chemistry. Despite their extensive use, these models have several well-known limitations due to simplified representations of the physical processes being modeled or challenges in selecting appropriate parameters. There is a tremendous opportunity to systematically advance modeling in these domains by using machine learning (ML) methods. However, capturing this opportunity is contingent on a paradigm shift in data-intensive scientific discovery since the ”black box” use of ML often leads to serious false discoveries in scientific applications (Lazer and others, 2014; Karpatne et al., 2017a)

. In this paper, we present a novel methodology for combining physics-based models with state-of-the-art deep learning methods to leverage their complementary strengths.

Even though physics-based models are based on known physical laws that govern relationships between input and output variables, the majority of physics-based models are necessarily approximations of reality due to incomplete knowledge of certain processes, which introduces bias. In addition, they often contain a large number of parameters whose values must be estimated with the help of limited observed data. A standard approach for calibrating these parameters is to exhaustively search the space of parameter combinations and choose parameter combinations that result in the best performance on training data. Besides its computational cost, this approach is also prone to over-fitting due to heterogeneity in the underlying processes in both space and time. The limitations of physics-based models cut across discipline boundaries and are well known in the scientific community; e.g., see a series of debate papers in hydrology 

(Lall, 2014; Gupta and others, 2014; McDonnell and Beven, 2014).

ML models, given their tremendous success in several commercial applications (e.g., computer vision, and natural language processing) are increasingly being considered as promising alternatives to physics-based models by the scientific community. State of the art (SOA) ML models (e.g., Recurrent Neural Networks (RNN) and Long-Short Term Memory (LSTM)) given enough data, can often perform better than traditional empirical models (e.g., regression-based models) used by science communities as an alternative to physics-based models 

(Graham-Rowe et al., 2008; Goh et al., 2017). However, direct application of black-box ML models to a scientific problem encounters three major challenges: 1. They require a lot of training data, which is scarce in most practical settings. 2. Empirical models (including the SOA ML models) simply identify statistical relations between inputs and the system variables of interest (e.g., the temperature profile of the lake) without taking into account any physical laws (e.g., conservation of energy or mass) and thus can produce results that are inconsistent with physical laws. 3. Relationships produced by empirical models can at best be valid only for the set of variable combinations present in the training data and are unable to generalize to scenarios unseen in the training data. For example, a ML model trained for today’s climate may not be accurate for future warmer climate scenarios.

The goal of this work is to improve the modeling of engineering and environmental systems. Effective representation of physical processes in such systems will require development of novel abstractions and architectures. In addition, the optimization process to produce an ML model will have to consider not just accuracy (i.e., how well the output matches the observations) but also its ability to provide physically consistent results. In particular, we present Physics-Guided Recurrent Neural Network models (PGRNN) as a general framework for modeling physical phenomena with potential applications for many disciplines. The PGRNN model has a number of novel aspects:

1. Many temporal processes in environmental/engineering systems involve complex long-term temporal dependencies that cannot be captured by a plain neural network or a simple temporal model such as a standard RNN. In contrast, in PGRNN we use advanced ML models such as LSTM, which have the potential to mimic complex temporal processes in physical systems.

2. The proposed PGRNN can incorporate explicit physical laws such as energy conservation or mass conservation. This is done by introducing additional variables in the recurrent structure to keep track of physical states that can be used to check for consistency with physical laws. In addition, we generalize the loss function to include a physics-based penalty 

(Karpatne et al., 2017a). Thus, the overall training loss is , where the first term on the right hand side represents the supervised training loss between the predicted outputs and the observed outputs (e.g., RMSE in regression or cross-entropy in classification), and the seconed term represents the physical consistency-based penalty. In addition, to favoring physically consistent solutions, another major side benefit of including physics-based penalty in the loss function is that it can be applied even to instances for which output (observed) data is not available since the physics-based penalty can be computed as long as input (driver) data is available. Note that in absence of physics based penalty, training loss can be computed only on those time steps where observed output is available. Inclusion of physics based loss term allows a much more robust training, especially in situations, where observed output is available on only a small number of time steps.

3. Physics based/mechanistic models contain a lot of domain knowledge that goes well beyond what can be captured as constraints such conservation laws. To leverage this knowledge, we generate a large amount of “synthetic” observation data by executing physics based models for a variety input drivers (that are easily available) and use the synthetic observation to pre-train the ML model. The idea here is that training from synthetic data generated by imperfect physical models may allow the ML model to get close enough to the target solution, so only a small amount of observed data (ground truth labels) is needed to further refine the model. In addition, the synthetic data is guaranteed to be physically consistent due to the nature of the process model being founded on physical principles.

Our proposed Physics-Guided Recurrent Neural Networks model (PGRNN) is developed for the purpose of predicting lake water temperatures at various depths at the daily scale. The temperature of water in a lake is known to be an ecological “master factor” (Magnuson and others, 1979) that controls the growth, survival, and reproduction of fish (Roberts and others, 2013). Warming water temperatures can increase the occurrence of aquatic invasive species (Rahel and Olden, 2008; Roberts and others, 2017), which may displace fish and native aquatic organisms, result in more harmful algal blooms (HABs) (Harris and Graham, 2017; Paerl and Huisman, 2008). Understanding temperature change and the resulting biotic “winners and losers” is timely science that can also be directly applied to inform priority action for natural resources. Given the importance of this problem, the aquatic science community has developed numerous modeuls for the simulation of temperature, including the General Lake Model (GLM) (Hipsey et al., 2019), which simulates the physical processes (e.g., vertical mixing, and the warming or cooling of water via energy lost or gained from fluxes such as solar radiation and evaporation, etc.). As is typical for any such model, GLM is only an approximation of the physical reality, and has a number of parameters (e.g., water clarity, mixing efficiency, and wind sheltering) that often need to be calibrated using observations.

We evaluate the proposed PGRNN method in a real-world system, Lake Mendota (Wisconsin), which is one of the most extensively studied lake systems in the world. We chose this lake because it has plenty of observed data that can be used to evaluate the performance of any new approach. In particular, we can measure the performance of different algorithms by varying the the amount of observations used for training. This helps test the effectiveness of the proposed methods in data-scarce scenarios, which is important since most real-world lakes have very few observations or are not observed at all (they usually have less than 1% of observations that are available for Mendota). In addition, Lake Mendota is large and deep enough such that it shows a variety of temperature patterns (e.g., stratified temperature patterns in warmer seasons and well-mixed patterns in colder seasons). This allows us to test the capacity of ML models in capturing such complex temperature patterns.

Our main contributions are as follows. We show that it is possible to effectively model the temporal dynamics of temperature in lakes using LSTMs provided that enough observed data is available for training. We show that traditional LSTMs can be augmented to take energy conservation into account and track the balance of energy loss and gain relative to temperature change (a physical law of thermodynamics). Including such components in models to make the output consistent with physical laws can make them more acceptable for use by scientists and also may improve the prediction performance. We also studied the benefit of pre-training this model using synthetic data (i.e., the output of an uncalibrated physics-based model) and then refining it using only a small amount of observation data. The results show that such pre-trained models can easily outperform the state-of-the art physics-based model by using a small amount of observed data. Moreover, we show that such pre-training is useful even if it uses simulated data from lakes that are very different in geometry, clarity or climate than the lake being studied. These results confirm that the PGRNN can leverage the strengths of physics-based models while also filling in knowledge gaps by overlaying features learned from data.

The organization of the paper is as follows: In Section 2, we describe the preliminary knowledge and the setting of our problem. Section 3 presents the discussions related to the proposed PGRNN model. In section 4, we extensively evaluate the proposed method in a real-world dataset. We then recapitulate related existing work in Section 5 before we conclude our work in Section 6. A preliminary version of this work appeared in (Jia et al., 2019).

2. Preliminaries

2.1. Problem Formulation

Our goal is to simulate the temperature of water in the lake at each depth , and on each date , given physical variables governing the dynamics of lake temperature. This problem is referred to as 1D-modeling of temperature (depth being the single dimension). Specifically, represents input physical variables at on a specific date , which include meteorological recordings at the surface of water such as the amount of solar radiation (in W/m, for short-wave and long-wave), wind speed (in m/s), air temperature (in ), relative humidity (0-100%), rain (in cm), snow indicator (True or False), as well as the value of depth (in m) and day of year (1-366). These chosen features are known to be the primary drivers of lake thermodynamics (Hipsey et al., 2019). Given these input drivers and a depth level , we aim to predict water temperature at this depth over the entire study period. For simplicity, we use and to represent and in the paper when it causes no ambiguity. During the training process, we are given the sparse ground-truth observed temperature profiles on certain dates and at certain depths captured by in-water sensors (more dataset description is provided in Section 4.1).

Figure 1. A pictorial description of the physical processes simulated by the General Lake Model (Hipsey and others, 2014). These processes govern the dynamics of temperature in a lake.

2.2. General Lake Model (GLM)

The physics-based GLM captures a variety of physical processes governing the dynamics of water temperature in a lake, including the heating of the water surface due to incoming short-wave radiation, the attenuation of radiation beneath the water surface, the mixing of layers with varying thermal energy at different depths, and the loss of heat from the surface of the lake via evaporation or outgoing long-wave radiation (shown in Fig. 1). We use GLM as our preferred physics-based model for lake temperature modeling due to its model performance and wide use among the lake modeling community.

The GLM has a number of parameters (e.g., parameters related to vertical mixing, wind sheltering, and water clarity) that are often calibrated specifically to individual lakes if training data are available. The basic calibration method (common to a wide range of scientific and engineering problems) is to run the model for combinations of parameter values and select the parameter set that minimizes model error. This calibration process can be both labor- and computationally-intensive. Furthermore, the calibration process, applied even in the presence of ample training data, is still limited by simplifications and rigid formulations in these physics-based models.

2.3. Sequential Machine Learning Model

Sequential machine learning model in dynamical systems aims to learn a black-box transformation from the input series to target variables . In this work, we will train a single sequential model that applies to different depths. This allows utilizing observation data from different depths in the training process. Later in Section 4 we will show that the model trained using the data from all the depths can still very well capture temporal dynamics at each depth separately.

We also use area-depth profile as additional information to compute energy constraints (see Section 3.2). Since we train machine learning models that are specific to a target lake, the area-depth profile remains the same on different days and thus we do not include it in the input features.

3. Method

In this section, we will discuss the proposed PGRNN model in detail. First, we describe how to train an LSTM to model temperature dynamics using sparse observed data. Second, we describe how to combine the energy conservation law and the standard recurrent neural networks model. Then, we further utilize a pre-training method to improve the learning performance even with limited training data.

3.1. Recurrent Neural Networks and Long-Short Term Memory Networks

Recent advances in deep learning models enable automatic extraction of representative patterns from multivariate input temporal data to better predict the target variable. As one of the most popular temporal deep learning models, RNN models have shown success in a broad range of applications. The power of the RNN model lies in its ability to combine the input data at the current and previous time steps to extract an informative hidden representation

. In an RNN, the hidden representation is generated using the following equation:


where and represent the weight matrices that connect and , respectively. Here the bias terms are omitted as they can be absorbed into the weight matrix.

While RNN models can model transitions across time, they gradually lose the connections to long histories as time progresses (Bengio et al., 1994). Therefore, the RNN-based method may fail to grasp long-term patterns that are common in scientific applications. For example, the seasonal patterns and yearly patterns that commonly exist in environmental systems can last for many time steps if we use data at a daily scale. The standard RNN fails to memorize long-term temporal patterns because it does not explicitly generate a long-term memory to store previous information but only captures the transition patterns between consecutive time steps. It is well-known (CHEN and BILLINGS, 1992; Pan and Duraisamy, 2018) that such issue of memory is a major difficulty in the study of dynamical system.

As an extended version of the RNN, LSTM is better in modeling long-term dependencies where each time step needs more contextual information from the past. The difference between LSTM and RNN lies in the generation of the hidden representation . In essence, the LSTM model defines a transition relationship for the hidden representation through an LSTM cell. Each LSTM cell contains a cell state , which serves as a memory and forces the hidden variables to preserve information from the past.

Specifically, LSTM first generates a candidate cell state by combining and , as:


LSTM generates a forget gate , an input gate , and an output gate

via sigmoid function

, as:


The forget gate is used to filter the information inherited from , and the input gate is used to filter the candidate cell state at . Then we compute the new cell state and the hidden representation as:


where denotes the entry-wise product.

As we wish to conduct regression for continuous values, we generate the predicted temperature at each time step via a linear combination of hidden units, as:


We also apply the LSTM model for each depth separately to generate predictions for every depth and for every date . Then given the true observation for the dates and depths where the sparse observed data is available, i.e., , our training loss is defined as:


It is noteworthy that even if the training loss is only defined on the time steps where the observed data is available, the transition modeling (Eqs. 2-5) can be applied to all the time steps. Hence, the time steps without observed data can still contribute to learning temporal patterns by using their input drivers.

Figure 2. The flow of the PGRNN model. The model includes the standard RNN flow (black arrows) and the energy flow (blue arrows) in the recurrent process.

3.2. Energy conservation over time

The law of energy conservation states that the change of thermal energy of a lake system over time is equivalent to the net gain of heat energy fluxes, which is the difference between incoming energy fluxes and any energy losses from the lake (see Fig. 3). The explicit modeling of energy conservation is critical for capturing temperature dynamics since a mismatch in losses and gains results in a temperature change. Specifically, more incoming heat fluxes than outgoing heat fluxes will warm the lake, and more outgoing heat fluxes than incoming heat fluxes will cool the lake.

The total thermal energy of the lake at time can be computed as follows:


where is the temperature at depth at time , the specific heat of water (4186 J kg°C), the cross-sectional area of the water column (m) at depth , the water density (kg/m) at depth at time , and the thickness (m) of the layer at depth . In this work, we simulate water temperature for every 0.5m and thus we set =0.5. The computation of requires the output of temperature through a feed-forward process for all the depths, as well as the cross-sectional area , which is available as input.

The balance between incoming heat fluxes () and outgoing heat fluxes () results in a change in the thermal energy () of the lake. The consistency between lake energy and energy fluxes can be expressed as:


where . More details about computing heat fluxes are described in the appendix. All the involved energy components are in Wm.

In Fig. 2, we show the flow of the proposed PGRNN model, which integrates energy conservation flow into the recurrent process. While the recurrent flow in the standard RNN can capture data dependencies across time, the modeling of energy flow ensures that the change of lake environment and predicted temperature conforms to the law of energy conservation. Traditional LSTM models utilize the LSTM cell to implicitly encode useful information at each time step and pass it to the next time step. In contrast, the energy flow in PGRNN explicitly captures the key factor that leads to temperature change in dynamical systems - the heat energy fluxes that are transferred from one time to the next. Further, even though the input drivers and temperature profiles in different years and different seasons distributions, they all conform to the universal law of energy conservation. Therefore, by complying with the universal law of energy conservation, PGRNN has a better chance at learning generalizable patterns to unseen scenarios  (Read et al., 2019).

Figure 3. The heat energy fluxes and the lake thermal energy that are modeled in PGRNN.

We define the loss function term for energy conservation and combine this with the training objective of standard LSTM model in the following equation:


where represents the length of the ice-free period. Here we consider the energy conservation only for ice-free periods since the lake exhibits drastically different reflectance and energy loss dynamics when covered in ice and snow, and the modeling of ice and snow was considered out of scope for this study. We provide more details about how to compute the energy fluxes and from input data in the appendix. The value is a threshold for the loss of energy conservation. This threshold is introduced because physical processes can be affected by unknown less important factors which are not included in the model, or by observation errors in the metereological data. The function is adopted such that only the difference larger than the threshold is counted towards the penalty. In our implementation, the threshold is set as the largest value of in the GLM model for daily averages. The hyper-parameter controls the balance between the loss of the standard RNN and the energy conservation loss.

Note that the modeling of energy flow using the procedure described above does not require any input of true labels/observations. According to Eqs. 11-13, the heat fluxes and lake energy are computed using only input drivers and predicted temperature. In light of these observations, we can apply this model for semi-supervised training for lake systems which have only a few labeled data points.

3.3. Pre-training using physical simulations

In real-world environmental systems, observed data is limited. For example, amongst the lakes being studied by USGS, less than 1% of lakes have 100 or more days of temperature observations and less than 5% of lakes have 10 or more days of temperature observations (Read and others, 2017). Given their complexity, the RNN-based models trained with limited observed data can lead to poor performance.

To address this issue, we propose to pre-train the PGRNN model using the simulated data produced by a generic (uncalibrated) GLM that uses default values for parameters. In particular, given the input drivers, we run the generic GLM to predict temperature at every depth and at every day. These simulated temperature data from the generic GLM are imperfect but they provide a synthetic realization of physical responses of a lake to a given set of meteorological drivers. Hence, pre-training a neural network using simulations from the generic GLM allows the network to emulate a synthetic but physically realistic phenomena. This process results in a more accurate and physically consistent initialized status for the learning model. When applying the pre-trained model to a real system, we fine-tune the model using true observations. Here our hypothesis is that the pre-trained model is much closer to the optimal solution and thus requires less observed data to train a good quality model. In our experiments, we show that such pre-trained models can achieve high accuracy given only a few observed data points.

4. Experiment

In this section, we conduct extensive evaluations for the proposed method. We first show that the RNN model with LSTM cell can capture the dynamics of lake systems. Then we build the RNN model by incorporating energy conservation, and demonstrate its effectiveness in maintaining physical consistency while also reducing prediction error. Moreover, we show that the pre-training method can leverage complex knowledge hidden in a physics-based model. In particular, pre-training the RNN model even using the simulated data of a lake that is very different than the target lake (in terms of geometry, clarity and the climate conditions) is able to reduce the number of observations needed to train a good quality model.

Figure 4. The distribution of observed data from April 02, 1980 to December 30, 2014.

4.1. Dataset

Our dataset was collected from Lake Mendota in Wisconsin, USA. This lake system is reasonably large (40 km in area) and exhibits large changes in water temperatures in response to seasonal and sub-seasonal weather patterns. Observations of lake temperature were collected from North Temperate Lakes Long-Term Ecological Research Program. These temperature observations vary in their distribution across depths and time. There are certain days when observations are available at multiple depths while only a few or no observations are available on some other days.

The input drivers that describe prevailing meteorological conditions are available on a continuous daily basis from April 02, 1980 to December 30, 2014. Specifically, we used a set of seven drivers as input variables, which include short-wave and long-wave radiation, air temperature, relative humidity, wind speed, frozen and snowing indicators. In contrast, observed data for training and testing the models is not uniform, as measurements were made at varying temporal and spatial (depth) resolutions. In total, 13,158 observations were used for the study period, as shown in Fig. 4.

We use the observed data from April 02, 1980 to October 31, 1991 and the data from June 01, 2003 to December 30, 2014 as training data (in total 8,037 observations). Then we applied the trained model to predict the temperature at different depths for the period from November 01, 1991 to May 31, 2003 (in total 5,121 observations).

4.2. Model setup

We implement the proposed method using Tensorflow with Tesla P100 GPU. The recurrent modeling structure uses 21 hidden units. The threshold value

is set as 24, which is equivalent to the largest value of in the GLM model for daily averages. The hyper-parameter is set to 0.01. The value of is selected to balance the supervised training loss and the conservation of energy. A smaller value of results in a lower training loss at the expense of conservation of energy, and vice versa. Note that, when ¿0 (and thus energy conservation is part of the loss function), then the model has a better chance at learning general patterns that can reduce test error. (compared with the test error using =0). Also note that the energy conservation term is not fully accurate since certain minor physical processes are not captured by the energy conservation loss. Hence, a much larger value of can also results in sub-optimal performance by enforcing the model to conform to approximate physical relationships. The model is trained with the learning rate of 0.005.

4.3. Performance: prediction accuracy and energy consistency

First, we aim to evaluate how energy conservation helps improve the prediction accuracy and maintain the energy consistency. In our experiments, we use RNN to represent the RNN model with the LSTM cell, and use the RNN to represent the LSTM-RNN networks after incorporating energy conservation to the entire study period. We assess the performance of each model based on their prediction accuracy (see Section 4.3.1) and the physical consistency (see Section 4.3.2). Some sensitivity tests regarding to hyper-parameters can be found in our previous work (Jia et al., 2019).

4.3.1. Prediction accuracy

Here we compare RNN, RNN, and GLM in terms of their prediction RMSE 111Here we do not include the basic neural network and the standard RNN model (without LSTM cell) since the basic neural network produces an RMSE of 1.88 and the standard RNN produces an RMSE of 1.60 using 100% observed data, which is far higher than the models under discussion.

. To test whether each model can perform well using reduced observed data. We randomly select a different proportion of data from the training period. For example, to select 20% of training data, we remove every observation in our training period with 0.8 probability. The test data stays the same regardless of training data selection. We repeat each test 10 times and report the mean RMSE and standard deviation.

From Table 1, we have several observations: 1) RNN consistently outperforms RNN. The gap is especially obvious when using smaller subsets of observed data (e.g., 0.2% or 2% data). However, given plenty of observed data, the RNN model can achieve the similar performance with the RNN model. 2) Both RNN and RNN can get close to their best performance using over 20% observed data. 3) RNN using 20% observed data outperforms fully calibrated GLM (using 100% observed data).

Method 0% 0.2% 2% 20% 100%
GLM 2.950(NA) 2.616(0.499) 2.422(0.423) 2.318(0.368) 1.836(NA)
RNN - 4.615(0.173) 2.311(0.240) 1.531(0.083) 1.489(0.091)
RNN - 4.107(0.181) 2.149(0.163) 1.489(0.115) 1.471(0.077)
Table 1. Performance of RNN, RNN and GLM with access to different amount of observed data.

4.3.2. Energy consistency

To visualize how RNN contributes to a physically consistent solution, we wish to verify whether the gap between incoming and outgoing heat energy fluxes matches the lake energy change over time. Specifically, we train RNN and RNN using observed data from the first ten years. Then, we show the curves for the gap between incoming and outgoing heat fluxes and the change of lake energy over time for a certain period (Fig. 5). These two curves should be well aligned (in the ice-free period) if the learning model follows the law of energy conservation. We also summarize the average gap between these two curves in test data (in the ice-free period) as the energy inconsistency. In Fig. 6, we show the RMSE and the energy inconsistency of RNN, RNN and the calibrated GLM model in the entire test period. Here each model is trained using 100% observed data (the last column in Table 1).

(a) RNN

(b) RNN

(c) The calibrated GLM

Figure 5. The sum of heat fluxes and the lake energy change generated by (a) RNN, (b) RNN, and (c) the calibrated GLM, from April 02, 1980 to October 22, 1980.

From Fig. 5, we observe that RNN produces a better match between energy fluxes and lake energy change while RNN leads to a large difference between the two curves. This confirms that the addition of energy conservation term in the loss function used for RNN during its training period results in a model that helps preserve energy conservation in the test data. Note that the match between the blue and yellow curves for RNN is not as good as that for the calibrated GLM. RNN can obtain a greater match between these two curves by simply using a larger value of during the training phase. However, the energy conservation formula used in Eqs. 9 and 10 (in Appendix) captures only a subset of physical processes and ignores certain minor processes that can be challenging to be precisely modeled (Read et al., 2019), and thus strict compliance to the simplified energy conservation term used in the loss function of RNN can reduce the prediction accuracy in unseen data. Finally, from Figure 6 (and also from Table 1), we can see that RNN has even lower RMSE than RNN (which focuses only on reducing RMSE during the training phase). This shows that a more physically realistic model can also be more generalizable.

Figure 6. The performance of RNN, RNN and calibrated GLM by RMSE and energy inconsistency.

4.4. Leverage the knowledge hidden in physics-based model via pre-training

Here we show the power of pre-training to improve prediction accuracy of the model even with small amounts of training data. A basic premise of pre-training our models is that GLM simulations, though imperfect, provide a synthetic realization of physical responses of a lake to a given set of meteorological drivers. Hence, pre-training a neural network using GLM simulations allows the network to emulate a synthetic realization of physical phenomena. Our hypothesis is that such a pre-trained model requires fewer labeled samples to achieve good generalization performance, even if the GLM simulations do not match with the observations. To test this hypothesis, we conduct an experiment where we generate GLM simulations with input drivers from Lake Mendota. These simulations have been created using a GLM model with generic parameter values that are not calibrated for Lake Mendota, resulting in large errors in modeled temperature profiles with respect to the real observations on Lake Mendota (RMSE=2.950). Nevertheless, these simulated data are physically consistent and by using them for pre-training, we can demonstrate the power of our ML models to work with limited observed data while leveraging complex physical knowledge inherent in the physical models.

We fine-tune the pre-trained models with different amounts of observed data and report the performance in Table 2. We use the notation, RNN, to refer to the RNN model with energy conservation that is first pretrained using simulation data during 1981-2013 and then gets fine-tuned using observed data from the training period. The comparison between RNN and RNN shows that the pre-training can significantly improve the performance. The improvement is relatively much larger given a small amount of observed data. For example, even with 0.2% of observed data (16 observations) RNN achieves RMSE of 2.056, which is much smaller than that obtained by RNN or RNN when using ten times the amount of observed data. Moreover, we find that the training RNN and RNN

model commonly takes 150-200 epochs to converge while the training for RNN

and RNN only takes 30-50 epochs to converge. The improvements in these aspects demonstrate that pre-training can indeed provide a better initialized state for learning a good quality model.

Now we wish to better understand how the fine-tuning improves the performance using only limited observations. In Fig. 7, we show the predictions at 10 m depth by the generic GLM (i.e., GLM-gnr), the pretrained RNN without fine-tuning (i.e., RNN), and the pretrained RNN using 2% data for fine-tuning (i.e., RNN). We include the distribution of the randomly selected 2% training data in the appendix. We have following observations: 1) The generic GLM results in a large bias with true observations. 2) RNN has similar predictions with the generic GLM since RNN is pretrained to emulate the generic GLM. Note that RNN has roughly captured temperature dynamics even without using any observed data. 3) After fine-tuning using just 2% observed data, the RNN very well closes the gap between RNN and true observations.

Method 0% 0.2% 2% 20% 100%
GLM 2.950 2.616(0.499) 2.422(0.423) 2.318(0.368) 1.836(NA)
RNN - 4.615(0.173) 2.311(0.240) 1.531(0.083) 1.489(0.091)
RNN - 4.107(0.181) 2.149(0.163) 1.489(0.115) 1.471(0.077)
RNN 2.455(0.169) 2.056(0.180) 1.590(0.162) 1.402(0.106) 1.380(0.078)
Table 2. Performance of the pre-trained model (RNN) after they are fine-tuned with access to different amount of observed data. We include the performance of GLM, RNN and RNN from Table 1 here to allow for an easier comparison.
Figure 7. The predictions by the generic GLM (GLM-gnr), RNN (the pretrained RNN model), RNN (the pretrained RNN model which then gets fine-tuned using 2% training observations) at 10m depth from November 01, 1991 to February 13, 1995.

4.5. The RMSE profile across depths and seasons

Here we further analyze the prediction results to understand the limitations of physics-based GLM models and how our proposed method can overcome these limitations. Specifically, we conduct analysis from two different perspectives - across depths and across seasons. Each one will provide unique insights on the underlying difference between GLM and the proposed method in modeling lake temperature dynamics.

4.5.1. Error across depths:

In Fig. 8, we show the error of RNN models (pre-trained and fine-tuned with 100% data) and the GLM models (generic GLM and calibrated GLM using 100% data) across different depths.

It can be seen at the shallow depth levels (¡ 6 m), RNN model achieves similar performance with the generic GLM, but has larger errors than the calibrated GLM. This is because a single RNN model is trained to optimize the performance across all the depths. If we separately train an RNN only for shallow depths, the performance can be close to the calibrated GLM.

The generic GLM model has much larger errors than RNN at depths larger than 6 m, especially at intermediate depths (i.e., between 6 m - 16 m). The reason for such depth-dependent differences between GLM and RNN is because GLM includes complex processes to model the dynamics of thermal stratification, which includes the density-based separation of the surface and bottom waters. Specifically, the GLM is designed to capture the location of this temperature transition and strength of the gradient. However, predicting the dynamics of stratification from the basis of the underlying processes is very challenging for any model, including the GLM (Hipsey et al., 2019), and thus we can observe an increase in errors of the generic GLM model at depth layers below 6 m.

The calibrated GLM has much smaller errors than the generic GLM at middle depths. This shows that the generic GLM model simulates complex processes that cannot be easily generalized to specific lake systems without calibration. After GLM is calibrated using true observations, it can better locate the temperature transition in this specific lake and consequently reduce the errors in the middle depths. Note that the calibrated GLM still has larger errors compared to RNN at lower depths. This is potentially the result of challenges from a physics-based formulation of stratification dynamics. In contrast, the ML models approach the problem of prediction without making any assumptions of the stratification processes, and are able to perform much better at intermediate and lower depths by learning patterns from the training data.

Figure 8. The testing RMSE error at different depths. The errors are calculated only at the depths where more than 50 observed data are available. RNN represent the RNN model that is pre-trained with simulated data and then fine-tuned by 100% observed data. GLM-gnr and GLM-calib represent the generic GLM and the fully calibrated GLM (using 100% observed data), respectively.

4.5.2. Error across seasons:

We show the overall error in each season in Fig. 9. We can observe that in spring RNN and calibrated GLM have similar errors, while in summer and fall RNN outperforms calibrated GLM by a considerable margin, with calibrated GLM offering improvement over RNN during the winter season. This implies a bias by GLM in modeling certain physical processes that are active during warmer seasons.

To better understand the difference between our proposed method and GLM across seasons, we separately plot the error-depth relation for different seasons (see Fig. 10). We can observe the error-depth profile in summer and fall are similar to that in Fig. 8. The difference between RNN and calibrated GLM performance is especially worse in summer and fall because these two seasons are dominated by a stronger stratification and/or rapid changes in stratification as the lake cools. The influence of stratification on model performance in the spring and winter period is weaker compared to summer and fall. Hence, the difficulty in modeling stratification in addition to the increased range of temperatures are likely responsible for GLM’s worse performance when compared to RNN in warmer seasons.

Figure 9. The overall error of RNN and calibrated GLM model in different seasons.
Figure 10. The error-depth relationship in (a) spring, (b) summer, (c) fall, and (d) winter.

4.6. Can a pre-trained ML model perform better than its teacher?

As observed from Table 2, the performance of the pre-trained RNN-based models with no fine-tuning is better than the accuracy of the outputs from the generic GLM model (RMSE=2.950) based on which RNN is pre-trained. GLM tracks temperature at various depth layers that grow and shrink, split, or combine based on prevailing conditions (this is referred to as a Lagrangian layer model, since vertical layers are not fixed in time). As adjacent layers split or combine, prediction artifacts that are not representative of the real-world lake system are introduced, which often result in additional variability at lower depths. The resulting temperature variability can be overly sensitive for Lake Mendota and can increase GLM error. In contrast, the pre-trained RNN as an imperfect emulator of GLM does not fully capture such complexity, and instead predicts smoother and often more accurate temperature dynamics compared to the simulated data. To verify that GLM can introduce unnecessary variability or temperature change artifacts at lower depths that are comparatively muted in the pre-trained model, in Fig. 11 we show the error profile of GLM and the pre-trained model at different depths when no observations are used for refinement, i.e., the RNN model. We can observe that the pre-trained RNN model and GLM achieve similar performance around the surface but the pre-trained RNN has much lower RMSE than the GLM model at lower depths.

Figure 11. The error of GLM and pre-trained model (i.e., RNN) at different depths. No data is used for refinement.

To better illustrate this, we pre-train RNN using data from different depth layers - surface (0 m) and 9 m. Then we measure the error of each model with respect to GLM simulated data and true observation data at the same depth where the model is trained (Table 3). We can observe that the error to GLM outputs is much higher at 9 m than at the surface. This shows that the ML models cannot fully mimic the complexity of GLM at lower depths. However, since these complex processes are not necessarily good representations of Lake Mendota temperature dynamics, the ML models achieve lower RMSE to true observations compared to GLM (4.752 by RNN, and 5.333 by GLM) at 9 m by learning a simpler temporal process that is closer to reality.

Surface 9 m
Method Simulation error Observation error Simulation error Observation error
GLM-gnr - 1.875 - 5.333
RNN 0.854 1.932 1.498 4.752
Table 3. The error of GLM and pre-trained models with respect to GLM simulations and observation data at different depths.

4.7. Ability to pre-train using lakes that are very different with target lake

In practice, the GLM model may not have access to true values of parameters (e.g., lake geometry, water clarity and climate conditions), and therefore can only generate simulations based on default and inaccurate assumptions of parameters that influence lake temperature dynamics. Here we show the power of pre-training using simulated data from a physics-based model built on different lake geometries, lake clarity, and climate conditions. Our assumption is that the simulations by physics-based models still represent physical responses that strictly follow known physical laws. Hence, the pre-trained model should be able to capture these physical relationships and reach a physically-consistent initialized state. In our experiment, we will show that the pre-training with even a wrong set of lake parameters or with weather drivers very different from the target lake can still significantly reduce the amount of observations required to train a good quality model.

Specifically, we pre-train RNN using the simulated data by GLM based on specific conditions (geometry, clarity, and climate conditions). Then we will verify whether theses pre-trained models still have superior performance after they are fine-tuned with a small amount of observations.

Lake geometry:

We generate GLM simulations for three synthetic lakes with three different lake geometric structure: cone, barrel, and martini. The cone shape is closer to the true geometry of Lake Mendota (see Fig. 12) while both barrel and martini are very different to the true geometry. We first conduct pre-training using the GLM outputs based on each geometric structure. Then we conduct fine-tuning using true observations. The performance is shown in Table 4.

It can be seen that when adapted to Lake Mendota, the learned model from the cone shape works well even with no observed data. In contrast, the models learned from the barrel and martini shapes have a much larger error when directly applied to Lake Mendota. However, these errors are significantly reduced after fine-tuning with only 2% data. This shows that the model learned from a specific geometric structure can also capture certain temporal patterns that are physically consistent and applicable to the target system.

Method 0% 0.2% 2% 20% 100%
RNN - 4.107(0.181) 2.149(0.163) 1.489(0.115) 1.471(0.077)
RNN 2.455(0.169) 2.056(0.180) 1.590(0.162) 1.402(0.106) 1.380(0.078)
RNN 2.469(0.168) 2.056(0.184) 1.595(0.097) 1.452(0.113) 1.374(0.074)
RNN 3.239(0.098) 2.060(0.144) 1.617(0.090) 1.401(0.098) 1.383(0.078)
RNN 5.340(0.110) 3.033(0.104) 2.216(0.141) 1.485(0.092) 1.459(0.059)
Table 4. Performance of pre-trained models from different geometric structures (cone, martini and barrel) after they are fine-tuned with different amount of observed data from Lake Mendota.

When comparing the performance of different pre-trained geometric structures, we notice that the model pre-trained with the martini shape has a much larger error (RMSE 5.340) than the other two geometric shapes and the cone shape has the smallest error (see the first column in Table 4). This result agrees with the assumption that the cone shape is closer to the true geometry of Lake Mendota. Consequently, the GLM simulations using the cone shape should be closer to reality and the GLM simulations in martini shape should be far away from true observations. We verify this by measuring the RMSE of the GLM simulations with respect to true observation data: {cone simulation=2.792, martini simulation=5.950, barrel simmulation=3.864}. Even though the GLM simulations can have large errors when assuming the wrong geometric structure, the pre-trained models obtain lower errors than their teacher (see the first column in Table 4: {cone 2.469, martini 5.340, barrel 3.239}). This shows that the machine learning models are less sensitive to the change of geometric structure. Moreover, even though the models pre-trained using the wrong geometric structure have relatively large errors after pre-training, they can quickly recover to reasonable performance when fine-tuned with small amount of true observations data (e.g., 2% data).

Figure 12. The geometry of Lake Mendota.

Lake clarity:

Similarly, we generate GLM simulations for three synthetic lakes with different levels of clarity: normal (Kw=0.45), dark (Kw=1.20) and clear (Kw=0.25). Here we fix the lake geometry as a cone shape. The clarity level affects the penetration rate of radiation into the deeper water. We wish to verify how a model learned from a different clarity level can be fine-tuned to fit Lake Mendota. The performance is shown in Table 5.

We can observe that even if the Lake Mendota has the clarity level Kw close to the normal (Kw=0.45) level, the model pre-trained from both “dark” clarity and “clear” clarity can be well adapted to lake Mendota after fine-tuning. We also note that the performance of fine-tuned models from different clarity levels are similar given even 0.2% observations. This shows that the clarity level has less impact than lake geometry on learning an accurate predictive model for lake systems.

Method 0% 0.2% 2% 20% 100%
RNN - 4.107(0.181) 2.149(0.163) 1.489(0.115) 1.471(0.077)
RNN 2.455(0.169) 2.056(0.180) 1.590(0.162) 1.402(0.106) 1.380(0.078)
RNN 2.469(0.168) 2.056(0.184) 1.595(0.097) 1.452(0.113) 1.374(0.074)
RNN 2.776(0.124) 2.067(0.155) 1.601(0.078) 1.393(0.091) 1.380(0.068)
RNN 2.518(0.135) 2.050(0.120) 1.648(0.128) 1.399(0.088) 1.371(0.076)
Table 5. Performance of pre-trained models from different clarity levels (normal, dark and clear) after they are fine-tuned with different amount of observed data from Lake Mendota.

The water clarity mainly determines how rapidly sunlight is attenuated with respect to water depth. This parameter therefore affects the gradient of the temperature transition and the warming rates of deeper waters. To further analyze this impact, we measure the error across different depths for models pre-trained under different clarity levels, as shown in Fig. 13 (a). It can be seen that the model pre-trained under ”dark” clarity has much higher error at depths 6m-12m, where the temperature changes most rapidly. This confirms that a different clarity level can negatively impact water temperature modeling across depths. However, when we fine-tune the models with a small amount of true observed data, e.g., 2% data, the model can quickly recover to reasonable performance, as shown in Fig. 13 (b). Here it can be seen that the model pre-trained under ”dark” clarity achieves similar performance with models pre-trained under other clarity levels across all the depths.

Figure 13. Error over different depths for pre-trained models under different clarity conditions and fine-tuned with (a) no observed data and (b) 2% observed data in Lake Mendota.

Climate conditions:

Next, we generate GLM simulations for a synthetic lake with input drivers from Florida (which are very different from the typically much colder conditions in Wisconsin) and then pre-train the RNN using the simulated data from GLM based on these input drivers. We show the performance of pre-trained models (RNN) in Table 6. Note that RNN trained using these input drivers and simulated data in Florida have very poor performance when directly applied to Lake Mendota (9.106 for RNN). This is not surprising because there is a huge temperature difference between Wisconsin (where Lake Mendota is located) and Florida. It is more interesting to see that even with just 2% observations, the learned model becomes much better after fine-tuning.

Method 0% 0.2% 2% 20% 100%
RNN - 4.107(0.181) 2.149(0.163) 1.489(0.115) 1.471(0.077)
RNN 2.455(0.169) 2.056(0.180) 1.590(0.162) 1.402(0.106) 1.380(0.078)
RNN 9.106(0.172) 2.601(0.177) 1.759(0.147) 1.470(0.091) 1.394(0.071)
Table 6. Performance of pre-trained models from Florida (RNN) after they are fine-tuned with different amount of observed data from Lake Mendota. We include the performance of RNN and RNN for better comparison.

5. Related Work

Various components proposed in this work, including generalizing the loss function to include physical constraints, addressing the imperfection of existing physical models, and training ML models using the outputs from physical models, have been studied in different contexts.

As discussed in (Karpatne et al., 2017a), the idea of including an additional term in the loss function to prefer solutions that are consistent with domain specific knowledge is beginning to find extensive use in many applications. In addition to favoring solutions that are physically consistent, this also allows training in absence of labels, since physics-based loss can be computed even in absence of class labels. Some recent applications of this approach to combining physical knowledge in machine learning can be found in computer vision (Sturmfels et al., 2018; Shrivastava et al., 2012), natural language processing (Kotzias et al., 2015), object tracking (Stewart and Ermon, 2017)

, pose estimation

(Ren and others, 2018), and image restoration (Pan et al., 2018; Li et al., 2019). To the best of our knowledge, our work demonstrates for the first time that an ML framework can be adapted to incorporate energy conservation constraint, which is a universal law that applies to many dynamical systems.

In the context of directly addressing the imperfection of physical models, which is the focus of this paper, the most common approach is residual modeling, where an ML model is learned to predict the errors made by a physics-based model. This ML model can be learned using standard supervised learning techniques as long as some observations are available (that can be used to compute the errors made by the physics model). Once learnt, this ML model is used to make corrections to the output of the physics model. Most of the work on residual modeling going back several decades (perhaps even earlier) has used plain regression models 

(Forssell and Lindskog, 1997; Xu and Valocchi, 2015), although some recents works (Wan et al., 2018) have used LSTM. A key limitation of such approaches is that they cannot enforce physics based constraints because they try to model the error made by a physics model as opposed to predicting some physical quantity. Recently, Karpatne et al. introduced a novel hybrid ML and physics model in which the output of a physics model is fed into an ML model along with inputs that are used to driver the physics model (Karpatne et al., 2017b). This hybrid model learns to use the output of the physics model as the final output for the input drivers for which physics model is doing well, and make corrections where it makes mistakes. Since the output of this hybrid model is a physical quantity, physics based constraints can now be enforced, allowing for label free learning. However, such approaches cannot be used to initialize the ML model using just synthetic outputs from the physics model (which are technically free to to obtain) since they require observations to be available during training.

Machine learning models are increasingly being used to emulate physics based models since an ML model is typically much faster to execute than a physics based model once it has been trained (Butler et al., 2018; Ojika et al., 2017; McGregor et al., ). Since these ML models are trained using synthetic outputs generated by physics based models, the availability of training data is not a limitation, which makes it possible to train even highly complex ML models. However these emulators (if well trained) can, in general, be expected to do only as well as the physics models used for generating the training data. In particular, they cannot correct the errors made by physics-based models due to missing physics or incorrect parameterization. However, the PGRNN approach presented in this paper can be used to develop emulators that are physically consistent and thus likely to more robust and generalizable to out of sample scenarios.

Another technique to fuse physical models with machine learning is to replace part of the physical model that is costly or inaccurate with a data-driven solution (Yao et al., 2018; Tartakovsky et al., 2018). In (Hamilton et al., 2017), a subset of the mechanistic model’s equations are replaced with data-driven nonparametric methods to improve prediction beyond the baseline process model. As another example from the domain of fluid dynamics, (Raissi et al., 2018) uses neural networks to approximate latent quantities of interest like velocity and pressure in Navier Stokes equations. This creates a much more generalizable fluid dynamics framework that doesn’t depend as heavily on careful specification of the geometry, as well as initial and boundary conditions. Such approaches are orthogonal to the ones being discussed in our work, as these ML models being used as surrogates can be made ”physics-guided” using the framework described in this paper.

There also exists extensive literature on the data-driven discovery of governing equations or mathematical forms that underly complex dynamical systems (P. Crutchfield and S. McNamara, 1987; Bongard and Lipson, 2007; J Majda and Harlim, 2012; Sugihara et al., 2012; Brunton et al., 2016; Raissi et al., 2017, 2018)

, or even how to discover the underlying physical laws expressed by partial differential equations from data

(Raissi, 2018). For example, Rudy et al. (Rudy et al., 2017) present a sparse regression method for identifying governing PDEs from a large library of potential candidate fictions and spatial-temporal measurements from a model dynamical system. Such approaches can be very valuable for analyzing and understanding complex systems for which analytical descriptions are not available (e.g., epidemiology, finance, neuroscience). In contrast, the focus of our work is on systems where the dominant governing equations and laws are already known, but physics-based models contain inherent biases, as they are necessarily approximations of reality.

6. Conclusion

The PGRNN approach presented in this paper is unique in that it provides a powerful framework for modeling spatial and temporal physical processes while incorporating energy conservation. We also studied the ability of pre-training these models using simulated data to deal with the scarcity of observed data. Using the simulated data from a poorly parameterized physics-based model, PGRNN obtains high prediction performance with fewer observation data used for refinement compared with a parameterized physics-based model calibrated using a large number of observations. Thus, PGRNN can leverage the strengths of physics-based models while filling in knowledge gaps by employing state-of-the-art predictive frameworks that learn from data.

The PGRNN framework incorporates energy conservation by adding additional states whose values are computed from physical equations. This allows the use of a rich set of constraints beyond those that can be enforced by just considering the output of the model. In particular, it can be used to model other important physical laws in dynamical systems, such as the law of mass conservation. The PGRNN framework can also be viewed as a transfer learning method that transfers the knowledge from physical processes to ML models. Future research needs to determine the types of dynamical systems models for which such an approach will be effective. It is entirely possible that new architectural enhancements will need to be made to the traditional LSTM framework to incorporate different types of physical laws and to model underlying physical processes that may be interacting at different spatial and temporal scales. Hence, the proposed framework can be applied to a variety of scientific problems such as nutrient exchange in lake systems and analysis of crop field production, as well as engineering problems such as auto-vehicle refueling design. Therefore, we anticipate this work as an important stepping-stone towards applications of machine learning to problems traditionally solved by physics-based models.

This work was supported by NSF and USGS. We thank North Temperate Lakes Long-Term Ecological Research (NSF DEB-1440297) for temperature and lake metadata. Access to computing facilities was provided by Minnesota Supercomputing Institute.


  • Y. Bengio, P. Simard, and P. Frasconi (1994) Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 5 (2), pp. 157–166. Cited by: §3.1.
  • J. Bongard and H. Lipson (2007) Automated reverse engineering of nonlinear dynamical systems. Proceedings of the National Academy of Sciences 104 (24), pp. 9943–9948. Note: bibtex*[publisher=National Academy of Sciences;eprint=] External Links: ISSN 0027-8424, Link, Document Cited by: §5.
  • L. C. Bruce et al. (2018) A multi-lake comparative analysis of the general lake model (glm): stress-testing across a global observatory network. Environmental Modelling & Software 102, pp. 274–291. Cited by: footnote 2.
  • S. L. Brunton, J. L. Proctor, and J. N. Kutz (2016) Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences 113 (15), pp. 3932–3937. Cited by: §5.
  • K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, and A. Walsh (2018) Machine learning for molecular and materials science. Nature 559 (7715), pp. 547. Cited by: §5.
  • S. CHEN and S. A. BILLINGS (1992) Neural networks for nonlinear dynamic system modelling and identification. International Journal of Control 56 (2), pp. 319–346. External Links: Document, Link Cited by: §3.1.
  • U. Forssell and P. Lindskog (1997) Combining semi-physical and neural network modeling: an example ofits usefulness. IFAC Proceedings Volumes. Cited by: §5.
  • G. B. Goh, N. O. Hodas, and A. Vishnu (2017) Deep learning for computational chemistry. Journal of computational chemistry 38 (16), pp. 1291–1307. Cited by: §1.
  • D. Graham-Rowe, D. Goldston, C. Doctorow, M. Waldrop, C. Lynch, F. Frankel, R. Reid, S. Nelson, D. Howe, S. Rhee, et al. (2008)

    Big data: science in the petabyte era

    Nature 455 (7209), pp. 8–9. Cited by: §1.
  • H. V. Gupta et al. (2014) Debates—the future of hydrological sciences: a (common) path forward? using models and data to learn: a systems theoretic perspective on the future of hydrological science. WRR. Cited by: §1.
  • F. Hamilton, A. L. Lloyd, and K. B. Flores (2017) Hybrid modeling and prediction of dynamical systems. PLoS computational biology 13 (7), pp. e1005655. Cited by: §5.
  • T. D. Harris and J. L. Graham (2017) Predicting cyanobacterial abundance, microcystin, and geosmin in a eutrophic drinking-water reservoir using a 14-year dataset. Lake and reservoir management. Cited by: §1.
  • B. Hicks (1972) Some evaluations of drag and bulk transfer coefficients over water bodies of different sizes. Boundary-Layer Meteorology 3 (2), pp. 201–213. Cited by: Appendix A.
  • M. R. Hipsey, L. C. Bruce, C. Boon, B. Busch, C. C. Carey, D. P. Hamilton, P. C. Hanson, J. S. Read, E. De Sousa, M. Weber, et al. (2019) A general lake model (glm 3.0) for linking with high-frequency sensor data from the global lake ecological observatory network (gleon). Cited by: Appendix A, Appendix A, §1, §2.1, §4.5.1.
  • M. Hipsey et al. (2014) GLM-general lake model: model overview and user information. Cited by: Figure 1.
  • A. J Majda and J. Harlim (2012) Physics constrained nonlinear regression models for time series. Nonlinearity 26, pp. 201. External Links: Document Cited by: §5.
  • X. Jia, J. Willard, A. Karpatne, J. Read, J. Zwart, M. Steinbach, and V. Kumar (2019) Physics guided rnns for modeling dynamical systems: a case study in simulating lake temperature profiles. In Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 558–566. Cited by: §1, §4.3.
  • A. Karpatne, G. Atluri, J. H. Faghmous, M. Steinbach, A. Banerjee, A. Ganguly, S. Shekhar, N. Samatova, and V. Kumar (2017a) Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Transactions on Knowledge and Data Engineering 29 (10), pp. 2318–2331. Cited by: §1, §1, §5.
  • A. Karpatne, W. Watkins, J. Read, and V. Kumar (2017b) Physics-guided neural networks (pgnn): an application in lake temperature modeling. arXiv preprint arXiv:1710.11431. Cited by: §5.
  • D. Kotzias, M. Denil, N. De Freitas, and P. Smyth (2015)

    From group to individual labels using deep features

    In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 597–606. Cited by: §5.
  • U. Lall (2014) Debates—the future of hydrological sciences: a (common) path forward? one water. one world. many climes. many souls. WRR. Cited by: §1.
  • D. Lazer et al. (2014) The parable of google flu: traps in big data analysis. Science. Cited by: §1.
  • R. Li, L. F. Cheong, and R. T. Tan (2019) Heavy rain image restoration: integrating physics model and conditional adversarial learning. arXiv preprint arXiv:1904.05050. Cited by: §5.
  • J. J. Magnuson et al. (1979) Temperature as an ecological resource. American Zoologist 19 (1), pp. 331–343. Cited by: §1.
  • J. J. McDonnell and K. Beven (2014) Debates—the future of hydrological sciences: a (common) path forward? a call to action aimed at understanding velocities, celerities and residence time distributions of the headwater hydrograph. WRR. Cited by: §1.
  • [26] S. McGregor, D. Dhuri, A. Berea, and A. Muñoz-Jaramillo FlareNet: a deep learning framework for solar phenomena prediction. In Workshop on Deep Learning for Physical Sciences (DLPS 2017), NIPS 2017, Cited by: §5.
  • D. Ojika, D. Acosta, A. Gordon-Ross, A. Carnes, and S. Gleyzer (2017) Accelerating high-energy physics exploration with deep learning. In Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, pp. 37. Cited by: §5.
  • J. P. Crutchfield and B. S. McNamara (1987) Equations of motions a data series. Complex Systems 1, pp. . Cited by: §5.
  • H. W. Paerl and J. Huisman (2008) Blooms like it hot. Science 320 (5872), pp. 57–58. Cited by: §1.
  • J. Pan, Y. Liu, J. Dong, J. Zhang, J. Ren, J. Tang, Y. Tai, and M. Yang (2018) Physics-Based Generative Adversarial Models for Image Restoration and Beyond. arXiv e-prints, pp. arXiv:1808.00605. External Links: 1808.00605 Cited by: §5.
  • S. Pan and K. Duraisamy (2018) Long-time predictive modeling of nonlinear dynamical systems using neural networks. Complexity 2018, pp. 1–26. External Links: Document Cited by: §3.1.
  • F. J. Rahel and J. D. Olden (2008) Assessing the effects of climate change on aquatic invasive species. Conservation biology 22 (3), pp. 521–533. Cited by: §1.
  • M. Raissi, P. Perdikaris, and G. E. Karniadakis (2017) Inferring solutions of differential equations using noisy multi-fidelity data. Journal of Computational Physics 335, pp. 736–746. Cited by: §5.
  • M. Raissi, P. Perdikaris, and G. E. Karniadakis (2018) Multistep Neural Networks for Data-driven Discovery of Nonlinear Dynamical Systems. arXiv e-prints, pp. arXiv:1801.01236. External Links: 1801.01236 Cited by: §5.
  • M. Raissi, A. Yazdani, and G. E. Karniadakis (2018) Hidden fluid mechanics: a navier-stokes informed deep learning framework for assimilating flow visualization data. arXiv preprint arXiv:1808.04327. Cited by: §5.
  • M. Raissi (2018) Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations. arXiv:1801.06637 [cs, math, stat] (en). Note: arXiv: 1801.06637 External Links: Link Cited by: §5.
  • E. K. Read et al. (2017) Water quality data for national-scale aquatic research: the water quality portal. Water Resources Research. Cited by: §3.3.
  • J. S. Read, X. Jia, J. Willard, A. P. Appling, J. A. Zwart, S. K. Oliver, A. Karpatne, G. J. Hansen, P. C. Hanson, W. Watkins, et al. (2019) Process-guided deep learning predictions of lake water temperature. Water Resources Research. Cited by: §3.2, §4.3.2.
  • H. Ren et al. (2018) Learning with weak supervision from physics and data-driven constraints.. AI Magazine. Cited by: §5.
  • J. J. Roberts et al. (2013) Fragmentation and thermal risks from climate change interact to affect persistence of native trout in the colorado river basin. Global Change Biology. Cited by: §1.
  • J. J. Roberts et al. (2017) Nonnative trout invasions combined with climate change threaten persistence of isolated cutthroat trout populations in the southern rocky mountains. North American Journal of Fisheries Management. Cited by: §1.
  • S. H. Rudy, S. L. Brunton, J. L. Proctor, and J. N. Kutz (2017) Data-driven discovery of partial differential equations. Science Advances 3 (4), pp. e1602614. Cited by: §5.
  • A. Shrivastava, S. Singh, and A. Gupta (2012)

    Constrained semi-supervised learning using attributes and comparative attributes

    In Proceedings of the 12th European Conference on Computer Vision - Volume Part III, ECCV’12, Berlin, Heidelberg, pp. 369–383. External Links: ISBN 978-3-642-33711-6, Link, Document Cited by: §5.
  • R. Stewart and S. Ermon (2017) Label-free supervision of neural networks with physics and domain knowledge.. In AAAI, Vol. 1, pp. 1–7. Cited by: §5.
  • P. Sturmfels, S. Rutherford, M. Angstadt, M. Peterson, C. Sripada, and J. Wiens (2018) A domain guided cnn architecture for predicting age from structural brain images. arXiv preprint arXiv:1808.04362. Cited by: §5.
  • G. Sugihara, R. May, H. Ye, C. Hsieh, E. Deyle, M. Fogarty, and S. Munch (2012) Detecting Causality in Complex Ecosystems. Science 338 (6106), pp. 496–500. Note: bibtex*[publisher=American Association for the Advancement of Science;eprint=] External Links: ISSN 0036-8075, Link, Document Cited by: §5.
  • S. Tabata (1973) A simple but accurate formula for the saturation vapor pressure over liquid water. Journal of Applied Meteorology 12 (8), pp. 1410–1411. Cited by: Appendix A.
  • A. M. Tartakovsky, C. O. Marrero, D. Tartakovsky, and D. Barajas-Solano (2018) Learning parameters and constitutive relationships with physics informed deep neural networks. arXiv preprint arXiv:1808.03398. Cited by: §5.
  • Z. Y. Wan, P. Vlachas, P. Koumoutsakos, and T. Sapsis (2018) Data-assisted reduced-order modeling of extreme events in complex dynamical systems. PloS one 13 (5), pp. e0197704. Cited by: §5.
  • T. Xu and A. J. Valocchi (2015) Data-driven methods to improve baseflow prediction of a regional groundwater model. Computers & Geosciences. Cited by: §5.
  • K. Yao, J. E. Herr, D. W. Toth, R. Mckintyre, and J. Parkhill (2018) The tensormol-0.1 model chemistry: a neural network augmented with long-range physics. Royal Society of Chemistry. External Links: Link Cited by: §5.

Appendix A Energy conservation

Figure 14. The heat energy fluxes that are modeled in PGRNN. For short-wave radiation () and long-wave radiation (), a portion of the energy is reflected by the lake surface.

In Fig. 14, we show the major incoming and outgoing heat fluxes that impact the lake energy. The incoming heat fluxes include terrestrial long-wave radiation and incoming short-wave radiation. The lake loses heat mainly through the outward fluxes of back radiation (), sensible heat fluxes (), and latent evaporative heat fluxes ()222Here the latent heat fluxes are related to changes in phase between liquids, gases, and solids while the sensible heat fluxes are related to changes in temperature with no change in phase (Bruce et al., 2018)..

We now expand Eq. 10 with more detailed energy fluxes. The consistency between lake energy and detailed energy fluxes can be expressed as:


where , is the short-wave albedo (the fraction of short-wave energy reflected by the lake surface) and is the long-wave albedo. In our implementation, we set to 0.07 and to 0.03 which are generally accepted values for lakes from previous scientific studies (Hipsey et al., 2019). All energy components are in Wm. By comparing this with Eq. 10, we can see that and . In this work, we ignore the smaller flux terms such as sediment heat flux and advected energy from surface inflows and groundwater.

Estimation of Heat Fluxes and Lake Thermal Energy: We now introduce how to estimate energy fluxes in our implementation.

Terrestrial long-wave () radiation is emitted from the atmosphere, and depends on prevailing local conditions like air temperature and cloud cover. Incoming short-wave radiation () is affected mainly by latitude (solar angle), time of year, and cloud cover. Both factors are included in the input drivers .

As for the outgoing energy fluxes, we estimate , , and separately using the input drivers and modeled surface temperature.

The sensible heat flux and latent evaporative heat flux can be computed based on the previous study (Hipsey et al., 2019):


where is the bulk aerodynamic coefficients for sensible heat transfer, and the bulk aerodynamic coefficients for latent heat transfer. Both coefficients are estimated from Hicks’ collection of ocean and lake data (Hicks, 1972). The coefficient is the ratio of the molecular mass of water to the molecular mass of dry air (=0.622), the latent heat of vaporization (=2.45310), and the specific heat capacity of air (=1005). The variable is the air temperature, and the wind speed (m/s) above the lake referenced to 10m height. Both these variables are included or can be derived from input drivers. is the surface water temperature in degrees Kelvin obtained through the feed-forward process. The air density is computed as , where is air pressure (hPa) and is the water vapour mixing ratio (both derived from input drivers). The vapour pressure ( and ) is calculated by the linear formula from Tabata (Tabata, 1973):


where is the relative humidity scaling factor (=1, obtained through calibrating the GLM model) and is the relative humidity (included in input drivers).

The back radiation is estimated as:


where is the emissivity of the water surface (=0.97), and is the Stefan-Boltzmann constant (=5.6697e-8 WmK).

Appendix B Distribution of training data

In Fig. 15, we show the distribution of randomly selected 2% observed data across different depths and different dates.

Figure 15. Distribution of 2% observed data used for training.