Disease Informed Neural Networks

10/11/2021
by   Sagi Shaier, et al.
0

We introduce Disease Informed Neural Networks (DINNs) – neural networks capable of learning how diseases spread, forecasting their progression, and finding their unique parameters (e.g. death rate). Here, we used DINNs to identify the dynamics of 11 highly infectious and deadly diseases. These systems vary in their complexity, ranging from 3D to 9D ODEs, and from a few parameters to over a dozen. The diseases include COVID, Anthrax, HIV, Zika, Smallpox, Tuberculosis, Pneumonia, Ebola, Dengue, Polio, and Measles. Our contribution is three fold. First, we extend the recent physics informed neural networks (PINNs) approach to a large number of infectious diseases. Second, we perform an extensive analysis of the capabilities and shortcomings of PINNs on diseases. Lastly, we show the ease at which one can use DINN to effectively learn COVID's spread dynamics and forecast its progression a month into the future from real-life data. Code and data can be found here: https://github.com/Shaier/DINN.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/24/2021

A spatiotemporal machine learning approach to forecasting COVID-19 incidence at the county level in the United States

With COVID-19 affecting every country globally and changing everyday lif...
12/16/2020

Disease Momentum: Estimating the Reproduction Number in the Presence of Superspreading

A primary quantity of interest in the study of infectious diseases is th...
11/10/2021

Utilising urgent computing to tackle the spread of mosquito-borne diseases

It is estimated that around 80% of the world's population live in areas ...
04/07/2020

Prediction of COVID-19 Disease Progression in India : Under the Effect of National Lockdown

In this policy paper, we implement the epidemiological SIR to estimate t...
12/21/2020

Disease Forecast via Progression Learning

Forecasting Parapapillary atrophy (PPA), i.e., a symptom related to most...
02/03/2021

Digital twins based on bidirectional LSTM and GAN for modelling COVID-19

The outbreak of the coronavirus disease 2019 (COVID-19) has now spread t...
11/22/2021

A Quantum Annealing Approach to Reduce Covid-19 Spread on College Campuses

Disruptions of university campuses caused by COVID-19 have motivated str...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Abstract

We introduce Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, and finding their unique parameters (e.g. death rate). Here, we used DINNs to identify the dynamics of 11 highly infectious and deadly diseases. These systems vary in their complexity, ranging from 3D to 9D ODEs, and from a few parameters to over a dozen. The diseases include COVID, Anthrax, HIV, Zika, Smallpox, Tuberculosis, Pneumonia, Ebola, Dengue, Polio, and Measles. Our contribution is three fold. First, we extend the recent physics informed neural networks (PINNs) approach to a large number of infectious diseases. Second, we perform an extensive analysis of the capabilities and shortcomings of PINNs on diseases. Lastly, we show the ease at which one can use DINN to effectively learn COVID’s spread dynamics and forecast its progression a month into the future from real-life data. Code and data can be found here: https://github.com/Shaier/DINN.

1 Introduction

Diseases can differ vastly in which organisms they affect, their symptoms, and the speed at which they spread. There have been extensive research on modeling diseases, and one common method is using differential equation compartmental models [Compartment_models]. Such models decompose a system into a number of interacting subsystems called compartments, where each compartment corresponds to a particular group (e.g. infected, susceptible, removed). However, creating such equations can be extremely time consuming, difficult, and often unfruitful.

Lately there have been enormous advancements in the field of artificial intelligence. These range from computer vision

[goodfellow2014generative], [NIPS2012_c399862d], [redmon2016look], [tan2020efficientdet]

, natural language processing

[devlin2019bert], [vaswani2017attention], audio [baevski2020wav2vec], [li2019jasper], to many more. The dominant algorithm associated with these advancements is the neural network (NN), and a main reason for it is its behavior as a universal function approximator [HORNIK1989359]. For any function, there exists a neural network that can approximate it. However, this field is largely relying on huge amounts of data and computational resources.

Recent approaches [RAISSI2019686] have been shown to be successful in combining the best of both fields. That is, using neural networks to model nonlinear systems, but reducing the required data and by constraining the model’s search space with known knowledge such as differential equation compartmental models. Adopting similar approach, we used generated and real-life data of various complex, highly infectious, and deadly diseases, and developed a simple method to learn such systems (including their possibly unknown parameters) in the hope of slowing down and potentially preventing the next epidemic.

The paper is structured as follows. In Section 2 we review a collection of relevant works that have previously appeared in the literature. Section 3 presents potential necessary background information, our technical approach, and our experiments. Lastly, we conclude with a summary in Section 4.

2 Related Work

There has been several works recently showing how differential equations can be learned from data. For example, [osti_1333570]

used a deep neural network to model the Reynolds stress anisotropy tensor,

[E_2017]

solved for parabolic PDEs and backward stochastic differential equations using reinforcement learning, and

[hagge2017solving]

solved ODEs using a recurrent neural network. In comparison, we used a simple feed forward network. Additionally,

[Raissi_2018], and [RAISSI2019686]

developed physics informed models and used neural networks to estimate the solutions of such equations. We adopt similar approach to

[RAISSI2019686], but while the former focused on physics laws which were largely composed of one and two spatial dimensions, we extend the method to higher dimensional systems, and focus on diseases. While [Kharazmi2021.04.05.21254919] focused solely on various COVID systems using a similar approach, we experiment with a much larger set of complex diseases, performed an extensive analysis on the approach itself as it relates to diseases, and used it successfully to predict future COVID progression.

There has also been work on differentiating using ODE and PDE solvers. The Stan Math Library [carpenter2015stan] performs reverse-mode automatic differentiation and estimates the partial derivatives using sensitivity analysis. The framework by [koryagin2019pydens] solves various PDEs including heat equation and wave equation. The dolfin-adjoint library [doi:10.1137/120873558]

solves for ODE and PDE solutions by implementing adjoint computation. While their method works by backpropagation through the individual operations of the forward solver, we utilize automatic differentiation to differentiate our function approximators with respect to their input coordinates and model parameters.

3 Disease Informed Neural Network

In this section, we present our algorithm DINN (sample architecture can be seen in figure 1). Subsections 3.1 and 3.2 briefly discuss background information (SIR models and neural networks respectively). Subsection 3.3

provides an overview of the DINN approach and explains the algorithm, loss function, and training information. Lastly, subsection

3.4 reviews our experiments and analyses.

Figure 1: A Simple Disease Informed Neural Network Architecture. In DINN, our input — time, has a size of 1. While the output vary in size (S,I,R / S,I,D,R / etc.).

3.1 SIR Model

One method to model the spread of diseases is the SIR model [sir]. SIR stands for S, I and R, which represent the number of susceptible, infected, and recovered individuals in the population, respectively. The SIR model can generally be written as follows:

Where represents the transmission rate, and represents the removal or the recovery rate.

3.2 Neural Networks

Briefly speaking, neural network is an attempt to mimics the way the human brain operates. The general fully connected model is organized into layers of nodes (i.e. neurons) where each node in a single layer is connected to every node in the following layer (except for the output layer), and each connection has a particular weight. The idea is that deeper layers capture richer structures

[eldan2016power]

. A neuron takes the sum of weighted inputs from each incoming connection (plus a bias term), applies an activation function (i.e nonlinearity), and passes the output to all the neurons in the next layer. Mathematically, each neuron’s output looks as follows

where represents the number of incoming connections, the value of each incoming neuron, the weight on each connection, is a bias term, and

the activation function (for example ReLU, Tanh, Swish, etc).

3.3 DINNs Approach

Given a disease, we would normally gather data from the environment to try and understand it. For illustration purposes, we generated data instead by solving the systems of disease ODEs using LSODA algorithm [lsoda], the initial conditions, and the true parameters corresponding to each disease (e.g. death rate) from the literature. These little data ( points) are of the form of the above SIR compartments. To make our neural networks disease informed, once the data was gathered we introduced it to our neural network without the parameters (not to be confused with the NN’s parameters). It is worth noting that in this formulation there is no training, validation, and test data set, such as in most common neural networks training, but rather how the disease is spread over time. The model then learned the systems, and predicted the parameters that generated them. Since in many of these systems there exist a large set of parameters that can generate them, we restricted our parameters to be in a certain range around the true value. That is, to show that our model can in fact identify the systems and one set of parameters that match the literature they came from. However, our method is incredibly flexible in the sense that adding, modifying, or removing such restrictions can be done with one simple line of code. Additionally, we used nearly a year worth of real data aggregated over every US state and accurately predicted a month into the future of COVID transmission.

DINN takes the form

def net_sir(time_array):
    SIR = neural_network(time_array)
    return SIR
def net_f(time_array):
    dSdt = torch.grad(S, time_array)
    dIdt = torch.grad(I, time_array)
    dRdt = torch.grad(R, time_array)
    f1 = dSdt - (-beta SI)
    f2 = dIdt - (beta SI - mu I)
    f3 = dRdt - (mu I)
    return f1, f2, f3, S, I, R

The input of the NN net_sir is a batch of time steps (e.g. ), and the output is a tensor (e.g. [S,I,R]) that represents what the network believes the disease’s compartments look like at each time step. Here, net_f bounds the NN by forcing it to match the environment’s conditions (e.g. ).

The parameters of the neural network net_sir and the network net_f can be learned by minimizing the mean squared error loss

where

and

That is, minimizing the loss

Here, “actual” and “predict” refer to the actual data that the model was given and the prediction the model outputted, respectively. As seen from the above, we leveraged the automatic differentiation that neural networks are trained on to get the partial derivatives of each S,I,R with respect to time. The neural networks themselves are fairly simple, consisting of 8 fully connected layers with either 20 or 64 neurons each depending on the complexity of the system and ReLU activation in between. Since the data is relatively small, our batch size contained the entire time array. The networks were trained on Intel(R) Xeon(R) CPU @ 2.30GHz, and depending on the complexity of the system the training time ranged from 30 minutes to 58 hours, which could be accelerated on GPUs and TPUs. That is, to learn both a system and its unknown parameters. However if the parameters are known, the training time to solely learn the system can be as short as 3 minutes. We used Adam optimizer, and PyTorch’s CyclicLR as our learning rate scheduler, with mode = “exp_range”, min_lr ranging from

to depending on the complexity of the system, max_lr = , gamma=0.85, and step_size_up=1000. In the next sections we will refer to “min_lr” simply as “learning rate”. It is important to note that some diseases’ systems were much more difficult for DINN to learn (e.g. Anthrax) and further training exploration such as larger/smaller learning rate, longer training, etc. may be needed to achieve better performance.

3.4 Experiments

The following experiments were done on the COVID model [Anastassopoulou2019]:

3.4.1 Parameters Ranges

As mentioned above, there may be a large set of parameters that can generate a certain system. Hence, we restricted our parameters to be in a certain range to show that our model can learn the set that was used in the literature. Here we experimented with various parameter ranges, seeing the effect they had on the model. In the following we used a 4 layer neural network with 20 neurons each, learning rate, 100 data points, and the models were trained for 700k iterations (roughly 30 minutes). In or experiments we report two kinds of relative MSE loss errors. The first, “Error NN”, is the error on the neural network’s predicted system. The second, “Error learnable parameters”, is the error on the system that was generated from the learnable parameters. That is, using LSODA algorithm to generate the system given the neural networks’ parameters (e.g. ).

As an example, if the actual parameter’s value was , a search range would simply be , a range would be . Further ranges are multiplications of those: , , and so on. Table 1 (left) shows the parameters, their actual value, what range was DINN searching in, and what parameters values were found. While (right) shows the error of the neural network and the LSODA generation of the system from the parameters. That is, it shows the effect that the search range had on how well the neural networks’ learned the parameters. As seen from table 1 and figure 2 (for the remaining tables and figures see appendix 6.1), at least in the case of the COVID model DINN managed to find extremely close set of parameters in any range we tested. Additionally, the systems were almost learned perfectly, though, there was some variation in the relative error between experiments.

Table 1: 1000% Search Range. Left-hand side shows the parameters, their ranges, value found after training, and relative error percentage. Right-hand side shows 2 errors: “Error NN” is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems’ dynamics look like), and “Error Learnable Parameters” is the relative MSE loss error from the LSODA generated system using the parameters found values.

[100000% Search Range — NN Output] [100000% Search Range — LSODA Generation From Learnable Parameters]

Figure 2: 100000% range. Left image shows the effect that the search range had on the neural networks’ outputs. Right column shows the effect that the search ranges had on how well the neural networks’ learned the parameters.

3.4.2 Noise

To show the robustness of DINNs, we generated various amounts of uncorrelated Gaussian noise. The models were trained for million iterations (roughly 1 hour), using parameter ranges of 1000% variation and similar learning parameters (e.g learning rate) as the above section. We used a 4 layer neural network with 20 neurons each, and 100 data points. The experiments show that even with a very high amount of noise such as 20%, DINN achieves surprisingly accurate results with maximum relative error of 0.143 on learning the system. That being said, the exact parameters were harder to learn in that amount of noise. It appear that the models may need further training to stabilize the parameters, as there were some variations in the amount of noise versus the accuracy. Figure 3 shows DINN’s predictions on 20% uncorrelated gaussian noise. For the remaining figures and tables on various uncorrelated gaussian noise see appendix 6.2.

Figure 3: DINN’s prediction on 20% uncorrelated gaussian noise. Better seen in color

3.4.3 Data Variability

In another exercise to show robustness, we trained our models with various amounts of data — 10, 20, 50, 100, and 1000 points. The models were trained for 700k iterations, consisting of 4 layers with 20 neurons each, and learning rate. Our analysis shows that there was a big increase in the parameters accuracy from 10 points to 20 points. The model that was trained on 1000 data points performed the best, following by 100 points, 20 points, 50 points, and lastly 10 points. It may be the case that further training will stabilize the results and the 50 data points model will outperform the 20 points one. Though, even with 20 data points the model learns the system incredibly well (table 2). See appendix 6.3 for remaining tables and figures.

[10 points — Neural Network’s System] [20 points — Neural Network’s System]
[50 points — Neural Network’s System] [100 points — Neural Network’s System]

Figure 4: Various Data Points
Table 2: 20 Data Points. Left-hand side shows the parameters and values found after training. Right-hand side shows 2 errors: “Error NN” is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems’ dynamics look like), and “Error Learnable Parameters” is the relative MSE loss error from the LSODA generated system using the parameters found values.

3.4.4 Neural Network Architectures

Here we examined the effect that wider or deeper neural network architecture has on DINN. The models were trained on 100 data points, using parameter ranges of 1000%, learning rate, and 700k iterations. Tables 3 and 4 show a clear decrease in error as one increases the amount of neurons per layer. There also seem to be a clear decrease in error as the number of layers increase. However, the error seem to stabilize around 8 layers, with very minor performance increase in 12 layers.

Neurons Per Layer
Layers 10 20 64
2 (0.030, 0.109, 0.027, 0.027) (0.024, 0.15, 0.022, 0.022) (0.005, 0.038, 0.004, 0.004)
4 (0.002, 0.027, 0.004, 0.004) (0.001, 0.007, 0.001, ) (, 0.004, , )
8 (0.001, 0.008, 0.002, 0.002) (, 0.002, , ) (, 0.001, , )
12 (0.001, 0.008, 0.002, 0.002) (, 0.002, , ) (, 0.001, , )
Table 3: Architecture Variation: (S,I,D,R) error from the neural network’s output. Neural network architecture variations — depth and width. Relative MSE errors were reported on the predicted NN system.
Neurons Per Layer
Layers 10 20 64
2 (0.132, 0.519, 0.088, 0.111) (0.106, 0.423, 0.083, 0.077) (0.001, 0.009, 0.019, 0.011)
4 (0.038, 0.148, 0.026, 0.029) (0.064, 0.256, 0.045, 0.050) (0.009, 0.044, 0.010, 0.008)
8 (0.036, 0.138, 0.033, 0.024) (0.027, 0.107, 0.018, 0.022) (0.057, 0.234, 0.045, 0.043)
12 (0.036, 0.138, 0.033, 0.024) (0.022, 0.091, 0.015, 0.019) (0.017, 0.076, 0.017, 0.017)
Table 4: Architecture Variation: (S,I,D,R) error from LSODA generation of the learnable parameters. Neural network architecture variations — depth and width. Relative MSE errors were reported on the LSODA generated system using the parameters found values.

3.4.5 Learning Rates

We found that quickly increasing the learning rate and then quickly decreasing it to a steady value allows the network to learn well. One such learning rate schedule is PyTorch’s CyclicLR learning rate scheduler. To show the importance of learning rate in the amount of needed training time, we trained DINNs with several values— 1e-5 ,1e-6, 1e-8, as well as different step size for each one — 100, 1000, 10000. We used 4 layers with 20 neurons each, and 100 data points. The time measured from the moment the network started training, and until the loss was smaller than

— which usually corresponds to learning the system almost perfectly. As can be seen from the results (Table 5) both the minimum learning rate and the step size play an important role in learning the system. Reducing the learning rate to a small value too quickly may result in hours of training time instead of minutes. As an afterthought, this might be the reason why most of the systems were taking so long to train (>10 hrs), while the COVID system took <25 minutes.

Step Size Up
Learning Rate 100 1000 10000
2min 31s 2min 57s 3min 16s
21min 11s 20min 59s 18min 43s
>8hrs >8hrs >8hrs
Table 5: Learning Rate & Step Size vs Training Time

3.4.6 Missing Data

So far we assumed that we have all the data for each compartment. However, what would happen if we don’t? In this experiment we explored that exact question. We tested DINN on 3 systems: COVID with missing data on I, tuberculosis with missing data on L and I, and ebola with missing data on H. The models were trained on 100 data points, were given the known parameters from the literature, and were only given the initial conditions for the missing data. The COVID model was trained with learning rate for 1 million iterations (roughly 1 hour). The tuberculosis model was trained with learning rate for 100k iterations. The ebola model was trained with learning rate for 800k iterations. Our results show that DINN can in fact learn systems even when given partial data. However, it is important to note that the missing data compartments should be in at least one other compartment in order to get good results. For example, when we tried to remove the COVID recovered compartment (i.e R), DINN learned S, I, and D nearly perfectly. However, it did not do very well on R. That is because R is not in any of the other equations. We report the neural networks’ systems outputs and their losses. See appendix 12 for figures.

COVID system [Anastassopoulou2019]:

Tuberculosis system [Castillo-Chavez1997]:

Ebola system [Ebola]:

Relative MSE Error
Table 6: COVID: Missing Data On I
Relative MSE Error
Table 7: Tuberculosis: Missing Data On L,I
Relative MSE Error
Table 8: Ebola: Missing Data On H

3.4.7 11 Diseases Summary

Expanding on the relatively simple COVID model that was used for experimentation so far, here DINN was applied to 10 other highly infectious diseases, namely Anthrax, HIV, Zika, Smallpox, Tuberculosis, Pneumonia, Ebola, Dengue, Polio, and Measles. These diseases vary in their complexity, ranging from 3D to 9D ODEs, and from a few parameters to over a dozen. Table 9 provides a summary of our analysis. See subsection 6.5 in the appendix for the remaining diseases analyses.

Disease Best Worse Median
COVID 0.2 1.151 1.02
Anthrax 0.5754 6.0459 2.4492
HIV 0.007515 3.811689 0.829756
Zika 0.0588 5.8748 0.7261
Smallpox 0.0882 10.8598 4.9239
Tuberculosis 0.5424 11.0583 3.8952
Pneumonia 0.0005 11.6847 1.6372
Ebola 0.2565 9.6403 1.1244
Dengue 0.2696 9.7723 0.8796
Polio 0 0.4168 0.3587
Measles 2.9999 12.704 3.1453
Table 9: 11 Diseases Summary. Each disease and its best, worst, and median parameter estimate error.

3.4.8 Real-Life Data

Lastly, to verify that DINN is in fact as reliable as it appears, we used 310 days (04-12-2020 to 02-16-2021) of real-life US data from [JHU]

. We trained a neural network that learned the cumulative cases of susceptible, infected, dead, and recovered, and predicted the cases for a future month. Specifically, out of those 310 days we gave the network 280 days worth of data and asked it to predict each compartment’s progression a month (30 days) into the future. The network received 31 data points (1 per 10 days), was trained for 100k epochs (roughly 5 minutes), had 8 layers with 20 neurons each, a 1000% parameters variation, and

learning rate.

Our results suggest that the learnable parameters found from both networks were quite different from the parameters in the literature (for the cumulative and daily networks respectively: and instead of , and instead of , and and instead of ). This may imply that either the data was different from the initial data distribution used in the literature [Anastassopoulou2019], or that as other authors mentioned these are time-varying parameters rather than constant ones. As seen from figure 5, the cumulative cases had less data variation and were fairly easy to learn. Additionally, it appears as DINN managed to accurately predict the future month on each compartment. The daily cases had much more data variations and were more difficult. That being said, DINN managed to learn the relative number of cases each day.

Figure 5: DINN’s output on COVID real-life cumulative cases over 310 days

4 Conclusion

We introduced Disease Informed Neural Networks (DINNs) — a neural network capable of learning a number of diseases, how they spread, forecasting their progression, and finding their unique parameters. Here we used it to model 11 deadly infectious diseases and show its simplicity, efficacy, and generalization. These diseases were modeled into various differential equations systems with various number of learnable parameters. We found that DINN can quite easily learn systems with a low number of parameters and dimensions (e.g. COVID), and when the learnable parameters are known the training time can change from 50 hours to a few minutes. Moreover, it appears as if the number of dimensions doesn’t affect the performance as much as the number of learnable parameters (e.g see pneumonia vs ebola). From the anthrax model result we see that it’s far more difficult for DINN to learn systems which have numerous quick and sharp oscillations. That being said, looking at the polio, and zika models results we can see that it’s not impossible, but rather more time consuming (both in training and hyperparameter search). Also, based on the measles, tuberculosis, and smallpox models results we can see that low number of sharp oscillations are relatively easy to learn. Several interesting systems were the anthrax — as DINN appeared to be struggling with, zika — with the highest number of dimensions, parameters, and oscillations, and COVID — could be predicted to nearly perfection in roughly 3 minutes.

5 Acknowledgements

This work is supported in part by the Computational Mathematics program at the National Science Foundation through grant DMS 2031027. We also gratefully acknowledge the support of NVIDIA Corporation and Lambda Labs.

6 Appendix

The following subsections provide additional information mainly in the form of figures and tables.

6.1 Parameters Ranges

This subsection shows the remaining figures (6, 7, 8, 9) and table (10) for the various parameter search ranges we trained DINN on.

[0% Search Range — NN Output] [0% Search Range — LSODA Generation From Learnable Parameters]

Figure 6: 0% range. Left image shows the effect that the search range had on the NN outputs. Right column shows the effect that the search ranges had on how well the NN learned the parameters.

[100% Search Range — NN Output] [100% Search Range — LSODA Generation From Learnable Parameters]

Figure 7: 100% search range

[1000% Search Range — NN Output] [1000% Search Range — LSODA Generation From Learnable Parameters]

Figure 8: 1000% range. Left image shows the effect that the search range had on the NN outputs. Right column shows the effect that the search ranges had on how well the NN learned the parameters.

[10000% Search Range — NN Output] [10000% Search Range — LSODA Generation From Learnable Parameters]

Figure 9: 10000% range.
Table 10: Various Search Range. Left-hand side shows the parameters, their ranges, value found after training, and relative error percentage. Right-hand side shows 2 errors: “Error NN” is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems’ dynamics look like), and “Error Learnable Parameters” is the relative MSE loss error from the LSODA generated system using the parameters found values.

6.2 Noise

This subsection shows the remaining figures (10) for the various uncorrelated gaussian noise we trained DINN on.

[1% — Neural Network’s System] [5% — Neural Network’s System]
[10% — Neural Network’s System] [20% — Neural Network’s System]

Figure 10: Uncorrelated Gaussian Noise
Table 11: Various Guassian Noise. Left-hand side shows the parameters, their actual values, value found after training, and relative error percentage. Right-hand side shows 2 errors: “Error NN” is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems’ dynamics look like), and “Error Learnable Parameters” is the relative MSE loss error from the LSODA generated system using the parameters found values.

6.3 Data Variability

This subsection shows the remaining figure (11) and table (12) for the various data points settings we trained DINN on.

[20 points — Neural Network’s System] [50 points — Neural Network’s System]
[100 points — Neural Network’s System] [1000 points — Neural Network’s System]

Figure 11: Various Data Points
Table 12: Various Data Points. Left-hand side shows the parameters, their actual values, value found after training, and relative error percentage. Right-hand side shows 2 errors: “Error NN” is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems’ dynamics look like), and “Error Learnable Parameters” is the relative MSE loss error from the LSODA generated system using the parameters found values.

6.4 Missing Data

This subsection shows the remaining figure (12) for the various systems (COVID, Tuberculosis, and Ebola) with missing data DINN was trained on.

[COVID: Missing Data On I — NN Output] [Tuberculosis: Missing Data On L,I — NN Output]
[Ebola: Missing Data On H — NN Output]

Figure 12: Missing data

6.5 Diseases

6.5.1 Covid

The DINN COVID model had 8 layers with 20 neurons per layer, min learning rate, and was trained for 400k iterations (about 20 minutes). Figure 13 and tables 13, 14 show our results.

System:

(S, I, D, R) Error
(0.022, 0.082, 0.022, 0.014)
Table 13: COVID: relative error from LSODA generation of the learnable parameters
Figure 13: COVID: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
0.191 (-1,1) 0.1932 1.151
0.05 (-1,1) 0.0501 0.2
0.0294 (-1,1) 0.0297 1.02
Table 14: COVID: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.2 Hiv

The DINN HIV model had 8 layers with 20 neurons per layer, min learning rate, and was trained for 25mil iterations (about 22 hours). Figure 14 and tables 15, 16 show our results.

System:

(T, I, V) Error
(0.008, 0.002, 0.003)
Table 15: HIV: relative error from LSODA generation of the learnable parameters
Figure 14: HIV: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
10 (9.9,10.1) 10.000751 0.007515
0.02 (0.018,0.022) 0.020762 3.811689
0.26 (0.255,0.265) 0.261271 0.488758
0.24 (0.235,0.245) 0.241747 0.727760
2.4 (2.5,2.3) 2.419914 0.829756
0.03 (0.029,0.031) 0.030605 2.015910
250 (247.5,252.5) 249.703094 0.118762
1500 (1485,1515) 1506.543823 0.436255
0.000246 2.447948
0.000203 1.599052
Table 16: HIV: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.3 Smallpox

The DINN Smallpox model had 8 layers with 20 neurons per layer, min learning rate, and was trained for 12mil iterations (about 14 hours). Figure 15 and tables 17, 18 show our results.

System:

(S, En, Ei, Ci, I, Q, U, V) Error
(0.033, 0.053, 0.045, 0.060, 0.014, 0.036, 0.027, 0.021)
Table 17: Smallpox: relative error from LSODA generation of the learnable parameters
Figure 15: Smallpox: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
0.06 (9.9,10.1) 0.0554 7.7222
0.04 (0.036,0.044) 0.0380 4.9239
0.975 (0.86,1.04) 0.9839 0.9089
0.3 (0.27,0.33) 0.2841 5.2848
0.975 (0.86,1.04) 0.9759 0.0882
0.95 (0.86,1.04) 0.9050 4.7371
0.068 (0.061,0.075) 0.0626 8.5490
0.11 (0.10,0.12) 0.1034 10.8598
Table 18: Smallpox: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.4 Tuberculosis

The DINN Tuberculosis model had 8 layers with 20 neurons per layer, min learning rate, and was trained for 10mil iterations (about 12 hours). Figure 16 and tables 19, 20 show our results.

System:

(S, L, I, T) Error
(0.030, 0.034, 0.034, 0.008)
Table 19: Tuberculosis: relative error from LSODA generation of the learnable parameters
Figure 16: Tuberculosis: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
500 (480,520) 509.4698 1.8587
13 (9,15) 12.5441 3.6341
1 (-1,3) 1.0405 3.8952
0.143 (0.1,0.3) 0.1474 3.0142
0.5 (0,1) 0.5396 7.3433
2 (1,3) 1.9892 0.5424
1 (-1,3) 1.1243 11.0583
13 (9,15) 13.7384 5.3746
0 (-0.4,0.4) -0.0421 0
Table 20: Tuberculosis: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.5 Pneumonia

The DINN Pneumonia model had 8 layers with 64 neurons per layer, min learning rate, and was trained for 25mil iterations (about 41 hours). Figure 17 and tables 21, 22 show our results.

System:

(S, V, C, I, R) Error
(0.020, 0.039, 0.034, 0.019, 0.023)
Table 21: Pneumonia: relative error from LSODA generation of the learnable parameters
Figure 17: Pneumonia: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
0.01 (0.0099,0.011) 0.0098 2.0032
0.1 (0.99,0.11) 0.0990 0.9622
0.5 (0.49,0.51) 0.5025 0.5083
0.002 (0.001,0.003) 0.0022 11.6847
0.89 (0.87,0.91) 0.8912 0.1309
0.0025 (0.0023,0.0027) 0.0027 7.4859
0.001 (0.0009,0.0011) 0.0011 6.7374
0.2 (0.19, 0.21) 0.2033 1.6372
0.008 (0.0075,0.0085) 0.0084 4.8891
0.01 (0.009,0.011) 0.0092 8.4471
0.057 (0.056,0.058) 0.0570 0.0005
0.05 (0.049,0.051) 0.0508 1.5242
0.0115 (0.0105,0.0125) 0.0122 5.8243
0.2 (0.19,0.21) 0.2023 1.1407
0.5 (0.49,0.51) 0.4960 0.8003
0.1 (0.09,0.11) 0.1038 3.7502
Table 22: Pneumonia: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.6 Ebola

The DINN Ebola model had 8 layers with 20 neurons per layer, min learning rate, and was trained for 20mil iterations (about 33 hours). Figure 18 and tables 23, 24 show our results.

System:

(S, E, I, H, F, R) Error
(0.023, 0.050, 0.044, 0.062, 0.049, 0.005)
Table 23: Ebola: relative error from LSODA generation of the learnable parameters
Figure 18: Ebola: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
3.532 (3.5,3.56) 3.5589 0.7622
0.012 (0.011,0.013) 0.0129 7.8143
0.462 (0.455,0.465) 0.4638 0.3976
1/12 (0.072,0.088) 0.0866 3.9320
1/4.2 (0.22,0.28) 0.2471 3.7853
0.65 (0.643,0.657) 0.6523 0.3477
0.1 (0.099,0.11) 0.0904 9.6403
0.47 (0.465,0.475) 0.4712 0.2565
1/8 (0.118,0.122) 0.1205 3.6124
0.42 (0.415,0.425) 0.4247 1.1244
0.5 (0.45,0.55) 0.5196 3.9246
0.082 (0.081,0.083) 0.0811 1.0932
0.07 (0.069,0.071) 0.0710 0.7563
Table 24: Ebola: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.7 Dengue

The DINN Dengue model had 8 layers with 20 neurons per layer, min learning rate, and was trained for 35mil iterations (about 58 hours). Figure 19 and tables 25, 26 show our results.

System:

(Sh, Eh, Ih, Rh, Sv, Ev, Iv) Error
(0.003, 0.012, 0.030, 0.054, 0.001, 0.001, 0.002)
Table 25: Dengue: relative error from LSODA generation of the learnable parameters
Figure 19: Dengue: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
10 (9.9,10.1) 9.9317 0.6832
30 (29.7,30.3) 29.8542 0.4859
0.055 (0.054,0.056) 0.0552 0.2696
0.05 (0.049,0.051) 0.0506 1.2876
0.99 (0.9,1.1) 0.9643 2.5967
0.057 (0.056,0.058) 0.0567 0.5294
0.0195 (0.0194,0.0196) 0.0194 0.3835
0.016 (0.015,0.017) 0.0159 0.8796
0.53 (0.52,0.54) 0.5372 1.3567
0.2 (0.19,0.21) 0.1989 0.5483
0.1 (0.05,0.15) 0.0902 9.7723
Table 26: Dengue: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.8 Anthrax

The DINN Anthrax model had 8 layers with 64 neurons per layer, min learning rate, and was trained for 55mil iterations (about 91 hours). Figure 20 and tables 27, 28 show our results.

System:

(S, I, A, C) Error
(0.052, 0.144, 0.171, 0.171)
Table 27: Anthrax: relative error from LSODA generation of the learnable parameters
Figure 20: Anthrax: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
r 1/300 (0.003,0.0036) 0.0034 1.2043
1/600 (0.0014,0.0018) 0.0017 0.5754
0.1 (0.99,0.11) 0.1025 2.5423
0.5 (0.49,0.51) 0.5035 0.7022
0.1 (0.09,0.11) 0.1024 2.4492
0.01 (0.09,0.011) 0.0106 6.0459
0.1 (0.09,0.11) 0.0976 2.4492
1/7 (0.13,0.15) 0.1444 1.0542
1/64 (0.03,0.07) 0.0512 2.3508
100 (98,102) 100.6391 0.6391
0.02 (0.0018,0.0022) 0.0021 6.5466
0.1 (0.09,0.11) 0.1051 5.1029
Table 28: Anthrax: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.9 Polio

The DINN Anthrax model had 8 layers with 64 neurons per layer, min learning rate, and was trained for 40mil iterations (about 66 hours). Figure 21 and tables 29, 30 show our results.

System:

(Sc, Sa, Ic, Ia, Rc, Ra) Error
(0.001, 0.001, 0.017, 0.021, 0.004, 0.001)
Table 29: Polio: relative error from LSODA generation of the learnable parameters
Parameter Actual Value Range Parameter Found % Relative Error
0.02 (0.018,0.022) 0.0200 0
0.5 (0.495,0.505) 0.5018 0
18 (17.9,18.1) 18.0246 0.4168
36 (35.8,36.2) 36.0701 0.3587
40 (39,41) 40.2510 0
90 (89,91) 90.6050 0
0 (-0.001,0.001) 0.0002 0
0 (-0.001,0.001) 0.0004 0
Table 30: Polio: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage
Figure 21: Polio: Neural Network Output

6.5.10 Measles

The DINN Measles model had 8 layers with 64 neurons per layer, min learning rate, and was trained for 17mil iterations (about 28 hours). Figure 22 and tables 31, 32 show our results.

System:

(S, E, I) Error
(0.017, 0.058, 0.059)
Table 31: Measles: relative error from LSODA generation of the learnable parameters
Figure 22: Measles: Neural Network Output
Parameter Actual Value Range Parameter Found % Relative Error
0.02 (0.01,0.03) 0.0225 12.704
0.28 (0.27,0.37) 0.2700 3.5704
100 (97,103) 97.0001 2.9999
35.84 (33,37) 34.7127 3.1453
Table 32: Measles: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage

6.5.11 Zika

The DINN Zika model had 8 layers with 64 neurons per layer, min learning rate, and was trained for 8mil iterations (about 13 hours). The following image has a selection of the compartments to reduce scatter in visualization. Figure 23 and tables 33, 34 show our results.

System:

() Error
(, 0.017, 0.014, 0.003, 0.024, 0.091, 0.005, 0.012, 0.018, 0.018)
Table 33: Zika: relative error from LSODA generation of the learnable parameters
Parameter Actual Value Range Parameter Found % Relative Error
a 0.5 (0.49,0.51) 0.4997 0.0588
b 0.4 (0.39,0.41) 0.4033 0.8297
c 0.5 (0.49,0.51) 0.5015 0.3086
0.1 (0.09,0.11) 0.0999 0.0687
0.05 (0.0495,0.0505) 0.0498 0.4098
0.6 (0.594,0.606) 0.6033 0.5486
0.3 (0.27,0.33) 0.2902 3.2565
18 (0.17.8,18.2) 17.9669 0.1838
m 5 (4.5,5.5) 5.2937 5.8748
1/5 (0.198,0.202) 0.1996 0.1798
10 (9.9,10.1) 10.0170 0.1700
1/5 (0.18,0.22) 0.1991 0.4651
1/64 (0.045,0.055) 0.0504 0.7261
1/7 (0.139,0.141) 0.1406 1.5967
1/14 (0.063,0.077) 0.0723 1.1806
Table 34: Zika: parameters, their values, the parameters search range that DINN was trained on, the parameters found after training, and the relative error percentage
Figure 23: Zika: Neural Network Output