Forecasting the Spread of Covid-19 Under Control Scenarios Using LSTM and Dynamic Behavioral Models

05/24/2020 ∙ by Seid Miad Zandavi, et al. ∙ UNSW 0

To accurately predict the regional spread of Covid-19 infection, this study proposes a novel hybrid model which combines a Long short-term memory (LSTM) artificial recurrent neural network with dynamic behavioral models. Several factors and control strategies affect the virus spread, and the uncertainty arisen from confounding variables underlying the spread of the Covid-19 infection is substantial. The proposed model considers the effect of multiple factors to enhance the accuracy in predicting the number of cases and deaths across the top ten most-affected countries and Australia. The results show that the proposed model closely replicates test data. It not only provides accurate predictions but also estimates the daily behavior of the system under uncertainty. The hybrid model outperforms the LSTM model accounting for limited available data. The parameters of the hybrid models were optimized using a genetic algorithm for each country to improve the prediction power while considering regional properties. Since the proposed model can accurately predict Covid-19 spread under consideration of containment policies, is capable of being used for policy assessment, planning and decision-making.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Tthe outbreak of coronavirus disease 2019 (Covid-19) has exposed the world to great challenges and is a serious concern for public health. The outbreak started in Wuhan, China, in December 2019 [1, 18] and within a few weeks it spread across the globe. This caused policy changes regarding the control of the spread. There is a lack of information and uncertainty about this outbreak, making it important to understand its dynamic behavior. Forecasting the outbreak’s behavior over time can provide useful insights into the epidemiological situation [6] and determine whether the pandemic has been brought under control by mitigation measures [26, 10]. Research is currently forecasting changes in infectious diseases [29], predicting the international spread of outbreaks [7], and assessing the impacts of alternative interventions during pandemics [17].

However, research is faced with many challenges, particularly related to time. There may be delays in the presentation of symptoms due to the incubation cycle and delays in verifying detection and testing events. Delays and uncertainties can be taken into account by models, especially those stemming from normal infection histories and reporting processes [24]. Besides, some aspects of outbreak dynamics can be biased, incomplete, or only reported by individual data sources. There is evidence that synthesis approaches will permit a more robust estimate of the dynamics underlying the transmission based on noisy data [4, 2].

In order to determine the potential trajectory of the disease in accordance with the evidence, a dynamic model can be used. These predict issues such as how an infection progresses, how the number of cases/deaths is affected, or how long an outbreak lasts. In traditional approaches, SEIR models (representing the population groups susceptible, exposed, infectious and recovered) have been used to analyze the spread of Covid-19 [18, 12]. Such models include feedback that regulates endogenous changes in contact rate, testing, diagnostics, and reporting in response to risk perception and other relevant factors. In these compartmental models [18]

, populations are divided into compartments and people are modelled by (ordinary) differential equations within the four SEIR groups.

Generative models reflect another broad variety of models with causal effects (using hidden states and parameters). For example, a generative model simulates the dynamics of effects in a group or population (i.e., new COVID-19 cases over time) [9]. These methods can measure the impact of policies (e.g. social distancing) and demographic variations (e.g., public immunity) in order to anticipate what might happen in a particular area under various conditions [8]

. Accordingly, SEIR-like models have been used to imitate disease outbreaks by, for example, estimating the parameters of Bayesian Markov-Chain Monte Carlo (MCMC) models

[5, 21] or comprehensively discussing scenarios [27, 25]. This family of models also has recently played a dominant role in studying the overall outbreak of the coronavirus, from inference [18, 19] to scenario prediction [1] under control strategies [32].

Here, we propose a model of time-dependent spreading movement. The time-dependence is determined through potential changes representing human and system behaviors. Therefore, a hybrid dynamic model is proposed to acquire a robust estimation of the exclusive use of region properties, which includes three main sections: A task model, facility model, and dynamic motion model. The models estimate the behavior of the Covid-19 outbreak in a particular region. The first model represents general behavior in public, such as public knowledge and how to follow the rules and knowledge. The facility model considers facilities such as hospitals, including emergency and inpatient departments, and medical staff and their knowledge, rules, and skills. The last model predicts the dynamic behavior of time-varying cases and deaths in the nominated country. The motion model uses an artificial neural network using Long Short Term Memory (LSTM) to update itself in line with stochastic behaviors and uncertainties in the proposed framework. Our framework is structured to assess past intervention efficacy and to analyze future possibilities that spread uncertainty. This model can be easily adapted to any country or region. The top-ten most-affected countries, including Australia, are studied with data collected between 31 December–19 April 2020 obtained from the European Centre for Disease Prevention and Control (ECDC). Furthermore, 90 % of this data is used to train the hybrid dynamic model and 10 % is used to analyze the performance of the model in the proposed framework.

This paper is organized as follows. Section II describes the methods. The problem formulation is explained and introduced in Section III. The results and discussion are presented in Section IV. Finally, the paper ends with a conclusion.

Ii Method

Recently, deep learning has garnered the attention of many researchers in different areas. Deep learning usually defines multiple layers considering its architecture and uses a stochastic optimization algorithm to calculate the weight and bias parameters for each layer. As designed its architecture to perform the machine learning tasks, the number of depths (i.e., number of hidden layers) is directly correlated with the learning ability


. In particular, LSTM, a form of recurrent neural network (RNN), is able to update during the sequence of learning as it has feedback connections, unlike a feedforward neural network. This is the key point in using LSTM to forecast with time series data. For example, it has been applied to many areas such as image captioning

[30, 15, 22], natural language translation [28], and speech recognition [13]. Many of these active areas focus on classification and applications to forecast/regression models are relatively limited.

This paper aims to make a hybrid model based on LSTM and a dynamic behavioral model to forecast the spread of Covid-19 across the world from a dynamic modelling perspective. The dynamic behavioral model is introduced to describe and predict the interactions between multiple components of a phenomenon that are viewed as a system, which includes many inputs and outputs interacting over time. The dynamic model focuses on the mechanism of how the components and system evolve across time. Therefore, dynamic modeling allows us to bridge the gap between conceptualizing the phenomena of dynamic behavior and particular phenomena. Dynamic system modeling is used in many academic fields, originating in mathematics and physics before being adopted in the life, social, and behavioral sciences. It is clear that dynamic system models combined with machine learning techniques can play an essential role in data analysis and the way theories are conceived and developed.

In order to forecast Covid-19, the proposed hybrid model consists of three main modules: 1) time series/sequencing learning (i.e., LSTM) with time-variant dynamics, 2) public behavior and 3) a behavioral model of the system. Figure. 1 shows that Covid-19 forecasting includes three main sections. First, the task model describes external conditions and reference inputs during covid-19 epidemics. This module can describe factors such as social distancing, social knowledge, self-isolation, etc. People decide whether to conform to tasks or not, observing both external conditions and the current situation. Additionally, the reference inputs are the ideal states of public behavior during the covid-19 epidemic, so the people should control this epidemic. Second, the facility model describes covid-19 control behavior during the outbreak. This model focuses on technologies, supplies, professional personnel, etc. Such modelling can reduce the number of deaths and increase the number of discharged patients. In this model, personnel are modeled based on human features such as workload, fatigue, and conditions that affect the performance of medical staff. Lastly, dynamic motion is modelled using the artificial neural network based on LSTM to predict numbers of cases and deaths. The details of each model are introduced in the following sections.

Fig. 1: Hybrid Model of Covid-19 forecast

Ii-a LSTM Model

LSTM is a particular form of RNN capable of learning long-term dependency, and has fundamental differences to a conventional feedforward neural network. They are sequence-based models that are able to establish the temporal correlations between previous information and current circumstances. In times series problems, like forecasting the spread of Covid-19, using a sequence-based model in an LSTM means that the decision an LSTM made at time affects the decision it will make at the next time, . The feature (i.e., feedback connections) plays an important role in imitating the system’s dynamic motion, since it takes daily information into account when the subsequent information is entered.

According to back-propagation through time, RNNs suffer from long-range dependencies because of gradient vanishing and exploding [16]. Gradient vanishing in RNN refers to problems where the norm of the gradient for long-term components decreases exponentially fast to zero, limiting the model’s ability to learn long-term temporal correlations, while gradient exploding refers to the opposite event. Although LSTM has been introduced to address the issue [14], the forget gate in the LSTM architecture boosts the performance of the model [20]. This feature opens a new avenue for many sequence-learning applications.

Here, the general structure of LSTM is described with naming similar to that of Ref [11]. Let present a sequencing input for an LSTM model (i.e., the general structure is illustrated in Fig. 2). In Fig. 2, is a

-dimensional real vector at the

-th time step.

In order to establish temporal connections, the LSTM defines and maintains an internal memory cell state throughout the whole life cycle, which is the most important element of the LSTM structure. The memory cell state interacts with the intermediate output and the subsequent input to determine which elements of the internal state vector should be updated, maintained or erased based on the outputs of the previous time step and the inputs of the present time step. In addition to the internal state, the LSTM structure also defines an input node , input gate , forget gate , and output gate . Equation 1 - Eq. 6 gives the formulations for all nodes in an LSTM structure.


where ,, ,, , , and

are weight parameters for the corresponding input of the network activation function;


are a sigmoid function and

, respectively. The sigmoid function with an output range of works as a soft switch for the forget gate (), input gate (), input node (), and output gate (). This means that it is a decision-making point determining whether the signal/sequencing data should pass the gate or not. For example, if the output of the sigmoid function is zero, there is no signal for the prediction. Thus, all gates (forget gate, input gates, input node and output gates), are directly depended on the current and previous output .

The input gate decides what to maintain in the internal state while the forget signal is carried out from the previous state () by the forget gate. In order to update the internal state, the output gate points out which internal state should pass as the LSTM output . This process then continues to repeat for the next time step. All the weights and biases are learned by minimizing the differences between the LSTM outputs and actual training samples. Besides, information on the current time step can be stored and maintained to affect the LSTM output of future time steps. Here, LSTM is designed to estimate the movement of Covid-19’s spread with consideration of uncertainties. The stochastic behavior of the system is modeled as a dynamic behavioral model to update the LSTM states over time.

Fig. 2: General Structure of LSTM

Ii-B Dynamic Behavioral Model

The dynamic behavioral model describes the behavior of a system on how its elements interact with each other. The system interactions provide the functionality of the system and are used for systems and subsystems. A behavioral model represents the temporal behavior between different subsystems, while an entity interacts over time. Thus, modeling of the interactions among subsystems, the functionality of the entity over time, and the setting of roles, etc. is introduced as dynamic behavior in the system because the system performance is determined through the engagement of each module over time.

Generally, systems and subsystems interact to accomplish a purpose by exchanging information to submit roles with expectations, from which functions are presented to ensure that action is taken. Also, an interaction over time involves communication (i.e., signals in time-series interactions), transfer of knowledge, receiving and collecting data. Thus, data manipulation/changing system behavior over time is introduced as a dynamic role. A dynamic behavioral model is constructed to demonstrate one or more interactions within the system that are responsible for task accomplishment. Accordingly, general public behavior, skilled medical staff, and high-quality hospitals cooperate as a dynamic system to manage the spread of Covid-19 in an area.

Iii Problem Formulation

The problem is to forecast the number of cases and deaths during Covid-19 pandemics. The hybrid model using LSTM and the dynamic behavioral model are introduced to achieve good predictions. Covid-19 forecasts help to take more attention into account when the number of cases is increasing. In this section, the structure of the proposed framework (see Fig. 1) is introduced. Also, the formulation of the modules involved, such as the task model, medical staff model, hospital as facility model, and dynamic motion model, are presented in detail.

Iii-a Task Model

In the proposed framework, the task model describes the behavior of the public in the Covid-19 time. It is categorized into two groups: an external condition and reference input. The external condition provides all the environmental factors related to people deciding whether to maintain social distancing or not. The reference input describes the ideal state in the country, such as public knowledge, keeping updated with the latest news, and government rules. It is obvious that the task model is very uncertain due to its dependency on public behavior. Thus, the uncertainties are modeled by stochastic colored noise.

The colored noise was generated by white noise that was Gaussian-distributed with a zero mean. Therefore, the dynamic is presented in Eq.



where is the task model and

is the white noise. Colored noise is the spectral density calculated by the Fourier-transform of the auto-correlation of white-noise (

). The auto-correlation is formulated in Eq. 8


where and are the noise intensity and correlation time, respectively. The Laplace transform of the auto-correlation of introduced colored noise is expressed as Eq. 9.


where and . The transfer function used to model the task model is . It is obvious that the colored noise determines the uncertain behavior in terms of the physical phenomena of the motional system.

Iii-B Medical Staff Model

According to McRuer’s crossover model [23], humans in a dynamic system behave in a way that results in an open-loop transfer function, which is formulated as a Laplace form in Eq.10.


where and represent human and plant transfer functions, respectively. is the crossover frequency that describes human operations and adaptation during a compensatory situation.

Figure 3 shows that the human behavior model is formulated with three elements: delay in performance, equalization form, and medical staff (according to their knowledge, rules, and skills). Thus, the model is generalized and adjusted by using a describing function form and a mitigating set of rules of human characteristics, such as knowledge, rules, and skills.

Fig. 3: Human Behavior Block Diagram

The generalized/adjusted human function, according to McRuer’s crossover model, is formulated in Eq. 11.


where , and are the gain, time-lead and time-lag constant in the equalization from of the model shown in Fig. 3. Here, represents the delay in staff response, which is described by the time-constant (). is the transfer function of medical staff according to their knowledge, rules, and skills.

Having generalized and adjusted the human model, knowledge, rules, and skill play important roles in staff performance. In this regard, the model consists of four modules: human senses such as visual and audio recognition, a cognitive model, plans and rules for different forms of tasks, and medical staff’s physical actions and performance efficiency. Figure 4 illustrates the general model of medical staff.

Fig. 4: General Model for Knowledge, Rules and Skills

Human behavior and performance are highly correlated with visual and auditory recognition. These human senses are heavily dependent on workload, fatigue, and working hours. In the time-domain, the function of the human senses is modeled as an exponential function (see Eq. 12).


where , and are constant parameters in the proposed model. Laplace transform is applied to fully describe the behavior of the system. Thus, the Laplace transform of the human sense model (i.e., transfer function of human senses) is calculated as Eq. 13.


In order to familiar form in Eq. 13, some replacements in the formula are made (see Eqs. 14 - 16 ).


Alternatively, the familiar form of the human sense transfer function can be simplified as in Eq 17.


where is the gain in the system, which describes the quality of human senses while submitting jobs. and represents the time-lead and time-lag in the proposed module. The signal output from the human sense module passes through the cognitive model, bridging the gap between understanding what to preserve and what to perform under some circumstances. As per the system defined in Fig. 4, an expression of the cognitive module is related to system outputs and inputs representing the conservation-of-mass principle. Besides, the cognitive model is formulated as per Eq. 18.



is the probability of understanding the task correctly in uncertain circumstances. The term

represents the length of the cognitive process in the human mind, which can be formulated as the input coming from the human senses with the delay in output considering the length of . Therefore, the formulation and transfer function of the cognitive model are as presented in Eq. 19 and Eq. 20.


The Laplace transform is as below.


The transfer function is simplified in Eq. 21.


After the cognitive model has been defined, medical staff need to make a plan for their decision. Here, they need to plan and obey the rules, somehow make the limitation in jobs. In this matter, a saturation function in Laplace form is considered.

A saturation function is introduced to define a threshold in the system’s response when the input exceeds the limit. At this time, the output becomes constant at the highest level of the threshold. Thus, the plans and rules module in the proposed system can be mathematically expressed as in Eq. 22 .


where is the transfer function of the plans and rules module represented in Fig. 4.

Finally, the performance and physical actions of medical staff are like a second-order dynamic system. According to the McRuer crossover theorem, the proposed model can be formulated by Eq. 23.


where , , , are constant parameters related to the nature of the system. is the performance response when the plans and rules are followed by the staff. The Laplace transform of this proposed model is calculated as per Eq. 24.


The transfer function is replaced with a similar form as that in Eq. 25.


where and are the natural frequency and damping ratio, respectively. These parameters define how the dynamic model can behave in the system.

Iii-C Hospital Model

When viewing a hospital as a system, it can be divided into two main departments—emergency and inpatients—which are the key departments that affect the quality and timeliness of patient care. Emergency capacity must be flexible throughout the day as patients arrive according to a non-homogeneous arrival pattern. Also, the inpatient sector usually focuses on maintaining bed occupancy levels to improve efficiency in terms of utilizing resources. The proposed model describes the maximum occupancy level and planned capacity.

The proposed hospital model consists of three patient arrival sources: medical direct admissions, emergency walk-in patients, and emergency ambulance arrivals. While the emergency department is crowded with a considerable number of walk-in and ambulance arrival patients, diversion might carry out in the ambulance arrival. Under emergency diversion, the number of patient arrivals is reduced as ambulances are rerouted to nearby alternative hospitals. At this point, emergency walk-in patients must be registered. Besides, medical direct admission patients are sent directly to the inpatient department upon arrival to the hospital. Similar to ambulance arrivals, accepting patients to the inpatient department is heavily dependent on the availability of beds. In this regard, the diversion might be performed when crowding becomes a problem. Note that the inpatient department includes many individual sectors/units, such as the intensive care unit (ICU), telemetry units, medical/surgical units, etc. However, this work focuses on the performance of hospitals with high-demand units under conditions of a Covid-19 outbreak. Figure 5 presents the general model of a hospital during a Covid-19 pandemic with consideration of high-demand departments such as emergency and inpatient departments.

Fig. 5: General Model of Hospital

According to the conservation-of-mass principle, continuous-time patient flow in the proposed hospital model can be formulated using differential equations according to disturbances and manipulated variables to measured outputs. Equations 26 to 31 are formulated to describe changes and hospital tasks. Ambulance arrivals are correlated with hospital decisions of accepting or diverting some or all patients to nearby hospitals. This matter is formulated as Eq. 26.


where is ambulance diversion and describes the decision between ambulance arrival or ambulance diversion. Additionally, the emergency queue is a module that makes decisions about transferring patients emergency queue to delivering services to each individual patient. The mathematical expression is formulated in Eq. 27.


where is the emergency queue and and are the time-varying rates of ambulance and walk-in patient arrivals, respectively. and are modeled as colored noise due to a lack of information about the rate of arrivals. is similar to and represents the decision to transfer patients in emergency sector to visiting patients or delivering services.

After completing emergency arrival, it is time to perform services in the emergency department. Here, the length of treatment is a crucial consideration. Equation 28 shows the duration of progress in emergency treatment.


where and are the emergency services and length of treatment, respectively. Emergency holding is a decision about whether to hold patients in the emergency department or transfer them to inpatient beds. At this point, medical staff decide to transfer emergency patients directly to the inpatient department for further care. Equation 29 describes the emergency holding decision.


where represents emergency holding and is the decision to transfer a patient in the holding sector to the inpatient department. is the probability that emergency patients are admitted directly to the inpatient sector. Also, the rate of medical admission is modeled as a decision to deliver services directly to the patient (see Eq. 30).


where and are the medical diversion and direct admission decisions, respectively.

Finally, the hospital model describes the number of patients being treated in the inpatient department. The length of treatment in this sector is of primary importance. Therefore, controlling inpatient stays and outpatient services directly influences the number of Covid-19 cases. In this regard, the model is formulated as per Eq. 31.


where represent inpatient services. and are the time-varying rates of direct patient admissions (i.e., modeled as colored noise) to the inpatient section and the length of treatment in the inpatient department, respectively. The Laplace transform of each module is represented in Eq. 32 to Eq. 37.


Iii-D Dynamic Motion

The dynamic motion model is introduced to describe time-varying cases and deaths. After complete modeling of the public tasks and facilities, such as hospitals and medical staff behavior, movement is of importance in the proposed model. The time-varying movement is obviously dependent on the proposed models such as the tasks and facilities models (see Fig. 1). The proposed framework is determined by how public tasks, facilities, and medical staff behavior affect the numbers of cases and deaths during a Covid-19 outbreak. In this dynamic motion module, LSTM is utilized as a dynamic model in the proposed system. LSTM is a time-series model, which is able to estimate the temporal correlations acquired by the simulation model and real data from previous and current circumstances. Therefore, the decision the LSTM makes at time affects the decision it will be make at the next time step, . The LSTM’s feedback connections imitate the system’s dynamic motion, since it takes daily information into account when the subsequent information is entered. Here, the proposed LSTM architecture is shown in Fig. 6.

Fig. 6: LSTM Architecture in Hybrid Model

Iv Results and Discussion

In this section, the performance of the proposed hybrid model is evaluated using the latest available public data on Covid-19111 From the worldwide distribution of Covid-19 cases, the top-ten most-affected countries, which include Australia, were chosen as a case study. The regional properties used in the proposed model are fundamentally different from one another because of the different infrastructure of each region. The model properties, listed in Table I

, can be measured by two approaches. First, by extracting the impact of each property using a heuristic optimization algorithm

[31]. Second, by utilization of actual data and the observed behavior of the whole system if it is available. In this paper, a genetic algorithm (GA) was applied to actual Covid-19 distribution data to determine the impacts of the different modules in the proposed hybrid model. Dynamic motion is represented by the LSTM, which is highly correlated with the sequence length in training the model. The sequence length plays an essential role in the prediction of Covid-19’s spread because mitigation measures and policies are being improved daily. In this regard, sequence lengths of 1–10 days are used as the primary parameters in training the LSTM.

Variable Description
Colored noise numerator
Colored noise denominator
Gain in equalization
Time-lead in equalization
Time-lag in equalization
Delay time in staff response
Gain in the human sense model
Time-lead in the human sense model
Time-lag in the human sense model
Probability of understanding task
Delay time in cognitive model
Saturation parameters in plans and rules model
Natural frequency of medical staff behavior
Damping ratio of medical staff behavior
Length of treatment in the emergency department
Length of treatment in the inpatient department
Probability of emergency patients being transferred to the inpatient department
TABLE I: The parameters of the hybrid model

The boxplot (see Fig. 7

) shows the distribution of RMSEs across the different countries for each sequence length. A seven-day sequence length was found to provide lower variability and shorter outliers than other sequence lengths, while both cases and deaths are matter. Thus, a seven-day sequence length was selected to analyze the impacts of the different modules in the proposed hybrid model.

Fig. 7: The Impact of Sequence Length Across the Top Ten Most-Affected Countries

Table II lists 10 different combinations, from pure LSTM to the whole hybrid model. Generally, the proposed framework includes three main models: the Task Model (TM), Facilities Model (FM), and Dynamic Motion Model (DM). The TM includes external conditions and reference input, while the FM consists of two main sections: medical staff and hospital models. The medical staff model considers human senses, a cognitive model, plans and rules, performance and physical actions, and the hospital, which is modeled in terms of direct medical registration, ambulance arrivals, emergency walk-ins, and emergency services. The DM represents dynamic movement over time as modelled by LSTM. Here, the USA, the most-affected country, was chosen to investigate the importance of the modules in the hybrid model.

Genetic Algorithm (GA) was used to find best parameters for each model. Accordingly, GAs with population size and performed to minimize the root mean square error (RMSE) between the predicted and observed data. The RMSE is calculated as follows:


where and are the observed and predicted values, respectively.

To assess the statistical significance of the reduction in RMSE in hybrid models compared to LSTM, each module was evaluated 500 times after hype-parameter tuning, and the corresponding RMSE distribution was used to estimate confidence interval (CI) and t-test p-values comparing significant differences between stage 11 and other stages. Table II and Fig. 8) clearly show the cumulative effect. As seen in Fig. 7, adding TM (stage 2) significantly improved the accuracy (i.e., better than LSTM). Then, adding more modules (i.e., greater similarity to the real environment) on top of TM (i.e., stages 3 to 11) was able to improve the model’s performance in comparison with the previous stage.

width=center Stage Model Module RMSE Case Death Mean Std 95% CI p-value Mean Std 95% CI p-value 1 DM LSTM 2 TM, DM External Condition, Reference Input, LSTM 3 TM, FM, DM TM, Human Sense, LSTM 4 TM, FM, DM TM, Human Sense, Cognitive Model, LSTM 5 TM, FM, DM TM, Human Sense, Cognitive Model, Plans and Rules, LSTM 6 TM, FM, DM TM, Human Sense, Cognitive Model, Plans and Rules, Human Performance, LSTM 7 TM, FM, DM TM, Medical Staff, Ambulance Arrivals, LSTM 8 TM, FM, DM TM, Medical Staff, Ambulance Arrivals, Emergency Walk-in, LSTM 9 TM, FM, DM TM, Medical Staff, Ambulance Arrivals, Emergency Walk-in, Medical Direct Registration, LSTM 10 TM, FM,DM TM, Medical Staff, Ambulance Arrivals, Emergency Walk-in, Medical Direct Registration, Emergency Services, LSTM 11 TM, FM,DM TM, FM (Medical Staff, Hospital), LSTM - -

TABLE II: Cumulative effect of modules in the accuracy, estimation of CI and t-test p-values to compare significant differences between stage 11 and other stages

The cumulative results show that having greater similarity to the real environment helps to obtain a more accurate model. Figure 8 demonstrates that not only can the whole modeled system, encompassing everything from public behavior to hospital performance, reach significant accuracy, it also imitates changes in Covid-19 time. For instance, Fig. 9 shows a comparison of LSTM and the proposed model, from which it presents that the hybrid model can more accurately estimate real ambient behavior.

Fig. 8: Cumulative Module Effect
Fig. 9: Performance of Hybrid Model for USA

The parameters of the hybrid model were tuned by GA for the most-affected countries, which were the United States of America (USA), Spain (ESP), Italy (ITA), France (FRA), Germany (DEU), United Kingdom (GBR), Turkey (TUR), Iran (IRN), China (CHN), Russia (RUS), and Australia (AUS). The tuned parameters are listed in Table III. The results show that the hybrid model outperforms the LSTM model. The performance of the hybrid model in testing and forecasting is presented in Figs. 10 - 13.


TABLE III: The tuned parameters of the hybrid model

Having tuned the hybrid model’s parameters, the hybrid model was evaluated using test data. In Table IV, the performance of the hybrid model is compared with that of LSTM. In this table, RMSEs were obtained for cases and deaths in different countries. The results show that the performance of the hybrid model is significantly better than that of LSTM.

Country RMSE
LSTM Hybrid Model
Case Death Case Death
TABLE IV: Comparison of Hybrid model with LSTM

Figure 10 to Fig. 13 illustrates that the hybrid model was capable of more accurate prediction than the LSTM model with all case studies. Not only can the model predict the behavior/trend, it also provides accurate estimates of the numbers of cases and deaths under considerable uncertainty. In these figures, forecasts for the next 20 days are provided. As seen, Spain, France, Germany, China, and Australia are able to control the number of cases at zero cases per day. However, the numbers of cases in the USA and Turkey in the next 20 days are around 25,000 and 2,000 per day, respectively. Therefore, these countries need greater restrictions and improved public knowledge and behavior if they are to decrease the number of cases. During this period, there is insufficient control in Iran and the United Kingdom. As the model is sensitive to large fluctuations in cases and deaths, care should be taken to maintain a stable trend. Besides, Fig. 12 and Fig. 13 show that LSTM crashes when estimating cases in Australia and France because of negative values of cases.

Fig. 10: LSTM and Hybrid Model for USA, ESP and ITA
Fig. 11: LSTM and Hybrid Model for FRA, DEU and BGR
Fig. 12: LSTM and Hybrid Model for TUR, IRN and CHN
Fig. 13: LSTM and Hybrid Model for RUS and AUS

V Conclusion

A novel hybrid model using LSTM and dynamic behavior models was proposed to forecast the spread of Covid-19 in the most-affected countries. Many factors affect the spread of this virus, so it is very difficult to make the right predictions of cases and deaths. In this regard, the LSTM and dynamic behavioral models were used to model a dynamic system with a high level of fidelity. The results show that the hybrid model can accurately predict the spread of Covid-19 based on real data.

The proposed hybrid model provides robust estimates with the exclusive use of regional properties. Adding more modules and using real data for different modules can substantially improve the model’s performance. In this paper, a TM, FM (including medical staff and hospitals), and DM were constructed to forecast the spread of Covid-19 in the top ten most-affected countries. Public knowledge and behavior can directly impact the spread of Covid-19. Additionally, skilled medical staff and high-quality hospitals can control the outbreak.


  • [1] R. M. Anderson, H. Heesterbeek, D. Klinkenberg, and T. D. Hollingsworth (2020) How will country-based mitigation measures influence the course of the covid-19 epidemic?. The Lancet 395 (10228), pp. 931–934. Cited by: §I, §I.
  • [2] M. Baguelin, S. Flasche, A. Camacho, N. Demiris, E. Miller, and W. J. Edmunds (2013) Assessing optimal target populations for influenza vaccination programmes: an evidence synthesis and modelling study. PLoS medicine 10 (10). Cited by: §I.
  • [3] Y. Bengio, A. Courville, and P. Vincent (2013) Representation learning: a review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35 (8), pp. 1798–1828. Cited by: §II.
  • [4] P. J. Birrell, D. De Angelis, A. M. Presanis, et al. (2018) Evidence synthesis for stochastic epidemic models. Statistical Science 33 (1), pp. 34–43. Cited by: §I.
  • [5] T. Britton and P. D. O’NEILL (2002) Bayesian inference for stochastic epidemics in populations with random social structure. Scandinavian Journal of Statistics 29 (3), pp. 375–390. Cited by: §I.
  • [6] A. Camacho, A. Kucharski, Y. Aki-Sawyerr, M. A. White, S. Flasche, M. Baguelin, T. Pollington, J. R. Carney, R. Glover, E. Smout, et al. (2015) Temporal changes in ebola transmission in sierra leone and implications for control requirements: a real-time modelling study. PLoS currents 7. Cited by: §I.
  • [7] B. S. Cooper, R. J. Pitman, W. J. Edmunds, and N. J. Gay (2006) Delaying the international spread of pandemic influenza. PLoS medicine 3 (6). Cited by: §I.
  • [8] J. Dehning, J. Zierenberg, F. P. Spitzner, M. Wibral, J. P. Neto, M. Wilczek, and V. Priesemann (2020) Inferring covid-19 spreading rates and potential change points for case number forecasts. arXiv preprint arXiv:2004.01105. Cited by: §I.
  • [9] K. J. Friston, T. Parr, P. Zeidman, A. Razi, G. Flandin, J. Daunizeau, O. J. Hulme, A. J. Billig, V. Litvak, R. J. Moran, et al. (2020) Dynamic causal modelling of covid-19. arXiv preprint arXiv:2004.04463. Cited by: §I.
  • [10] S. Funk, I. Ciglenecki, A. Tiffany, E. Gignoux, A. Camacho, R. M. Eggo, A. J. Kucharski, W. J. Edmunds, J. Bolongei, P. Azuma, et al. (2017) The impact of control strategies and behavioural changes on the elimination of ebola from lofa county, liberia. Philosophical Transactions of the Royal Society B: Biological Sciences 372 (1721), pp. 20160302. Cited by: §I.
  • [11] F. A. Gers, J. Schmidhuber, and F. Cummins (1999) Learning to forget: continual prediction with lstm. Cited by: §II-A.
  • [12] N. Ghaffarzadegan and H. Rahmandad (2020) Simulation-based estimation of the spread of covid-19 in iran. medRxiv. Cited by: §I.
  • [13] A. Graves and N. Jaitly (2014) Towards end-to-end speech recognition with recurrent neural networks. In International conference on machine learning, pp. 1764–1772. Cited by: §II.
  • [14] S. Hochreiter and J. Schmidhuber (1997) Long short-term memory. Neural computation 9 (8), pp. 1735–1780. Cited by: §II-A.
  • [15] A. Karpathy and L. Fei-Fei (2015) Deep visual-semantic alignments for generating image descriptions. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    pp. 3128–3137. Cited by: §II.
  • [16] W. Kong, Z. Y. Dong, Y. Jia, D. J. Hill, Y. Xu, and Y. Zhang (2017) Short-term residential load forecasting based on lstm recurrent neural network. IEEE Transactions on Smart Grid 10 (1), pp. 841–851. Cited by: §II-A.
  • [17] A. J. Kucharski, A. Camacho, F. Checchi, R. Waldman, R. F. Grais, J. Cabrol, S. Briand, M. Baguelin, S. Flasche, S. Funk, et al. (2015) Evaluation of the benefits and risks of introducing ebola community care centers, sierra leone. Emerging infectious diseases 21 (3), pp. 393. Cited by: §I.
  • [18] A. J. Kucharski, T. W. Russell, C. Diamond, Y. Liu, J. Edmunds, S. Funk, R. M. Eggo, F. Sun, M. Jit, J. D. Munday, et al. (2020) Early dynamics of transmission and control of covid-19: a mathematical modelling study. The lancet infectious diseases. Cited by: §I, §I, §I.
  • [19] R. Li, S. Pei, B. Chen, Y. Song, T. Zhang, W. Yang, and J. Shaman (2020) Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2). Science. Cited by: §I.
  • [20] Z. C. Lipton, J. Berkowitz, and C. Elkan (2015) A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019. Cited by: §II-A.
  • [21] J. Lourenco, M. M. de Lima, N. R. Faria, A. Walker, M. U. Kraemer, C. J. Villabona-Arenas, B. Lambert, E. M. de Cerqueira, O. G. Pybus, L. C. Alcantara, et al. (2017) Epidemiological and ecological determinants of zika virus transmission in an urban setting. Elife 6, pp. e29820. Cited by: §I.
  • [22] J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille (2014) Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632. Cited by: §II.
  • [23] D. T. McRuer and E. S. Krendel (1974) Mathematical models of human pilot behavior. Technical report ADVISORY GROUP FOR AEROSPACE RESEARCH AND DEVELOPMENT NEUILLY-SUR-SEINE (FRANCE). Cited by: §III-B.
  • [24] H. Nishiura, D. Klinkenberg, M. Roberts, and J. A. Heesterbeek (2009) Early epidemiological assessment of the virulence of emerging infectious diseases: a case study of an influenza pandemic. PLoS One 4 (8). Cited by: §I.
  • [25] A. Pandey, K. E. Atkins, J. Medlock, N. Wenzel, J. P. Townsend, J. E. Childs, T. G. Nyenswah, M. L. Ndeffo-Mbah, and A. P. Galvani (2014) Strategies for containing ebola in west africa. Science 346 (6212), pp. 991–995. Cited by: §I.
  • [26] S. Riley, C. Fraser, C. A. Donnelly, A. C. Ghani, L. J. Abu-Raddad, A. J. Hedley, G. M. Leung, L. Ho, T. Lam, T. Q. Thach, et al. (2003) Transmission dynamics of the etiological agent of sars in hong kong: impact of public health interventions. Science 300 (5627), pp. 1961–1966. Cited by: §I.
  • [27] B. Shulgin, L. Stone, and Z. Agur (1998) Pulse vaccination strategy in the sir epidemic model. Bulletin of mathematical biology 60 (6), pp. 1123–1148. Cited by: §I.
  • [28] I. Sutskever, O. Vinyals, and Q. V. Le (2014) Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104–3112. Cited by: §II.
  • [29] C. Viboud, K. Sun, R. Gaffey, M. Ajelli, L. Fumanelli, S. Merler, Q. Zhang, G. Chowell, L. Simonsen, A. Vespignani, et al. (2018) The rapidd ebola forecasting challenge: synthesis and lessons learnt. Epidemics 22, pp. 13–21. Cited by: §I.
  • [30] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan (2015) Show and tell: a neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3156–3164. Cited by: §II.
  • [31] S. M. Zandavi, V. Y. Y. Chung, and A. Anaissi (2019) Stochastic dual simplex algorithm: a novel heuristic optimization algorithm. IEEE Transactions on Cybernetics. Cited by: §IV.
  • [32] V. Zlatić, I. Barjašić, A. Kadović, H. Štefančić, and A. Gabrielli (2020) Bi-stability of sudr+ k model of epidemics and test kits applied to covid-19. arXiv preprint arXiv:2003.08479. Cited by: §I.