An approximate dynamic programming approach to food security of communities following hazards

by   Saeed Nozhati, et al.

Food security can be threatened by extreme natural hazard events for households of all social classes within a community. To address food security issues following a natural disaster, the recovery of several elements of the built environment within a community, including its building portfolio, must be considered. Building portfolio restoration is one of the most challenging elements of recovery owing to the complexity and dimensionality of the problem. This study introduces a stochastic scheduling algorithm for the identification of optimal building portfolio recovery strategies. The proposed approach provides a computationally tractable formulation to manage multi-state, large-scale infrastructure systems. A testbed community modeled after Gilroy, California, is used to illustrate how the proposed approach can be implemented efficiently and accurately to find the near-optimal decisions related to building recovery following a severe earthquake.


A Modified Approximate Dynamic Programming Algorithm for Community-level Food Security Following Disasters

In the aftermath of an extreme natural hazard, community residents must ...

An Approximate Dynamic Programming Approach to Community Recovery Management (Extended Abstract)

The functioning of interdependent civil infrastructure systems in the af...

Near-optimal planning using approximate dynamic programming to enhance post-hazard community resilience management

The lack of a comprehensive decision-making approach at the community-le...

Solving Markov decision processes for network-level post-hazard recovery via simulation optimization and rollout

Computation of optimal recovery decisions for community resilience assur...

Integrated Process Planning and Scheduling in Commercial Smart Kitchens

This paper describes the possibility of applying a generic, cloud-based ...

Modeling of Lifeline Infrastructure Restoration Using Empirical Quantitative Data

Disaster recovery is widely regarded as the least understood phase of th...

1 Testbed Case Study

As an illustration, this study considers the building portfolio of Gilroy, California, USA. The City of Gilroy is a moderately sized growing city in southern Santa Clara County, California, with a population of 48,821 at the time of the 2010 census. The study area is divided into 36 rectangular regions organized as a grid to define the properties of the community with an area of 42 and a population of 47,905. Household units are growing at a faster pace in Gilroy than in Santa Clara County and the State of California (Harnish (2014)). The average number of people per household in Gilroy in 2010 was 3.4, greater than the state and county average. Approximately 95% of Gilroy’s housing units are occupied. A heat map of household units in the grid is shown in Figure 1. Age distribution of Gilroy is tabulated in Table 1.

Figure 1: Housing units over the defined grids.
Age Group Percent
Children (0-17 years) 30.60
Adults (18-64 years) 61
Senior Citizen (65+ years) 8.40
Table 1: Age distribution of Gilroy (Harnish (2014)).

2 Seismic Hazard and Damage Assessment

The seismic hazard is a dominant hazard of California. Hence, we consider a seismic event of moment magnitude

that occurs at one of the closest points on the San Andreas Fault to downtown Gilroy with an epicentral distance of approximately 12 km. We used the Abrahamson et al. (2013) ground motion prediction equation (GMPE) to evaluate the Intensity Measures (IM) and associated uncertainties, including the intra-event (within event) and inter-event (event-to-event) uncertainties, at the sites of each of the 14,702 buildings in Gilroy. We assessed the damage to household units and food retailers with the seismic fragility curves presented in HAZUS-MH (FEMA (2003)). We considered repair vehicles, crews, and tools as available resource units (RUs) to restore the buildings following the hazard. One RU is required to repair each damaged building (Masoomi (2018)). We adopted the synthesized restoration time from HAZUS-MH.

3 Markov Decision Process Framework

We provide a brief description of MDPs; additional details of MDPs are available elsewhere (Puterman (2014)). A MDP is defined by the five-tuple , where denotes the state space, denotes the action space,

is the probability of transitioning from state

to state when action is taken, , is the reward obtained when action is taken in state , and is the discount factor. A policy is a mapping from states to actions, and be the set of policies (). The objective is then to find the optimal policy, denoted by , that maximizes the total reward (or minimizes the total cost) over the time horizon, i.e.,




is called the value function for a fixed policy , and is the discount factor (Puterman (2014)). The optimal value function for a given state is connoted as given by


Bellman’s optimality principle (Bertsekas (2005)) is useful for defining Q-value function. Q-value function plays a pivotal role in the description of the rollout algorithm. Bellman’s optimality principle states that satisfies


The -value function associated with the optimal policy is defined as


which is the inner-term on the R.H.S. in Eq. (4).


can be computed with linear programming or dynamic programming (DP). However, exact methods are not feasible for real-world problems that have large state and action spaces, like the community-level optimization problem considered herein, owing to the

curse of dimensionality; thus, an approximation technique is essential to obtain the solution. In the realm of approximate dynamic programming (ADP) techniques, a model-based, direct simulation approach for policy evaluation is used (Sarkale et al. (2018)

). This approach is called “rollout.” Briefly, an estimate

of the Q-value function is calculated by Monte Carlo simulations (MSC) in the rollout algorithm as follows: we first simulate number of trajectories, where each trajectory is generated using the policy (called the base policy), has length , and starts from the pair ; then, is the average of the sample functions along these trajectories:


For each trajectory , we fix the first state-action pair to ; the next state is calculated when the current action in state is completed. Thereafter, we choose actions using the base policy. A more complete description of the rollout algorithm can be found in (Bertsekas (2005); Nozhati et al. (2019)).

4 Building Portfolio Recovery

Each household unit and retailer building remains undamaged or exhibits one of the damage states (i.e., Minor, Moderate, Major, and Collapse) based on the level of intensity measure and the seismic fragility curves. There is a limited number of RUs (defined earlier) available to the decision maker for the repair of the buildings in the community. In this study, we also limit the number of RUs for each urban grid so that the number of available RUs for each grid is 20 percent of the number of damaged buildings in each region of the grid. Therefore, the number of RUs varies over the community in proportion to the density of the damaged buildings.

Let be the state of the damaged structures of the building portfolio at time ;

is a vector, where each element represents the damage state of each building in the portfolio based on the level of intensity measure and the seismic fragility curves. Let

denote the repair action to be carried out on the damaged structures in the region of the grid at time ; each element of is either zero or a one, where zero means do not repair and one means carry out repair. Note that the sum of elements of is equal to . The repair action for the entire community at time , , is the stack of the repair action . The assignment of RUs to damaged locations is

in the sense that the decision maker cannot preempt the assigned RUs from completing their work and reassign them to different locations at every decision epoch

. This type of scheduling is more suitable when the decision maker deals with non-central stakeholders and private owners, which is the case for a typical building portfolio. We wish to plan decisions optimally so that a maximum number of inhabitants have safe household unit structures per unit of time (day in our case). Therefore, the reward function embeds two objectives as follows:


where is the number of people benefited from household units after the completion of , and is the total repair time to reach from any initial state . Note that the reward function is stochastic because the outcome of the repair action is stochastic. In this study, we set the discount factor to be 0.99, implying that the decision maker is “far-sighted” in the consideration of the future rewards.

We simulated number of trajectories to reach a low (0.1 in this study) dispersion in Eq. (6). As Eq. (6) shows, we addressed the mean-based optimization that is suited to risk-neutral decision-makers. However, this approach can easily address different risk aversion behaviors. Figure 2 shows the total number of people with inhabitable structures (undamaged or repaired) over the community. We also computed the different numbers of children, adults, and senior citizens that have safe buildings over the recovery. Different age groups have different levels of vulnerability to food insecurity; for example, children are a vulnerable group and must be paid more attention during the recovery process.

Figure 2: Different numbers of people based on age with inhabitable structures.

Figure 3 depicts the spatio-temporal evolution of the community for people with inhabitable structurally-safe household units. This figure shows that for urban grids with a high density of damaged structures, complete recovery is prolonged despite availability of additional RUs. The spatio-temporal analysis of the community is informative for policy makers whereby they can identify the vulnerable areas of the community across time.

Figure 3: Number of people with inhabitable houses a) following the earthquake b) after 100 days c) after 600 days.

5 Conclusion and Future Work

The building portfolio restoration is one of the most challenging ingredients to address food security issues in the aftermath of disasters. Our stochastic dynamic optimization approach, based on the method of rollout, successfully plans a near-optimal building portfolio recovery following a hazard. Our approach shows how to overcome the curse of dimensionality in optimizing large-scale building portfolio recovery post-diaster. For future work, we consider several aspects of a community from infrastructure systems to social systems along with their interdependencies. We will also explore how to fuse meta-heuristics to our solution to supervise the stochastic search that determines the most promising actions (

Nozhati et al. (2018)).

6 References


  • Abrahamson et al. (2013) Abrahamson, N. A., Silva, W. J., and Kamai, R. (2013). “Update of the as08 ground-motion prediction equations based on the nga-west2 data set.
  • Bertsekas (2005) Bertsekas, P. (2005). Dynamic programming and optimal control, Vol. 1. Athena scientific Belmont, MA.
  • FAO (2001) FAO (2001). “The state of food insecurity in the world (rome: Fao).
  • FEMA (2003) FEMA, H. (2003). “Multi-hazard loss estimation methodology, earthquake model.” Washington, DC, USA: Federal Emergency Management Agency.
  • Harnish (2014) Harnish, M. (2014). “2015-2023 housing element policy document and background report.
  • Lin and Wang (2017) Lin, P. and Wang, N. (2017). “Stochastic post-disaster functionality recovery of community building portfolios i: Modeling.” Structural Safety, 69, 96–105.
  • Masoomi (2018) Masoomi, H. (2018). “A resilience-based decision framework to determine performance targets for the built environment.” Ph.D. thesis, Colorado State University. Libraries, Colorado State University. Libraries.
  • Nozhati et al. (2018) Nozhati, S., Sarkale, Y., Ellingwood, B., Chong, E., and Mahmoud, H. (2018). “A modified approximate dynamic programming algorithm for community-level food security following disasters.” Proceedings of the 9th International Congress on Environmental Modelling and Software (iEMSs 2018), Fort Collins, CO, June.
  • Nozhati et al. (2019) Nozhati, S., Sarkale, Y., Ellingwood, B., Chong, E. K., and Mahmoud, H. (2019). “Near-optimal planning using approximate dynamic programming to enhance post-hazard community resilience management.” Reliability Engineering & System Safety, 181, 116–126.
  • Oliveira (2017) Oliveira, V. (2017). “The food assistance landscape: Fy 2016 annual report.
  • Paci-Green and Berardi (2015) Paci-Green, R. and Berardi, G. (2015). “Do global food systems have an achilles heel? the potential for regional food systems to support resilience in regional disasters.” Journal of Environmental Studies and Sciences, 5(4), 685–698.
  • Puterman (2014) Puterman, M. (2014). Markov decision processes: discrete stochastic dynamic programming. John Wiley and Sons.
  • Research and Center (2017) Research, F. and Center, A. (2017). “An advocate’s guide to the disaster supplemental nutrition assistance program.” Food Research and Action Center (FRAC).
  • Sarkale et al. (2018) Sarkale, Y., Nozhati, S., Chong, E., Ellingwood, B., and Mahmoud, H. (2018). “Solving markov decision processes for network-level post-hazard recovery via simulation optimization and rollout.” Proceedings of the 14th IEEE International Conference on Automation Science and Engineering (CASE 2018), Munich, Germany.
  • Seekell et al. (2017) Seekell, D., Carr, J., Dell’Angelo, J., D’Odorico, P., Fader, M., Gephart, J., Kummu, M., Magliocca, N., Porkka, M., Puma, M., et al. (2017). “Resilience in the global food system.” Environmental Research Letters, 12(2), 025010.