## 1 Abstract

Representing the reservoir as a network of discrete compartments with neighbor and non-neighbor connections is a fast, yet accurate method for analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale compartments with distinct static and dynamic properties is an integral part of such high-level reservoir analysis. In this work, we present a hybrid framework specific to reservoir analysis for an automatic detection of clusters in space using spatial and temporal field data, coupled with a physics-based multiscale modeling approach.

A novel and rigorous non-local formulation for flow in porous media is presented, in which the reservoir is represented by an adjacency matrix describing the connectivities of comprising compartments. We automatically divide the reservoir into a number of distinct compartments, in which the direction-dependent multiphase flow communication is a function of non-local phase potential differences. Our proposed clustering framework begins with a mixed-type raw dataset which can be categorical/numerical, spatial/temporal, and discrete/continuous. The dataset can contain noisy/missing values of different data types including but not limited to well production/injection history, well location, well type, geological features, PVT measurements, perforation data, etc. Unsupervised clustering techniques suited to the input data types (e.g. k-prototypes, spectral, Gaussian Mixtures, and hierarchical clustering), and appropriate distance measures (such as Euclidean distance, soft dynamic time warping, and mode) are used. The input data is standardized, and upon convergence check, the best clustering representation is obtained. Finally, Support-Vector-Machine technique is utilized in the kernel space to trace a demarcating hyperplane for the clusters.

The proposed framework is successfully applied to more than five mature fields in the Middle East, South and North America, each with more than a thousand wells. In a specific case study reported here, the proposed workflow is applied to a major field with a couple of hundreds of wells with more than 40 years of production history. Leveraging the fast forward model, an efficient ensemble-based history matching framework is applied to reduce the uncertainty of the global reservoir parameters such as inter-blocks and aquifer-reservoir communications, fault transmissibilities, and block-based oil in place. The ensemble of history matched models are then used to provide a probabilistic forecast for different field development scenarios. In addition, the clustering framework enables us to treat missing data and use the augmented dataset for improving the clustering accuracy.

In summary, in this work a novel hybrid approach is presented in which we couple a physics-based non-local modeling framework with data-driven clustering techniques to provide a fast and accurate multiscale modeling of compartmentalized reservoirs. This research also adds to the literature by presenting a comprehensive work on spatio-temporal clustering for reservoir studies applications that well considers the clustering complexities, the intrinsic sparse and noisy nature of the data, and the interpretability of the outcome.

keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal Clustering; Physics-Based Data-Driven Formulation; Multiscale Modeling

## 2 Introduction

In subsurface flow modeling, subsurface formations are characterized by the heterogeneity over multiple length scales, which can have a strong impact on the flow and transport (Bonnet et al., 2001; Abu-Al-Saud et al., 2016, 2019; Cinar and Riaz, 2014). To account for this multiscale nature of the subsurface, high-resolution reservoir simulation is widely used to model the complex physics present in oil recovery processes. Although the recent increase in computational resources makes the flow simulation of large-scale models feasible, simulations with high-resolution models are still computationally expensive and prohibitive.

In the long term oil recovery processes, steps such as field development planning, resource allocation, and production optimization are of great importance. During these steps, reservoir simulation models are extensively used for objective function calculations and this can pose computational challenges as the forward models have to be run thousands of times (Salehi et al., 2019a; Brown et al., 2017; Olalotiti et al., 2019; Shirangi and Durlofsky, 2016).

In reservoir analysis, calibrating techniques are used for estimating unknown properties of a reservoir such as porosity and permeability from historical measured data as a type of inverse problem where the historical data are usually taken at the production wells and might consist of pressure and/or flow data. The model is considered to be calibrated if it is able to reproduce the historical data of the reservoir it represents. This calibration process is called history matching, and this is the most time-consuming phase in any reservoir simulation study since it requires forward run of a fine-scale reservoir simulation model in an often exhaustive iterative process

(Gharib Shirangi, 2014; Tahmasebi et al., 2018).On the other hand, reservoir simulation models are subject to uncertainty in a great variety of input parameters caused by geological variability, measurement error, petrophysical-modeling uncertainty, and structural-modeling uncertainty, to name just a few. Furthermore, a correct analysis of uncertainty propagation through the model increases the quality and robustness of the decision-making process of field management. Consequently, an accurate analysis of uncertainty requires a large number of simulations (Camacho et al., 2017; Lee et al., 2014; Yeh et al., 2014; Yang et al., 2007). Furthermore, the computational time of large-scale refined models becomes a bottleneck in the decision-making process and assimilating real-time data into a model (Salehi et al., 2013; Salehi, 2016; Gildin and Lopez, 2011; Ghasemi et al., 2012).

In all of the reservoir studies areas mentioned above, there have always been attempts to accelerate simulations in order to overcome their prohibitive costs and make them feasible for real-world applications while honoring the accuracy and reliability of the outcome. For example, using reduced order modeling (ROM), high-order reservoir models can be reorganized and reduced into (non-)linear low-order models using theoretical concepts like modal decomposition, balanced realization, proper orthogonal decomposition (POD), and system identification (Bistrian and Navon, 2017; Fujimoto and Scherpen, 2010; Ghasemi et al., 2014; Cardoso and Durlofsky, 2009; Udy et al., 2017; Markovinvić et al., 2002). ROM has received significant attention in the recent years particularly for use in optimization procedures in many areas of research, where a forward model must be run many times in order to determine the optimal design or the operating parameters (Esmaeilzadeh and Alam, 2019). ROM procedures were first applied by Vermeulen et al. (2004) for subsurface flow problems. It has been observed that the standard POD procedures are not very robust and/or require more basis functions as the models become very nonlinear in the presence of strong gravitational effects with highly nonlinear relative permeability functions where the speed-up offered by standard POD techniques will lose its significance (Cardoso et al., 2009). As another ROM technique, in the modal decomposition approach, the reduction that can be achieved is restricted by the low-order model and is not anymore able to reconstruct the behaviour of the high-order model accurately, as soon as more than half of the eigenmodes get truncated during the model reduction (Heijn et al., 2004).

On the other hand, in order to accelerate reservoir simulations, there have been attempts to decrease the number of grid cells in a simulation model by some coarsening techniques (upscaling methods) [(Salehi et al., 2017)(Salehi et al., 2019b)]. Using homogenization during the coarsening process, a heterogeneous property region consisting of fine grid cells is replaced with an equivalent homogeneous region made up of a single coarse grid cell with an effective property value that needs to be calculated [see refs. (Gautier et al., 1999; Fredrik and Tveito, 1998; Durlofsky et al., 1996)]. A wide range of methods have been proposed in the literature to effectively carry out this coarsening (e.g. local, extended local, global, and quasi-global techniques for the upscaling of permeability, permeability upscaling in the vicinity of wells, flow-based grid generation techniques) which all try to increase the simulation speed while preserving a level of model accuracy [see ref. (Wen and Gómez-Hernández, 1998; Holden and Nielsen, 2000; Yeten et al., 2003; Barker and Thibeau, 1997; Rabahy and Ignarra, 2002; Durlofsky, 1998)].

As an alternative approach to upscaling, a number of so-called multiscale methods have also been developed in order to accelerate the reservoir simulation process while achieving a higher level of accuracy. The key idea of the multiscale methods is to construct a set of operators that map between the unknowns associated with cells in a fine grid (holding the petrophysical properties of the geological reservoir model) and unknowns on a coarser grid used for dynamic simulation. The operators are computed numerically by solving localized flow problems in the same way as for flow-based upscaling methods. Unlike effective parameters, the multiscale basis functions have subscale resolution, which ensure that fine-scale heterogeneity is correctly accounted for in a systematic manner [see refs. (Hajibeygi et al., 2012; Lee et al., 2008; Tchelepi et al., 2007; Darve, 2000; Forsum et al., 1978; Hajibeygi and Jenny, 2009, 2011)]

As another way to accelerate reservoir simulations, Matringe et al. (2018) proposed an approach where a reservoir is treated as a combination of multiple interconnected compartments which under a range of uncertainty can capture the reservoir’s response during a recovery process. In this work, we extend the approach proposed by Matringe et al. (2018) to represent a reservoir in a multiscale form consisting of multiple interconnected segments. To identify segments of the reservoir, we use the state-of-the-art spatial, temporal, and spatio-temporal unsupervised data-mining clustering techniques. Then, a novel non-local formulation for flow in porous media is presented, in which the reservoir is represented by an adjacency matrix describing the neighbor and non-neighbor connections of comprising compartments. We automatically divide the reservoir into distinct compartments, in which the direction-dependent multiphase flow communication is a function of non-local phase potential differences. Having the segmented reservoir and the uncertain parameters, we use a robust history matching technique to reproduce reservoir’s historical response. Finally, for the segmented and history matched reservoir, we carry out forecasting that can be used in reservoir management and field development applications.

The manuscript is ordered as the following: First, we describe the methods and frameworks. We start by describing the spatio-temporal clustering framework, then we provide an overview of the compartmentalized reservoir simulation framework where we explain multi-tank history material balance and multi-tank predictive material balance. Finally, we describe the history matching approach. Afterwards, we present the results of clustering, multi-tank material balance, and history matching for a real-world reservoir.

## 3 Methods and Frameworks

### 3.1 Clustering.

Clustering data-mining process is an interactive and iterative procedure that involves many steps, such as data extraction and cleaning, feature selection, algorithm design, and post-processing analysis of the output when an algorithm is applied to the dataset. The goal of clustering data-mining is to automate the discoveries of patterns and information, which can be examined by domain experts for further validation and verification. Clustering is one of the most fundamental tasks in spatial data-mining. It is the process of aggregating objects into different groups known as clusters such that the objects within the same cluster are similar to each other but dissimilar from those in other clusters. Clusters are generated on the basis of a

similarity criterion, which is used to determine the relationship between each pair of objects in the dataset. In this section, we present a general framework for clustering in subsurface applications as an unsupervised machine learning approach. We carry out clustering at the well-level and find separating hyperplanes of wells in the physical domain. The data can be either spatial or temporal, categorical or numerical. Well coordinates and PVT measurements are examples of spatial numerical data; well type, fault-regions, and well states are examples of spatial categorical data; and well bottomhole pressure, production/injection rates, cumulative production, gas/oil ratio, water/oil ratio are examples of temporal numerical data.### 3.2 Spatial Clustering.

During the spatial clustering, we deal with features that are static (do not change over time) and can either be categorical (discrete categorical labels) or numerical (continuous real values). We build the feature vector using both categorical and numerical spatial characteristics of wells. The common approach for clustering numerical feature vectors is k-means, which clusters numerical data based on Euclidean distance (Jain, 2010; Hartigan and Wong, 1979; Steinley, 2006; Alsabti et al., 1997). For categorical data the k-modes method is used, which defines clusters based on the number of matching categories between data points (Huang and K. Ng, 1999; Huang and Ng, 2003; Chaturvedi et al., 2001). However, for the case of having feature vectors comprised of mixed categorical and numerical data neither of the two approaches could be used. For this scenario, k-prototypes first proposed by Huang (1997), and Huang (1998) got further studied later on by Ahmad and Dey (2007) and Ji et al. (2012).

In the k-prototypes approach, the cost function for clustering mixed datasets with data objects and attributes ( numeric attributes, categorical attributes, ) is defined as:

(1) |

where is the distance of a data object from the closest cluster center . is defined as:

(2) |

where are values of numeric attributes and are values of categorical attributes for data object . Here represents the cluster center for cluster . represents the most common value (i.e. mode) for categorical attributes and class . represents mean of numeric attribute and cluster . For categorical attributes, when and when . is a weight for categorical attributes of cluster . Cost function is minimized for clustering mixed datasets. Equation (1) is a combined similarity measure on both numeric and categorical attributes between objects and cluster prototypes. The similarity measure for numeric attributes is the squared Euclidean distance, whereas the similarity measure for categorical attributes is the number of mismatches between cluster prototypes and objects. Weight

is introduced to avoid favoring either type of the attributes and is considered as a tuning hyperparameter.

The k-prototypes algorithm can be described as the following: (1). select initial prototypes from the dataset , one for each cluster; (2). allocate each object in to a cluster whose prototype is the nearest to it according to the cost function in Equation (1) and update the prototype of the cluster after each allocation; (3). after all objects have been allocated to a cluster, recalculate the similarity of objects against the current prototypes. If an object is found such that its nearest prototype should be another cluster rather than its current one, reallocate the object to that cluster and update the prototypes of both clusters; (4). repeat (3) until no object has changed clusters after a full cycle test of . The number of clusters here is an input parameter and is determined using the elbow criterion (Rousseeuw, 1987; Yu et al., 2014) during the clustering process. Although, using other approaches such as density-based spatial clustering (aka DBSCAN) (Borah and Bhattacharyya, 2004) for finding out the value of a priori to clustering has been proposed, elbow criterion is reported to be a robust and independent approach (Mingjin, 2005; Sugar and James, 2003).

### 3.3 Adaptive Temporal Clustering.

In this part, we explain our approach on how we carry out clustering in an adaptive way based on temporal (dynamically changing with time) properties including but not limited to quantities such as pressure profile, oil, gas, and water production rates, water and gas injection rates, gas/oil ratio, and water/oil ratio. We perform temporal clustering as explained in the following of this part. After carrying out temporal clustering, for cluster adaptiveness, we examine each temporally clustered group of wells and if the internal variation of the signals is larger than a pre-specified threshold, we sub-cluster those clusters again. Our approach can easily be used for any temporal quantity in a similar scenario as explained below.

There have been different approaches proposed for temporal proximity-based clustering using similarity measures such as Euclidean distance, Pearson correlation coefficient, Kullback-Liebler (KL) divergence, and dynamic time warping (DTW), and using clustering algorithms such as k-means, k-medoids, hierarchical, and kernel DBScan (Aghabozorgi et al., 2015; Rani and Sikka, 2012). In this work, we use dynamic time warping (DTW) as the similarity measure of our time series due to its advantages over other similarity counterparts such as being able to give the position of alignment between two sequences, and being able to compare sequences with different lengths. DTW also can exploit the temporal distortions and compare shifted or distorted evolution profiles whose time sampling can even be irregular due to its optimal alignment. DTW is also able to find optimal global alignment between sequences and accordingly is the most commonly used measure to quantify the dissimilarity between sequences (Kruskall and Liberman, 1983; Aach and Church, 2001; Bar-Joseph et al., 2002; Gavrila and Davis, 1995; Rath and Manmatha, ). DTW also provides an overall real number that quantifies the similarity between two temporal sequences and makes it possible to find the best global alignment between them. As the clustering approach, here for the illustration purposes, we have implemented and used k-means together with DTW as the similarity measure, however, any other clustering approach can be used and evaluated in a similarly consistent way.

DTW was first introduced by Sakoe and Chiba (1971) and Hiroaki and Seibi (1978), with applications to speech recognition. It finds the optimal alignment (or coupling) between two sequences of numerical values and captures flexible similarities by aligning the coordinates inside both sequences.

In the following, we briefly describe how DTW works on two given sequences.
Given two sequences and with different lengths (), a warping path is an alignment between and , involving one-to-many mapping for each pair of elements. The cost of a warping path is calculated by the sum of each mapping pair cost. Furthermore, the warping path contains 3 constraints: (1). Endpoint constraint, which states that the alignment starts at pair and ends at pair ; (2). Monotonicity constraint, which states that the order of elements in the path for both and should be preserved same as the original order in and , respectively; (3). Step size constraint, which states that the difference of index for both and between two adjacent pairs in the path need to be no more than one step (i.e. pair can be followed by three possible pairs including , and
). As DTW is a distance measure that
searches the optimal warping path between two series, in DTW we firstly construct a cost matrix where each element is a cost of the pair specified by using Euclidean, Manhattan, or other distance functions. The initial step of DTW algorithm is defined as:

(3) |

The recursive function of DTW is defined as:

(4) |

where are weights for horizontal, vertical, and diagonal directions, respectively. denotes distance or cost between two subsequences and , and indicates the total cost of the optimal warping path. DTW is calculated based on dynamic programming of the order O(). In an equally weighted case, , the recursive function has the preference for diagonal alignment direction because the diagonal alignment takes one-step cost while the combination of a vertical and a horizontal alignment takes two-steps cost where this preference can be counterbalanced if needed by, for instance, setting . One potential issue of using this DTW definition is that the longer the two sequences are, the larger their DTW value will be, so its absolute value may not truly reflect the difference of the two sequences. Thus, we use the normalized DTW, defined by dividing the original DTW by the sum of the lengths of two sequences as:

(5) |

Each alignment in the warping path has a corresponding weight, selected from and the sum of the weights for all alignments equals to the sum of the lengths of two sequences i.e. . Therefore, the normalized DTW evaluates the average distance of alignments in the warping path for two sequences.

### 3.4 Spatio-temporal Clustering.

Once the quantities of interest for clustering are chosen, depending on the type of features, we use spatial clustering and/or temporal clustering as presented before for clustering a given field at the well-level. In the next step, we want to divide a given reservoir into separate zones in the physical space based on the well-level clusters obtained. There have been similar attempts in a few previous works as so-called spatially constrained multivariate clustering (Coppi et al., 2010; Chatzis, 2008; Legendre, 1987; Patil et al., 2006), however, these attempts in enforcing spatial coherence as constraints in the clustering framework have not been successful in the cases with multiple interrelated mixed type features. Also, in the spatially constrained multivariate clustering approaches, defining the proper constraints and satisfying them at the cluster level is a challenging step. In this work, we propose an alternative approach that is not only robust while dealing with mixed type temporal and spatial features but also easy to implement and tune.

Once the clusters at the well-level have been formed using spatial and/or temporal clustering, we find the separating hyperplanes between well-level clusters in order to divide a given field into separate zones. We shape this problem as a supervised learning and use

support vector machine (SVM)approach as a discriminant classifier

(Suykens and Vandewalle, 1999; Scholkopf et al., 1998). We consider wells as individual samples, the cluster labels of the wells as categorical target variables, and the wells’ head coordinates as numerical features. Forming the problem this way, SVM as a classifier predicts the separating hyperplanes of the clustered wells. To account for separations in higher feature space using the kernel trick we consider different kernel functions such as linear polynomial and exponential, and also account for regularization to avoid overfitting. For the details on support vector machines see (Scholkopf and Smola, 2001; Wang, 2005; Lin and Chih-Jen, 2012).### 3.5 Compartmentalized Reservoir Modeling.

In the following sections, we explain our proposed model to study a compartmentalized reservoir with a non-local formulation where the reservoir is represented by an adjacency matrix describing the neighbor and non-neighbor connections of the comprising compartments. We will look into the reservoir’s history using a so called material balance network (MatNet) methodology and also propose a forecast approach using a predictive material balance network.

### 3.6 Reservoir’s History - Material Balance Network (MatNet).

In this section, we present the generalized material balance equation that relies on the following assumptions: (1). the reservoir is under an isothermal transformation and is at equilibrium at any point in time; (2). gas component can be dissolved in the oil phase; (3). oil component can be volatile in the gas phase; (4). and oil, water, and rock are slightly compressible. We start by expressing the initial volume of gas component in the reservoir () as sum of the volumes present as the gas component in the gas phase and as the dissolved gas in the oil phase at the reservoir conditions. Similarly, the original oil in place () can be expressed as sum of the volumes present as the oil component in the oil phase and as the volatile oil in the gas phase at the reservoir conditions as:

(6a) | |||

(6b) |

After some time of recovery from the reservoir, the remaining oil/gas in place can be expressed as the difference between the initial volume of oil/gas in place and the cumulative volume of oil/gas produced as:

(7a) | |||

(7b) |

Using Equations Eq_GN11, Eq_GN12,Eq_GN21,Eq_GN22 we get the volume of gas component in the gas phase () and the volume of oil component in the oil phase () as:

(8a) | |||

(8b) |

We use the original reservoir volume as a control volume to write the material balance equation as:

(9a) | |||

(9b) | |||

(9c) |

where is the local volume change within a given block of the reservoir due to internal changes in pressure or local injections and productions leading to changes in oil, gas, water, and rock volumes, and is the change of volume due to non-local volumetric fluxes of different phases from other blocks of the reservoir (i.e. neighbor or non-neighbor connected blocks). here represents the difference between the initial and the current volumes at the reservoir conditions. The current volume of oil, gas, and water at reservoir conditions can be written as:

(10a) | |||

(10b) | |||

(10c) |

with the original volume of oil, gas, and water at the reservoir conditions being , , and , respectively. The difference between the initial and the current oil/gas/water volumes at the reservoir conditions becomes:

(11a) | |||

(11b) | |||

(11c) |

with , , and being the initial pore volume, the water compressibility, and the change of reservoir pressure, respectively. Finally, the pressure depletion applied to the rock phase creates a shrinkage of the reservoir pore volume () as:

(12) |

where represents the formation (rock) compressibility. Finally, the volume change due to contributions from adjacent compartments (i.e. neighboring communications) can be expressed as:

(13) |

By substituting Equations Eq_Vo,Eq_Vg,Eq_Vw,Eq_Vr,Eq_Neigh in Equation (9a) and expressing the initial pore volume as a function of the initial reservoir fluid volumes as we get:

(14) |

Finally, using Equations (8a) and (8b) in Equation (14) and by some rearrangements and simplifications we get the general material balance equation as:

(15) |

In order to solve the Equation (15) as a coupled non-linear system of equations for all the compartments, we write it in the Residual form and after building the Jacobian matrix, we solve for the unknown pressure of each compartment as:

(16a) | |||

(16b) | |||

(16c) | |||

(16d) |

For an arbitrary block , we split the Residual into two parts of local and non-local, according to the Equation (9a) as where the local and non-local Residuals, respectively, are:

(17) |

(18) |

To calculate the Jacobian matrix in Equation (16c), we split the derivative of the Residual of an arbitrary block into two parts, one with respect to the compartment and one with respect to the connected compartment to , as:

(19) |

where the derivative of the Residual terms in Equation (19) can be calculated, respectively, as:

(20) |

(21) |

(22) |

where Equations Eq_jacob1,Eq_jacob2 and (22) are, respectively, the diagonal and off-diagonal terms in the Jacobian matrix.

As the aquifer model, we use Fetkovich (1971) and Ahmed (2006)’s approach to define the cumulative volume of water encroached from the aquifer as a function of the reservoir’s pressure drop where the cumulative encroached water volume is computed recursively at each time step as:

(23) |

where with being the total aquifer’s compressibility as . Rearranging Equation (23), we get the aquifer’s cumulative encroached water volume at each time step as:

(24) |

Assuming that , Equation (24) can be written as:

(25) |

We can rewrite Equation (25) as:

(26) |

where and . For a few time steps, Equation (26) can be written as:

(27a) | |||

(27b) | |||

(27c) | |||

(27d) |

As , can be written as:

(28) |

The derivative of with respect to reservoir’s pressure can also be calculated as:

(29) |

Finally, using the Fetkovich’s approach, the aquifer terms in the Residual and Jacobian Equations Eq_ResLoc1,Eq_jacob1 can be calculated using the above Equations Eq_wen2, Eq_we_jacob.

### 3.7 Reservoir’s Forecast - Predictive Material Balance Network.

The material balance network approach presented in the previous section is used for analysis of the reservoir based on the available historical data. In this section, we extend the formulation to forecast the reservoir’s performance based on the historical data and the planned schedule of recovery in the future. In particular, a new algorithm is developed for three-phase flows in a mixed-drive reservoir, capable of predicting flow rates and pressure, assuming pressure constraints on the producing wells. Existing methods in the literature by Tarner (1944), Muskat (1945), and Tracy (1995) are designed for solution-gas drive reservoirs with two-phase flows. In the historical material balance network analysis, the flow rates corresponding to each phase are available based on the historical data, however, in the predictive mode, the rates are forecasted and therefore, the constraint on the producer wells is of flowing bottomhole pressure type.

At each time step, there are four unknowns as pressure (), cumulative oil production (), cumulative gas production (), and cumulative water production (). Therefore, four equations are required at each time step to solve for the four unknowns.

The first equation is the general material balance equation (15) as:

(30) |

As the second equation, we consider the flowing bottomhole pressure (FBHP) as the operating parameter. In this case the second equation describes the inflow performance relation (IPR). This equation relates the total liquid production rate with the flowing bottomhole pressure and the average reservoir pressure at each time step. Here, we consider the Vogel’s equation (Ahmed, 2006) as:

(31) |

where is the total liquid (oil and water) production rate and is the maximum liquid flow rate at zero FBHP, also called absolute open flow (AOF). is the flowing BHP for an average well, is the average reservoir pressure, is the number of active producers at each time step, and and are the cumulative volume of water and oil production during one time step respectively.

In the third equation, the relationship between field measurements of water/oil ratio () and fluid and rock-fluid characteristics including , viscosity, and relative permeability information will be considered as:

(32) |

The fourth equation relates the field measurements of gas/oil ratio (GOR) to characteristic properties of oil, gas, and rock including viscosity, PVT, and relative permeability as:

(33) |

In Equations Eq_forecast3, Eq_forecast4 above, , , and are the relative permeabilities of rock to gas, oil, and water in the presence of the three phases respectively, and , , and , respectively represent the viscosity of oil, gas, and water () at the current reservoir pressure. Likewise, , , and correspond to the formation volume factors (RB/STB) of oil, gas, and water at the current reservoir pressure respectively.

It is worth mentioning that the relative permeability ratios in both Equations Eq_forecast3, Eq_forecast4 are different from the ones measured in the lab using cores. The former also depends on the development and operation history of the field, whereas the latter only depends on the rock and fluid properties and their interaction.

In order to solve the Equations Eq_forecast1,Eq_forecast2,Eq_forecast3,Eq_forecast4 as a non-linear system of equations, we write them in the linear Residual form and after building the Jacobian matrix, we solve for the unknowns vector () as:

(34a) | |||

(34b) | |||

(34c) | |||

(34d) | |||

(34e) |

Here, we build the unknowns vector over all the four unknowns (i.e. ) and over all the blocks (i.e. ) in a block-based order (i.e. ). Ordering the unknowns this way leads to a less sparse block-diagonal Jacobian matrix and leads to a smoother convergence of the linear solver once solving Equation (34d) to find . The Jacobian matrix has diagonal blocks (local Jacobian elements) and some non-diagonal non-zero elements due to non-local connectivity of the blocks in the reservoir and pressure derivatives with respect to the adjacent blocks.

First, we present the Equations Eq_forecast1,Eq_forecast2,Eq_forecast3,Eq_forecast4 in a Residual form. The Residual form of Equation (30) is similar to Equations Eq_ResLoc1, Eq_ResLoc2 presented before. The Residual form of Equation (31) would be:

(35) |

The Residual form of Equation (32) can be written as:

(36) |

And finally, the Residual form of the Equation (33) becomes:

(37) |

The corresponding Jacobian equations for the Residual terms can be derived in a similar way as Equation (19).

### 3.8 Global History Matching.

We use an efficient hierarchical history matching approach to update the model parameters. Our history matching approach is a global history matching where large-scale features such as transmissibility between the blocks, aquifer size and strength, average properties of each block, and original oil in plae per block are adjusted to match fluid production and bottomhole pressure data using ensemble smoother robust Levenberg-Marquardt (ES-rLM) (Ma et al., 2017)

history matching workflow. Rather than a long process of finding the minimum objective function value using for instance a genetic algorithm, the ES-rLM is a Bayesian approach that accounts for probabilistic characteristics, which requires much fewer simulation runs.

Salehi et al. (2019a) have recently applied this algorithm for solving a full-physics history matching probelm.Although the ES-rLM algorithms have been discussed elsewhere (Ma et al., 2017), we provide a brief review of the main equations here for completeness. The ensemble equation derived based on the Levenberg-Marquardt algorithm can be written as:

(38) |

where is an ensemble, at the iteration. The relationship between the ensembles and the model parameters can be described as:

(39) |

where is the linearized forward model around the ensemble mean, and the mean of the predicated data is:

(40) |

and,