Plain Language Summary
Modeling for engineering applications that focus on multiphysics processes in porous media is usually facilitated through mathematical equations. Due to non-uniform material properties and non-linear relations, the solution often requires numerical methods. However, these methods require substantial computational resources for large and complex problems. Consequently, model order reduction techniques have been developed as an alternative to accelerate our practical solutions to these multiphysics processes. Although they work well in many cases, it is not easy to parameterize spatially heterogeneous material properties. This work develops a new data-driven framework to take given heterogeneous fields as input and then deliver approximated quantities of interest (e.g., pressure and solid deformation). This framework could provide a fast solution, and two-dimensional demonstration cases show at least 10,000 times speed-up compared to a traditional numerical method (i.e., finite element model). Moreover, it gives a solution with a relative root mean squared error of less than 2 %, which can be an acceptable accuracy given the parameter uncertainty and measurement errors. This framework would allow us to perform field-scale inverse problems and optimization in realistic, complex problems.
Many subsurface applications in porous media ranging from groundwater and contaminant hydrology to sequestration rely on physical models [middleton2015shale, choo2018cracking, yu2020poroelastic, kadeethum2019investigation, bouklas2012swelling, newell2017investigation]
. These models often seek the solution of the governing partial differential equations (PDEs) in a domain of interest. For instance, coupled hydro-mechanical (HM) processes in porous media, in which fluid flow and solid deformation tightly interact in a time-dependent fashion, could be described by Biot’s formulation of linear poroelasticity[biot1941general]. These PDEs are often solved numerically using various techniques such as finite volume or finite element methods [wheeler2014coupling, deng2017locally, liu2018lowest], which is referred to as full order model (FOM) approaches. However, computational methods to handle field-scale systems require substantial computational resources, especially when discontinuities or nonlinearities arise in the solution [hansen2010discrete, hesthaven2016certified]. Therefore, in some instances, the FOM is not directly suitable to handle large-scale inverse problems, optimization, or even concurrent multiscale calculations in which an extensive set of simulations are required to be explored [ballarin2019pod, hesthaven2016certified].
A reduced order model (ROM) that aims to produce a low-dimensional representation of FOM could be an alternative to handling field-scale inverse problems, optimization, or real-time reservoir management [schilders2008model, amsallem2015design, choi2019accelerating, choi2020gradient, mcbane2021component, yoon2009environmental, yoon2009numerical]. The ROM methodology primarily relies on the parameterization of a problem (i.e., repeated evaluations of a problem depending on parameters), which could correspond to physical properties, geometric characteristics, or boundary conditions [ballarin2019pod, venturi2019weighted, hesthaven2016certified, hoang2021domain, copeland2021reduced, choi2020sns, choi2021space, carlberg2018conservative]. However, it is difficult to parameterize heterogeneous spatial fields of PDE coefficients such as heterogeneous material properties by a few parameters. Coupled processes such as HM processes commonly involve complex subsurface structures [flemisch2018benchmarks, jia2017comprehensive, chang2020hydromechanical, chang2020operational, chang2021mitigating] where the corresponding spatially distributed material properties can span several orders of magnitude and include discontinuous features (e.g., fractures, vugs, or channels). Consequently, traditional projection-based ROM approaches might not be suitable for this type of problem as they commonly employ a proper orthogonal decomposition (POD) approach, which in turn will require a high dimensional reduced basis to capture most of the information at the expense of the computational cost [kadeethum2021framework].
kadeethum2021framework present a data-driven generative adversarial networks (GAN) framework that can parameterize heterogeneous PDEs coefficients (i.e., heterogeneous fields) and work for both forward and inverse problems. This GAN-based work could be considered as an extension of regression in subsurface physics through GAN model such as zhong2019predicting, in which heterogeneous fields are also parameterized through GAN model [chan2020parametrization] and subsequently used to predict the state variables (pressure and displacement responses). In addition, this GAN framework based on conditional GAN (cGAN) [mirza2014conditional, isola2017image]
is not conditioned by categorical data (e.g., cat or dog), but by a whole heterogeneous field (for both generator and critic as well as its loss function). Hence, given a highly heterogeneous permeability field, it can approximate fluid pressure and solid deformation for forward modeling. Using noisy and sparsely available pressure data as an input for inverse modeling, the model can accurately approximate a complex permeability field. kadeethum2021framework also show that by utilizing Earth mover’s distance through Wasserstein loss (W loss) and gradient penalty[arjovsky2017wasserstein, gulrajani2017improved], the model accuracy and training stability can be improved significantly. This is because the Earth mover’s distance enforces the framework to approximate the training data distribution rather than a point-to-point mapping. Hence, the model can predict physical quantities depending upon heterogeneous fields well. However, the framework developed by kadeethum2021framework is limited to only steady-state solutions of given PDEs.
Recently, ding2020continuous developed continuous conditional generative adversarial networks (CcGAN) to condition the GAN model with continuous variables such as the weight of each animal. The CcGAN concept can extend the conditioned information from categorical data (e.g., cat or dog) to more quantitative measures (e.g., the weight of each animal). For PDE problems, the concept of CcGAN can also be extended to quantify continuous variables (e.g., time domain), enabling the solution of time-dependent PDEs.
In this work, we extend our previous work [kadeethum2021framework] by extending the CcGAN concept to solve for time-dependent PDEs. This new framework is developed by utilizing element-wise addition or conditional batch normalization [de2017modulating] to incorporate the time domain in both training and prediction processes. The CcGAN approach to solving time-dependent PDEs is uniquely employed to develop a data-driven surrogate model given highly heterogeneous permeability fields generated from two known distributions. The CcGAN performance will be evaluated by comparing the CcGAN-based results with FOM-based solutions, highlighting the computational accuracy and efficiency in heterogeneous permeability fields. The rest of the manuscript is summarized as follows. We begin with the model description and CcGAN architecture in Section 2. The two variants of the framework, including element-wise addition or conditional batch normalization to incorporate the time domain, are also discussed. We then illustrate our ROM framework using three numerical examples in Section 4. Lastly, we conclude our findings in Section 5.
We first present a general framework with a parameterized system of time-dependent PDEs and then as a demonstration case we focus on the linear poroelasticity to represent a coupled solid deformation and fluid diffusion in porous media with highly heterogeneous permeability. A parameterized system of time-dependent PDEs are as following
where corresponds to system of time dependent PDEs, () denotes the computational domain, and denote the Dirichlet and Neumann boundaries, respectively. and are prescribed values on and , respectively. is an initial value of . The time domain is partitioned into subintervals such that , We denote as th time-step, . is a set of scalar (
) or tensor valued (e.g.) generalized primary variables. For the parameter domain , it composes of members, i.e., , , , , , and could be any instances of given . could be scalar () or tensor valued (e.g. ) generalized parameters. We want to emphasize that is an exact solution of and not an approximation through FOM. In general, could correspond to physical properties, geometric characteristics, or boundary conditions. In this study, we limit our interest in approximate primary variables for a solution of the system of PDEs given heterogeneous fields for the generalized parameters at any given time , and note that represents heterogeneous permeability fields that are constant through time ).
2.1 Framework development
A conceptual schematic is illustrated in Figure 1. We train our framework using obtained from FOM. Our target here is to deliver with acceptable accuracy but much lower computational cost compared to . The proposed framework based on the offline-online paradigm is presented in Figure 1b. The offline phase comprises the first to third steps, starting from data generation of 1. permeability fields and 2. to 3. training of CcGAN. The fourth step is the online phase or prediction of .
2.1.1 Offline stages
The first step is an initialization of training, validation, and test sets of parameters (, , and ) of cardinality , , and , respectively. In Figure 1, we illustrate only and for the sake of brevity. We want to emphasize that , , and . These , , and here represent any physical properties, but it could also serve as geometric characteristics or boundary conditions. In this work, we follow kadeethum2021framework and focus on using , , and to represent collections of spatially heterogeneous scalar coefficients - more specifically heterogeneous permeability fields - see Section 3 for details of how we generate these data.
In the second step, we query the FOM, which can provide a solution in a finite-dimensional setting for each parameter (i.e., ) in the training set. Throughout this study, for the sake of simplicity, we use a uniform time-step which leads to each query of the FOM having the same number of . In cases where adaptive time-stepping is required, for instance, advection-diffusion problems, the procedures proposed in kadeethum2021nonTH could be utilized. The same operations follow for each and in the validation and test sets. This work focuses on the linear poroelasticity equations and demonstrates our proposed framework with highly heterogeneous permeability fields. The FOM is used to approximate primary variables , which correspond to bulk displacement () and fluid pressure () fields at each time-step given the field of parameters , in this case - permeability field, as input. Please find the detailed description in Problem description, model geometry, and boundaries section in Supplementary information.
In the third step, ROM is constructed by training the data generated from the FOM where the inputs to the model are and , and the output is or with given and . In this study, we build a separate model for each primary variable ( and
), although both primary variables can be trained together with a single model. A key aspect of this work is to apply the CcGAN image-to-image translation framework for time-dependent PDEs by adapting the concept of a naive label input (NLI) and an improved label input (ILI) proposed by ding2020continuous to the framework developed by kadeethum2021framework. The proposed framework in this work consists of agenerator and critic where two types of architecture for the generator with a similar critic architecture are presented (Figure 2).
The first one uses the NLI concept (i.e., NLI model) by introducing a temporal term () to the generator’s bottleneck using element-wise addition. The details of the architecture can be found in Table S1. The second one adopts the ILI concept (i.e., ILI model) by injecting the temporal term to all layers inside the generator through conditional batch normalization [de2017modulating]. However, in contrast to ding2020continuous,de2017modulating, our is not categorical data (i.e., not a tag of number ranging from zero to nine), but continuous variable (i.e.,
. Hence, we replace embedded layers with an artificial neural network (ANN). To elaborate, each conditional batch normalization composes of one ANN and one batch normalization layer. Each ANN composes of one input, three hidden, and one output layers. Each hidden layer and the input layer are subjected to the tanh activation function. In this way, we can inform eachto each layer of our generator through a conditional batch normalization concept. The details of the ILI architecture can be found in Table S2.
The critic, similar for both NLI and ILI, uses time , parameter , and primary variable ( or ) as its inputs. The output is a patch score added with an inner product calculated using two linear layers and output from the last contracting block ( contracting block) of the critic. To elaborate, parameter () and primary variable ( or ) are fed into the convolutional layer of the critic while time () is injected into the model using linear layer shown in Figure 2. The output from the contracting block of the critic is then passed through the linear layer shown in Figure 2 and performed an inner product operation with the output from the linear layer. The result of this inner product is then added (element-wise) to the patch score presented in Figure 2. The architecture of the critic can be found in Table S3, and kadeethum2021framework,demir2018patch. We also provide an implementation of these models, NLI and ILI, and critic (see Code Availability).
To train both generator and critic, we normalize , , and primary variables ( or, in this paper, and ) to be in a range of . The Wasserstein loss (W loss) [arjovsky2017wasserstein, gulrajani2017improved] is used since it has been shown to provide the best result when dealing with building data-driven frameworks for PDEs (e.g., a series of numerical experiments have been illustrated in kadeethum2021framework). In short, this implementation enforces the model to approximate the training data distribution instead of aiming to do the point-to-point mapping. This improves our model generalization, which is essential as we deal with heterogeneous permeability fields, which can have virtually infinite realizations. The W loss uses the following constraint
Here, and are short for generator and critic, respectively, and is the Earth mover’s distance defined as
where is the final score of the critic with given , , and or shown in Figure 2, and is a batch size, which is set as . The Earth mover’s distance is used to measure the distance between output of the model and the training data. This way helps our model to better generalize its prediction. Additionally, is a user-defined penalty constant that we set at , and as a reconstruction error term is given by
denotes a gradient penalty constant set to 10 throughout this study; is the gradient penalty regularization. The latter is used to enforce Lipschitz continuity of the weight matrices (), i.e., Euclidean norm of discriminator’s gradient is at most one, and it reads as
where is L2 or Euclidean norm. This term helps to improve the stability of the training by limiting the step we can take in updating our trainable parameters (weight matrices () and biases (). The term
is an interpolation betweenand , which is defined by
We randomly select for each
from a uniform distribution on the interval ofkingma2014adam] to train the framework (i.e., updating a set of weight matrices () and biases ()). The learning rate () is calculated as [loshchilov2016sgdr]
where is a learning rate at , is the minimum learning rate (), is the maximum or initial learning rate (), is the current step, and is the final step. We note that each step refers to each time we perform back-propagation, including updating both generator and critic’s parameters ( and ).
2.1.2 Online stage
For the fourth step, we use the trained generator to predict given and or for the instances of the parameters belonging to the validation and test sets. To avoid over-fitting, we first use
to evaluate our framework as a function of epoch. Subsequently, we select the model (fixedand at certain epoch) that provides the best accuracy for to test the . To elaborate, we train our model for 50 epochs. We then test our model against our validation set and observe the model performance as a function of the epoch. We then select a set of and at the epoch that has the best accuracy. Other hyper-parameters, number of layers, number of hidden layers, convolutional layers’ parameters, and initialization of the framework are used based on the neural architecture search performed in kadeethum2021framework.
3 Data generation
We utilize a discontinuous Galerkin finite element (FE) model of linear poroelasticity developed in kadeethum2020enriched, kadeethum2020finite to generate training, validation, and test sets, see Figure 1 - initialization. The geometry, boundary conditions, and input parameters are similar to that used in kadeethum2021framework where a steady-state solution of linear poroelasticity is studied, but, in this work, the temporal output of pressure and displacement are investigated, resulting in the dynamic behavior of pressure (), displacement () as well as pore volume. The mesh and boundary conditions over the square domain used in this work are presented in Figure S1. We enforce constant pressures of 0 and 1000 at the top and bottom boundaries, respectively, to allow fluid to flow from the bottom to the top while no-flow boundary on both left and right sides. Furthermore, we compress the medium with normal traction of 1000 applied at the top boundary. We fix the normal displacement to zero for the left, right, and bottom boundaries. The initial pressure is 1000 , and initial displacement is calculated based on the equilibrium state.
To obtain a set of parameters corresponding to heterogeneous fields, we focus on two types of highly heterogeneous fields generated using: (1) a Zinn & Harvey transformation [zinn2003good], and (2) a a bimodal transformation [muller2020]. The details of generation of fields is available in kadeethum2021framework. Briefly, the field from the Zinn & Harvey transformation has a wider range of values with thinner high permeability pathways. This feature represents highly heterogeneous sedimentary aquifers with preferential flow pathways, such as the MADE site in Mississippi [zinn2003good] and the Culebra dolomite developed for the Waste Isolation Pilot Plant (WIPP) project in New Mexico [yoon2013parameter]. In contrast, the field from the bimodal transformation has a narrow range values with wider high permeability pathways, which is a good representation of sandstone reservoirs with an iron inclusion, for example, Chadormalu reservoirs in Yazd province, Iran [daya2015application]. A few examples of fields from both transformations are shown in Figures 3.
In this work, three examples of fields are used. Two examples are from Zinn & Harvey and bimodal distributions. For Example 3, these two fields are used together. Note that we employ unstructured grids in the finite element solver. However, our framework in this study requires a structured data set. Thus, we interpolate the FE result to structured grids using cubic spline interpolation. We then replace the FOM dimension , associated with the unstructured grid, with a pair , corresponding to the structured grid. The same procedures are carried out for the displacement field .
For simplicity, the FE solver employs a fixed number of for all . Our time domain is set as , and , which leads to . The total size of data set is calculated by where is the number of training, validation, or test sets. While we investigate the effect of on the data-driven model accuracy, is fixed. The samples of the test set of , , and are presented in Figure 3. We note that the difference (DIFF) between solutions produced by the FOM (FEM in this case) and ROM (CcGAN in this case) is calculated by
To reiterate, represents and , represents and , and is field. We also use the relative root mean square error (relative RMSE) read
to evaluate our model. Here, and are the ground truth (FOM result) and approximated values (ROM result), respectively.
4 Numerical results
4.1 Example 1: Zinn & Harvey transformation
The first Example test cases from the Zinn & Harvey transformation are shown in Figures 3a-b, e-f, including fields, FOM and ROM results, and DIFF fields for pressure and displacement fields, respectively. The box plots of relative RMSE values of pressure () during training with the validation set are presented for NLI and ILI models in Figures S2 and S3, respectively, over epochs for different numbers of . As expected, the model performance is improved as the number of training samples increases. Besides, the accuracy improves as the training progresses over epochs but reaches a plateau around epochs 32-34. Figures S2 and S3 show that the ILI model performs slightly better than the NLI model. The behavior of the results of is similar to that of , but the relative RMSE is lower.
The best-trained model is tested against the test set. The distributions of and for the selected test cases are shown in Figures 3a-b, e-f, respectively. The DIFF values are very low (i.e., less than one percent of the average field values). The relative RMSE results of both and of the test set are provided in Table 1. The relative RMSE values of the test set in Table 1 are very close to those of the validation set (Figures S2 and S3). As in training, model performance improves with increasing . Besides, ILI always performs better than NLI. The relative RMSE of displacement () is generally lower than that of pressure (), which attributes to the relatively uniform response of displacement fields, compared to the pressure field. Hence, the ROM can learn the solution easier.
Example 3: a total number of is the sum of training data from both Examples 1 and 2.
4.2 Example 2: bimodal transformation
The second Example presents the model performance using fields from the bimodal transformation, which has a narrow range of values with wider high permeability pathways. As in Example 1, selected test cases of fields, FOM and ROM results, and DIFF fields are presented in Figures 3c-d, g-h. The box plot of relative RMSE values of the validation set is presented in Figures S4 and S5. Similar to Example 1, the model performance improves with increasing . Moreover, the higher the number of epochs, the model tends to provide more accurate results. Although there are some fluctuations of the relative RMSE values at a later training stage, the model accuracy tends to improve as the training progresses. Except for the case where , the ILI model provides better results than the NLI.
For the testing set, the ILI model provides better accuracy than the NLI except for (Table 1). It is noted that ILI always performs better than NLI for the state variable . Moreover, the relative RMSE results of is always lower than those of , which are similar to Example 1.
The relative RMSE of Example 2 is slightly lower than that of Example 1, which is consistent with the steady-state results shown in kadeethum2021framework. In addition, the trend of the relative RMSE values between NLI and ILI is similar in both Examples 1 and 2. We will discuss the performance of NLI and ILI later. Since fields from the bimodal transformation have a narrower range and wider permeable pathways with less contrast compared to those from the Zinn & Harvey transformation (see Figure 3, the corresponding pressure field may have similar features. This can be seen in the DIFF distribution where the DIFF values are larger along high-pressure gradient regions in all pressure cases (Figure 3a-d). At in Example 2, high DIFF values are mostly located near the top boundary where the pressure boundary is set to zero after the initial pressure of 1000 Pa. Over time the DIFF distribution propagates as the pressure contrast migrates along the high permeability regions (e.g., the DIFF fields at ). Compared to Example 1, where pressure gradients tend to be slightly gradual (e.g., wider transition) and the high DIFF values are distributed in larger areas, Example 2 cases show the higher contrast of pressure values along with the pressure transition, and the high DIFF values tend to be distributed more locally (Figure 3a-d).
Although this comparison qualitatively shows the dependency of the model performance on input and output characteristics, it is also well-known that deep neural networks often need complex neural network architecture to extract and learn high-frequency features such as high permeability contrast and high-pressure gradients in this work [xu2019frequency]. Recent work by kim2021connectivity transformed physical connectivity information of the high contrast drainage network into multiple binary matrices, which improved the network generation using deep convolutional GAN. However, the success rates of the drainage network with proper connectivity were relatively low. CcGAN developed in this work shows that although it is still challenging to improve the prediction accuracy, the increase in the training data sets may provide a potential solution to this challenging problem. However, the increase of training data sets also increases required computational resources. This aspect will be discussed later. For displacement results (Figure 3e-h) the relative RMSE results in Example 2 (Table 1) follow the same trend in the pressure results. The lower relative RMSE values of displacement than pressure also stem from the smooth displacement fields compared to pressure fields. It is noted that the relative RMSE values of the test set are similar to those of the validation set.
4.3 Example 3: Combined Zinn & Harvey and bimodal transformations
In this example, permeability fields from both Zinn & Harvey and bimodal transformations are used to test the generalization ability of the proposed approach (i.e., Figures 3a-h). As discussed earlier, Example 3 can represent different types of heterogeneity with high permeable pathways. The relative RMSE of the validation set for pressure fields () is presented in Figures S6 and S7. Similar to Examples 1 and 2, the model accuracy improves with increasing , and we did not observe any over-fitting behavior.
As presented in Table 1, for both pressure and displacement fields, the ILI model performs better than the NLI, and the model with a higher (i.e., more training data) provides better accuracy. Note that Example 3 has two times higher than the other two cases. Overall, the relative RMSE values in Example 3 are closer to those in Example 1 rather than Example 2 for the same number of . This indicates that more challenging fields for ML training predominantly govern the model performance with combined fields. In addition, Example 3 tends to have higher relative RMSE values than Examples 1 and 2 for the lower number of (e.g., = 2500 and 5000 for pressure and displacement, respectively). As the number of
increases, however, Example 3 has lower RMSE values than Example 1 and gets closer to Example 2, indicating that more training data sets improve the ML model to a certain degree. Although there may be more optimal hyperparameters to train both fields better, these results demonstrate the general learning capability of the proposed models. The inclusion of a more challenging data set will increase the generalization capability of the trained model.
4.4 Computational costs
A summary of the wall time used for each operation (i.e., steps 2 to 4 in Figure 1) is presented in Table S4. The time for step 1 or initialization was negligible compared to the other steps; hence, we leave out the time used for step 1 from Table S4. The finite element model (FOM) was run using a single AMD Ryzen Threadripper 3970X, while training and testing of CcGAN or ROM models were carried out with single GPU (NVIDIA Quadro RTX 6000). With the fixed number of throughout this study, each FOM simulation (per ) takes about 40 seconds. Consequently, if we select as an example, it would take about 400,000 seconds (111 hrs) to build the training set. To train the model using , it takes approximately 30 hours. During the online or the prediction phase, however, the trained model could provide its result within 0.001 seconds per testing . It should be noted that the ROM as the trained model in this work is not required to have the number of in which FOM uses. Assuming the ROM also uses , it still provides a speed-up by 10,000 times. One advantage of the ROM framework compared to the FOM is that it can also deliver the solution at any time, including times that do not exist in the training snapshots since it is simply a nonlinear interpolation in output space. This characteristic is an asset of our ROM because it is not bound by any time-stepping constraints and can evaluate our quantities of interest at any time required. For instance, we may be interested in pressure and displacement fields at one, two, and three hours with given different . To achieve this with FOM, it may need to go through many steps in between. However, ROM enables us to evaluate it at those three times only.
4.5 Prediction accuracy and geophysical applications
With three examples presented, we observe that the ILI model performs better than the NLI for all cases except one particular case (Table 1). This finding is in good agreement with classification problems in ding2020continuous. ding2020continuous speculates that the ILI overcomes the label inconsistency of the classification problems while the NLI could not. Our reasoning for the outperformance of the INI stems from continuous conditional batch normalization, which can carry out temporal information more consistently than the element-wise addition in the NLI model. In addition to the skip connections that are essential in transferring multiscale features between input (permeability fields) and output (pressure and displacement fields) (see kadeethum2021framework), CcGAN provides the framework to account for temporal features over time, resulting in better representation of temporal patterns.
One key aspect of many subsurface energy applications is a coupled process that involves hydrogeology and geomechanics. Although ML-based data-driven modeling has been increasingly studied for reservoir modeling, most are still limited to uncoupled processes (e.g., lange2020machine,miah2020predictive) or relatively homogeneous fields (e.g., zounemat2021ensemble,zhao2021reduced). The CcGAN approach proposed in this study demonstrates its capability to handle coupled hydro-mechanical processes with relative RMSE less than 2% of the transient pressure and displacement responses in the worst case. The results also show that the relative RMSE of the ILI model can be improved to a scale of 1% with more training data sets, which can be acceptable given the uncertainty and observational error in the subsurface systems. Additionally, our framework for model prediction (i.e., online stage) achieves up to
10,000x speed-up compared to the FE solver. Note that the computational advantage of ML-driven ROM models will increase further with the increasing degree of freedom in the FE solver (e.g., three-dimensional and longer transient problems). This computational advantage and accuracy will enable us to achieve real-time reservoir management and robust uncertainty quantification even for vast parameter space. At the same time, the ROM can be updated offline as necessary. It should be noted that the method presented here can be extended to incorporate more continuous variables into the system. For instance, besides the time domain, we could also add Young’s modulus and Poisson ratio into the CcGAN model. Furthermore, since this model isdata-driven, it is not limited to only coupled hydro-mechanical processes presented in this manuscript but also applicable to other coupled processes such as thermal-hydrological-mechanical-chemical (THMC) processes.
This work presents a data-driven framework for solving a system of time-dependent partial differential equations, more explicitly focusing on coupled hydro-mechanical processes in heterogeneous porous media. This framework can be used as a proxy for time-dependent coupled processes in heterogeneous porous media, which is challenging in classical model order reduction. Our framework is developed upon continuous conditional generative adversarial networks (CcGAN) composed of the U-Net generator and patch-based critic. The model has two variations: (1) the time domain is introduced to only the generator’s bottleneck using element-wise addition (i.e., NLI), and (2) the time domain is injected into all layers inside the generator through conditional batch normalization (i.e., ILI). The critic is similar for both models. Our approach is desirable because it does not require any cumbersome modifications of FOM source codes and can be applied to any existing FOM platforms. In this regard, the CcGAN approach to solve time-dependent PDEs is uniquely employed to develop a data-driven surrogate model given highly heterogeneous permeability fields. We illustrate that our framework could efficiently and accurately approximate finite element results given a wide variety of permeability fields. Our results have a relative root mean square error of less than 2% with 10,000 training samples for training. Additionally, it could speed up at least 10,000 times compared to a forward finite element solver. ILI delivers slightly better results than NLI without any observable additional computational costs. To this end, this framework will enable us to do a large-scale operation of real-time reservoir management or uncertainty quantification with complex heterogeneous permeability fields, which are practically very difficult to do with FOM and traditional model order reductions.