Laboratory study of reservoir rock samples of a geologic formation (Core Analysis) is the direct way to determine reservoir properties and to provide accurate input data for geological models (Andersen M. A., 2013). Geoscientists have developed a variety of approaches for measuring properties of reservoir rocks, such as porosity, permeability, residual oil saturation, and many others. The information obtained from core analysis aids in formation evaluation, reservoir development, and reservoir engineering (McPhee et al, 2015; Mahzari et al, 2018). It can be used to calibrate log and seismic measurements and to help with well placement, completion design, and other aspects of reservoir production.
Common applications of Core Analysis include (Gaafar et al, 2015):
definitions of porosity and permeability, residual fluid saturations, lithology and prediction of possible production of gas, condensate, oil or water;
definition of spatial distributions of porosity, permeability and lithology to characterize a reservoir in macro scale;
definition of fluids distribution in a reservoir (estimation of fluids contacts, transition zones);
performing special core analysis tests to define the most effective field development plan to maximize oil recovery and profitability.
Unfortunately, Core Analysis is expensive and tedious. Laboratory study requires careful planning to obtain data with minimum uncertainties (Ottesen and Hjelmeland, 2008). Proper results of basic laboratory tests, provides the reservoir management team with a vital information for further development and production strategy.
Core analysis is generally categorized into two groups: conventional or routine core analysis (RCA) and special core analysis (SCAL) (Dandekar, 2006). RCA generally refers to the measurements for defining porosity, grain density, absolute permeability, fluid saturations, and a lithologic description of the core. Samples for conventional core analysis are usually collected three to four times per meter (Monicard, 1980). Fine stratification features and spatial variations in lithology may require more frequent sampling.
Probably the most prominent SCAL tests are two-phase or three-phase fluid flow experiments in the rock samples for defining relative permeability, wettability, and capillary pressure. In addition, SCAL tests may also include measurements of electrical and mechanical properties, petrographic studies and formation damage tests (Orlov et al, 2018). Petrographic and mineralogical studies include imaging of the formation rock samples through thin-section analysis, X-ray diffraction, scanning electron microscopy (SEM), and computed tomography (CT) scanning in order to obtain better visualization of the pore space (Dandekar, 2006; Liu et al, 2017; Soulaine and Tchelepi, 2016). SCAL is a detailed study of rock characteristics, but it is time-consuming and expensive. As a result, a number of SCAL measurements is much less than a number of RCA measurements (5-30% of RCA tests). In this way, SCAL data space requires correct expansion or extrapolation to the data space covered by RCA. To provide the expansion, core samples set used in SCAL tests should be highly representative and contain all the rock types and cover a wide range of permeability and porosity (Stewart, 2011). Even then, sometimes it is difficult to estimate correlations between conventional and special core analysis results and expand SCAL data to the available RCA dataset. There are few common approaches on stretching the SCAL data to RCA data space:
typification (defining rock types with typical SCAL characteristics in certain ranges of RCA parameters);
petrophysical models (SCAL characteristics included as parameters in functional dependencies between RCA characteristics);
prediction models based on machine learning (RCA parameters used as features to predict SCAL characteristics).
The first approach leads to a significant simplification of reservoir characterization and is based on subjective conclusions. Petrophysical models allow predicting only a few of SCAL characteristics (basically capillary curves and residual saturations). The last approach looks more promising as it accounts for all the available features (measurements) and builds implicit correlations among the features (Meshalkin et al, 2018; Tahmasebi et al, 2018).
The purpose of this research is to demonstrate the performance of Machine Learning (ML) at maximizing the effect of RCA and SCAL data treatment. Machine Learning is a subarea of artificial intelligence based on the idea that an intelligent algorithm can learn from data, identify patterns and make decisions with minimal human intervention(Kotsiantis et al, 2007).
Commonly spread feature of fields in Eastern Siberia is salts (the ionic compound that can be formed by the neutralization reaction of an acid and a base) presented in the pore volume of the deposits. Salts distribution in the reservoirs depends on a complex of sedimentation processes. Thus, the key challenge of this work is to develop prediction models, which can characterize the quantity of soluble rock matrix components (sodium chloride and other ionic compounds) and an increase of porosity and permeability after reservoir desalination due to drilling mud or water injection (ablation).
One of the main challenges for geoscientists is forecasting salts distribution in productive horizons together with porosity and permeability alteration due to the salts ablation. It is very important for:
estimations of original porosity and permeability in wells as water-based drilling muds can change pore structure during wellbore drilling and coring,
RCA and SCAL data validation and correction due to pore structure alteration during core sample preparation and consequent measurements,
reservoir engineering (IOR&EOR based on water injection).
In this work salts content and alteration of porosity and permeability after desalination could be considered as a SCAL measurements because it is expensive and time-consuming. The procedure of alteration estimation includes porosity and permeability measurements before and after water injection in core samples and its desalinization during long-term one phase water filtration. Porosity and permeability before desalination, sample density and lithology and texture description are the RCA input data for our predictive models.
The significant benefit of ML predictive models is that one may not have to perform SCAL measurements for all the core samples, but can conduct prediction of the results (Unsal et al, 2005). Once a predictive model of any SCAL results is trained it could be effectively used for future forecasting. There are a lot of ML algorithms to build a predictive model (Hastie et al, 2001). In our work we used the following algorithms: linear regression (with and without regularization) (Boyd and Vandenberghe, 2004; Freedman, 2009)1986)1995)2000), neural network (Haykin, 1994) and support vector machines (Cortes and Vapnik, 1995). The choice of algorithm strongly depends on the considered problem, data quality and size of the dataset. For example, it would be unnecessary to build convolutional or recursive NN in our problem due to the small dataset size and its structure. However, more simple algorithms (mentioned above) could be adopted for discussed cases.
Accordingly, we have two goals in our research. First is to develop a predictive model of salts concentration using information of RCA and some additional data about coring depth and top and bottom depths of productive horizons. Second is to develop relevant predictive models of porosity and permeability.
The main innovation elements of the research are:
Special experimental investigations of porosity and permeability increasing in core waterflooding tests,
Validation of predictive algorithms to define the best predictive model,
Accounting 10 features of core samples to predict porosity and permeability after rock desalination and 9 features to predict salts content;
The high quality of models for prediction of porosity and permeability after rock desalination and rather good quality of model for evaluation of rock salinity.
2 Materials and Methods
2.1 Hydrocarbon reservoir characterization
The Chayandinskoye oil and gas condensate field is located in the Lensk district of Sakha (Yakutia) Republic in Russia and hosted towards the south of the Siberian platform within the Nepa arch. The field belongs to the Nepa-Botuobinsky oil and gas area, which contains rich hydrocarbons reserves. The main gas and oil resources are associated with the Vendian terrigenous deposits (Talakh, Khamakin and Botuobinsk horizons) which are overlapped by a thick series of the salt-bearing sediments.
Chayandinskoye field is characterized by a complex geological structure and special thermobaric formation conditions (reservoir pressure of 36-38 MPa, overburden stress of 50 MPa, temperature of 11-17C). The Vendian deposits consist predominantly of quartz sandstones and aleurolits with a low level of cementation and development of indentation and incorporation of grains. Another essential feature of the field is salts presented in the pore volume of the deposits. Salts distribution in the reservoir is exceptionally irregular due to various sedimentation processes: change in thermobaric condition during regional uplifts and erosional destruction of deposits, paleoclimate cooling and glaciation, in addition to filtration of brines through rock faults and fractured zones (Ryzhov et al, 2014). Usually, the most common salt in rock matrix is sodium chloride (NaCl), but many other salts occur in varying smaller quantities. The same conclusion based on TDS analysis (measurement of the total ionic concentration of dissolved minerals in water) is correct for brine composition. Rocks analysis demonstrates that highly salinized formations are coarse-grain poorly sorted rocks with mass salts concentration – ranging from 4 to 30%. The porosity of the salted rocks is 1 - 8% (seldom 10%). After core desalination permeability could be increased up to 60 times and porosity - up to 2.5 times.
We included the following features to the dataset for our prediction models:
measurements of salts mass concentrations for core samples with various values of initial porosity and permeability, lithology, depth, horizons’ ID and wells’ ID;
measurements of porosity and absolute permeability before and after desalination.
All tests are performed on 102 cylindrical core samples with 30 mm radius and 30 mm length. Sample preparation included delicate extraction in the alcohol-benzene mixture at room temperature (to avoid premature desalination) and drying up to constant weight. Absolute permeability was measured at ambient conditions in the steady-state regime of nitrogen flow. Porosity was determined by a gas-volumetric method based on Boyle’s Law (RP40, 1998).
For salts ablation, we injected in each core sample more than 10 pore volumes of brine with low salinity (30 mg/cc). To enhance ablation, we also performed additional extraction in the alcohol-benzene mixtureto remove oil films preventing salts dissolution. After that, the samples were dried up to a constant weight to measure porosity and permeability after desalination. Results of porosity and permeability measurements (before and after salts ablation) are presented in figures (a)a, (b)b.
Measurements of salts mass concentrations for core samples were based on the data of sample weighting before and after desalination. The resulting expression for salts concentration could be defined as:
where - is the sample mass difference before and after ablation (g); - sample mass before desalination (g). All the measurements were made on oven-dry samples. Results of salts mass concentrations measurements presented in figure (c)c.
For salts concentration predictive model, we have used 9 features: formation top depth, formation bottom depth, initial (before desalination) porosity and permeability, sample depth adjusted to log depth, sample density (before desalination), average grain size (by lithology and texture description), sample colour and horizon ID. Average grain size was quantified from textual lithology description in the following way:
Gravel –- 1 mm;
Coarse sand -– 0.5 mm;
Medium sand –- 0.25 mm;
Fine sand -– 0.1 mm;
Coarse silt –- 0.05 mm;
Fine silt –- 0.01 mm;
Clay –- 0.005 mm.
For sample colour and horizon type, we have used the classification scheme containing 6 colour types and 3 horizon types. If the sample has any of 6 colours and any of 3 horizons we mark “1”, otherwise, we mark “0”.
For porosity and permeability predictive models we have used 9 previously described features plus salts concentration. All 10 features accounting in machine learning algorithms are presented in table 1.
|2||formation top depth||m|
|3||formation bottom depth||m|
|4||porosity before desalination||%|
|5||absolute permeability before desalination||mD|
|8||average grain size||mm|
2.3 Prediction models
We have used 9 models: linear regression (simple, with L1 and L2 regularization), decision tree, random forest, gradient boosting (two different implementations with and without regularization) and neural network, support vector machines to compare their predictive power.
where is predicting parameter; is a vector of features; is a vector of optimizing coefficients.
The optimization problem for regression is given by the expression:
Coefficients , are defined from a training set of data ( is the actual value of the predicting parameter; is the size of the training set). Further, the linear regression model with predefined coefficients could be effectively applied to fit new data. This algorithm is implemented in LinearRegression() method of Python scikit-learn library (Pedregosa et al, 2012).
Sometimes regression with regularization works better than simple regression. In a case when we have many features, linear regression procedure leads to overfitting: enormous weights w that fit the training data very well, but poorly predicts future data. “Regularization” means modifying the optimization problem to prefer small weights. To avoid the numerical instability of the Least Squares procedure regression with L2 and L1 regularizations are often applied (Hastie et al, 2001).
Linear regression with L2 regularization (Ridge) (Boyd and Vandenberghe, 2004)
. This approach is based on Tikhonov regularization, which addresses the numerical instability of the matrix inversion and subsequently produces lower variance models:
All variables have the same meanings as in linear regression case. Optimal regularization parameter is chosen in a way to get the best model fitting while weights w are small. This algorithm is implemented in Ridge() method of Python scikit-learn library (Pedregosa et al, 2012).
Linear regression with L1 regularization (Lasso) (Boyd and Vandenberghe, 2004). While L2 regularization is an effective approach of achieving numerical stability and increasing predictive performance, it does not address another problem with Least Squares estimates, parsimony of the model and interpretability of the coefficient values (Tibshirani, 2011). Another trend has been to replace the L2-norm with an L1-norm:
This L1 regularization has many of the beneficial properties of L2 regularization, but yields sparse models that are more easily interpreted (Hastie et al, 2001). L1 regularization algorithm is implemented in Lasso() method of Python scikit-learn library (Pedregosa et al, 2012).
Decision Tree (Quinlan, 1986). Decision tree is a tree representation of a partition of feature space. There are numbers of different types of tree algorithms, but here we will consider only CART (Classification and Regression Trees) (Breiman, 2017) approach. A classification tree is a decision tree which returns a categorical answer (class, text, color and other) while a regression tree is a decision tree which responses with an exact number. Figure (a)a demonstrates a simple example of the decision tree in a case of two-dimensional space based on two features X1 and X2.
A decision tree consists of sets of leaves and nodes. One may builts very detailed deep tree with many nodes however such tree will suffer from overfitting. Usually, the maximum depth of the tree is restricted. To build a tree, the recursive partition is applied until a sufficient size of a tree would be obtained. The criteria for tree splitting is often given by Gini index or information gain criteria in classification case and mean squared error or mean absolute error in the regression case. These functions are used to measure the quality of a split and choose the optimal point of partition. Tree construction could involve not all input variables, so tree managed to demonstrate which variables are relatively important, but it could not rank the input variables. Decision tree algorithm exploited in this work is implemented in DecisionTreeRegressor() method of Python scikit-learn library (Pedregosa et al, 2012).
Random Forest (Ho, 1995), (Breiman, 2001). The main idea of this method is to build many independent decision trees (ensemble of trees), train them on data subset and receive predictions. The algorithm uses bootstrap re-sampling to prevent overfitting. Bootstrapping is a re-sampling with replacement: bootstrap sets are built on initial data, where several samples are replaced with other repeating samples. Each tree is built on individual bootstrap set (so, for N tree estimators, we need N different bootstrap representations). Consequently, all trees are different as they are built on different datasets and hold different predictions. Then all trees are aggregated together after training and the final prediction is obtained by averaging (in the case of regression) predictions of each tree. One useful feature of Random Forest algorithms is that it could rank input features. It is implemented in RandomForestRegressor() method of Python scikit-learn library (Pedregosa et al, 2012).
Gradient Boosting (Friedman, 2000)
. This method uses "boosting" of the ensemble of weak learners (often decision trees). Boosting algorithm combines trees sequentially in such a way that the next estimator (tree) learns from the error of previous one: this method is iterative, and each next tree is built as a regression on pseudo-remainders. Similar to any other ML algorithm, Gradient Boosting uses loss function to minimize. Also, gradient descent is applied to minimize error (loss function) associated with adding a new estimator. The final model is obtained by combining the initial estimation with all subsequent estimations with appropriate weights. Gradient boosting method used in this study implemented in the scikit-learn library in GradientBoostingRegressor() method(Pedregosa et al, 2012)
. Also, XGBoost library(Chen and Guestrin, 2016) was considered because it allows adding regularization to the model.
Support Vector Machine (SVM) (Cortes and Vapnik, 1995). The idea of Support vector machine (in case of regression) is to find a function that approximates data in the best possible way. This function has at most deviation out from real train values and as flat as possible. Such linear function could be expressed as:
The optimization problem could be formulated in the following form:
Often, some errors beyond are allowed, which requires introducing of slack variables into the problem:
The solution of SVR problem is usually given in a dual form(Hsieh et al, 2008), which includes calculation of Lagrange multipliers . In this formulation solution looks as follows:
where is a kernel (Crammer and Singer, 2001).
In this work, the standard Gaussian kernel has been applied:
SVM algorithm is implemented in SVR() method of Python scikit-learn library (Pedregosa et al, 2012).
Neural Network (Haykin, 1994; Schmidhuber, 2014). Artificial neural network (ANN) is a mathematical representation of biological neural network (figure (b)b). It consists of several layers with units that are connected by links (McCulloch and Pitts, 1943)
. Each link has associated weight and activation level. Each node has an input value, an activation function and an output. In ANN information propagates (forward pass) from first (not hidden) layer with inputs to next hidden layer and then to further hidden layers until the output layer would be achieved. The value in each node of the first hidden layer obtained after calculation of activation function for the dot product of inputs and weights. Next hidden layer receives the output of the previous one and puts its dot product with weights to the activation an so on. Initially, all weights for all nodes are assigned randomly. ANN calculates first output with random weights. Then compare it to real value, calculate the error and adopts weights to obtain smaller error on the next iteration via backpropagation (ANN training algorithm). After training all weights are tuned, and one may make a prediction for a new data by passing them into an ANN which will calculate output via forward propagation throw all activations.(Rumelhart et al, 1986). Scikit-learn library provides ANN representation in MLPRegressor() method (Pedregosa et al, 2012). This implementation allows to indicate the number of hidden layers, number of nodes in each layer, activation function, learning rate and some other parameters.
2.4 Methodology of using machine learning algorithms
Metrics. To evaluate the accuracy of applied methods and compare them between each other, the following metrics have been exploited.
The coefficient of determination R2 is the proportion of the total (corrected) sum of squares of the dependent variable “explained” by the independent variables in the model. R2 score is a part of dispersion of dependent variable that is predictable by independent variables:
where are real values, are predicted values, is a mean value.
Mean squared error corresponds to quadratic errors:
Mean absolute error corresponds to absolute errors:
Cross-validation. Since this study is limited in the amount of data, cross-validation (CV) has been applied. There are several cross-validation techniques. For example, k-fold, when we split whole data into k parts, then use first part for testing the performance of ML model after training it on the other k-1 parts. Next, we could take the second part for testing and the rest parts for training and so on k times. In the end we will have k different values of metrics - R2, MAE, MSE (eqs. (11) –- (13
)) calculated for each fold. So, via cross-validation, we could obtain mean values of metrics and their standard deviation. Another cross-validation technique called random permutation supposes random division of data into train and test set, then data shuffle and we can obtain new division on test and train set. This process repeats n times and each time metrics are calculated. Similarly, in the end, we can evaluate the mean values of metrics and standard deviations. So, cross-validation allows not only calculate R2 (or MAE and MSE) for the test set but do it several times by independent data splitting into test and train sets. Since in our task we are restricted in the amount of data, cross-validation was used several times (for hyperparameters tuning, model’s performance estimation and making predictions to plot predicted results versus real data). Models building and estimation was done in 3 steps:
Hyperparameters tuning was done with the help of exhaustive grid search (Bergstra and Bengio, 2012). This process allows to search through given ranges of each hyperparameter and define optimal values which led to the best R2 (or MAE, MSE etc.) scores. This process implemented in scikit-learn Python library (Pedregosa et al, 2012) in method GridSearchCV(). The method simply calculates CV score for each combination of hyperparameters in a given range. Random permutation approach with 10 repeats was chosen as CV iterator. GridSearchCV() allows not only find the best hyperparameters but also calculate metrics in the optimal point. However, just 10 repeats are not enough for very accurate final estimation of mean value and deviation, while taking more repeats require more computational time. So, final estimation with known hyperparameters would be done next.
Evaluation of the ML model with optimal hyperparameters (defined on the first step) was also done via CV with random permutation approach. However, here, we take 100 repeats, and it is enough for a fair result.
To plot predictions versus real values, we applied k-fold CV. In our particular case, we took k equals the number of samples. So, first of all, we train our model on all data without one point, then predict at this point (testing) by trained model. Next, we take the second point - remove it from the dataset and train model from scratch, then obtain prediction in point. This process was made for all data points (102 times). It allows to obtain predictions for all points and compare them to initial data visually.
Some researchers (Choubineh et al, 2019) suggest the other way to validate the quality of ML models. The data records of the dataset are divided into training, validation and testing subsets, respectively. Where validation set is used for hyperparametres tuning while training on training set and test set is used for final model evaluation. However, unfortunately, a single random partition of the data on subsets could not be enough for correct model estimation due to non-uniformity of the dataset. Another random partition will give another value of metrics (R2, MAE, MSE and others). Single partition of the data is reasonable only in case of the big size of the dataset.
Normalization of data. Some machine learning methods require normalized data to proceed correctly (SVM and Neural Network), so in these cases, the data has been normalized by using the mean and standard deviation of the training set:
where is the feature vector, is the mean value of feature vector, is a variance.
3 Results & Discussion
In this section results of porosity, permeability and salts concentration predictions are presented, analyzed and compared. We made predictions of these reservoir properties on the basis of features described in section 2.2 by algorithms described in section 2.3. However, only Linear regression with L1 regularization was taken into account out of 3 Linear regression algorithms (because algorithms are very similar, implemented within the same library and results are very close, and regression with L1 regularization performs slightly better in our task). Also, two different libraries for Gradient boosting calculation was applied (they reported separately) because XGBoost library allows regularization while scikit-learn not. So, we reported results for 7 different algorithms. Each algorithm was adopted to the experimental data to get the best performance due to the cross-validation procedure.
Since our dataset is not such big from Big Data point of view and contains different measurements errors, it turns out that the proportion of its splitting via cross-validation procedure is important. We found out via grid-search that for our small dataset would be optimal to left 35% of data for testing on each cross-validation pass to obtain the estimation of algorithm performance.
3.1 Porosity prediction
Porosity model was the first ML model we built. Actually, the material balance equation defines specific dependence between porosity and salts concentration:
where is porosity after desalinization; is porosity before desalinization; is the mass salts concentration; is salt density; is core sample density before desalinization. Equation 15 allows to estimate performance of ML models. In this case, data-driven algorithms should find hidden correlations within parameters and demonstrate appropriate predictive abilities. The appearance of the physical model gives an opportunity widely to test ML instruments.
For predictive models of porosity we took into account influence of all 10 characteristics of rock samples (Table 1). Almost all surveyed methods (except Decision tree) has demonstrated promising results and high values of determination coefficient in the case of porosity prediction. Table 2 demonstrates results of models evaluation via cross-validation process: mean value and standard deviation. Here and further we reported linear regression only with L1 regularization, because of other implementations (without regularization and with L2 regularization) show very close results. However, this algorithm works slightly better. In general, the highest value of R2-metric corresponds to the lowest values of MSE and MAE metrics. In porosity case SVM, Neural network, Gradient boosting and Linear regressions have the best scores. The best two models are Support Vector Machines with and Neural network with .
|1||Linear regression with L1 regularization||0.792||0.116||1.035||0.178||2.361||1.110|
|5||Gradient boosting (XGBoost)||0.782||0.081||1.112||0.186||2.526||0.840|
|6||Support Vector Machines||0.855||0.144||0.816||0.194||1.634||1.472|
Grid search calculated optimal regularization value of L1 linear regression which equals 0.001. In a similar search process, optimal depth of decision tree was obtained - 7 and the optimal number of estimators (trees) for random forest - 150 along with the maximum depth of each tree - 8. For Gradient Boosting model the following parameters were selected: 16000 estimators (trees), maximum depth of each tree - 2, subsample - 0.7 (it means that each tree takes only 70% of initial data to fit, the next tree takes another 70% randomly etc., this idea helps to prevent overfitting), max-features - 0.9 (the concept is similar to subsample, the only difference is that instead of using part of the samples, algorithm takes part of features to fit each tree), regularization - 0.001. Neural Network architecture was also defined with the help of grid search since we have the small dataset and can calculate several architectures fast. It has 2 hidden layers with 2 and 4 nodes in each layer. SVM was built with Gaussian kernel which has two parameters to tune: C and gamma. The exhaustive search showed optimal gamma - 0.0001 and optimal C - 40000.
Performance of the SVR and MLPRegressor models could be demonstrated by plotting predicted values (via cross-validation) of porosity after ablation versus the actual values (figure 3). One can see that the data points located along the mean line (bisectrix).
. The importance provides a score (referred as F score) that indicates how useful each feature was in the construction of the boosted decision trees within the model. This metric shows how many times the feature was used to split tree on(Freedman, 2009). The feature importance is then averaged across all of the decision trees within the model.
In figure 4 one can see that porosity and permeability before desalination and salts concentration have the most significant influence on the porosity prediction results. What is in a good correspondence with the geometric correlation between porosity and salts concentration (equation 15). Core sample density before desalinization () also among the five influential features in ML algorithms (Figure 4). These observations demonstrate that predictive ML models simulate the same correlations as the physical model. A strong influence of the permeability before desalinization () on the porosity after desalinization () can be explained by a strong correlation between and (as, for example, in Kozeny-Carman equation form (Carman, 1956)). The next few features on Figure 4, which also significantly influence on porosity increase, are depths of sample, formation top and formation bottom. This fact confirms that the salts are distributed in formation non-uniformly and the distribution strongly depends on the geological condition of the reservoir. The colour features and the horizon types have the lowest influence on the prediction models.
3.2 Permeability prediction
For permeability prediction all the methods also look promising and demonstrate high values of R2-metrics (table 3). In this case Support Vector Machines, Neural network and linear regressions show the best performance. The highest scores were obtained for SVR with and Linear regression with
|1||Linear regression with L1 regularization||0.852||0.074||118||20||40864||16434|
|5||Gradient boosting (XGBoost)||0.809||0.093||137||36||57345||36071|
|6||Support Vector Machines||0.856||0.078||105||20||39957||17906|
Similar to the previous section, one may define optimal hyperparameters of algorithms for permeability prediction via grid search (Bergstra and Bengio, 2012). So, for Linear regression model, we have used L1 regularization with regularization parameter equals to 10. The decision tree was built with maximum depth - 10. 25 trees with the maximum depth of 8 were defined for Random Forest algorithm. In Gradient boosting model we obtained the following optimal values: 300 estimators with a maximum depth of each tree equals 2. Only 80% of the samples and 90% of the features have been used for each tree to fit the model and regularization parameter - 0.1. Neural network had 2 hidden layers with 77 and 102 nodes in each layer. SVM parameters included the Gaussian kernel with gamma equals to 0.0001 and C equals to 50000.
Performance of the SVR and Linear regression models is demonstrated on the mean line plots in figure 5. Similarly, the prediction was made via cross-validation for all points. Data points are mainly located along the bisectrix, but generally matching between observed and predicted permeability is weaker than in porosity case.
The XGBoost method was also used to arrange features concerning their influence on the predictive model. In figure 6 one can see that porosity and permeability before desalination and salts concentration have the most influence on the permeability prediction results. It is very similar to the results of Feature Selection in porosity model. We also obtained that the next features, which significantly influence permeability increase, are connected with the geological condition of the reservoir (sample depth, formation top and bottom depths). The colour features and the horizon type also occurred in the lowest influencers.
3.3 Salts concentration prediction
The last part of the research is devoted to the prediction of salts concentration. The models work worse and demonstrate rather weak performance with R2-metric hardly reaching 0.6. Only a few methods look promising and demonstrate reasonable values of R2-metrics (Table 4). The best algorithms are Neural network, Gradient boosting and Random forest. Linear regression and Decision tree models are unacceptable with very small R2-metrics. R2 for Support Vector Regression reached almost 0.5. The best two models with the highest scores are Neural network (from MLPRegressor model of Scikit-learn) with and Gradient boosting with . Also, in this case, very high standard deviation (up to 100%) in defining of R2, MSE and MAE metrics are obtained. This could be explained by non-uniformity of the experimental data.
|1||Linear regression with L1 regularization||0.014||0.385||1.579||0.174||5.747||1.996|
|5||Gradient boosting (XGBoost)||0.565||0.299||0.950||0.189||2.276||1.275|
|6||Support Vector Machines||0.484||0.411||0.951||0.1909||2.608||1.221|
By analogy with porosity and permeability, we defined optimal hyperparameters of algorithms via grid search process (Bergstra and Bengio, 2012). Regularization parameter for L1 regression is equal to 1.0. The optimal decision tree has a depth of 9. Random Forest performs better with 10 estimators of the depth of 1. Gradient Boosting model runs with 300 estimators of the depth of 10. 95% of samples and 50% of features have been used for training of each tree. Regularization parameter is small and equals to 0.00001. The neural network contains 3 layers and 55, 10, 86 nodes in each layer respectively. SVM was performed with gamma equals to 0.1 and C equals to 25.
Performance of the Gradient boosting and Neural network models are demonstrated on the mean line plots (figure 7). Data points partially located along the mean line. Accordingly, the correlation between observed and predicted values is much weaker than in porosity and permeability cases.
Results of XGBoost features arrangement is in figure 8. As one can see, porosity before desalination has the most substantial influence on the salts concentration prediction results. The next two features affecting the prediction results are sample depth and permeability before desalination. Results of this Feature Rating differs from the results obtained for porosity and permeability models. We can state that the prediction of porosity and permeability alteration is primarily controlled by its initial values and amount of salts in the pore volume. Salts concentration, in its turn, strongly depends not only on the initial porosity and permeability but also on the formation pattern characteristics, which are linked with post-sedimentation processes. Therefore, the prediction model attempts to learn from training dataset where and how strong these processes are developed in the certain reservoir beds with various depth and location in the oilfield (through the formation top and bottom depths).
3.4 Comparison of the predictive models with traditional approaches
All obtained R2 scores with its variances for all algorithms are represented in figure 9. The worst results could be associated with Decision tree method where we obtained not only the lowest values for R2 metric but the largest standard deviation of R2. Support Vector Machines and Linear regression demonstrate good results only for porosity and permeability prediction, but these methods are inappropriate for salts concentration prediction. The best machine learning method for prediction of all three petrophysical characteristics is Neural network in MLPRegression implementation. This algorithm demonstrates the most significant values of R2 metrics and the smallest standard deviation. Gradient boosting and Random forest could also be recommended as effective methods for prediction of salts concentration and permeability and porosity alteration due to salts ablation.
The benefits of using machine learning models to estimate rock properties in comparison with standard one-feature approximation are obvious. When we talk about standard one-feature approximation, we assume the next approach. In case we do not know the law (or physical model) controlling the correlation between core sample characteristics and the target rock property the simplest and the fastest way to build the petrophysical model of the property is consistent single variable function analysis - finding consistently the target rock property functional dependence on each variable (characteristic of rock). The best single variable correlation in this approach could be considered as a one-feature approximation model. Instead of this expert approach ML algorithms allow building multi-feature approximations, which are more relevant to real rock properties correlations. Using one-feature approximation analysis, we found that porosity and permeability after ablation have the strongest correlation with salts concentration and corresponding dependencies showed in figures 10 and 11. However, it does not mean that other core sample characteristics are useless. Over against, ML algorithms accounting all 10 characteristics should demonstrate better results. In our case, the one-feature approximation is the cubic polynomial which accounts for the dependency of porosity (permeability) alteration on salt content.
To compare ML methods with one-feature approximation approach predictions of porosity and permeability alterations were performed in three ways:
The last approach includes a two-step procedure. First, we estimate salts concentrations with corresponding prediction model and second, use this predicted values in porosity and permeability predictions. This approach is applicable in the case when we do not have experimental measurements of salts concentrations.
The more detailed comparison of machine learning models and standard one-feature approximation presented in figures 12, 13. Here, the experimentally measured porosity (figure 12) and permeability (figure 13) were compared with the predicted values from ML (with and without salt content measurements) and one-feature approximation. These plots demonstrate the difference between three applied approaches for estimation of porosity and permeability after desalinization. Originally one feature approximation were obtained from dependency of porosity (permeability) alteration on salt content (figures 10, 11). Than these alterations were used to obtain values of porosity and permeability after desalinization. Blue triangles in figures 12, 13 relate to ML model with known salt content. These points are located near the plot diagonal (ideal case). Black points relate to one-feature approximation, and these points are the most distant from diagonal. Yellow squares are approximation by ML with salt content preliminary predicted by ML. These predictions were made by using the approach described in section 2.4 in cross-validation as a third step (k-fold cross-validation with k equals to the number of samples). This method helps to compare approaches of permeability evaluation between each other and depict them on the plot. It does not evaluate the performance of ML algorithms well (methods evaluation was given in tables 2,3,4), because we have only one value of R2 (equation 11
) and do not have confidence intervals. From figures12, 13 one can see, that machine learning models work better.
Quantitative comparison of models is presented in Table 5. Negative value of R2 for the one-feature approximation of permeability was obtained because of several points, which after recalculation (from normalized permeability to absolute values) lay very far from experimental data and add huge error in R2 calculation (equation 11). One may remove these points and obtain much better estimation, but ML models work with satisfying accuracy at these points, so, we’ve left results as it is to demonstrate the superiority of ML over old method. Table 5 confirm that simple polynomial regression taking into account only one feature at the same time works not so well as machine learning models considering many different features. We can also see that restriction of the dataset (case without salts concentration measurements) does not strongly affect prediction quality. However, it makes it possible to predict porosity and permeability alterations using only formation and core sample depths, initial porosity and permeability, rock density and lithology description. Feature ranking for salts concentration, permeability and porosity alterations models with Python’s XGBoost method demonstrate that sample colour and horizon have a feeble influence on the predictive models and could be excluded from feature list for further applications.
|No.||Metric||ML (salts is known)||ML (salts is unknown)||one-feature approx.|
In this paper applicability of various Machine Learning algorithms for prediction of some rock properties were tested. We demonstrated that three special properties of salted reservoirs of Chayandinskoye field could be predicted only basing on routine core analysis data. The target properties were:
alteration of open porosity,
alteration of absolute permeability,
salts mass concentration.
After core desalination permeability could be increased up to 60 times and porosity - up to 2.5 times. Usually these characteristics are out of RCA scope because it is time-consuming and occasional analysis. It is very useful for reservoir development planning to have the predictive models in case of lack of this type of data. Porosity and permeability before desalination, sample density, lithology and texture description are the RCA input data for our predictive models.
To build relevant predictive models the dataset with results of 100+ laboratory experiments was formed. The main 9 features were: formation top depth, formation bottom depth, initial (before desalination) porosity and permeability, sample depth adjusted to log depth, sample density (before desalination), average grain size (by lithology and texture description), sample colour and horizon ID. These features were used to build the salts concentration predictive model. For porosity and permeability alteration prediction we additionally used the 10th feature – the salts concentration. From a technical point of view, there is no matter these concentrations measured or predicted with other ML model. We reported 7 algorithms:
Linear regression with L1 regularization;
Support vector machine;
Artificial neural network.
The best two algorithms for porosity and permeability alteration prediction were Support Vector Machines with and Neural network. For permeability the Linear regression with regularization also showed good results. The best models demonstrate the determination coefficient R2 of 0.85+ for porosity and permeability. High precision of developed models looks to be helpful in decreasing of geological uncertainties in modelling of salted reservoirs. It was shown, that porosity and permeability before water intrusion along with the matrix density, sample depth and salts content are the most influencing features on permeability and porosity alteration.
The predictive model of salts concentration has been developed using the results of routine core analysis and data on core depth and top and bottom depths of productive horizons. The best algorithms here were Gradient boosting and Neural network. The highest coefficient of determination R2 for salts concentration in rocks equals 0.66. The precision of salts model is lower than the precision of porosity and permeability models. Nevertheless, the developed models allows to estimate the salts content in rocks without special experiments.
Combining all three models, it is also possible to make precise porosity and permeability alterations predictions using only a minimal volume of routine core analysis data: formation and core sample depths, initial porosity and permeability, rock density and lithology description. Accordingly, with these instruments geocientists and reservoir engineers can estimate the porosity and permeability alteration at waterflooding conditions having RCA measurements only.
It was shown that different algorithms work better in different models. However, the best machine learning method for prediction of all three parameters was two hidden layer Neural network in MLPRegression implementation. This algorithm gave the highest values of R2 metric and the smallest standard deviation. Gradient boosting and Random forest could also be recommended as alternative methods for predictions but with lower precision.
Finally, this work showed that machine learning methods could be applied for the prediction of rock properties, which laboratory measurements are time-consuming and expensive.
- Andersen M. A. (2013) Andersen M A MR Duncan B (2013) Core truth in formation evaluation. Oilfield review 82(2):16–25
- Bergstra and Bengio (2012) Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. Journal of Machine Learning Research 13(Feb):281–305
- Boyd and Vandenberghe (2004) Boyd S, Vandenberghe L (2004) Convex Optimization. Cambridge University Press
- Breiman (2001) Breiman L (2001) Random forests. Machine Learning 45(1):5–32
- Breiman (2017) Breiman L (2017) Classification and regression trees. Routledge
- Carman (1956) Carman PC (1956) Flow of gases through porous media. Academic press
- Chen and Guestrin (2016) Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, ACM, pp 785–794
- Choubineh et al (2019) Choubineh A, Helalizadeh A, Wood DA (2019) Estimation of minimum miscibility pressure of varied gas compositions and reservoir crude oil over a wide range of conditions using an artificial neural network model. Advances in Geo-Energy Research 3(1):52–66
- Cortes and Vapnik (1995) Cortes C, Vapnik V (1995) Support vector networks 20:273–297
- Crammer and Singer (2001) Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. Journal of machine learning research 2(Dec):265–292
- Dandekar (2006) Dandekar A (2006) Petroleum Reservoir Rock and Fluid Properties. Taylor & Francis
- Freedman (2009) Freedman DA (2009) Statistical models: theory and practice. cambridge university press
- Friedman (2000) Friedman J (2000) Greedy function approximation: A gradient boosting machine 29
- Gaafar et al (2015) Gaafar G, Tewari R, Md Zain Z (2015) Overview of advancement in core analysis and its importance in reservoir characterisation for maximising recovery
- Hastie et al (2001) Hastie T, Tibshirani R, Friedman J (2001) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer series in statistics, Springer
- Haykin (1994) Haykin S (1994) Neural networks: a comprehensive foundation. Prentice Hall PTR
- Ho (1995) Ho TK (1995) Random decision forests. In: Document analysis and recognition, 1995., proceedings of the third international conference on, IEEE, vol 1, pp 278–282
- Hsieh et al (2008) Hsieh CJ, Chang KW, Lin CJ, Keerthi SS, Sundararajan S (2008) A dual coordinate descent method for large-scale linear svm. In: Proceedings of the 25th international conference on Machine learning, ACM, pp 408–415
- Kotsiantis et al (2007) Kotsiantis SB, Zaharakis I, Pintelas P (2007) Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering 160:3–24
- Liu et al (2017) Liu Z, Herring A, Arns C, Berg S, Armstrong RT (2017) Pore-scale characterization of two-phase flow using integral geometry. Transport in Porous Media 118(1):99–117
- Mahzari et al (2018) Mahzari P, AlMesmari A, Sohrabi M (2018) Co-history matching: A way forward for estimating representative saturation functions. Transport in Porous Media 125(3):483–501
- McCulloch and Pitts (1943) McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics 5(4):115–133
- McPhee et al (2015) McPhee C, Reed J, Zubizarreta I (2015) Core Analysis: A Best Practice Guide. Developments in Petroleum Science, Elsevier Science
- Meshalkin et al (2018) Meshalkin Y, Koroteev D, Popov E, Chekhonin E, Popov Y (2018) Robotized petrophysics: Machine learning and thermal profiling for automated mapping of lithotypes in unconventionals. Journal of Petroleum Science and Engineering
- Monicard (1980) Monicard RP (1980) Properties of Reservoir Rocks: Core Analysis, vol 5. Editions Technip
- Orlov et al (2018) Orlov D, Koroteev D, Sitnikov A (2018) Self-colmatation in terrigenic oil reservoirs of eastern siberia. Journal of Petroleum Science and Engineering
- Ottesen and Hjelmeland (2008) Ottesen B, Hjelmeland O (2008) The value added from proper core analysis. In: International Symposium of the Society of Core Analysts, Weatherfordlabs
- Pedregosa et al (2012) Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E, Louppe G (2012) Scikit-learn: Machine learning in python 12
- Quinlan (1986) Quinlan JR (1986) Induction of decision trees. Machine Learning 1(1):81–106
- RP40 (1998) RP40 A (1998) Recommended practices for core analysis. Washington, DC: API
- Rumelhart et al (1986) Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. nature 323(6088):533
- Ryzhov et al (2014) Ryzhov AE, Grigoriev BA, Orlov DM (2014) Improving fluid filtration to saline reservoir rocks. In: Book of abstracts of International Gas Union Research Conference (IGRC-2014)
Schmidhuber J (2014) Deep learning in neural networks: An overview 61
- Soulaine and Tchelepi (2016) Soulaine C, Tchelepi HA (2016) Micro-continuum approach for pore-scale simulation of subsurface processes. Transport in Porous Media 113(3):431–456
- Stewart (2011) Stewart G (2011) Well Test Design & Analysis. PennWell
- Tahmasebi et al (2018) Tahmasebi P, Sahimi M, Shirangi MG (2018) Rapid learning-based and geologically consistent history matching. Transport in Porous Media 122(2):279–304
- Tibshirani (2011) Tibshirani R (2011) Regression shrinkage selection via the lasso 73:273–282
Unsal et al (2005)
Unsal E, Dane J, Dozier GV (2005) A genetic algorithm for predicting pore geometry based on air permeability measurements. Vadose Zone Journal 4(2):389–397