Abstract

In recent years, the use of long short-term memory (LSTM) has made significant contributions to various fields and the use of intelligent optimization algorithms combined with LSTM is also one of the best ways to improve model shortcomings and increase classification accuracy. Reservoir identification is a key and difficult point in the process of logging, so using LSTM to identify the reservoir is very important. To improve the logging reservoir identification accuracy of LSTM, an improved equalization optimizer algorithm (TAFEO) is proposed in this paper to optimize the number of neurons and various parameters of LSTM. The TAFEO algorithm mainly employs tent chaotic mapping to enhance the population diversity of the algorithm, convergence factor is introduced to better balance the local and global search, and then, a premature disturbance strategy is employed to overcome the shortcomings of local minima. The optimization performance of the TAFEO algorithm is tested with 16 benchmark test functions and Wilcoxon rank-sum test for optimization results. The improved algorithm is superior to many intelligent optimization algorithms in accuracy and convergence speed and has good robustness. The receiver operating characteristic (ROC) curve is used to evaluate the performance of the optimized LSTM model. Through the simulation and comparison of UCI datasets, the results show that the performance of the LSTM model based on TAFEO has been significantly improved, and the maximum area under the ROC curve value can get 99.43%. In practical logging applications, LSTM based on an equalization optimizer is effective in well-logging reservoir identification, the highest recognition accuracy can get 95.01%, and the accuracy of reservoir identification is better than other existing identification methods.

1. Introduction

With the development of logging technology, the interpretation technology associated with it is gradually moving from qualitative and quantitative manual processing to the era of quantitative processing using machines. Traditional reservoir identification mainly relies on expert experience, construction of rendezvous plates, and other methods, but the conventional reservoir identification methods are subject to many human influence factors. Currently, increasingly scholars are proposing the use of artificial neural networks to solve the reservoir identification problem, thus effectively avoiding errors and improving production efficiency [13]. This study provides a basis for reservoir quality and oil-bearing evaluation of terrestrial shale reservoirs. However, these methods ignore the time-series nature of logging data and do not conform to the practical geological thinking and the logic of traditional geological analysis.

The reservoir data are temporal in nature, with strong backward and forward correlation, so the use of long short-term memory networks is considered for the processing of reservoir data. Long short-term memory (LSTM) is a special type of RNN that can solve the gradient explosion and gradient disappearance problems during the training of long sequences. They can also improve the performance of long sequences. At present, some scholars use LSTM to identify reservoirs. For example, Zhou et al. [4] established a Bi-LSTM network model, which can accurately identify different types of strata developed in storage space and significantly improve the accuracy of reservoir identification. Chen et al. [5]constructed a multilayer LSTM for fine reservoir parameter prediction. The results show that multilayer LSTM has better robustness and accuracy in prediction.

In recent years, LSTM has made breakthroughs in many fields, and the use of intelligent optimization algorithms combined with LSTM is also one of the best ways to improve model shortcomings and increase classification accuracy. For example, Xie et al. [6] used the enhanced gray wolf optimization(GWO) algorithm for the CNN-LSTM model of time-series prediction and indicated that the classification accuracy was improved. Peng et al. [7] applied the fruit fly algorithm (FOA) to optimize the hyperparameters of the LSTM neural network, and the results showed that the prediction accuracy of the FOA-LSTM model was greatly improved. Yang et al. [8] build an improved lion swarm algorithm (LSO) for the LSTM to optimize the hyperparameters of the LSTM model, and the results showed that the enhanced model has strong generalization ability and higher prediction accuracy. When using the LSTM model for reservoir identification, various parameters of the LSTM need to be artificially selected, resulting in insufficient accuracy in reservoir identification. A strategy is proposed to optimize the number of neurons and hyperparameters of the LSTM model by gradient descent using an improved equilibrium optimizer.

With the rapid development of intelligent algorithms, more and more intelligent algorithms are available. For example, in 2017, Mohamed et al. [9] proposed the moth swarm algorithm; Yang et al. [10] proposed the hunger games search algorithm in 2021; in 2022, Ia et al. [11] proposed the weighted mean of vector algorithm. The equilibrium optimizer (EO) is a new optimization algorithm inspired by the physical phenomenon of control volume mass balance proposed in 2020, and it is characterized by its high optimization finding capability and simple parameters [12], but it still suffers from a tendency to fall into local optima and slow convergence in practical applications. Therefore, it is necessary to improve the algorithm of the equilibrium optimizer to ensure the stability and effectiveness of the algorithm. Wang et al. [13] used backpropagation neural networks to predict more output data, which can achieve more efficient optimization and more reasonable fitness functions; Fan et al. [14] proposed a definition of certain particle concentrations based on OBL, a new nonlinear time control strategy, a novel population update, and a chaos-based strategy. Fu et al. [15] combined the strategies of the modal algorithm and fused EO and heat exchange optimization (TEO) to obtain a new highly equilibrium optimizer (HEO). Gupta et al. [16] used Gaussian variation and an additional exploratory search mechanism based on the concept of population partitioning and reconstruction to improve the convergence speed of the algorithm and obtain more accurate optimal solutions; although these methods achieved good results, the EO algorithm still needs to be improved in terms of convergence speed and accuracy.

This paper proposes a TAFEO algorithm to increase the convergence speed, improve the convergence accuracy, and avoid falling into the local optimum. To verify the effectiveness of the improved algorithm, an LSTM model based on the equilibrium optimizer is then constructed and applied to log reservoir identification to achieve desirable practical application results.

2. Materials and Methods

2.1. The Equilibrium Optimizer Algorithm and Its Improvements
2.1.1. Equalization Optimizer Algorithm

The equilibrium optimizer (EO) is primarily a physically heuristic optimization algorithm for dynamic mass balance in a strongly mixed type of controlled volume. The mass balance equation embodies the physical processes of mass entry, departure, and generation in the controlled volume and is generally described using a first-order differential equation as shown inwhere is the control volume; is the concentration in the control volume; is the volumetric flow rate into or out of the control volume; is the concentration within the control volume in the absence of mass production (i.e., at equilibrium); and is the mass production rate within the control volume.

By solving the differential equation described by equation (1), we can find thatwhere is the exponential term factor, is the flow rate, and is the initial concentration of the control volume at time .

2.2. Improvement of the Equalization Optimizer Algorithm

At present, the equilibrium optimizer algorithm has an excellent performance in intelligent algorithms and has been applied to specific problems with significant effects, but it still suffers from slow convergence speed, insufficient initial population diversity, and ease to fall into local extremes. In this paper, three strategies are proposed to better the equilibrium optimizer algorithm: using tent chaotic mapping to enhance population diversity; introducing convergence factors to accelerate the convergence speed of the algorithm in the early stage, and the ability to search locally in the late stage; and using an early perturbation strategy to enhance the ability of the algorithm to jump out of the local optimum.

2.2.1. Improvement Strategies

(1) Tent Chaotic Mapping. The traversal of tent mapping has uniformity and randomness, which enables the algorithm to easily escape from local optimal solutions, thus maintaining the diversity of the population while improving the global search ability. Therefore, to obtain a good initial solution position with a greater chance and speed up the convergence of the population, this paper adopts the tent chaotic mapping method with better traversal uniformity and faster iteration speed to improve the coverage space of the initial solution, which is calculated as shown

Among others, .

(2) Convergence Factor. The exponential term coefficient F used to balance the local search and global search ability in the equalization optimizer algorithm uses constant coefficient weights, and the tendency of the obtained coefficients to change tends to be constant, which does not conform to the nonlinear optimization search law in the algorithm iteration process. To address the problem of slow convergence and low precision in the EO algorithm, a nonlinear decreasing strategy is proposed to balance the local and global search capabilities of the algorithm so that the algorithm has enough steps to search for spatially dispersed populations in the early iterative stage, and reduces the step size to facilitate local search in the late iterative stage of the algorithm, with the convergence factor A defined as shown in where is the current number of iterations, and is the maximum number of iterations.

(3) Early Perturbation Strategy. For the disadvantage that the equilibrium optimizer algorithm is easy to fall into the local optimum set, condition to determine whether the particles fall into the local optimum, and update the particle positions when the condition of falling into the local optimum is satisfied, thus introducing the early perturbation strategy, as shown in equations (6) and (7), when equation (12) is satisfied, we reset the positions of the particles so that they are randomly distributed around the gbest thereby jumping out of the local optimum, i.e.,where is the value of the function corresponding to the global optimum of the generation, respectively, and is a random number of [−1, 1].

2.2.2. Improved Algorithms

Based on the above three improvement strategies, this paper’s specific operation process and parameters of the improved algorithm are designed as follows:(1)Initialization. The algorithm performs random initialization within the upper and lower bounds of each optimization variable, as shown inwhere and are the lower and upper bound vectors of the optimization variables, respectively; represents a vector of random numbers for individual , whose dimension is the same as the dimension of the optimization space, with each element value being a random number from 0 to 1.Using equation (4), the solution is generated by the random initialization.(2)Equilibrium State Pool. To improve the global search capability of the algorithm and avoid falling into low-quality local optimal solutions, the equilibrium state (i.e., the optimal individual) in equation (9) will be selected from within the five currently optimal candidate solutions, which constitute the equilibrium state pool as follows:where are the best four solutions found as of the current iteration; represents the average state of these four solutions. It is worth noting that these five candidate solutions are chosen with the same probability of 0.2.(3)Exponential Term Coefficient. To better balance the local and global search of the algorithm, equation (3) is improved as follows:where is the constant coefficient of the weight of the global search; is the symbolic function; and all represent vectors of random numbers whose dimensions are the same as the dimension of the optimization space, with each element value being a random number from 0 to 1.The constant factor weights are replaced by in equation (5).(4)Mass Generation Rate. To enhance the local search capability of the algorithm, the generation rate is designed as shown in where is a vector of generation rate control parameters; is a vector of random numbers whose dimension is the same as the dimension of the optimization space, with each element value being a random number from 0 to 1; and is a random number range from 0 to 1.(5)Solution Update. For optimization problems, the individual solutions can be updated as follows, based on what is shown in

We use equation (6) to determine whether it is the optimal solution. If the current output is not the optimal solution, we use equation (7) to reset the particle position.

In summary, the above improvement algorithm that incorporates the three improvement strategies is named TAFEO, and the specific steps of the TAFEO algorithm are shown in Table 1.

2.3. Simulation Experiments and Analysis of Results

The computer configuration used for the simulation experiments was Intel Core i7 6700HQ with 3.6 GHz main frequency, 16 GB of memory, 64-bit operating system, and MATLAB R2020b as the computing environment. In the following experiments, we set the number of evaluations to M = NT = 30000, the number of populations of all intelligent optimization algorithms was set to N = 30, and the maximum number of iterations to T = 1000. For each basic algorithm, the internal parameter settings are shown in Table 2.

To verify the effectiveness and generalization of the TAFEO algorithm, 16 international standard test functions are used, F1-F10 [17] is chosen from the common test functions, and F11-F16 is from CEC2017 [18]. The details of the test functions are shown in Table 3.

2.3.1. Performance Comparison of Various Improved EO Algorithms

To verify the effectiveness of the improved TAFEO algorithm, the performance of the search iterations was compared with the basic EO algorithm, the m-EO algorithm [16], and the MDSGEO [18] on test functions. Among them, m-EO is proposed in the literature [16] as an improved equilibrium optimizer algorithm using Gaussian variance and based on population partitioning and reconstruction, and the MDSGEO algorithm in literature [18] as an enhanced equilibrium optimizer algorithm using sinusoidal pooling strategy and adaptive preferential gravity strategy. Sixteen benchmark test functions of Table 3 were selected to test the four algorithms. The simulation-optimization-seeking iteration curves are shown in Figure 1.

As shown in Figure 1, the improved TAFEO algorithm performs well in the tests of all the functions, the convergence accuracy is better than all algorithms, and the convergence speed is faster than other algorithms on most test functions except F8, F12, and F14. The comparative analysis shows that the improved strategy of this paper is feasible.

Table 4 shows the results of the different algorithms. To demonstrate the repeatability of each algorithm, the optimal solution and standard deviation in Table 4 are the averages of 30 optimization calculations for each algorithm on sixteen benchmark test functions.

As can be seen from Table 4, the convergence accuracy of the improved TAFEO algorithm is significantly better than that of the original EO algorithm and the two improved EO algorithms in sixteen benchmark functions. The comparative analysis proves that the enhanced TAFEO algorithm in this paper is effective.

2.3.2. Performance Comparison of the Improved EO with Various Intelligent Algorithms

The TAFEO algorithm was compared with the whale algorithm (WOA) [19], the marine predator algorithm (MPA) [20], the wolf pack (GWO) algorithm [21], and the EO algorithm to verify the performance of the TAFEO algorithm.

To observe the performance of the five algorithms more clearly, sixteen benchmark test functions were chosen for the iterative graph of the simulation search, and the results are shown in Figure 2.

In Figure 2, the improved TAFEO has the highest convergence accuracy in all test functions except F12 and F16, and the convergence speed is faster than other algorithms in all test functions. Table 5 shows the results of the different algorithms.

To demonstrate the repeatability of each algorithm, the optimal solution and standard deviation in Table 5 are the averages of 30 optimization calculations for each algorithm on sixteen benchmark test functions.

As can be seen in Table 5, TAFEO outperformed the other four algorithms in terms of both search accuracy and performance stability in the benchmark function test. Therefore, it can be verified that the convergence accuracy and convergence speed of TAFEO are both higher than the other four algorithms.

In summary, the improvements to the equilibrium optimizer algorithm in this paper are highly effective and TAFEO outperforms several other intelligent algorithms tested for the benchmark functions.

2.3.3. Wilcoxon Rank-Sum Test

To observe the statistical difference between the TAFEO algorithm and other algorithms, Wilcoxon rank-sum test[22] was used to verify the results. We compare TAFEO with the EO algorithm, WOA algorithm, MP algorithm, GWO algorithm, m-EO algorithm, and MDSFEO algorithm through 18 test functions in Table 3 to verify the statistical superiority of the TAFEO algorithm. If the -value is greater than 0.05 or NAN, it means that TAFEO is not statistically significantly different on this function. The Wilcoxon rank-sum test results are shown in Table 6.

The bold text in Table 6 indicates values greater than 5% or NAN. From Table 6, on F1, the result of the MDSGEO test is NAN, because both TAFEO and MDSGEO seek the best theoretical solution at the same time, so there is no statistical difference. There was no significant difference on F7 between TAFEO and m-EO. There was no significant difference between TAFEO and MPA on F7 and F15. There was no significant difference between TAFEO and GWO on F12 and F16.

In addition to the above, there are statistical differences between the TAFEO and other algorithms in the Wilcoxon rank-sum test, which shows that the TAFEO algorithm has great statistical advantages in the optimization results of benchmark function and verifies the robustness of the algorithm.

3. Improvements to the LSTM

3.1. LSTM Principle

The long short-term memory (LSTM) network is a recurrent neural network responsible for computing the dependencies between observations in a time series. As such, it is commonly used for forecasting. Because logging attribute data are time-series data, this paper uses LSTM as a classification prediction model for reservoir identification. The cell structure of the primary LSTM neural network is shown in Figure 3.

An LSTM cell element includes the forgetting gate , the input gate , and the output gate , which are used to protect and control.

Information determines the new information being stored in the cell state, is used to determine the updated information, and finally, the gate determines the output value to the next LSTM cell element.

The equation for each variable in the LSTM network is shown in where are the weight matrix; are the offset vector; is the sigmoid activation function and takes values in the range [0, 1]; and tanh is the tangent activation function and takes values in the range [−1, 1].

3.2. Research on the TAFEO-LSTM Model
3.2.1. Improvement Strategies

In the basic LSTM neural network, the number of neurons in the hidden layer is mainly chosen randomly or empirically, which leads to the problems of low classification accuracy and unstable classification effect. When the number of Batchsize is too large, it will increase the memory capacity and make the gradient descent direction no longer change, which will easily fall into the local optimal solution and reduce the accuracy. The size of the Maxepoch determines whether the model can be fitted or not; when the Maxepoch is too small, the model will be underfitted, and when the Maxepoch is too large, the model will be overfitted. Therefore, the TAFEO algorithm can be used to optimize the hyperparameters and the number of neurons of the LSTM network to enhance the classification accuracy and speed up the classification. In this paper, TAFEO is combined with the LSTM network to optimize the number of neurons in the hidden layer and the parameters batchsize and maxepoch in gradient descent of the LSTM model.

The improved TAFEO-LSTM model is shown in Figure 4.

The algorithm table for the improved TAFEO-LSTM model is shown in Table 7.

3.3. UCI Dataset Simulation Experiments

To test the superiority of the improved algorithm optimization model, six sets of international general UCI binary classification datasets were selected for comparison. The comparison models include LSTM, EO-LSTM, and TAFEO-LSTM, and the six datasets are banknote, blood, climate simulation, Indian, Pima, and WDBC. The details of the six datasets are shown in Table 8.

To evaluate the performance of the model more accurately, the ROC [23] curve was introduced as an additional metric for model evaluation in addition to the accuracy.

The vertical coordinate of the ROC curve is the true-positive rate (TPR), and the horizontal coordinate is the false-positive rate (FPR). The true-positive rate represents the proportion of predicted positive samples to all positive samples, as shown in equation (15), and the false-positive rate represents the proportion of predicted positive samples to all negative samples, as shown inwhere is the number of positive examples of correctly classified labels, and is the number of negative examples of incorrectly classified labels.where is the number of negative examples of correctly classified labels, and is the number of positive examples of incorrectly classified labels.

In the ROC curve, the performance of the model is usually evaluated by the value of AUC (area under ROC curve), which is the area under the ROC curve, and the larger the AUC value, the better the generalization performance of the model. The formula for calculating AUC is shown inwhere and are the number of positive and negative samples, respectively.

In the simulation experiments, each UCI dataset was divided into a 70% training set and a 30% test set. The experimental results were averaged over ten experiments. The AUC values of classification results are shown in Table 9, and the ROC prediction curves are shown in Figure 5.

From Figure 5, it can be seen that the strategy of using TAFEO for the LSTM model with the number of neurons seeking and hyperparameters batchsize and maxepoch seeking is effective, and the AUC value of LSTM with the original LSTM model has been improved than after EO optimization. As can be seen from Table 8, the accuracy of the LSTM model optimized by TAFEO is also improved.

4. Logging Reservoir Identification

4.1. Logging Dataset

To verify the effectiveness of the TAFEO-LSTM model proposed in this paper in oil logging data mining, actual oil and gas field data (D1 and D2) were used for validation.

The D1 well was attribute reduced to obtain five attributes, namely, (AC, GR, RT, RXO, and SP), and the D2 well was attribute reduced to obtain 13 attributes, namely, (GR, DT, SP, WQ, LLD, LLS, DEN, NPHI, PE, U, TH, K, and CALI). The attributes of D1 and D2 wells were applied in the testing process after attribute simplification, and the data in the training well segment were divided into 70% training set and 30% test set. The attribute information from the D1 and D2 wells was selected for normalization, and the five main attributes from each of the D1 and D2 wells were selected to draw their logging curves, as shown in Figures 6 and 7.

To verify the performance of the TAFEO-LSTM model, five models, LSTM, EO-LSTM, MPA-LSTM, WOA-LSTM, and GWO-LSTM, are constructed for comparison experiments. The parameter settings in each model are shown in Table 10.

Information on the data from the two selected wells is shown in Table 11.

4.2. Reservoir Identification Results

Figure 8 shows the actual formation results for each algorithm model in the test well section of well D1 compared to the test oil results, where vertical coordinate 2 represents the oil formation and vertical coordinate 1 represents the nonoil formation.

From Figure 8, the TAFEO-LSTM model has higher accuracy and the classification results can be closer to the actual oil layer distribution than the LSTM, EO-LSTM, MPA-LSTM, WOA-LSTM, and GWO-LSTM, and the accuracy was used to assess the model performance. The recognition accuracy was selected as the average value of each algorithm after 30 runs. The results are shown in Table 12.

From Table 11, the classification performance of the LSTM optimized by applying the intelligent algorithm has significantly improved, in which the recognition accuracy of the oil layer can reach 94.04% by applying the TAFEO-LSTM model, which indicates that it is feasible to apply TAFEO to optimize the LSTM and identify the oil layer with significant effect.

Similar to D1, accuracy was used to assess model performance and the average of results after 30 runs is shown in Table 13.

From Table 13, the classification performance of the LSTM optimized by applying the intelligent algorithm is significantly improved, in which the recognition accuracy of the gas layer can reach 95.01% by applying the TAFEO-LSTM model, which indicates that it is feasible to apply the TAFEO-optimized LSTM and the recognition of the gas layer is significant.

Figure 9 shows the actual gas formation results for each algorithm model in the test well section of well D2 compared to the test gas results, where vertical coordinate 2 represents the gas formation and vertical coordinate 1 represents the nongas formation.

As can be seen from Figure 9, compared with LSTM, EO-LSTM, WOA-LSTM, GWO-LSTM, and MPA-LSTM, the TAFEO-LSTM model has a higher accuracy rate and the classification results can be closer to the real gas layer distribution.

4.3. Comparison of Reservoir Identification Models

To better verify the validity of the improved model, several models that have been applied to logging are compared with those proposed in this paper. These models are Fisher’s different approach, BP neural network, ELM [24], SVM [25], and CNN [26]. We use these models to identify well D2 logging data. The comparison results are shown in Table 14.

From Table 14, the TAFEO-LSTM model is more accurate in identifying reservoirs than other models because it takes into account the temporal characteristics of logging data.

In summary, the improved TAFEO strategy applying LSTM classification is effective in practical logging reservoir identification.

5. Conclusions

(1)An improved equilibrium optimizer algorithm, TAFEO, is proposed, and tent mapping is introduced to increase the population diversity. The introduction of a convergence factor is to effectively accelerate the convergence speed of the algorithm and balance the local and global optimization-seeking ability, and finally, adding a premature perturbation strategy can prevent the algorithm from falling into the local optimum. Through the simulation of 16 benchmark functions and Wilcoxon rank-sum test for optimization results, the improved algorithm outperforms various intelligent optimization algorithms in terms of accuracy and convergence speed and has good robustness.(2)The TAFEO algorithm is applied to the LSTM parameter optimization; then, a reservoir identification TAFEO-LSTM model is established. Simulation experiments on the UCI dataset demonstrate that the improved model has a strong generalization ability and high recognition rate. The TAFEO-LSTM model is then applied to identify the reservoirs, and the results are compared with those of five reservoir identification models, namely, LSTM, EO-LSTM, WOA-LSTM, GWO-LSTM, and MPA-LSTM. The results show that the proposed TAFEO-LSTM model is more accurate than the other five models in identifying reservoirs, with the highest accuracy of 94.04% for the oil layer and 95.01% for the gas layer. Subsequently, TAFEO-LSTM still performs well compared with the other models used for reservoir identification. Obviously, the improved model is effective in reservoir identification and has broad application prospects.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest in the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 42075129), the Hebei Province Natural Science Foundation (no. E2021202179), and the Key Research and Development Project of Hebei Province (nos. 19210404D, 20351802D, and 21351803D).