Abstract
As new ways to solve partial differential equations (PDEs), physics-informed neural network (PINN) algorithms have received widespread attention and have been applied in many fields of study. However, the standard PINN framework lacks sufficient seepage head data, and the method is difficult to apply effectively in seepage analysis with complex boundary conditions. In addition, the differential type Neumann boundary makes the solution more difficult. This study proposed an improved prediction method based on a PINN with the aim of calculating PDEs with complex boundary conditions such as Neumann boundary conditions, in which the spatial distribution characteristic information is increased by a small amount of measured data and the loss equation is dynamically adjusted by loss weighting coefficients. The measured data are converted into a quadratic regular term and added to the loss function as feature data to guide the update process for the weight and bias coefficient of each neuron in the neural network. A typical geotechnical problem concerning seepage phreatic line determination in a rectangular dam is analyzed to demonstrate the efficiency of the improved method. Compared with the standard PINN algorithm, due to the addition of measurement data and dynamic loss weighting coefficients, the improved PINN algorithm has better convergence and can handle more complex boundary conditions. The results show that the improved method makes it convenient to predict the phreatic line in seepage analysis for geotechnical engineering projects with measured data.
1. Introduction
Deep learning-based neural networks are widely used in industrial artificial intelligence and have rapidly promoted the development of computer image recognition, autonomous driving, and computer-aided medical diagnosis. Furthermore, with the discovery of function approximators for multilayer feedback neural networks, deep learning methods have also been used to solve various partial differential equations (PDEs) in the field of mathematics [1–12]. Zhu and Zabaras solved PDEs with deep convolutional neural networks, which are commonly applied in image regression analysis [13]. Sirignano and Spiliopoulos proposed the deep Galerkin method (DGM) via the deduction of the Galerkin approximation, which proves that a neural network can approximate a class of quasilinear PDEs [14]. Furthermore, Gaussian processes have been incorporated into the computation of linear neural network operators and can accurately predict parameters for sparse observational data [15, 16]. However, the Gaussian process has limitations in dealing with nonlinear problems, where nonlinear terms are locally linear, which significantly reduces the prediction accuracy of the Gaussian process in nonlinear regions. Therefore, Raissi et al. adopted the Runge–Kutta method to solve the above problem and proposed a method for solving PDE neural networks that considers physical information, which is called a physics-informed neural network (PINN); PINNs are widely applied to prediction and inversion problems [15–23]. The PINN method is suitable for dealing with high-dimensional problems and can determine the inverses of equation parameters with limited data; therefore, this algorithm attracted drawn deep attention and is under continuous improvement [24]. Wu et al. combined the standard PINN with the conservation laws that enhances the physical model constraint capabilities of the PINN [25]. Fang et al. proposed a strongly-constrained physically informed neural network (SCPINN) by adding the information of the complex derivatives to the PINN, which effectively predicts a series of nonlinear dynamical equations [26]. Ramabathiran and Ramachandran proposed a method that mixes a PINN with traditional meshless numerical algorithms to improve the interpretability of the PINN, and this method can also solve the discontinuity problem [27]. Compared with traditional numerical methods such as the finite difference method and finite-element method (FEM), methods based on neural networks can avoid grid division and discrete equations, and once the neural networks are trained, results can be predicted in the training area.
Currently, the artificial intelligence method has been extended to geotechnical engineering as a new computational method and a method for generating data to compensate for the lack of measured data [28–31]. Yang et al. proposed an excavation and support deformation prediction method based on a gated recurrent unit neural network [32]. Wang applied convolutional neural networks to geotechnical reliability analysis [33]. Wang et al. [34] proposed a novel convolutional neural network to analyze dam surface seepage images collected by drones. In addition, Wang et al. [35] proposed a solution method for groundwater seepage differential equations based on deep convolutional residual networks, which has high efficiency in solving high-dimensional stochastic PDEs. Daolun et al. [36] created a new special neuron containing pressure gradient information model and proposed an algorithm called the signpost neural network (SNN), which can greatly improve the attained solution accuracy of unsteady seepage PDEs.
However, most of the above studies solved seepage differential equations under simple Dirichlet boundary conditions, and these methods can hardly be used in practical engineering cases when solving PDEs with complex boundary conditions, including Dirichlet boundary conditions and Neumann boundary conditions. For the seepage of an earth-rock dam, the corresponding seepage control equation has complex boundary conditions such as the phreatic line to be solved, which requires an iterative algorithm even for traditional numerical methods such as the FEM. Hence, it is necessary to carry out more studies on solutions for seepage differential equations with complex boundary conditions using neural networks.
In this study, an improved method for solving the differential equations of seepage based on a PINN is proposed, which aims at the complex free surface solution in seepage analysis. PINN-MD contains the spatial distribution characteristic information added by a small amount of measured data. These data were added to the loss equation of the neural network to improve the prediction accuracy of PINN-MD and reduce the induction error. Meanwhile, the addition of dynamic loss weight coefficients effectively improves the convergence performance of PINN-MD. The prediction method is verified by a two-dimensional numerical seepage example. The results show that the developed method can predict the two-dimensional steady-state free surface of seepage with high accuracy based on some measured data. Therefore, the method can be effectively adopted for computational problems with scarce measured data and has wide application prospects in geotechnical engineering.
2. Methodology
2.1. PDEs Solution Based on PINNs
Traditional solution methods for PDE, such as the FEMs, require PDE discretization to form a nonlinear system of equations. Then, the system of equations is solved with an iterative method. With the increase in precision requirements, traditional methods need to be divided into finer grids. This greatly increases their computing times. In contrast, neural networks utilize a very different approach to solve PDE problems. A PINN computes higher-order differentials by the backward automatic differentiation method [37], in which there is no need to either discretize the governing equations or form or refine the mesh. Because PINNs are rooted in deep neural networks (DNNs), the basic idea of a DNN is briefly introduced as follows.
As shown in Figure 1, a basic DNN consists of three parts: an input layer, hidden layers, and an output layer. The hidden layers form the training part and consist of multiple layers of neurons that are fully connected. Assuming that there are m neurons in the (n-1)th layer and that the input data have k dimensions, the output of the jth neuron in the nth layer is as follows:where and are the weight and bias of the jth layer, respectively, and is the activation function. is taken as the collection of the weight matrix and bias vector and is written as . When n = 2, the input layer corresponding to is . W and b are continuously updated during training until the output conditions are met. The neural network also contains parameters that cannot be obtained by training, called hyperparameters, including the initial learning rate , the number of neurons in each layer m, and the number of hidden layers n.

(a)

(b)
Through DNN improvements, Raissi et al. [16] provided a new network structure, containing physical information constraints that can solve the following PDE:where is a nonlinear differential operator and is a subset of R. The function is defined as follows:
The loss function can be defined as the mean squared error (MSE) as shown in the following equation:
During the process of solving a PDE, the loss function is used to characterize the gap between the predictions and the actual data. In the meantime, the loss function represents the robustness of the prediction model.
is the residual of the governing equation as follows:
is the residual of the initial and boundary condition as follows:where is the solution of the equation, is the output of the neural network, and and are the numbers of training samples in the equation and the boundary , respectively. When the loss function is reduced to the control value, the output of the neural network approximates the solution of the corresponding PDE.
As shown in Figure 2, the randomly generated input data x and t are trained in the hidden layers to produce output data. The output data are converted into governing equations by automatic differentiation. Then, the residuals of the governing equation and the boundary equation form a loss function. The parameters are continuously updated within the specified iteration threshold . The loop ends when the iteration Iters reaches the threshold . Finally, the solution of the PDE is output.

From the above, it can be seen that the training data in the standard PINN algorithm are randomly generated, so the data themselves contain no additional information. This often makes the training process too long or even causes it to fail. Similar to the training process of a machine vision network, labeled data are also necessary when predicting differential equations. Additionally, the introduction of physical information makes the optimization process of the loss function highly nonconvex, which also causes randomly generated data to easily fall into local minima during training [38–40]. In addition, equation (4) shows that the loss function of the PINN method is a weighted combination of its loss terms. However, the loss items compete because they have the same weight. Competition among loss terms adversely affects the solution accuracy [41].
2.2. PINN Optimized by Measurement Data (PINN-MD Method)
In previous PDE studies, the PINN method could obtain a high prediction accuracy with exact Dirichlet boundary conditions [31]. However, these studies are not applicable to problems with nonlinear boundaries such as seepage with phreatic lines because the exact spatial distribution of the solutions is required to provide the necessary feature information. The authors found that the accuracy of the PDE solution is greatly reduced when the exact boundary conditions are removed. Therefore, it is necessary to introduce new feature information to enhance the convergence of PINN.
When complex boundary conditions such as Neumann’s boundary condition are present in the phreatic line equation, it is difficult for the PINN to converge, and it easily falls into local minima. To help the network converge better, an improved method is proposed to assess the loss function by adding some measured data in the hidden layer to increase the spatial distribution characteristics in the neural network. The loss function for the measurement data is regularized as follows:where is the number of measurement data and and are the predicted and measured values at the measurement points , respectively.
In addition, when training the PINN, as the total loss continues to decrease, there is an imbalance among the residuals of the components because they have the different convergence speed [42]. The residuals of the three components are given different weighting coefficients during training, and these weighting coefficients are continuously changed during the training process to regulate the rates of convergence of the residuals of each component. To ensure that all items in the loss function have the same convergence speed, a loss weighting coefficient that changes with the loss residual is added to the loss function. To prevent a certain part of the residuals from being given too much weight, the weight of the largest part of the residuals is set to 1, and the rest are determined according to the ratios of their residual values to the largest residual value. The rules for assigning weights are as follows:where , , and are the weight correction coefficients for , , and , respectively. Then, the improved loss function is as follows:
Because measured data and dynamic loss weighting coefficients are added to the network to participate in the network parameter updates, this new method is named the PINN algorithm considering measurement data with dynamic loss weighting coefficients (PINN-MD).
Figure 3 shows the schematic diagram of PINN-MD. Compared to the PINN algorithm, in PINN-MD, the residuals of the measurement data MSEd are regularized and added to the loss equation as a new loss term. Spatial distribution features are added to the neural network through this loss term to participate in parameter updating and guide the neural network to converge. The dynamic loss weight coefficients help the neural network converge accurately. Consequently, PINN-MD has a faster convergence rate while having better adaptability to seepage equations with nonlinear boundary conditions.

2.3. Application of PINN-MD in a 2D Seepage PDE with a Phreatic Line
2.3.1. Two-Dimensional Seepage PDE
According to Darcy’s law, the 2-dimensional differential seepage equation is expressed as follows:
In addition, the continuity equation of steady flow is as follows:
By combining Darcy’s law (10) with flow equation (11), the seepage control equation can be obtained as follows:where h is the waterhead function, and are the permeability coefficient of the porous medium in the x and y direction. When the permeability of the soil is isotropic, namely, , equation (12) changes to the following Laplace equation:
The seepage phreatic line model is shown in Figure 4. For a rectangular homogeneous earth-rock dam of length L, the upstream and downstream water levels are h1 and h2, respectively, and equation (13) is the governing equation for describing steady-state seepage in geotechnical engineering. In an earth-rock dam, the phreatic line is the seepage flow line with zero pore pressure [43]. The height and shape of the phreatic line in the dam body have great influences on the stress of the dam and the stability of the dam slope. Therefore, the determination of its position is an important element in seepage and stability analyses of earth-rock dams.

2.3.2. Procedure for Solving 2D Seepage PDEs with PINN-MD
The process of solving a PDE with the PINN-MD method is based on two data components: one part is the generated featureless data, which enter the hidden layer through the input layer, and the other part is the measured spatial feature data, which affect the hidden layer through the loss function. Simultaneously, the hyperparameters s and l of the neural network structure are constructed according to the PDE. The neural network of this study can start with a single layer and gradually increase in size [44]. A fully linked neural network is used as the basic network form. In addition, the initial learning rate lr also needs to be set. The learning rate lr controls the speed of the parameter update process and the learning speed of the model, and the relationship is expressed as follows:where is the parameter value after the ith training iteration and is the gradient value after the ith training step. lr is a very sensitive parameter that influences the performance of the model in two ways: via the size of the learning rate and the change plan of the learning rate. The initial learning rate always has an optimal value and the commonly used values of lr range from 0.001 to 0.01. For the transformation strategy of the learning rate, the authors use the Adam optimizer of TensorFlow to automatically adjust lr.
Then, the training parameters are initialized. Before training, initial values are set for the parameters, but different initialization methods may have different training effects. The available parameter initialization strategies mainly include random initialization of the normal distribution and Xavier initialization. Random value initialization can break the symmetry of the neural network very well, but if the random value is too large or too small, the final convergence loss value will be large. The basic idea of Xavier initialization is that if the overall output and the output of a network layer can maintain a normal distribution and have similar variances, then the output can be prevented from tending to 0, thus avoiding the gradient dispersion situation.
For the activation function, tanh is used for each layer due to its good applicability in solving seepage PDEs [36].
Finally, the hyperparameters are adjusted according to the results of each training iteration, and the neural network is continuously optimized until the solution is obtained.
The general process of PINN-MD is as follows.(1)Two-dimensional input data x and y are randomly generated within the study area, and the neural network hyperparameters, including the initial learning rate lr, the number of neurons in each layer m, the number of hidden layers n, and the neural network parameters are initialized.(2)The boundary conditions and measurement data are added to the loss function through regularization. The PDE is added to the loss function through the backward automatic differentiation method. The training process of the neural network is started.(3)During training, the residuals of the loss items , , and are monitored, and the loss weight coefficients , , and are adjusted according to the change in the loss item.(4)The hyperparameters are adjusted according to each training result so that the neural network can achieve better PDE solution accuracy. Under the control of the iteration value , the neural network keeps updating the parameters until it finishes training and outputs the result.
3. Numerical Experiments
The two-dimensional Laplace equation is used to examine the performance of PINN-MD, a fully connected neural network with different depths/widths/initial learning rates. The calculation program is written in Python. The computational program is written in Python and implemented with the open-source machine learning platform TensorFlow 2.0 for automatic differentiation. All training is controlled by iterative parameters, Iters = 20000.
3.1. Laplace Seepage Equation
The phreatic line of glycerin seepage through a rectangular dam with a height of 6 m and a width of 4 m is taken as an example to demonstrate the feasibility and accuracy of neural network prediction methods [45]. The seepage governing equation in equation (13) is added to the loss function as physical information. In addition to Dirichlet boundary conditions for the upstream and downstream water levels, the Neumann boundary conditions at the base of the dam are as follows:
As shown in Figure 5, the standard PINN with 5 layers of neurons and 20 neurons per layer is first used to predict the seepage field and phreatic line. Because the standard PINN does not consider measurement data and dynamic weight coefficients, so it is difficult for the neural network to converge under such boundary conditions, resulting in obvious errors in the phreatic line. In contrast, once PINN-MD is used by adding measured data in place of randomly generated data for training, the error is greatly reduced. Part of the experimental data presented in literature [45] was used as a training set.

(a)

(b)
The left side of Figure 5 shows the comparison between the predicted curves obtained under different network shapes and the experimental curves. The experimental data in the interval x in the figure are used as training data to help the neural network converge and improve its prediction accuracy. The red line is the free surface of seepage fitted according to all experimental data. Three layers, with 5 neurons in each layer, can clearly predict the downward trend of the free surface. As the network shape becomes more complex, the prediction accuracy gradually increases. The upstream region shows high prediction accuracy due to the addition of measured data, and the prediction error mainly occurs downstream. In the right part of Figure 5, the measured data are moved from the upstream part to the middle section or x. The prediction curve has a slight fluctuation near the theory line. As the complexity of the network structure increases, the prediction effect is gradually improved. When the network structure contains 3 layers with 10 neurons in each layer or becomes more complex, the prediction curve obtained using the measured data from the middle section becomes closer to the accurate value than the prediction curve obtained using the upstream measured data. Table 1 specifically shows the prediction errors induced using measured data from the upstream and middle sections. Among the most complex network structures in the table, the prediction accuracy of the middle section measurement data is 68% higher than that of the upstream measurement data. In addition, Figure 6 shows that PINN-MD has better convergence performance. The dynamic weighting factor improves the convergence efficiency of PINN-MD and reduces the convergence value from to , reflecting the significant advantage of PINN-MD.

Figure 7 shows the prediction errors of the exudation points for different network structure curves in Figure 5. The prediction error of the exudation point decreases with the deepening of the neural network, and the network structure is when using middle section measurement data with a very small error of 1.2%.

In general, the measured data in the middle of the dam contain more obvious seepage pressure variation characteristics, and using the middle section measurement data as the spatial distribution feature information for learning and training can significantly improve the prediction accuracy achieved for the phreatic line and seepage exudation point.
3.2. Hyperparameter Analysis
The settings of neural network hyperparameters directly impact the prediction effect of the PINN-MD algorithm. Figure 8 shows the effects of the hyperparameters on prediction accuracy using middle section measurement data. With the decrease in the initial learning rate, the error exhibits a rapid decreasing trend and stabilizes below 1%. The relative error reaches its minimum value when the learning rate is 0.001. However, a further decrease in the learning rate generates additional errors. With the deepening of the neural network, the overall prediction accuracy shows an upward trend. When the number of network layers is 3, better prediction results are obtained, and after this point, the accuracy improvement is not obvious. With 5 layers and 20 neurons in each layer, the prediction accuracy of the model can reach 0.35%. Although continuing to increase the number of neural network layers can further improve the prediction accuracy, the benefits of increasing the number of layers are lower than the cost induced by the extra time consumption. Because differential equation (13) have lower levels of complexity than machine vision [12], the change in the number of neuron layers has a different effect on the prediction accuracy than the change in the number of neurons per layer. In this study, once the number of layers reaches 5, continuing to increase the number of neuron layers has limited improvement in prediction accuracy. However, an increase in the number of neurons per layer always steadily improves the accuracy of the prediction. When the number of neurons in each layer is small, increasing the number of network layers is still ineffective, but when the number of network layers is low, increasing the number of neurons can effectively improve the prediction accuracy. The prediction accuracy reaches a peak when the number of neurons is 40 and then slowly increases as the number of neurons increases. The PINN-MD method can perform well with sparse data by using only a small amount of measured data. At the same time, the use of fewer neural network parameters allows the deep learning process to converge faster.

(a)

(b)
In summary, too many or too few neurons in the hidden layer can lead to overfitting or underfitting. When a neural network has too many nodes or too much information processing power, the limited amount of information contained in the training set is not sufficient for training all the neurons in the hidden layers; thus, overfitting occurs. Even if the training data contain enough information, the use of too many neurons in the hidden layer increases the training time, making it difficult to achieve the desired effect. Therefore, choosing an appropriate number of neurons is crucial.
4. Discussion
The most important improvement provided by PINN-MD is to introduce a reasonable regularization approach for spatial distribution information, which enables neural networks to better learn solutions to PDEs from fewer measurement data. In this study, spatial distribution information regularization is achieved through the automatic differentiation of the TensorFlow framework. PINN-MD can handle seepage PDEs with more complex boundary conditions, including the Dirichlet boundary conditions and Neumann boundary conditions. Furthermore, the improved dynamic loss function of PINN-MD has better convergence.
Although the method proposed in this article has many advantages, such as not considering the discretization of PDEs, this method also faces many problems, such as the fact that neural networks for solving PDEs heavily rely on training data, and when the quality of the given training data is not good, more training time is often needed.
5. Conclusion
An improved method for predicting the free surface of seepage using neural networks is proposed, in which measured data are added to the neural network, and the boundary conditions, initial value conditions, and seepage control equations together constitute the loss function. The convergence of the loss function is regulated by dynamic loss weighting coefficients. Different neural network structures and initial learning rates are used for training the neural network. The main conclusions are as follows.(1)The PINN-MD method is characterized by introducing a small amount of measured data that can provide additional feature inputs for neural networks and dynamic loss weight coefficients. The loss function of the seepage equation for a rectangular homogeneous dam is established, which includes the characteristics of the spatial distribution and the dynamic balance of each loss item.(2)Numerical experiments show that minor errors are produced between the predicted results and the experimental data, thus verifying the effectiveness of the proposed method. A comparison among the measurement data taken from different positions illustrates that measurement data closer to the center of the phreatic line area can improve the attained prediction accuracy.(3)The neural network training is sensitive to the hyperparameters choice, including the number of network layers, the number of neurons, and the initial learning rate. The influence of the number of network layers on the prediction accuracy is greater than that of the number of neurons in each layer. For the two-dimensional steady-state seepage equation, fewer hidden layers can achieve a good learning effect.
Compared with traditional numerical calculation methods, the PINN-MD method does not need to apply strict boundary conditions, requiring only a small number of measurements to complete the prediction process. Additionally, it solves similar types of equations quickly after training, thus effectively reducing the workload of engineering detection. The method can be further improved to provide a valuable approach for solving geotechnical engineering problems.
Data Availability
The data used to support the findings of this study are included within the article
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The authors acknowledge the financial support received from the National Natural Science Foundation of China (Grant no. U1965203) and ‘The research on support time and deformation warning of surrounding rock of large underground cavern group under extremely high stress condition of Shuangjiangkou Hydropower’ (Grant no. A147 SG).