Abstract
In order to improve the accuracy of theoretical energy loss calculation in low voltage distribution area (LV-area), this paper proposes a new prediction method based on variational mode decomposition (VMD) and particle swarm optimization (PSO) least squares support vector machine (LSSVM). Firstly, the main influencing factors of energy loss calculation in LV-area are determined by the grey correlation method, which reflects the data-driven characteristic of the method and ensures the objectivity of the prediction results and the generalization of the calculation model. Secondly, the trend component and fluctuation component are obtained by VMD of daily energy loss series in the LV-area. The variable set of main influencing factors of energy losses is used as the input variable of LSSVM, and the VMD result of the energy loss sequence is used as the output. The theoretical energy loss training and calculation model of LV-area is established. Compared with the traditional calculation model, this model has more accurate calculation accuracy by taking into account the frequency characteristics of energy losses in different frequency bands. PSO is used to optimize the parameters of LSSVM for the purpose of improving the accuracy of LSSVM. Finally, an example of 252 LV-area in a city in northern China is given to verify the validity of the proposed method. The results indicate that the proposed method generates more accurate results.
1. Introduction
In recent years, the development of intelligent distribution network has carried out a simulation of construction and energy saving simultaneously. At the same time, the accuracy and timeliness of energy loss calculation are required to be higher. As an important part of the distribution network directly connected with load, the analysis and calculation of the power loss in the LV-area has always been one of the important research topics [1]. The traditional energy loss calculation method needs a lot of real-time operation data and very detailed system basic data; however, these data are not easily available. It is difficult to get the accurate calculation results by using traditional strategy [2–5]. Artificial intelligence algorithm, which has been successfully applied in load forecasting and other fields [6, 7], provides a new way to solve this problem, which could generate accurate prediction of energy loss using based on limited data.
In the past decades, researchers have proposed lots of methodologies to improve energy loss prediction accuracy. For example, Bai and Liu [8] proposed an evaluation method based on BP neural network improved by particle swarm optimization. Zhong and Chen [9] selected eleven energy loss features from LV-area sample data, constructed a relatively perfect electrical index evaluation system of the LV-area, took them as the input variable of the training model, and proposed a line loss calculation method based on deep learning. But the actual distribution network is extremely complex; it is extremely difficult to acquire such a large amount of distribution operation data from courts users. Qin and Liu [10] applied extreme learning machines in line loss prediction, and the author also selected the particle swarm optimization method to optimize hidden layer parameters of ELM. Some methods using the artificial intelligence algorithm achieve accurate distribution network energy loss calculation by using limited data. Data-driven feature selection methods such as the grey correlation method provide a solution to this problem. Yao and Zhu [11] applied gradient boosting decision tree model in line loss predicting in LV-area. And the author demonstrated that the method proposed performs better than others’ traditional subjective feature selection methods. Actually, the data-driven feature extraction method makes the calculation model more accurate, can eliminate the coupling relationship between feature quantities, and reduce the subjective influence, which could make the prediction model more universal and generalizable [12, 13]. Actually, the energy loss data series of LV-area have significant frequency characteristics. But in the past studies, the frequency characteristics of energy loss series are often ignored. Sufficiently considering the characteristics of energy loss sequence in different frequency bands, more accurate calculation results can be obtained. Fan et al. [6] applied the empirical mode decomposition (EMD) method in electric load signal decomposition. The author divided the data into eight groups, including a trend term and seven high-frequency items, which are simultaneously employed to predict modelling. The result proved that the method aimed at the fact that different frequency characteristics could get a more accurate predictive value. VMD is evolved from EMD and it can decompose the source signal into physically significant subsignals of different frequency bands, which is more suitable for power signals analysis.
The least squares support vector machine is an improved machine prediction algorithm based on the support vector machine (SVM) [14], which has the advantages of minimizing the structural risk and adapting to small samples, and can greatly reduce the solving complexity. LSSVM has been widely used in many fields, such as in load prediction and other fields [15–18], but it is seldom used in the energy loss prediction field. Prediction model parameters have a great influence on the accuracy of the prediction model; thus, it is of great significance to optimize the parameters of the prediction model. Researchers have put forward various methodologies in order to optimize model parameters. Most of the methods are based on metaheuristics modelling [19–21]. Many researchers applied various methodologies in parameters optimization for SVM, and the prediction effect of the SVM model got good results. Deng et al. [22] proposed a method to allocate novel gate resource in airport, which is based on the quantum evolutionary algorithm improved by PSO. The result showed that the algorithm optimized by PSO has a good optimal scheduling effect. PSO has the advantages of simple operation and fast convergence and has been widely applied in many fields.
Based on this, the paper proposes the energy loss calculation method based on variational mode decomposition and particle swarm optimization least squares support vector machine for LV-area. Firstly, the factors affecting the energy losses of LV-area are listed, and the main influencing factors of energy loss calculation of LV-area are determined by the grey correlation method. Then, the daily LV-area energy loss series of LV-area is decomposed into trend components and fluctuation components with different frequency characteristics by VMD, the main influencing factors of energy losses obtained by grey correlation method are used as input variables of LSSVM, and the VMD score of energy loss series is calculated. The component obtained from the solution is taken as the output, and the theoretical energy loss calculation model of LV-area is established. In order to improve the accuracy of LSSVM, PSO is used to optimize the parameters of LSSVM. Finally, the effectiveness of the proposed method is verified by taking 252 LV-areas in a city in northern China.
The structure of this paper is as shown in Figure 1. Sections 2–4 introduce the theoretical principles of the grey correlation method, VMD, and PSO optimization of LSSVM, respectively. Section 5 introduces the establishment of the energy loss prediction model based on the above methods. In Section 6, a numerical example is used to analyse the validity of the model.

2. Grey Relational Principal Component Analysis of Influencing Factors of Energy Losses
The structure of the distribution network is often complex, and there are many factors that affect the LV-area energy loss. They mainly include line length, insulation rate of distribution network, cable rate, line section, active energy supply, reactive energy supply, energy factor, distribution transformer load rate, working voltage of distribution transformer, and the number of users (equal to the number of meters), some of which play a major role in the change of energy loss rate. In this paper, the grey correlation method is used to find out the first five main factors affecting energy losses.
The energy loss rate in the LV-area is taken as the reference sequence, and the factors affecting the energy losses in the LV-area and the station area are taken as the comparison sequence. The formula is expressed as follows:where denotes the real energy loss rate sequence; denotes the i-th affecting factors; and k represents the data at the k-th measuring moment.
Before using the grey correlation method to analyse the correlation degree of the data, due to the different measurement units of each influencing factor, the data cannot be directly calculated, and the data needs to be dedimensioned. In this paper, the initial value operator is used to normalize the influencing factors and LV-area energy loss series.
The correlation coefficients of influencing factor series and daily energy loss rate series at different times can be expressed as follows [13]:where is the resolution coefficient, generally between 0 and 1, usually 0.5.
The correlation coefficient represents the influence degree of the influencing factors of the daily LV-area energy loss, and the average value of the correlation coefficient is taken as the correlation degree [13]:
Here, is the grey correlation coefficient of the influencing factor series to the daily energy loss rate series. The closer the value of is to 1, the greater the influence of this factor on the energy loss in the LV-area. According to the calculated grey correlation degree, the top 50% influencing factors of the correlation degree are selected as the input data of theoretical energy loss calculation. In this paper, five of the above 10 main influencing factors of energy losses are selected; they are active load supply, reactive power supply, transformer voltage, transformer power factor, and transformer load rate, separately.
3. Variational Mode Decomposition
VMD algorithm can find the centre frequency and bandwidth of each modal component adaptively, and VMD has good frequency separation ability. In this paper, by means of VMD, a group of energy loss rate subsequences representing different frequency components are obtained [23–25].
3.1. Realization of Variational Mode Decomposition
The VMD algorithm consists of three steps: constructing a constrained variational model, Lagrange transformation, and alternating optimization: Step 1: a constrained variational model is constructed. The original sequence is decomposed into K modal subsequences after VMD, and is the modal component with a specific centre frequency. Under the condition that the sum of all modal components is equal to the original sequence, the constrained variational expression is obtained by minimizing the sum of the estimated bandwidth of each model [26]: where K is the number of modes to be decomposed (K = 4 in this paper); and are the k-th (k = 1, 2, …, K) mode sequence and the center frequency of each mode sequence after VMD; is Dirac distribution; and is the original daily energy loss rate sequence. Step 2: Lagrange Transformation. The above-mentioned constrained variational problems are solved by constructing Lagrange multiplication operators to transform them into unconstrained variational problems. The variational constraint model represented by the Lagrange function is as follows [26]: where λ is the Lagrange multiplication operator and α is the quadratic penalty factor to reduce the interference of Gaussian noise. Step 3: Alternating Optimization. The variational constraint model (5) in Step 2 can be solved according to the following two iterative equations [26]:where is the Wiener filter of the modal sequence, is the real part of the modal sequence, which is calculated by the inverse Fourier transform, is the frequency centre of the corresponding modal sequence, and n is the number of iterations. After iteration, when the accuracy meets the convergence criterion, the feasible solution of the constrained variational model is obtained.
3.2. Variational Mode Decomposition of Daily Energy Loss Series
The VMD process of the daily energy loss series is summarized as follows: Step 1: input the daily LA-area energy loss series , , to be decomposed. Step 2: initialize , , , and set n to zero. Step 3: update and according to equations (6) and (7). Step 4: update according to the following equation: Step 5: set n = n + 1, repeat Steps 3 and Steps 4 for iteration until the following formula is satisfied [26]:
Based on equation (9), the energy loss subsequences are obtained. The modal function is uk, and the central frequency is , k = 1, 2, …, K. Based on the analysis of the VMD principle, when the value of K increases to a certain amount, the curve has an obvious downward bending phenomenon, and the critical value of K is the optimal number of VMD. If the number of decomposition amount is too large, the component will show intermittent phenomenon, resulting in the average instantaneous frequency of the high-frequency signal decrease, which causes the downward bending phenomenon. Confirmed by test, the best decomposition amount K for energy loss series is 4.
4. Particle Swarm Optimization for Least Squares Support Vector Machine
In this paper, the kernel function density and penalty coefficient of the least squares support vector machine (LSSVM) are determined by iterative optimization of particle swarm optimization (PSO). This method retains the advantages of the least squares support vector machine, such as small sample size demand, fast calculation speed, and high accuracy and also takes advantage of the fast search speed of particle swarm optimization.
4.1. Least Squares Support Vector Machine and Its Regression Process
The least squares support vector machine is an improved machine prediction algorithm based on SVM. The loss function is the least squares support vector machine loss function, which transforms the inequality constraints of the SVM calculation method into equality constraints, which facilitates the solution of Lagrange multiplier and transforms the original quadratic programming problem into the problem of solving linear equations. LSSVM also has the advantages of minimizing the structural risk of SVM, adapting to small samples, and greatly reducing the complexity of the solution.
The regression process of LSSVM is as follows: Step 1: for a given sample set, are the training samples of LSSVM, and are the corresponding calculation samples. For the nonlinear separable training sample, it can be mapped to a higher dimensional space through the function , and the optimal linear regression function is expressed as follows: where is the vector of weight coefficient in feature space and is deviation. Step 2: according to the structural risk minimization criterion, the planning problem of equation (10) is transformed into the following equation: where is the error variable, is the penalty coefficient, and . Step 3: to solve the above optimization problem, the Lagrange function is constructed as follows: where is the Lagrange multiplication operator. Step 4: equation (12) is solved by Coroner–Kuhn–Tucker condition to obtain and b, and the regression function is obtained by calculation:
In formula (13), is the kernel function. The commonly used kernel functions are linear kernel function like formula (14) and RBF kernel function like formula (15):where is the width of the kernel function.
The penalty coefficient and kernel density have a great influence on the accuracy of the LSSVM prediction model. The generalization ability of the model increases with the decrease of , and the sample error increases. The smaller , the higher the complexity of the model, while greater leads to low learning efficiency. A reasonable selection of parameters and can improve the accuracy of LSSVM. In this paper, the particle swarm optimization algorithm is used to optimize these two parameters.
4.2. Particle Swarm Optimization
The principle of particle swarm optimization algorithm is based on the study of the foraging behavior of birds. The parameters and that need to be optimized are regarded as a search individual in the whole region, that is, particles. In the iterative process, the particle can dynamically adjust its velocity and position according to the particle optimal position pbest and the population optimal position gbest:
Equations (16) and (17) are the updated formulas of particle velocity and particle position, respectively. Equation (18) is the inertia weight updating formula, where is the inertia weight, and the larger the value, the stronger the global optimization ability of particles, the weaker the local optimization ability, and vice versa. and are acceleration constants, reflects the individual’s own cognitive ability, and reflects the individual’s cognitive ability of the whole group. and are random numbers between them and is the constraint factor.
4.3. Least Squares Support Vector Machine Optimized by Particle Swarm Optimization
The parameters of LSSVM model are optimized by using particle swarm optimization algorithm, and parameter is optimized by continuous iteration to minimize the objective function, as follows:where and are the corresponding outputs of the i-th sample data and the calculated outputs of LSSVM based on the inputs.
The process of PSO optimizing the and σ parameters of LSSVM is as follows: Step 1: initialize PSO parameters, including the number of particles L, individual extreme acceleration coefficient , global acceleration coefficient , and the maximum number of iterations . Step 2: all particle position vectors are initialized randomly. Step 3: select formula (19) as the fitness function, train LSSVM through the current particle position vector, calculate fitness value, and update the optimal value and global optimal value of each particle. Step 4: the optimal position of particle swarm optimization is calculated and the new position of each particle is updated. Step 5: the particle position is checked to judge whether it meets the end of iteration condition. If the fitness value meets the convergence criterion or the number of iterations reaches the maximum set value, the iteration process is ended. The global optimal solution is obtained from the feasible solutions obtained in each iteration process, and the optimal values of and are output. Otherwise, go back to Step 3 and continue the iteration.
5. Energy Loss Calculation Method Based on Grey Correlation Method and VMD-PSO-LSSVM Model
A theoretical energy loss calculation method of LV-area based on grey correlation method and VMD-PSO-LSSVM model is designed in this paper. Firstly, the energy loss rate sequence is decomposed by VMD, and four groups of subsequences reflecting different frequency components are obtained. Then, the four subsequences are combined with the energy loss influencing factor set (obtained by the grey correlation method) as the input of four different LSSVM prediction calculation models. Finally, the results of each LSSVM model are added to get the final result. In conclusion, in order to get more accurate calculation results, this method designs energy loss prediction calculation models for different frequency components of the energy loss rate series.
The calculation process of this method is shown in Figure 2, and the detailed steps are as follows: Step 1: the main influencing factors of energy losses are determined by the grey correlation method, including active energy supply, reactive energy supply, transformer voltage, transformer energy factor, and transformer load rate. Step 2: the training samples are the daily energy loss rate of LV-area 24 hours a day for 30 days, and they are input into the VMD model as input data. Step 3: the daily energy loss rate data series are decomposed by VMD with K = 4, and four groups of daily energy loss rate series with different frequency components are obtained, which are expressed as four groups of 30 rows and 24 columns (“”) daily energy loss rate matrix. Step 4: the five main influencing factors determined by the grey correlation method are combined with four groups of matrices obtained from the VMD of the daily energy loss rate of the training samples, which are respectively input into four different PSO optimized LSSVM models for energy loss calculation. Step 5: four groups of energy loss calculation results of different frequency scales are accumulated to get the theoretical energy loss calculation results. The time interval of daily energy loss rate and influencing factor data is 1 hour.

In order to evaluate the accuracy of energy loss rate calculation, the average absolute percentage error (MAPE) and the maximum relative error (Emax) are used to evaluate the accuracy of energy loss calculation. The index calculation formula is as follows [6]:where and are the theoretical energy loss actual value and theoretical energy loss calculated value of the i-th test sample, respectively.

6. Numerical Example and Analysis
6.1. An Overview of the Numerical Example
(1)Data source: the algorithm is verified by using the energy loss data of 252 LV-area in a city in northern China. These data have been processed with bad data. Here, the energy loss estimation method of one of the distribution transformers is introduced in detail.(2)Input data: the influencing factors of energy loss from May 1, 2020, to May 31, 2020 (obtained by the grey correlation method), including the active energy supply, reactive energy supply, transformer voltage, transformer energy factor, transformer load rate, and daily energy loss rate at the low voltage side of the distribution transformer. A total of 31 groups of experimental data are selected. The first 30 days of energy loss rate data and its influencing factors are selected as the training set; the energy loss rate data on the 31st day is taken as the test set, and some data are shown in Table 1. Taking the daily energy loss rate series on May 31, 2020, as an example, the daily energy loss series is decomposed into four 1 × 24 subsequences by the VMD method. The original series of daily energy loss rate and four subsequences of daily energy loss rate decomposed by VMD are shown in Figure 3. As can be seen from Figure 3, four subsequences of daily energy loss rate represent different frequency components obtained from VMD. Subsequence 1 is the trend component, which is similar to the change trend of the actual value of daily energy loss rate; subsequences 2∼4 are the fluctuation components, in which subsequence 2 reflects the shape of the curve, and subsequences 3 and 4 reflect the random fluctuation details of the curve. Because subsequence 1 and subsequence 2 fluctuate more smoothly than other subsequences, LSSVM1 and LSSVM2 use linear kernel function (formula (14)), while LSSVM3 and LSSVM4 use RBF kernel function (formula (15)) because subsequence 3 and subsequence 4 fluctuate more strongly.(3)Calculation model: in order to analyse the effectiveness of the model, this paper compares the algorithm based on VMD-PSO-LSSVM (the algorithm of this paper), PSO-LSSVM without variational mode decomposition (PSO-LSSVM), and LSSVM without any optimization algorithm (LSSVM) comparative study.(4)The parameters of the calculation model are as follows: in the VMD-PSO-LSSVM algorithm, the modal function K is 4, the number of particles L is 30, the individual extreme acceleration coefficient is 1.5; the global acceleration coefficient is 1.7; the maximum iteration times are 100; the initial parameters of the four LSSVM algorithms are 30, 4; and the PSO-LSSVM and LSSVM parameters are set the same as the VMD-PSO-LSSVM parameters.(5)Calculation objective: the daily energy loss rate series on May 31, 2020, is calculated from the series data, the influencing factors of daily energy loss rate from May 1 to May 30, 2020, and the influencing factors of energy loss on May 31, 2020.6.2. Comparative Analysis of Calculation Results
The daily energy loss rate on May 31, 2020, is obtained by using the above four calculation models, as shown in Figure 4. The prediction results are shown in Table 2. It can be seen from Figure 4 that the results calculated by VMD-PSO-LSSVM in this paper are closer to the actual values than the other two calculation methods, indicating that the calculation results obtained by VMD-PSO-LSSVM in this paper are more accurate.

6.3. Error Comparison and Efficiency Analysis
The average calculation error and average forecast time of the above three calculation models on May 31, 2020, are shown in Table 3.
Compared with the calculation results of the above three models, the calculation error indexes MAPE and Emax of VMD-PSO-LSSVM are the smallest, which indicates that the theoretical energy loss calculation accuracy of this method is the highest; the calculation error index of VMD-PSO-LSSVM is smaller than that of PSO-LSSVM, which verifies the necessity of combining VMD algorithm. Comparing the calculation time of the three models, VMD-PSO-LSSVM takes the most time, but its accuracy is significantly improved compared with the other two models.
Statistical hypothesis testing is also used to verify the prediction accuracy advantage of the model. In this paper, the T-double-tail test method is used to test whether there is a significant difference in the prediction difference of the three methods, so as to judge whether VMD and PSO have a significant improvement in the prediction accuracy of the model. The results of hypothesis testing are shown in Table 4.
It can be seen from Table 4 that the prediction accuracy of the VMD-PSO-LSSVM model is significantly different from that of the other two models. At the same time, the LSSVM model optimized by PSO also has significant accuracy improvement compared with the model without parameter optimization. The results demonstrate that the VMD and PSO methods greatly improve the prediction accuracy of the prediction model.
Using the same method to estimate the energy losses of other 251 LV-areas, the average calculation error MAPE is 1.203%. The relative advantages and disadvantages of the three methods are similar to those in Table 3, which shows that the method proposed in this paper is effective.
7. Conclusion
In this paper, an energy loss calculation method of LV-area based on variational mode decomposition and PSO optimized LSSVM is proposed.
Firstly, the grey correlation method is used to analyse a series of factors that may affect the calculation of energy losses and, finally, five main factors are determined to simplify the calculation. This data-driven feature selection method makes the model more objective and enhances the universality of the calculation model. It is verified by the author that this method not only can be used for energy loss prediction in LV-area but also has good applicability for energy loss prediction in 10 kV distribution network.
Secondly, different subsequences of daily energy loss rate decomposed by VMD combined with influencing factors are taken as the input of four different LSSVM calculation models, and then the calculation results of each model are added to get the final result. Considering the different frequency characteristics of the energy loss sequence, it is decomposed into components of different frequency bands and predicted by model matching, respectively. Energy loss calculation models designed for different frequency components of energy loss rate series make the calculation more refined. Compared with the traditional algorithm, the prediction model proposed in this paper has better adaptability and calculation accuracy.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Authors’ Contributions
Keyan Liu conceptualized the study, designed the methodology, conducted formal analysis, wrote the original draft of the study. Dongli Jia curated the data; administered the project; was responsible for resources; wrote, reviewed, and edited the article. Fengzhan Zhao supervised the project. Qicheng Zhang arranged software. Shuai Hao investigated the study. Shuai Zhang validated the findings.
Acknowledgments
This research was funded by the Science and Technology Project of State Grid Corporation of China (SGTJDK00DWJS1800014): “Lean Line Loss Management Technology and Application of Distribution Network Based on Big Data.”