Abstract

In order to estimate the power system accurately and identify anomaly detection in real time, an identification method of anomaly detection in power system state estimation based on Fuzzy C-means algorithm is proposed. Considering the problems of scale and redundancy of power system measurement data, effective measurement data of the power system is extracted by the principal component analysis method. On this basis, the power system state estimation model established by particle swarm optimization support vector machines is used to judge the operational state of the power system. An anomaly detection identification method based on fuzzy C-means algorithm is proposed to cluster the measured data and identify the anomaly detection of power system. The experimental results show that this method can accurately estimate the state of the power system and has the highest identification accuracy for anomaly detection compared with similar methods. When the equivalent measurement data is affected by noise, the identification delay of this method for anomaly detection in a power system is 1 s, and the real-time performance is high.

1. Introduction

The energy management system (EMS) of the power system dispatching center is the key to ensuring the safe and economic operation of the power grid. As an important part of EMS, state estimation provides complete and reliable real-time data and information of system operation for EMS and serves other advanced application software [1]. With the rapid development of the economy, the demand for power is increasing, and the structure and working mode of power systems are becoming increasingly complex. In order to ensure the stability, economy, and security of power system operation, the automation level of the power system dispatching center needs to be constantly improved [2]. Power system state estimation is one of the most important application software in energy management systems, and its performance directly affects the reliability and correctness of the operation results of many other advanced application software [3]. The power system dispatching center makes corresponding decisions according to the system operation conditions provided by the power system state estimation, so the state estimation is directly related to the safe operation of the power grid. How to obtain state estimation software with superior performance has been one of the goals of engineering and academic circles for many years [46].

For a power system, to understand its operation, it is necessary to obtain as much system measurement information as possible [7]. Such measurement information can be used to estimate the power system state. At present, there are many researches on power system state estimation and anomaly detection and identification.

Reference [8] comprehensively considered the circuit correlation between measurement data and between measurement data and network parameters and proposed a forward-backward tracking identification method for complex state estimation anomaly detection. This method can effectively identify parameter errors, bad measurement data, and topology errors, but it does not consider the scale and redundancy of measurement data. In practical applications, in large-scale measurements, too much redundant measurement data will affect the identification effect of anomaly detection. Reference [9] proposed a discrimination method using GPU parallel acceleration and the maximum value of the maximum measurement point. This method mainly uses the anomaly detection identification method of graphics processor parallel acceleration to complete the anomaly detection identification. The method has good identification ability for anomaly detection, and because the calculation time is short, it can meet the needs of production. Reference [10] proposed a new improved state transition algorithm to identify the parameters of the chaotic power system. This method first analyzed the mathematical model and chaotic state of a second-order chaotic power system and fourth-order chaotic power system, respectively, and then used the state transition algorithm to identify the unknown parameters. Among them, a local optimal judgment mechanism was set to judge whether the state was premature. The reverse transfer mechanism was added to make the state jump out of the precocious state in order to avoid falling into the local optimal solution and improve the diversity of the population; and finally, the improved state transition algorithm was applied to the parameter identification of second-order and fourth-order chaotic power systems, respectively, which is proved to have the identification ability. However, when the method is applied in practice, there are many power system components, and it is difficult to use the state transition algorithm to carry out the transfer identification of unknown parameters, which will affect the identification efficiency. Reference [11] proposed a wind farm anomaly data recognition method based on composite machine learning. By using the horizontal and vertical quartile methods, the abnormal points of wind speed and power data are identified from the bilateral relationship between wind speed and power output, and the cleaning effects of different methods on data are compared. The vertical quartile and composite machine learning are combined to identify the abnormal points of wind speed and power data of wind farms. Reference [12] proposed a data-driven detection technology to detect abnormal attack data in the power grid. In the research, the sparse identification method based on nonlinear dynamics and the neural network based on physical knowledge are used to detect the attack parameters in the power grid, and the simulation in IEEE 6 and other systems shows that this method can effectively detect the abnormal attack data in the power grid. Reference [13] proposed a distributed Internet of Things monitoring terminal anomaly data detection and recognition method based on space-time correlation. Use the time correlation of monitoring terminal data to calculate the distance matrix between data and combine the improved DBSCAN algorithm to extract the geometric features of the spatial correlation number of terminal nodes so as to complete the detection of abnormal data. However, the impact of data scale and data redundancy on power system status is not considered in the above three methods when identifying bad data, and the recognition effect needs to be improved.

Therefore, this paper proposes a method for identifying anomaly detection of the power system state estimation based on fuzzy C-means (FCM) algorithm. The effective measurement data of the power system is extracted by the principal component analysis method, which is used as training samples and input into the particle swarm optimization-support vector machine (PSO-SVM) estimation model to estimate the power system state. Considering the complexity of power system, the FCM algorithm is improved by the locust algorithm and Lagrange multiplier method to solve the optimal clustering center and realize power system state estimation anomaly detection.

2. Identify Anomaly Detection in Power System State Estimation

2.1. Effective Measurement Data Extraction of the Power System Based on Principal Component Analysis
2.1.1. Power System State Measurement Matrix and Its Centralization

Principal component analysis can be used in any engineering or scientific measurement data [14], assuming that the number of power system measurement data samples is , and the data type in each measurement data is . A measurement matrix is constructed by arranging each type of data.

For the convenience of calculation, the measurement matrix is expressed as the measurement vector of power system state , namely,

Among them, is the measurement vector of the th type under the power system state. The sample mean value calculation formula of power system state measurement matrix is as follows:

Principal component analysis usually requires the calculation matrix to have zero sample mean value, i.e., centralization, so that the measurement matrix after centralization is as follows:

Among them, is the measurement of data samples of power system status after centralized processing.

2.1.2. Singular Value and Singular Value Decomposition of Power System Measurement Matrix

Let the power system state measurement matrix be a rectangular matrix of , let be the unit orthogonal basis in , and set the corresponding state eigenvalue of the orthogonal diagonal symmetric matrix to be . Then, we obtainwhere the right superscript indicates transposition; is the th unit orthogonal basis in . The singular value of power system measurement matrix is set as and as the th state eigenvalue. is the maximum value of at all unit vectors, and the maximum value can be calculated at corresponding to the eigenvalue .

Assuming that the rank of power system state measurement matrix is , a diagonal matrix is constructed as follows:

In formula (6), is a diagonal matrix of order , , and the elements on the main diagonal of are the first singular values of in descending order. At this time, an orthogonal matrix and a orthogonal matrix can be used to decompose the singular value of the power system state measurement matrix :

Unitize , calculate , and then construct the orthogonal matrix .

2.1.3. Effective Measurement Data Extraction

In the principal component analysis, it is mainly used to construct an orthogonal matrix and perform orthogonal transformation on the centralized power system measurement matrix [15]. The transformation method is . The measurement data vectors in the new measurement vector after transformation are independent, and the variance is arranged in descending order. At this time, the new observation vector is obtained from the original measurement vector:

In the power system measurement matrix and covariance matrix , the unit eigenvector corresponding to the power system state measurement eigenvalue is , and is the first principal component, the second principal component, and the second principal component of the measurement matrix. The size of the feature vector reflects the deviation degree of the corresponding principal component samples. The selected principal component data can be determined by the cumulative contribution rate [16]. The cumulative contribution rate of the principal component is as follows:

When estimating the power system state, the measurement data with the largest cumulative contribution rate is used as the effective measurement data sample .

2.2. Application of PSO-SVM in Power System State Estimation

For the effective measurement data sample extracted in Section 2.1, it is regarded as an input sample of power system state estimation, and a power system state assessment model is constructed based on PSO (particle swarm optimization algorithm)—SVM (least squares support vector machine) to estimate whether there is an abnormality in the power system state.

In the measurement sample extracted in subsection 2.1, if the dimension measurement vector is , the power system state estimation steps are as follows.

The sampling time data before time is used as the state estimation sample set of the power system. The sample set is divided into training and test samples. Firstly, the corresponding relationship among the state values of the power system is established at the previous time and the power system state value at the previous time, and then move the training window back to obtain the measurement data input by the least squares support vector machine prediction model and the output matrix of the state estimation as and are as follows:

The test sample set is and . The test sample set is input into the least squares support vector machine prediction model, and the power system state estimation problem is converted into the objective function with the minimum power system state estimation error: , and are penalty factors and relaxation factors in turn, and is the weight. The reasonable setting of and can ensure the minimum error of power system state estimation results. Based on this, particle swarm optimization algorithm is adopted to set and as follows:(1)Particle Swarm Optimization InitializationInitialize the parameters of the particle swarm optimization algorithm, set the number of iterations, penalty factor and relaxation factor’s population size, weight, and learning factor. The penalty factor and relaxation factor are regarded as particles, and the initial velocity and initial position of particles are randomly generated. The initialization range of the penalty factor and the core width is and , respectively. The current position of each particle is set as , and the current position of the best particle in the population is set as .The least squares SVM is trained with training samples to build a prediction model, and the measurement sample is brought into the prediction model to obtain the power system state estimation value. At this time, the individual fitness value of each particle representing the penalty factor and relaxation factor is as follows:Among them, represents the number of samples of the measured data. Under the estimation of the th measurement sample, the actual value and the predicted value of power system state are and , respectively.(2)Compare the fitness value calculated by each particle with the fitness value of the current particle optimal solutionIf is less than , the particle will replace the current individual optimal solution. The optimum optimization result of each particle is compared with the optimum value of the current population optimum scheme. If is less than , replace the original population optimal solution with this particle.(3)Iteration number analysisDetermine the maximum value of the iteration number. If not, update the particle’s speed and position, and jump to step (3). If the iteration number has reached the maximum, map the global optimal particle to the penalty factor and the kernel width of the least squares support vector machine.(4)State estimationThe normalized parameter penalty factor and kernel width obtained in the previous step and the measured sample data of the training power system are used to train the least squares vector machine, and then the measured data samples are imported into the prediction model to estimate the power system state at the time of .

2.3. Application of Improved FCM Algorithm in Anomaly Detection Identification

After the power system state is estimated in Section 2.2, the operation state of the power system can be estimated. If the power system operation state is abnormal, it is necessary to quickly identify anomaly detection, so as to provide a basis for anomaly detection processing. Because there are many components and lines in the power system, there are many types of measurement data. The FCM algorithm is a clustering algorithm based on partition, which determines that each data point belongs to a certain degree of clustering by membership degree. It does not need human intervention in the process of algorithm implementation. Its calculation is simple, fast, intuitive, and easy to implement by computer. Based on this, the FCM algorithm is used to identify the anomaly detection of each component or line through data clustering.

Fuzzy clustering is a good data identification method, but its shortcomings are as follows:(1)The number of anomaly detection clusters and the selection of cluster centers are accidental, which seriously affect the clustering performance;(2)It is necessary to balance the efficiency and quality of the algorithm;(3)Due to the large scale of measurement data, the objective function is easy to fall into local optimization.

In order to overcome the above problems, this paper uses the locust algorithm to determine the number of anomaly detection clusters and the cluster center of the FCM clustering algorithm. Solve the clustering results of anomaly detection, calculate the upper and lower limits of the measurement feasible region, and when the actual value exceeds the upper and lower limits of the feasible region, it is considered as anomaly detection.

Anomaly detection clustering refers to the use of the clustering algorithm to classify historical measurement data. The algorithm can be described as follows.

For the sample set containing samples, each sample can be represented by a set of feature vectors. Assuming that these samples can be divided into cluster centers, minimize the weighted square sum of the intraclass distance between samples and their cluster centers as the algorithm objective function, namely,

Among them, is the fuzzy partition matrix of the measurement data type in the power system state estimation results; is the measurement data type cluster center matrix; is the number of measurement data type clusters; is the membership degree of the cluster center of the th measurement data sample; is the distance from the sample to the cluster center in the measurement data sample.

Membership meets the following criteria:

In formula (12) as the number of clusters of measurement data decreases monotonously, there is an inflection point . In this way, when the number of measured samples changes, the number of iterations of the algorithm will change, and the minimum value is taken near the inflection point , namely,

According to the Lagrangian multiplier method, when the minimum value of the objective function can be obtained, the membership degree and the measurement data clustering center shall meet the following requirements:

Among them, is the distance from the sample to the cluster center in the measurement data sample.

Set formula (12) as the objective function, formula (13), formula (15), and formula (16) as the constraints, and when the error of the two previous and subsequent iterations of the objective function is less than a given positive number after a certain number of iterations, the clustering ends.

Like other meta heuristic algorithms, locust algorithm is also prone to fall into the local optimum and lack of global search ability. In order to solve the problem of uninitialized population and optimization of linear decreasing parameters in locust algorithm, this paper introduces a method of using reverse learning and Cauchy distribution to optimize respectively.(1)Population Initialization. The initial solution of the locust algorithm is randomly generated when optimizing the clustering number and clustering center of anomaly detection, so it is easy to state that the optimal solution generated by the algorithm cannot be evenly distributed in space, which limits the algorithm’s solving efficiency and reduces the algorithm’s performance. Reverse learning is a machine learning strategy suitable for population optimization. It usually obtains the reverse solution of these current solutions in each iteration of the algorithm and selects the solution that is conducive to evolution according to the current solution and the reverse solution, further reducing the blindness of the algorithm, expanding the search space, and improving the overall performance of the algorithm. The algorithm process is as follows.(a)Randomly generate the number of anomaly detection clusters and the initial population of the cluster center, where the number of anomaly detection clusters and the solution of locust individuals corresponding to the cluster center are calculated as follows:Among them, represents the individual in the dimension and represents the number and center of the anomaly detection clusters in the dimension; The range of is . The range of is ; and represent the number of anomaly detection clusters and the upper and lower bounds of the cluster center, respectively. and are the number and dimension of locust population used for optimization.(b)Solve the inverse solution. Find the reverse population , where each individual represents the number of anomaly detection clusters and the reverse solution of the cluster center and give it in the following way:(c)Select the best locust individual. First, select the locust individual with the best objective function value from , and then calculate the mean value of the solution in .Among them, is the current solution of the number and center of anomaly detection clusters. It can be seen from formula (19) that the reverse learning method enables the algorithm to search the number of anomaly detection clusters and cluster centers in a larger search space. At the same time, the method can also guide individual optimization and accelerate the global convergence of the algorithm.(2)Linear Decline Coefficient Optimization. The linear decline parameter is mainly used to balance the local and global development capabilities. Except for the iteration variable , the rest of its expressions are fixed values. Therefore, the value of the linear decreasing parameter decreases gradually with the number of iterations. If the value of the decreasing parameter is too small or too large, it is not conducive to the balance of local and global capabilities. Therefore, it is necessary to optimize the iterative decreasing coefficient so that the coefficient can be well balanced. The Cauchy distribution is a function with the characteristics of two wings equalization, and the formula is as follows:

The Cauchy distribution is introduced into the linear decline expression for the number of iterations, which can better balance the ability between the whole and the whole. Therefore, the linear decline parameter expression is optimized as follows:where is the maximum number of iterations; and are the maximum and minimum values of the linear decreasing parameter in turn.

In this paper, the improved locust algorithm is used in the FCM algorithm, which is mainly used to optimize the number of anomaly detection clusters and cluster centers, set locust individuals to represent the number of anomaly detection clusters and cluster centers, code locust individuals, combine the optimization task with the location of locust individuals, and use natural number coding method for locust individuals. This method is easy to understand and can reduce the amount of calculation. There are kinds of anomaly detection identification tasks, which are divided into subtasks. The number of measured data is , so the locust code is expressed as dimension measurement vector, and each one-dimensional ordinate of the locust individual represents the number of a feasible solution. In the locust algorithm, the fitness value is used as the condition for selecting individuals. With the increase of the fitness value, its individual quality will also be improved. On the contrary, their individual advantages and disadvantages will also be reduced. Therefore, the locust with a good fitness value will have a greater role in improving the optimization efficiency. Let be the position of the th locust obtained through iteration, and be the feasible solution function corresponding to the individual of the locust, then the fitness is as follows:

When the locust algorithm is used to solve the cluster number and cluster center of FCM clustering algorithm, it is as follows:(1)Initialize the locust algorithm and other relevant parameters, and one-to-one correspond the number of anomaly detection clusters and the feasible solution of the cluster center with the locust individual, and set the maximum value of iteration number.(2)The population of locust algorithm is initialized according to reverse learning.(3)The decreasing coefficient of locust algorithm is updated by Cauchy distribution.(4)Calculate the of the task corresponding to the individual locust.(5)Compare the task with the in the previous iteration. If the result is smaller than the previous iteration, it will replace the original locust position. Otherwise, it will remain unchanged.(6)When the number of iterations is less than the maximum number of iterations, turn to step (3), otherwise turn to step (7).(7)The output of the best fitness locust individual position is the best solution of the number of anomaly detection clusters and the cluster center.

Based on the above contents, the power system state estimation anomaly detection flow chart is established, as shown in Figure 1.

3. Experimental Analysis

In order to analyze the performance and efficiency of the method in this paper, simulation experiments are carried out on the IEEE 14 node (Figure 2) and IEEE 30 node (Figure 3) transmission systems on the MATLAB platform. The experimental operating system is Ubuntu 18.04, the software is CUDNN7.0, the hardware is NVIDIA GTX 1080Ti, the CPU frequency is 3.0 GHz, and the GPU model is FP32. The experimental conditions are as follows:(1)The section typical load (generation) data of the power system at a certain time is taken as the active load value and reactive load of each node at the first time, and the load power factor of each node is calculated. This paper uses the MATPOWER toolbox to extract the measurement information of IEEE 14 nodes [17] and IEEE 30 nodes [18] as the power system state estimation samples.(2)Set the power factor of each node load of the system to be unchanged and use linear increase and sinusoidal change to superimpose to simulate the fluctuation of the system load rate.(3)Simulate and generate 100 continuous time system measurement information.(4)The 100 continuous time system measurement information is added to the least squares support vector machine for state estimation, and the system state estimation results are obtained.

Setting the power system will cause abnormal problems, as shown in Table 1. The state estimation results of IEEE 14 node and IEEE 30 node systems by this method are shown in Figures 4 and 5.

As shown in Figures 4 and 5, the state estimation results of IEEE 14 node and IEEE 30 node systems in this method are true, which verifies that this method has the power system state estimation capability.

The IEEE 14 node is set to cause the high temperature overheating problem on the fault feeder lines 1-2 at different times of 2 d, 4 d, 6 d, and 8 d at 6 m due to the high temperature overheating fault. The method in this paper[810] are used to identify the anomaly detection at the same time. This identification mainly identifies the line location with anomaly detection by clustering, the anomaly detection identification results of different methods are shown in Figure 6.

According to Figure 6 that under the high temperature overheating fault of IEEE 14 node, when there is an exception on the 5 m position of the fault feeder 1-2, after the identification of the anomaly detection by this method, the position displayed in the recognition result is consistent with the real situation. There are errors in the identification results of the references [810] method, the results show that this method is the most effective in identifying anomaly detection.

The IEEE 30 node is set to have surface discharge fault at 2 m, 4 m, 6 m, 8 m, 10 m positions in the fault feeder line 27–29 due to the occurrence of high energy discharge fault. The method in this paper, the references [810] methods are used to identify the anomaly detection. The identification errors of various methods are shown in Figure 7.

According to Figure 7 that under the IEEE 30 node high energy discharge fault, when the surface discharge fault occurs at the 2 m, 4 m, 6 m, 8 m and 10 m positions of the fault feeder 27–29, the anomaly detection identification error of this method is 0-0.1 m, and the identification error of references [810] method, is greater than this method, which verifies that the identification result accuracy of this method for the anomaly detection is high.

The interference data with noise intensity of 0.5 dB is introduced into the measurement data. Tables 2 and 3 show the real-time test results of anomaly detection identification of IEEE 14 bus and IEEE 30 bus power system state estimation respectively.

According to Tables 2 and 3, after the measured data is introduced into the interference data with 0.5 dB noise intensity, the anomaly detection delay for identifying IEEE 30 power system is 1 s. The results show the robustness of the method in this paper. Because of the principal component analysis, it can efficiently extract useful measurement data from the power system, so as to identify the anomaly detection in the state estimation.

In order to compare the identification ability of different methods, the system measurement information of 100 consecutive moments is identified abnormally. The shorter the identification time is, the better the identification ability of this method is. The experimental results are shown in Figure 8.

It can be seen from Figure 8 that the identification time of the method in this paper is always below 0.05 ms, and lower than that of the comparison method, which verifies that the method in this paper has good identification ability.

4. Discussion

Through the state estimation of power system and identification of anomaly detection, it is proposed that ensuring the safe operation of power grid is a necessary condition to ensure people’s life and property and social and economic development, and it is particularly important to ensure the safe operation of power grid. Relay protection is an important measure to ensure the safe operation of power grid [1922]. The security of power system directly affects people’s daily life and the stability of the whole power grid. It is an important basic guarantee for the stability of power users. The development and application of power system relay protection can not only quickly and effectively eliminate power failures, but also provide an important security guarantee for the normalization of national economic production and people’s lives [23, 24]. The relationship between relay protection and electricity is just like that between the army and the country. In other words, relay protection monitors and controls the normal working conditions of the power system. In case of minor short circuit fault, the power supply system will not interrupt the work, but will issue a warning and handle accordingly. In case of large short circuit accident, relay protection can be used for reliable, accurate and rapid troubleshooting. For this reason, in terms of power system security protection, this paper suggests to use relay protection, combined with the power system operation state estimated by this method and the anomaly detection identified, to take appropriate methods to protect the normal and stable operation of the power system [25].

In modern society, electricity is an important energy. Whether the power system can operate safely and reliably is directly related to people’s lives and social development. Relay protection is also the guarantee of safe and reliable operation of power system. Taking effective measures to give full play to the role of relay protection is the premise to ensure the reliable operation of power system. In order to give full play to the role of relay protection, advanced equipment, perfect systems, and qualified staff are essential parts. Power enterprises should be flexible in actual operation [2630].

First of all, advanced equipment is the effective material basis for the full play of relay protection. Electric power and production enterprises shall regularly or irregularly inspect the relay protection devices and equipment used internally and deal with problems in a timely manner; At the same time, timely update the equipment with low performance and ordered to be eliminated, and constantly improve the equipment to ensure that each circuit has sufficient time protection [31]. We should also increase investment in financial expenditure to ensure that new technologies and equipment are put into use.

Secondly, the establishment and implementation of a sound management system is the guarantee for the normal work of relay protection. Electric power enterprises should pay attention to the formulation of the system and formulate regulations for routine operations [3235]. The person in charge of each link in the relay protection management shall strictly restrict the system, earnestly implement the implementation effect of the system, establish a strict assessment mechanism, and improve the overall level of the relay protection management of the power system [36].

Moreover, the cultivation of management and operation personnel’s professional level also plays a key role. Managers are commanders of power system relay protection management in power enterprises, while operators are practical operators of specific work. Their skill levels play a key role. In view of the low quality and poor professional level of relay protection management and operators, electric power enterprises should organize regular education and training to enhance the professional quality and skill level. Secondly, according to their own conditions, we should timely introduce a team with strong professional ability to promote the improvement of the comprehensive technical level of relay protection personnel.

Since it is difficult to obtain the actual operation data of the power system, the method proposed in this paper is identification after state estimation. In the future research, it is necessary to continuously strengthen the identification ability of abnormal data of the power system to provide theoretical basis for the safe operation of power system.

5. Conclusion

Power system state estimation and anomaly detection identification is one of the core issues in power system research. This paper proposes an anomaly detection and identification method for power system state estimation based on the FCM algorithm and tests the application effect of this method through experiments. In the experiments, the main conclusions are:(1)The state estimation results of the IEEE 14 bus and IEEE 30 bus systems are true and consistent with the actual situation;(2)Under the comparison of various methods, the method in this paper can accurately identify anomaly detection after the state estimation of IEEE 14 bus and IEEE 30 bus systems, and the identification accuracy is the highest;(3)After the interference data with noise intensity of 0.5 dB is introduced into the measurement data, under this condition, the identification delay of this method for anomaly detection of power systems with two structures is 1 s, and the identification of anomaly detection under various abnormal states of power systems is significant in real time.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The authors confirm that all listed authors have made a significant scientific contribution to the research in the manuscript, approved its claims, and agreed to be an author.