Abstract

With the development of intelligent technology, the distribution network has also ushered in a new round of upgrading and transformation. New technologies, equipment, and platforms will be introduced to bring innovative blood to the distribution network. This paper proposes a data mining settlement management method for distribution network projects. The improved K-means method is used to estimate the coefficients of each link of the project investment. When a large deviation of a certain coefficient is found, the investment adjustment is made immediately to avoid the overall investment. This method can save costs for infrastructure construction, through the data mining deeper, can form a standard, intelligent analysis method, realize based data acquisition to the output results from the application of rapid transformation, realize the material equipment price forecast and project construction cost and fast estimation, provide project decision-making more scientific and accurate, and make the whole process of project cost more accurate.

1. Introduction

Since the “Thirteenth Five-Year Plan,” with the implementation and promotion of policies such as poverty alleviation and rural revitalization, the prominent contradictions in power construction have changed from “utilizing electricity” to “good use of electricity.” The power distribution network is responsible for the distribution of electricity. High hopes were placed on it, and its engineering investment increased substantially. Faced with a large variety of distribution network construction data, the traditional manual management and control methods that lack “digital” overall analysis are inefficient, and decision-makers have insufficient analysis and response capabilities, making it difficult to achieve the goals and objectives of the distribution network construction during the 14th Five-Year Plan period [14]. In recent years, with the rapid development of data science, the application of data mining technology in the power grid has also flourished. Data mining technology can deeply dig its value through massive data information and provide strong support for the construction and development of electric power enterprises [5]. Therefore, in order to meet the task of power grid construction and lean management and control requirements in the new era, it is urgent to integrate data mining technology into the management and control of distribution network engineering to achieve “digital” management and scientific decision-making [6, 7].

With the continuous deepening of the power system reform and the continuous and rapid growth of the national economy, my country’s power grid construction has made brilliant achievements. However, with the increase in the scale of investment, the impact of the project cost level on the development of the power grid has become more and more significant. Therefore, it has attracted much attention. At present, the whole process cost management has proved the importance of the preproject cost to the entire project implementation. Scientific and accurate determination and effective preproject cost control and management are the key to achieving the construction project management goals [8]. In practice, the mid- and early-stage project cost has never received attention. The distribution network plays the role of distributing electric energy and stabilizing power supply in the power system. The safe and reliable operation of the distribution network is a prerequisite for the stable development of society, and the key to ensuring the safe operation of the power grid lies in the quality of the construction of the distribution network. Therefore, it is an inevitable trend to deepen the research on key management and control technologies for the entire process of distribution network engineering and to stabilize the construction and implementation quality of distribution network engineering through scientific means, lay a pragmatic foundation for the safe and reliable operation of the distribution network, and obtain safe and reliable electric energy for the people. The stable development of society provides a strong guarantee [912]. The distribution network project has the characteristics of multiple points, wide areas, and extremely complex external environment, and the management and control elements have multilevel logical relationships. The intricate and complex relationships enable the identification, analysis, and analysis of key management and control elements in the construction of the distribution network. Management and control has become the top priority of the entire project control and the most difficult. In recent years, the concepts of “Internet of Things” and “cloud computing” have stepped onto the stage of the times. Big data and mobile Internet technologies have grown vigorously. Massive amounts of data are contained in the information systems of the power industry, and the valuable value of data has been increasingly tapped. The abundant and massive data obtained can reflect the status information of all aspects of the distribution network construction project in real time, and data mining technology makes the transformation of “data-information-knowledge-wisdom” possible [13, 14]. Therefore, it is imperative to integrate data mining technology into the management of the whole process of distribution network construction. It is an important cornerstone for the better, stronger, more precise, and more accurate development of the entire process of distribution network management and control [15].

With the popularization and application of new technologies, new technologies, and new equipment in power grid engineering, the proportion of equipment and material purchase costs in project investment is increasing, and the impact of equipment and material prices on power grid engineering investment is increasing. Collect equipment and material price information released by the State Grid. It is helpful for power grid companies to analyze the trend of price fluctuations and improve the accuracy of project estimation and estimated investment. Create a basic data set of equipment and material prices, use the neural network analysis method to build an intelligent analysis data model, and perform iterative analysis and calculation through the data model and analysis method to realize the calculation and prediction of equipment and material information prices. The predicted material price information is fed back to the historical engineering cost data set for further adjustment, and the established rapid evaluation algorithm and model are used to make more accurate cost estimates for the projects that need to be invested. The cost of the proposed project is estimated based on the method of fuzzy mathematics, and the degree of closeness is calculated through the fuzzy relationship, so that the similarity between the constructed project and the estimated project is quantified, and then, the principle of selecting the nearest is adopted, and the project with the largest degree of closeness is used to estimate the estimated project cost.

With the development of intelligent technology, the distribution network has also ushered in a new round of upgrading and transformation, which will introduce new technologies, new equipment, and platforms and bring innovative blood to the distribution network [16]. The construction of distribution network projects has the characteristics of large investment amount, multiple nodes, small unit scale and long fund recovery period, and there are many uncertain factors. In the whole process of project implementation, funds are often brought in due to changes in certain factors. Changes have led to a large gap between budget and actual project investment, sometimes up to 40%, of which government influence factors may account for about 10%. High deviation will cause a great waste of power grid investment, and construction funds cannot be used rationally [1720].

This article aims to analyze the problems and causes of cost control in each stage of the distribution network project and explore solutions and management measures, while ensuring the quality of project construction, reasonably and effectively control costs, save funds and resources, and improve corporate investment efficiency. The deviation is controlled at 20% (the target is 10%). Based on this, this article will be based on the massive cost data of power grid engineering involved in the process of historical project cost data management, technical and economic analysis, and application of technological and economic results within its jurisdiction, including massive cost data in infrastructure information. Unify management in the management system, conduct deeper data mining, form standard and intelligent analysis methods, realize rapid conversion from basic data collection to result output application, realize material and equipment price forecasts and rapid and accurate estimation of project engineering costs, and make project decisions. Provide more scientific and accurate support to make the whole process of the project more accurate.

2. Analysis on Influencing Factors of Distribution Network Project Settlement

2.1. Environmental Impact

The most unpredictable are natural environmental factors, such as weather or natural disasters and other irresistible factors, which can directly prevent continuous construction during the construction process and affect the construction progress. If a long-term disaster is encountered, it will seriously affect the construction period and operation and maintenance costs will also rise. Natural disasters severely damaged the construction site and also seriously damaged the construction property. Sometimes, the plots and paths planned in advance will need to be reselected due to the destruction of the geographical environment, and different construction methods must be adopted to continue. These will seriously lead to budget overruns [2123].

2.2. Human Factors

The technical ability of personnel is the core of project implementation and runs through the entire project cycle. Starting from design consultation, the design plan should fully consider the rationality of the plan and have an emergency mechanism. When there is a change, the construction can be continued in the most economical way to reduce the difficulty; in the procurement stage, the personnel relies on professional ability to select cost-effective equipment and materials and accurate control of materials to avoid affecting the later construction; in the implementation process, effective evaluation of construction technology, and contractor capabilities can be carried out, and bad behaviors, such as violations, can be found and avoided as soon as possible.

2.3. Policy Factors

The unified promotion and construction of the distribution network is a project behavior arranged by the State Grid. It is mandatory under certain industry needs, but this kind of mandatory can only restrict the inside of the State Grid system, and the construction of the distribution network belongs to the scope of the national infrastructure project. Construction site selection and construction period require relevant documents issued by government departments, and the approval period of the process will seriously affect the project implementation period [24]. Moreover, national policies are greater than the requirements for the development of the State Grid industry, and often, sudden higher-level policies will directly overturn the early stage of the project, which is also a big blow to the project. It is necessary to conduct thorough investigations in the early stage of the project, gain insights into various aspects of information, and reduce losses caused by policy changes.

2.4. Social Factors

The distribution network is a “last mile” network, so it is necessary to communicate directly with the owner before construction, but disputes about compensation for land acquisition often occur, such as obstruction by villagers, mixing of surrounding enterprises, and government intervention, which hinder the smooth progress of the construction. Such social events will delay the implementation cycle. Negotiation time should be strictly controlled, reasonable negotiation terms should be adopted, and coordination should be done as much as possible.

2.5. Management Norm Factors

The implementation of distribution network projects is consistent with the implementation process of ordinary projects, and each stage involves investment. Therefore, investment funds must be strictly controlled at each stage. At present, at each stage, there is a waste of funds due to various negligence or accidents, and there is a lack of timely treatment measures. Strict management specifications should be formulated to avoid incomplete design considerations, deviations in project costs, and unreasonable project implementation cycles and control equipment and construction costs within the lowest acceptable range.

3. Analysis of the Effect of the Application of the Distribution Network Planning Data

The project cost of the distribution network is a huge project that runs through the entire life cycle of the distribution network project. Each stage will involve huge costs. Effective cost control methods should be used to strictly control every investment detail. From the perspective of the project cycle, including planning and project approval, design consultation, bidding and procurement, installation and construction, completion acceptance, and project maintenance, investment changes in each implementation stage will have an impact on the total investment: the project investment details, including installation projects fees, bank loan interest, equipment purchase fees, equipment testing fees. For the project cost, the most important thing is the real-time price mechanism. Therefore, the distribution network project must formulate a market-based price mechanism under the national macropolicy to affect the macroinvestment decision-making and improve the project economy to a certain extent. This paper adopts the cost control method of distribution network engineering based on price control and the whole life cycle as the main line [25].

The proportion of investment in the distribution network has gradually increased in the total investment in the power grid. From 10% in the past to 45% now, it can be seen that the State Grid attaches importance to the distribution network. From the perspective of technological development, the construction of backbone power grids has reached a certain bottleneck stage, and breakthroughs may take some time to accumulate in operation. The polymorphism and diversity of the distribution network is destined to develop much faster. At the end of the power grid, it integrates and develops with multiple departments to form a platform for the integration and integration of business and distribution data. Some of them will have repeated investment. The distribution network investment data should be sent to the grid big data platform as sample data, and its impact on each application system should be calculated, respectively, so that the data can penetrate into each business system, and the investment scale of each system can be adjusted to form a virtuous cycle system with optimal capital.

The comprehensive project cost of the cross-business platform is shown in Figure 1. The big data platform stores the investment data of each business department of the power grid. The investment data of each link of the distribution network is input as different sample sets. Data mining algorithms can be selected, such as support vector machines, priority, maximum expected value, and decision tree algorithms. Select the appropriate distribution network data corresponding to the business platform for data calculation training and form auxiliary factors to further optimize the engineering investment parameters of each business department. The goal is to maximize the intensive investment of each department.

4. Data Mining Concepts and Methods

Data mining consists of several steps, such as data cleaning, integration, selection, transformation, mining, model evaluation, and knowledge representation. Through refining, analyzing, and transforming a large amount of data, the key target value is finally obtained. Its value lies in the use of data mining to improve forecast model.

4.1. Research on Data Mining Technology

Knowledge discover in database (KDD) is a process technology, including the original simple data through a series of processing changes into useful information that people can visualize. As the core technology of KDD, data mining (DM) was born in the second half of the twentieth century. The essence is filtering and mining, that is, screening massive data, removing appearance information, and mining high-value information. Broadly speaking, data mining technology is a kind of big data as the object of analysis, in order to achieve the purpose of discovering its internal rules and exploring its internal value from the massive data of irregularities, integrating artificial intelligence technology, statistics, and other theories. Kind of data processing method. In a narrow sense, data mining technology is a kind of data processing method, which is the product of the combination with the database. Data mining technology frequently appears in the Internet of Things and financial fields. In recent years, it has been widely used in the power industry. Data mining is the integration of ideas and theories in various disciplines, such as statistics and AI, as shown in Figure 2.

Since data mining is a large integration of multiple disciplines, its methods are countless. Common methods include regression analysis, trend prediction, feature recognition, association analysis, cluster analysis, and anomaly detection, which mine data from different perspectives. There are many links and factors that affect the cost of distribution network projects, and a categorical statistical method should be selected to analyze and evaluate the factors of each link separately. This article mainly uses the clustering method.

4.2. Clustering Method

Clustering is one of the most commonly used techniques in the field of data mining. The main idea is to use similarity measures to group similar samples together. There are many clustering algorithms, such as partition-based clustering algorithm (K-means), hierarchical clustering algorithm (BIRCH), density-based clustering algorithm (DBSCAN), and grid-based clustering algorithm (STING). The K-means algorithm is simple and fast, and the clustering results are satisfactory. Therefore, the K-means algorithm has been widely used. The cluster analysis method is more suitable for this type of project evaluation. The adaptive classification method can be used to understand the project cost in real time. When a large deviation occurs in a certain link, remedial measures can be taken immediately. Cluster analysis is a more mature statistical analysis method of engineering data and has a huge algorithm system. Different subalgorithms have been applied to various engineering projects and have good practical applications. Clustering algorithms include K-means, factor-based, density-based, and hierarchical analysis. Each algorithm has its particularity, and the corresponding algorithm needs to be selected according to different application conditions. The K-means algorithm is relatively simple, fast in calculation, and not so accurate in prediction. It is more suitable for the cost of distribution network projects. For a large investment project such as the distribution network, the cost accuracy requirement is lower, but the K-means algorithm is more random, and it is difficult to determine the number of clusters. Therefore, the K-means algorithm needs to be slightly improved.

This paper proposes a K-means clustering algorithm based on contour coefficients. When determining the classification of each point in the traditional K-means algorithm, it is necessary to repeatedly calculate this point and the points in each cluster to determine the position in the cluster. The calculation efficiency is too low. The contour coefficient method can pass nodes and contours. The coefficient is compared and calculated to determine the position, which greatly improves the speed. The so-called contour coefficient method is actually an optimization coefficient proposed on the basis of the hierarchical clustering algorithm. It uses similarity to determine the number of clusters and improves the overall efficiency of the algorithm. The contour coefficient is to quantify the similarity between any object in the data set and other objects in the cluster and the similarity between the object and the objects in other clusters and combine the two quantified similarities in a certain form to obtain the cluster. Finally, the evaluation criteria of clustering algorithm are obtained. The detailed process of the improved K-means clustering algorithm is as follows.(1)Initialization: randomly select n objects as the initial center of the cluster, set n as the total number of objects, the distribution network engineering cost data sample set , t is the number of iterations, and t < n.(2)Calculation of contour coefficient: for the i-th object, calculate the average distance from this object to all objects in its cluster, denoted as ai. For the i-th object and any cluster that does not contain the object, calculate the average distance from the object to all objects in the given cluster, find the minimum value for all clusters, and record this value as bi. For the i-th object, its contour coefficient y is calculated as follows:The value of the contour coefficient varies between −1 and 1. When ai < bi, for the i-th object, the value of the contour coefficient is positive; otherwise, it is negative. After averaging, the best cluster profile coefficient y is obtained, and the cluster value k is inferred.(3)Calculate the mean value from a certain point xi to the object in the cluster center, and assign each object to the most similar cluster.(4)Repeat the previously mentioned operations until the center no longer changes and the square error function converges.

The improved method can greatly reduce the number of repeated calculations, thereby improving the efficiency of the algorithm.

5. Project Settlement Forecast of Distribution Network Based on Data Mining

According to the work of settlement data preparation, classification, index establishment, and data conversion, intelligent algorithm technology is used to establish a prediction model, and the key indicators of the preliminary design of the new project are used as input to obtain the output estimate, which provides a reference for the preparation and review of the estimate. Promote more reasonable preparation of budget estimates. The application of data mining technology to the settlement data to carry out the estimate and forecast of the power grid construction project is in the following order: settlement data collection, data classification, data conversion, data processing, intelligent algorithms, model establishment, engineering indicators, budget estimates, and forecasts. By analyzing a large number of settlement data for power grid construction projects, mining the internal laws of settlement changes, taking settlement data as the research object, using data preprocessing techniques such as statistical analysis, data conversion, and data denoising, combined with neural network technology, fuzzy mathematics, and genetic algorithms, support vector machines and other forecasting methods establish effective forecasting models to effectively forecast the estimates.

The effectiveness of the evaluation algorithm is verified by experimental simulation data, and the investment data of the entire life cycle of a city power distribution project is selected as the simulation sampling value. There are more than 3000 data in total. The simulation is carried out in Matlab7.0. According to the project cycle data, the experiment divided the project cycle data into 6 stages for verification, and the square error value is calculated. The statistical results are shown in Table 1.

It can be seen from Table 1 that the average error of project construction cost = 0.235, the average error of consulting design = 0.052, the average error of bidding and procurement = 0.042, the average error of engineering construction = 0.209, the average error of acceptance and completion = 0.075, and the average error of operation and maintenance = 0.098. The higher error value of the project cost in the construction phase and the operation and maintenance phase indicates that this part of the emergency situation is higher, and the probability of occurrence is the highest in phase 3. Then, the cost control should be strengthened in this part to improve the technical ability and technical ability of employees. A variety of alternatives are provided as emergency measures in the engineering design, and the investment of each option is listed and analyzed in more detail as the basis for selection during the later on-site construction.

6. Conclusion

This paper proposes a data processing method using data mining and classification calculations. The improved K-means method is used to estimate the coefficients of each link of the project investment. When a large deviation of a certain coefficient is found, investment adjustments are made immediately to avoid the overall investment. The method saves the cost of infrastructure construction, carries out deeper data mining, forms a standard and intelligent analysis method, realizes the rapid transformation from basic data collection to the output application of results, and realizes the rapid and accurate estimation of material equipment price and project engineering cost. Project decision-making provides more scientific and accurate support, making the whole process of the project more accurate. The life-cycle management of the distribution network project cost is a dynamic process that penetrates all aspects of project planning, design bidding, construction implementation, completion acceptance, and later maintenance. It is professional, technical, and comprehensive in management. The project management unit should do a good job of precontrol, process management, and postevent closed-loop. Strictly control every risk point that may affect the project cost in the project cycle, enhance the awareness of prevention, and take timely countermeasures when the cost changes to control the project cost within a reasonable range, so as to improve the lean management of the distribution network project cost. Promote the transformation and upgrading of power distribution network construction to meet the development process of modernization of power enterprises and better serve the society and people’s livelihood. In the study of budget estimates and forecasts for power grid construction projects, due to the large number of engineering indicators and the complex relationships between the indicators, the estimates and forecasts are more difficult. Through data mining on the settlement data of power grid construction projects, the original indicators are merged and dimensionality reduced to obtain key indicators, singular noise data are removed, data are cleaned, intelligent algorithms are used to establish predictive models, and reasonable estimates are obtained to control project investment. Within a reasonable range, the final settlement rate will be controlled within a reasonable range from the estimated rate of decrease.

Data Availability

The dataset can be accessed upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The authors thank the Science & Technology Project of State Grid Fujian Electric Power Co., Ltd., research on distribution network 3D auxiliary planning and design and engineering cost control tool based on real scene 3D reconstruction technology (No. 52135020000P).