Abstract

With the rapid development of information technology, especially the extensive use of databases, computer networks, and other fields, the amount of data held by enterprises is also growing rapidly. With the current shortage of human resources and the pressure of talent competition, the evaluation of human resource management in enterprises is particularly important. Creating measures to attract talents, innovating the understanding management system, formulating a standard talent evaluation system, avoiding the occurrence of improper employment accidents, rationally allocating the weight of human resources, and cultivating the loyalty of employees to the enterprise have become the challenges that more and more companies need to face. In response to the problems raised above, this paper conducts an in-depth excavation of strategic human resource management evaluation and uses in-depth excavation technology. Due to the huge amount of data in the human resource management system, data mining algorithms such as the ID3 algorithm, GBDT algorithm, and Bayesian network are proposed to classify and evaluate the data. Based on these data mining techniques, the strategic human resource management evaluation algorithm is researched and tested. First, the decision tree algorithm is used to build a decision tree for the educational background, identity, and other information of a company’s staff to classify the employees. The factor analysis test results show that after deleting a single similar factor, the variance contribution rate of each influencing human resource management evaluation has a certain increase. Employee engagement and organizational culture rose by 2 percent, while employee satisfaction, organizational learning, and organizational capability each rose by 1 percent. Therefore, data mining technology patient decision tree algorithm is effective for strategic human resource management evaluation.

1. Introduction

Enterprise managers and researchers struggle to identify a set of indicators to gauge the effectiveness of strategic human resource management, despite the fact that it can add value to the enterprise’s core competitiveness. This paper discusses the study and application of the human resource management evaluation process using data mining technology. A decision tree algorithm is proposed in order to address the massive amount of data in the strategic human resource management system, and its core algorithms, such as the ID algorithm and the GBDT algorithm, are utilized to carry out the realization experiment of human resource management evaluation. First, employee data is categorized using the decision tree method. Employee satisfaction, organizational learning, and organizational ability are all increased by 1% after removing a single similar factor, according to factor analysis experiments. Employee loyalty and organizational culture are increased by 2%. We verify the speedup ratio of the ID3 algorithm and the GBDT algorithm in the case of various nodes, and observe the relationship between the data size and running time of the ID3 algorithm and the GBDT algorithm in the performance test of the decision tree algorithm. In the process of verifying the speed ratio, it is discovered that in the state of the same data cluster, the speed ratios of the ID3 algorithm and the GBDT algorithm are 4 : 3.2 and 5 : 4.2, respectively, and the ID3 algorithm increases faster than the GBDT algorithm year-on-year. It is found that when the sample data are 50, the rate of the GBDT algorithm is 34.2 percent higher than that of the GBDT algorithm. As a result, the GBDT algorithm performs better for the strategic human resource management system in terms of running time. On the other hand, the rate of growth is faster than the ID3 algorithm year-over-year. As a result, both algorithms are somewhat useful for implementing human resource management evaluation.

Data mining is increasingly in demand in the context of big data, and this field is receiving more and more research. Badminton played by Zhang holds a significant place in Chinese sports. The sports industry is gradually embracing data mining technology. The definition of data mining technology, its steps and methodologies, and its application in the tactical analysis system used on badminton courts are all covered in this essay [1]. In order to assess the effectiveness of acupuncture in treating patients with cervical spondylosis (CS) and neck pain, Huo et al. investigated the value of data mining models. To assess the viability of the data mining model, we compare the data processed by the model with the clinical data. The processing efficiency of the data mining model and the support-vector machine model has been shown through experiments to reach maximum values of 81.48 percent and 82.64 percent, respectively [2]. Liu et al. explored drug compatibility rules using data mining techniques like frequency analysis, association rule apriori algorithm, and improved mutual information method, offering references and fresh concepts for their clinical application and the creation of new drug research [3]. In order to uncover the pertinent information about extraction parameters, Wang et al. used historical data. He also provided advice on the factors influencing the orthogonal experiment and the relative importance of each factor. Although the results of the DI3 algorithm can be clearly understood, the support-vector classification algorithm has high accuracy, according to the experimental results [4].

Human resource management is receiving more and more attention as a result of the fact that company employees are one of the essential elements in the process of enterprise organization. Amalnick and Zarrin proposed a thorough framework for evaluating and analyzing human resource performance that takes into account the standards of the European Federation for Quality Management (EFQM), renowned excellence in one of the business models, and the components of health, safety, environment, and ergonomics (HSEE) management system. According to the findings, the EFQM model has a significantly greater impact on company performance than the HSEE management system [5]. Lidinska and Jablonsky examined the function of employee performance reviews in management consulting firms using the analytic hierarchy process (AHP). The suggested AHP model combines relative and absolute measures, making it possible to quickly and simply calculate an employee’s overall performance score using a straightforward MS Excel tool [6]. Ellitan made an effort to provide basic answers to some enterprise resource management-related questions using the case analysis approach in order to gain a competitive edge and ensure his survival. The study’s conclusion was that there are various ways to gauge financial and operational performance [7]. The aforementioned study is very thorough in terms of data mining and human resource analysis, and it is a valuable source of information for the following in-depth description.

The strategic human resource evaluation algorithm is examined and studied in this paper using the decision tree classification algorithm’s function in data mining for human resource management evaluation. The improvement of the evaluation tool for the effectiveness of human resource management is the result of the evaluation algorithm’s influence on the system’s performance. A risk assessment model is created by combining a set of rational and scientific risk assessment indicators with the Bayesian network technology. This is performed by applying the Bayesian network theory and the AHP method. In order to provide guidance for enterprise human resource management, it also performs a risk assessment of the human resource management system using the Bayesian network toolkit. The association between strategic human resource management and strategic objectives and the variable-level relationship between the two can be shown. This paper focuses on analyzing the content and implementation steps of the model, which lays the foundation for practical application.

2. Implementation Method of Strategic Human Resource Management Evaluation Algorithm Based on Data Mining Technology

2.1. Data Mining Technology

Classification learning [8], cluster analysis, association rule mining, prediction, time-series mining [9, 10], deviation analysis, etc. are all examples of data mining techniques that are used to provide a specific model for mining objects in the mining process. It is the data mining process, as depicted in Figure 1. The five key steps in the data mining process are as follows: selecting the mining object, gathering the mining data, preprocessing the data, data mining, and interpreting the result information. Different implementations are used depending on the knowledge mined. These classification methods classify knowledge with different feature types and knowledge with the same attributes but different feature types [11]. Classification is a crucial technique in data mining. Language rules can be used to explain and anticipate classification. The k-nearest neighbor algorithm, decision tree algorithm, Bayesian classification, and support-vector machine are examples of common classification algorithms. In the area of extensive data mining, decision tree classifiers are one of them that are frequently used.

The technical foundation of data mining technology is artificial intelligence [1214]. By mining a large number of artificial intelligence algorithms, it provides users with useful information. In a sense, data mining is a very mature artificial intelligence technology (such as neural networks [15], genetic algorithms, and decision trees). It is implemented in a specific application system, but its scale and difficulty are greatly reduced. The data mining step is mainly to build a model based on data, use data mining algorithms and techniques, and use appropriate algorithms for specific tasks, so that the effect of the algorithm model is optimal. Visualization and expert knowledge interpretation are used to mine the results and enhance the understandability, so that the data mining results can be better applied in practice.

2.2. Decision Tree Algorithm
2.2.1. ID3 Algorithm

In the process of decision-making, the decision tree starts from the root node and reaches the leaf node through the classification of multiple attribute nodes. The decision tree construction process does not rely on domain knowledge, but divides different groups by attributes, and builds the topology between attributes. The decision-making process of the decision tree is intuitive, interpretable, and easy to understand, so it is widely used in customer analysis, market research, sales decision-making, market segmentation, market forecasting, and other fields.

A decision tree model is a machine learning model that resembles a tree structure. It primarily consists of components like root nodes, child nodes, and leaf nodes. The decision tree’s root node is located at the top, and its child node is located in the middle. Each of its child nodes, which it is allowed to have multiple of, stands in for a split attribute in the sample set. The classification effect is best when the top child node, also known as the best split attribute node, is used; the category of the sample set is represented by the tree’s leaf node, which is the bottom child node. Each leaf node represents a classification rule, and there may be more than one path from the root node to it. When determining the sample set’s category, the sample set is first classified at the root node, then once for each child node, and finally at the leaf node. The decision tree structure shown in Figure 2 is typical. To determine whether a user will go outside to play based on the weather is the purpose of data classification. Outlook is the mining target that resides on the top root node. The data category, which is “yes” or “no,” is represented by the bottom leaf node, the middle node by the weather attribute, each branch by a judgment rule, and the middle node by the judgment rule.

Of these, ID3 is the earliest decision tree algorithm based on attribute set selection criteria and employs the feature selection approach to information gain. The attribute with the greatest information gain is chosen as the best segmentation criterion when determining the best split point based on the information gain for each attribute. The degree of uncertainty surrounding attribute change is measured by information gain. The calculated attribute is more suitable as an attribute segmentation node if it has a higher information gain and a smaller attribute change, and is more suitable. As a result of the average path from nonleaf nodes to descendant leaf nodes being the shortest, the average depth of the generated decision tree is minimized, increasing the accuracy of the classification tree.

We set the dataset, according to the sample classification, and the information entropy of W relative to k state classification is defined as follows:

The value of W represents the stability of the system. The larger the value of W, the more unstable the system is. When the stability is poor, the system is more chaotic, and vice versa.

If F is chosen with the best segmentation attribute, then the subset corresponds to the node containing D. Then, the attributes are divided into a subset of test attributes whose entropy is defined as follows:

When the weight value in the dataset changes, the smaller the entropy value of the patients in the subset, the higher the degree of division, and then, the expected information of the subset is as follows:

The probability of the sample attribute can be obtained from the above formula, and the information gained can be obtained by splitting the sample attribute. The formula is expressed as follows:

In the process of constructing a decision tree by the ID3 algorithm, the division of attributes is a recursive calling process. When all the attributes in the dataset have been calculated, they all participate in the construction of the decision tree. When there are no attributes to compute, the recursion stops and the decision tree construction is complete. When the depth of the decision tree or the number of attribute divisions reaches a preset threshold, the recursion stops.

The amount of information required to make a correct class judgment is as follows:

The ID3 method identifies each attribute, chooses the attribute A with the greatest information gain (mutual information), creates a decision tree node based on the various values of the attributes, and creates branches based on the different values of the attributes until instances of a particular subset fall under the same category.

The ID3 algorithm has both many benefits and some drawbacks. The benefits are as follows: The ID3 algorithm’s decision result can automatically generate a rule set, the algorithm’s decision-making process is simple and easy to understand, and the algorithm can handle both data type and general attributes. Early in the algorithm’s development, there is no need for extensive data preprocessing. The ID3 algorithm has a lot of benefits, but it also has a lot of drawbacks. The ID3 algorithm uses information gain as the criterion for attribute judgment, and the effect is better when the dataset difference between different branches is small; however, when the dataset difference between different branches is large, according to the method of calculating information gain, the attributes after segmentation will tend to have attributes of more branches, which is the so-called multivalue preference problem; in addition, the ID3 algorithm has a multivalue preference problem that can occur when the dataset difference between different branches is large. The ID3 algorithm is an algorithm for constructing a decision tree based on the characteristic of maximum information gain, which has the smallest number of nodes and high discrimination accuracy.

The accuracy of the classifiers was evaluated using a decision tree preservation method, given a separate set of data. It was divided into two groups, two-thirds of which were used to build the decision tree. Then, it is classified, and the correct rate obtained is the correctness of the decision tree [16]. The hold method is shown in Figure 3.

2.2.2. GBDT Algorithm

The gradient boosting decision tree algorithm, referred to as the GBDT algorithm, mainly uses decision trees to classify data, and the data on the same node can be classified into the same category. The GBDT is generally composed of hundreds of regression trees, but the depth of each decision tree constructed is relatively shallow [17]. When the GBDT algorithm predicts the data, for each input data sample, when passing through each decision tree, the model will be revised once, and then, the prediction result will be obtained. In the GBDT asymptotic gradient decision tree, the ensemble learning is performed through the boosting iterative method, that is, the learning is performed in a serial manner through multiple decision trees.

First, we define the model and get the following:

After each iteration, the error in the system data is further reduced. Gradient descent is an effective method to reduce the error, and the gradient descent is applied in the iterative process to obtain the negative gradient of the loss function as follows:

If the above process is out of the fitting target, the negative degree direction is a continuous floating point type, which is represented by the mean square error as follows:

The error is reduced by the mean square error, and then, the Taylor expansion of the loss function is performed to obtain the following:

If the mean squared error function is considered to have fitted a negative gradient, then

Combining the above formula, it can be obtained as follows:

After the gth iteration of the model, the UI and the optimal Rg can be obtained as follows:

For the above formula, we first define some symbols and use the Taylor expansion method to get the following:

Finally, the second-order Taylor expansion is performed to solve the following:

Finally, the fitted extreme points of this function are obtained as follows:

Through the above solution, the gradient descent method can reach the convergence point in a relatively short time and reduce the fitting error.

2.2.3. Bayesian Network

Different risks could be materialized in the human resource management system. A probability model that combines a graphical framework and a probability model for risk assessment in human resource management form the foundation of the Bayes network. The Bayesian formula is used to modify the prior probability and calculate the posterior probability based on the internal causal relationship between node variables. Finally, the prior probability distribution of the parent node and the conditional probability distribution of the node variables of each intermediate level can be used to determine the probability value of the necessary node variable. The Bayesian network topology program takes into account the role of various risk factors in controlling human resources in relation to the cause and the likelihood that risk factors will materialize. Finally, the influence degree of the factors of various risk events on the overall risk is obtained [18].

A Bayesian network is a complex network of nodes and connections. In this network, each node is a random variable. The variable can be some kind of phenomenon, state, or attribute, and the state of the node corresponds to the probability of the occurrence of the node. Between each node, the directed connection arc represents the dependency or causality between the nodes. If there is no arc connection between the variables represented by the nodes, the conditions between the variables are independent.

When each node variable is independent of each other, it can be obtained according to the chain rule as follows:

The posterior probability can be obtained according to the set of conditional probability distributions of nodes in the Bayesian network and the chain rule as follows:

In the Bayesian network structure, when there is a new supplementary event probability, the original probability will be revised, and finally, the posterior probability will be obtained [19]. As shown in Figure 4, it is a Bayesian network model of human resource management risk assessment.

2.3. AHP to Determine the Weight of the Indicator System

The analytic hierarchy process (AHP) combines qualitative analysis and quantitative analysis, decomposes the problem into several levels, and gradually analyzes it at a simpler level, thereby stratifying and quantifying decision-making problems [20]. The reliability and validity of quantitative evaluation can be greatly improved through quantitative analysis of questions, and it is undoubtedly appropriate to apply it to the determination of the weight of effect evaluation indicators. According to the goal to be achieved, the elements included in the goal are divided into different levels, generally divided into three levels, such as the top layer is the target layer, the middle is the criterion layer, and the bottom is the scheme layer [21]. As shown in Figure 5, it is a relationship diagram of a hierarchical structure and a factor.

A judgment matrix is eventually formed, and the matrix’s structure is built after the elements of the upper and lower levels are compared one by one. The judgment matrix’s largest characteristic root is W, and since it was created by analyzing people’s perceptions subjectively and coming from qualitative research, there are bound to be some inaccuracies. As a result, it must undergo the necessary consistency testing, and the procedure is as follows:

Among them, m and CI are positively correlated. When m is the same, the larger the CI, the greater the instability of the judgment matrix. When CI = 0, it means that the judgment matrix has complete consistency.

We calculate the composite weight vector for the overall decision target between various levels, assuming that the entire level is divided into three layers: the first layer has one element, and the second and third layers have k and elements, respectively. Then, the weights of the second layer and the third layer to the first layer and the second layer can be obtained as follows:

We calculate the weight between each level according to the above formula, finally sort the importance according to the elements in each level, and select the best. As shown in Figure 6, it is a structural diagram of the AHP.

The use of a hierarchical approach to quantify the role of human resource control can overcome the limited intelligence of individuals, and the evaluation of the result is more rational and rational [22]. At the same time, the analysis of these measurements can be used as an analysis of material results and can provide technical and practical advice for future work.

The indicator includes four levels: human resource management tasks, staffing and behavior, program results, and performance. Different levels have different contributions to the implementation of the process and, therefore, different results. To calculate the implementation impact of the human resource management system, it is necessary to determine the weights for each level and each indicator [23].

The AHP divides complex decision-making problems into multiple factors and decomposes them into multiple levels. Two pairs of comparison methods are used to determine the weight of each factor in the same layer, so as to rank each factor. Finally, qualitative and quantitative analysis is carried out by decision-makers to obtain the best decision-making scheme. AHP makes the decision-making process more mathematical by reducing quantitative data, and provides a simple decision-making method for complex decision-making problems with multiobjective, multicriteria, or unstructured nature. Its entire decision-making process reflects people’s decomposition, judgment, and synthesis of decision-making.

3. Experiment on Resolving and Implementation of Strategic Human Resource Management Evaluation Algorithm Based on Data Mining Technology

3.1. Implementation of Decision Tree in Human Resource Management Evaluation

When conducting the analysis, pertinent database data are used to examine elements like the employee’s title, gender, level of education, age, etc. Different levels contribute differently to the execution of a strategy and have various outcomes. Therefore, it is necessary to set weights for each level and each indicator in order to evaluate the implementation effect of strategic human resource management. The decision tree ID3 algorithm is used to establish an employee classification model after data selection, data cleaning, data induction, data conversion, and other processes. Classification rules are then proposed. Organization, work, and individual are the three components that make up the subject of strategic human resource management. A decision tree is constructed using the ID3 algorithm to categorize employees based on their educational background, identity, and other factors. The basic information statistic example is as follows.

As can be seen from Table 1, the company has a total of 3700 employees. From the analysis of cultural level, we can know that the number of people with a master’s degree or above is relatively small, a total of 97 people, accounting for 2.62% of the total. Among them, the number of people below the junior college level is the largest, accounting for 62.7% of the total number of people, indicating that the company has a large number of middle and grassroots workers. There are 753 and 530 people with junior college degree and bachelor’s degree, accounting for 20.35% and 14.32%, respectively. The above data reflect the gap between normal positions, which can effectively avoid serious representation problems.

According to the length of service of employees in Table 2, the number of employees who have worked for 5–8 years is the largest, with 1130 employees, accounting for 30.54% of the total number. The above parameters are helpful for studying the comprehensive composition of the enterprise’s human resource management system. After analysis, the manpower situation of the enterprise can be systematically understood.

Combined with the classification function of the decision tree, the learning and growth aspects of each employee are analyzed, and appropriate factor analysis is carried out according to the unit matrix obtained by classification. This paper extracts six factors of employee learning and growth for analysis, namely, employee satisfaction, employee engagement, organizational learning, organizational ability, organizational culture, and work environment, and analyzes the status of each employee according to the factors.

It can be seen from Figure 7 that the variance contribution rates of the factors of employee learning and growth before and after the decision tree classification are different. The results show that after deleting a single similar factor, the variance contribution rate of each influencing human resource management evaluation has a certain increase. Employee engagement and organizational culture rose by 2 percent, with employee satisfaction, organizational learning, and organizational capability that each rose up by 1 percent, and the variance contribution rate of the work environment was unchanged. Therefore, data mining technology patient decision tree algorithm is effective for strategic human resource management evaluation.

3.2. Performance Test of Decision Tree Algorithm in Data Mining

In a strategic human resource management system, random datasets are parallelized and modeled using decision trees. Since a small amount of data is difficult to meet the performance test of human resource management, the original data are expanded to 100, 200, 300, 500, and 600 for experiments. First, by increasing the data scale, it compares the operating efficiency of the ID3 algorithm and the GBDT algorithm under the same data scale. As shown in Figure 8, it is the relationship between the data size and the running time of the ID3 algorithm and the GBDT algorithm.

Figure 8 shows that, when tested using the same sampling data, the processing times of the two algorithms decrease as the number of samples increases. The GBDT algorithm is 34.2 percent faster than the GBDT algorithm when the data sample is 50, with the ID3 algorithm running in 210 seconds and the GBDT algorithm running in 415 seconds. The ID3 algorithm runs at its fastest speed of 500 when the running sample data for the two algorithms are increased by 50, and the GBDT algorithm also speeds up under the assumption that all other conditions stay the same. Although the GBDT algorithm runs more quickly than the ID3 algorithm, it still has a relatively short running time. It also performs operations with a higher level of efficiency.

Both of the aforementioned algorithms require communication between their internal parts in order for them to function. A random forest algorithm builds the tree in parallel using many nodes, which significantly reduces the amount of time required, resulting in a flat degree of algorithmic time. The speed ratio between the ID3 algorithm and the GBDT algorithm is confirmed using the sample dataset in the case of different data, as shown in Figure 9.

It can be seen from Figure 9 that the algorithm integration of the distributed altar achieves a good speed ratio, and the speed ratio is related to the dataset size and the cluster size. When the dataset is adjusted, as the data segments in the cluster continue to grow, the partition speed also gradually increases. While the GBDT algorithm is in a similar process, many serial processes are in the process of the algorithm, so the increase in the acceleration ratio is not very large. When the cluster size is 50, the speed ratio generated by the ID3 algorithm with the highest number of nodes is 4, and the speed ratio of the GBDT algorithm is 3.2; when the number of clusters is 100, the speed ratio generated by the GBDT algorithm is 4.2, but the ID3 algorithm reaches 5, and the year-on-year increase is faster than the GBDT algorithm.

4. Discussion

This paper uses data mining technology to classify and manage the data of the human resource management system with a focus on potential issues in the original human resource management system. We create a common tree selection model to separate employees based on important algorithms like the ID3 algorithm, GBDT algorithm, and Bayes network. The Bayes network model is created to quantify the impact of events in order to address potential risks to human resource management. The density of the human resource management system and the enhancement of the evaluation process are both assessed using the tomographic release method. The comparison of the ID3 algorithm and the GBDT algorithm was found to improve the system performance after testing the human resource management system’s performance. The success of the GBDT algorithm demonstrates the significance of data mining technology in resource management.

5. Conclusions

This article introduces three kinds of data mining algorithms in the method part, in which the classification application of the decision tree algorithm is emphatically introduced. The ID3 algorithm and the GBDT algorithm are the core algorithms of the decision tree, among which the ID3 algorithm builds a decision tree model by subdividing and dispersing data. By applying the ID3 algorithm to the research of strategic human resource management evaluation algorithm, we can intuitively see the obvious influencing factors of the subdivided talent system. The GBDT algorithm is also a data classification algorithm, which is mainly a process of predicting the input data, revising it according to the data, and then obtaining the result. This paper uses the decision tree algorithm to test the system performance, and finds that under the same amount of data, the GBDT algorithm is faster and more efficient. Comparing the speedup ratios of the two, it is found that the speedup ratio is related to the size of the dataset and the scale of the cluster. The larger the cluster, the higher the speedup ratio. According to the nature of classification of the decision tree, the factors and variance contribution rate of employee learning and growth before and after classification are analyzed, and it is found that the variance contribution rate of human resource management evaluation has increased to a certain extent. Although some data mining techniques are proposed in the article to discuss the realization of human resource management evaluation algorithms, there is still a lack of certain data. If it is to be applied to large enterprises, there is still a lot of room for improvement.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The author does not have any possible conflicts of interest.