Abstract

The mining and analysis of student achievement data is of great importance to teaching management. Using the relevant data of college students’ physical fitness test and sports performance as the object of research, a BP neural network model is developed to predict performance. Based on BP neural network, an algorithm for predicting students’ endurance performance is proposed, which is applied to the sunshine long-distance intelligent sports testing system at Hangzhou Dianzi University. The nonlinear relationship between students’ performance in sunshine running and endurance performance is determined, and students’ performance in sunshine running is used to predict their endurance performance the following year. Experimental results indicate that the accuracy of the model is above 85%. At the same time, the prediction results are combined with the Internet of things technology to produce a student sports prescription management system, which sets different sunshine running parameters for students with different predicted results and provides personalized sports prescriptions for students with different physical conditions, which has extensive and far-reaching application value.

1. Introduction

In light of the advancement of information technology, teachers are obligated to investigate new ways of learning and teaching methods, as well as the creation of digital campuses [1]. All teachers are faced with the challenge of utilizing information technology in a reasonable, appropriate, and effective manner to address some issues in conventional teaching. In China, many colleges and universities are creating digital campuses and, in recent years, an increasing number of primary and secondary schools are also actively promoting the establishment of digital campuses [2]. Nowadays, digital campus construction has become a synonym for information education, a new development direction in school construction, and a symbol of school running level and conditions. Numerous schools have increased office funding, equipped teachers with office machines, and are gradually realizing the paperless office. With the expansion of the paperless office, school departments have accumulated a great deal of information, which is increasingly stored on the server. How to extract and locate useful information from such vast amounts of data to serve as a resource for education and teaching is a new problem that must be resolved immediately. In fact, this problem has been studied for quite some time. As early as 1995, a new technology for data analysis was proposed: data mining. After 15 years of research, data mining technology has rapidly evolved and is now widely used in various industries. Various data mining software systems have been developed, resulting in substantial economic benefits for society [3]. As an important discipline of modern nonlinear science, neural networks have undergone rapid development in recent years and are now widely used in aerospace, automatic control, construction, medicine, the military, electronics, and machinery, among other industries, with significant breakthrough results. However, fewer applications exist in the field of sports scientific research, mainly in sports mechanics and biochemistry.

With the advancement of computer and network technology, the student information management system, the popularization and promotion of teaching management system, platform of teaching, and campus card data rapid growth, there have only been a few successful cases of combining data mining technology with education. However, at present, data use is still limited to simple query and selection, and the processing of students’ scores is often just simple operations like average score, highest score, mean square deviation, and the relationship between scores cannot be found [4]. For example, it is important to thoroughly explore the factors that influence students’ performance and the key ones; furthermore, the future development trend of students cannot be predicted based on their current performance and behavior. Student performance assessment is an important indicator for determining the quality of teaching in the context of student assessment. If the past performance of students and other information can accurately predict the future trend of students, it can improve the way of cultivating students and promote the improvement of teaching quality [5]. However, in practice, the hidden value behind these teaching data has not been fully utilized. The emergence of educational data mining technology alleviates the problem that data in database is not fully utilized. Educational data mining is the use of data mining technology to analyze and mine the data generated during the education process, find out the teaching mode and knowledge hidden within the educational data, and serve as a resource for teaching. Educational data mining methods can be roughly divided into classification, clustering, regression, association rules, and so on [6]. Neural network is a physiologically real human brain neural network structure and function and a number of basic characteristics of a particular theoretical abstraction, simplification, and simulation of a type of information processing system. It is a complex network system composed of a large number of very simple processing units (or neurons) that are widely interconnected. It reflects numerous fundamental properties of human brain function. Compared with digital computers, neural network systems are capable of collective computation and adaptive learning.

At present, the application of data mining in universities is the analysis of students’ scores, but this analysis is limited to a query of the input scores and some conventional statistics, such as the average score, pass rate, excellent rate, and highest score [7]. However, these analyses could not find the relationships and rules in the data and could not predict students’ performance and future development trends based on the existing data. As a school, student performance assessment is an important basis for evaluating the quality of teaching; if we can identify the factors that influence results from grades and other student data already available and can make some predictions about student development, the way to improve the cultivation of students and strengthen the school teaching quality will be of great help [8]. In order to further study the physical fitness of college students and the influence of physical education and extracurricular physical exercise on students’ overall physical fitness, the learning sample of neural network was established to use the strong approximation ability of neural networks, and a physical fitness assessment model for male college students based on neural networks was established. This is shown in Table 1.

2. State of the Art

Sports prediction refers to the forecaster’s prediction and inference of various unknown factors in the field of sports, and it serves as a crucial foundation for decision makers. The predicted data can be used by decision makers to offer scientific advice to sports practitioners regarding sports health [9].

Jin used the method of sports experts to analyze the strength of the major nations' divers at the Olympic Games in Sydney in 2000 and predict the diving results of our country [10]. The qualitative prediction method plays an irreplaceable role in the quantitative resolution of the problem, but it cannot effectively reveal the relationship between the factors in each system [11].

Liang used the independent and dependent variables of athletes’ physical performance to establish a single regression equation to predict the results [12]. Results can be predicted quantitatively, but the method of linearizing nonlinear problems affects the prediction accuracy to some extent [13].

Zhao et al. used the grey system model to predict the women’s freestyle national record [14]. It offers a good solution to a large number of grey problems in the field of sports, but the model requires a small sample size and has limitations when dealing with large sample data [15].

After more than ten years of development, the research on data mining has developed from the initial surface and isolated problems to the systematic and comprehensive direction. Generally speaking, the research on data mining is mainly focused on three aspects: data mining technology and algorithm, data mining theory, and data mining application. Many school researchers have started to study data mining technology used in teaching and management of schools as it has matured and expanded in the field. Examples include application of the student information management, analysis of teaching evaluation in colleges and universities, student achievement and exam system, and enrollment system. These applications have all played significant roles in raising the level of school teaching management [16]. This paper focuses on the application of data mining technology. RBF networks have strong function approximation capabilities, making them a potential and effective modeling tool for complex functions and nonlinear phenomena in in vitro research projects.

In this study, the application of data mining technology in grade evaluation will be deeply discussed in combination with students’ exam results, and the results will be analyzed and predicted by neural network technology, which will provide basis for teachers’ teaching work in the future [17] and establish a physical fitness assessment model for male college students in order to provide certain references and references for applied research in sports-related fields. The specific research framework of this paper is shown in Figure 1:

3. Methodology

3.1. Introduction to Principal Component Analysis

Due to the excessive number of input samples at the front end of a BP neural network, network training will be slower and less efficient, resulting in severe network convergence; therefore, input preprocessing is required. In order to achieve the goal of data simplification and the accuracy of the results, this paper uses the principal component analysis method and the dimension reduction method to explain the multivariate variance in a small number of principal components, while retaining as much of the original variable information as possible [18].

The total mileage, frequency, and average speed of boys’ sunshine running in the second half of 2014 were input as X1, X2, and X3, respectively, denoted as X =  (X1, X2, M3) T. ∑ was the covariance matrix of X, and the eigenvalues of ∑ and the corresponding orthogonal normalized eigenvectors were λ1 ≥ λ2 ≥ λ3 ≥ 0 and e1, e2, and e3, respectively. The first principal component of X is as follows: where I = 1, 2, 3. . For example, the contribution rate of the KTH principal component Yk is λk/∑3i = 1λ I, and the calculated characteristic values are 2.0829, 0.9011, and 0.0160, respectively. The corresponding characteristic vector and variance contribution rate are shown in Table 2.

The contribution rate of the first principal component is the largest, indicating that the ability to synthesize the information contained in the original variable is the strongest. Since the cumulative variance contribution rate of the first two eigenvalues is 0.9947, far greater than 90%, the number of principal components can be selected as 2. It is used to replace the original variables, namely, total mileage, frequency, and average speed, which not only reduces the dimension and speeds up the model training speed, but also does not lose too much information in the original variable. The principal component score after feature vector is multiplied by original sample data is shown in Table 3 [19].

Through the principal component analysis of the factors related to learning difficulties of motor skills of students, the results show that the factors related to learning difficulties of motor skills of college students are mainly composed of physical, psychological, and other factors. The main components of related physical factors are sports quality factors (sensitivity and speed), body shape factors (weight, etc.), and body function factors (heart rate). The main components of related psychological factors are intelligence factor, physical symptoms, learning anxiety, anxiety about people, allergic tendency, and impulse factor. The principal components of other related factors are composed of sports achievement factor, sports attitude factor, and cultural achievement factor. Therefore, educators should fully understand and consider the main factors affecting students’ motor skills learning disabilities in the process of PE teaching and help students overcome the motor skills learning disabilities, so that students can strengthen their physique, complete their studies, and improve their sports performance. At the same time, sports and psychological workers should also pay attention to the theoretical research on motor skills learning disabilities, so as to enrich and perfect the theory of learning disabilities [20].

3.2. Model Building Based on BP Neural Network Algorithm

Artificial neural network is a new information processing technology which imitates neural network in human brain. With the advancement of artificial neural network technology, artificial neural network is suitable for solving nonlinear problems for which no suitable analytical model exists. Artificial neural network has been widely used in the financial field such as stock prediction. BP neural network has a strong nonlinear mapping ability. Theoretically, for a BP network with three or more layers, as long as the number of hidden layer neurons is enough, the network can approach a nonlinear function with arbitrary accuracy, so the experiment in this paper uses BP neural network [21].

The neural network model of hidden layer is actually a linear or nonlinear regression model [22]. The number of nodes in the hidden layer is not only related to the number of input and output nodes, but also related to the complexity of the problem to be solved, the type of transfer function, and the characteristics of sample data. The number of nodes in its hidden layer is calculated as follows:N is the number of nodes in the hidden layer, ni is the number of input nodes, n0 is the number of output nodes, and c is a constant between 0 and 10.

In this study, the activation functions of the hidden layer and output layer are determined as log sig log s-type transfer functions:

Sigmoid differentiable function is strictly increasing, can make the output between linear and nonlinear show a good balance, so can realize any nonlinear mapping between input and output, is suitable for medium and long term prediction, has the advantages of good approximation effect, fast calculation speed, and high precision. At the same time, it has a solid theoretical foundation, a rigorous derivation process, beautiful symmetry in the resulting formula, and strong nonlinear fitting ability. It reflects the saturation characteristics of neurons; that is, the range of the function can be given by researchers according to actual needs. When the input value is small, the function has a relatively large gain; when the input value is large, the function has a relatively small gain, which can well prevent the network from entering the saturation state.

3.3. Introduction to BP Neural Network Learning Algorithm

Artificial neural network is a system that can process information composed of many simple neurons through complex connections. The function of artificial neural network is usually determined by the connection mode between neurons. It can be seen that artificial neural network is the simulation of the structure and function of human brain neural network and thus has the ability to process information. Artificial neural network is suitable for parallel distributed computation, nonlinear mapping, adaptation, and integration, and it can deal with incomplete and noisy data.

The data preprocessed by the previous principal component analysis (between −1 and 1) were taken as the input, and the expected error was set as ε, the connection weight as and , and the node threshold as θi and θt. The expected output of the nodes in the output layer was ti, η represented the step size (learning rate), k was iteration times, and α (k) was the momentum factor, which was between 0 and 1.The output of hidden layer node and output layer node of BP neural network is calculated as follows:The weight error of BP neural network connected to the neural node L of output layer and hidden layer isUpdate the connection weight VU and threshold θt of BP neural network:(4)Input samples of a new period until the network error reaches the predetermined requirements, and the input of samples in each period is sorted randomly during training.

BP algorithm process is an iterative algorithm process, repeated until the error satisfies the requirements and a successful network training is achieved.

4. Result Analysis and Discussion

4.1. Student Performance Analysis and Sample Data Collection

In order to verify the correctness of the neural network model and to more objectively reflect the physical fitness of college students, supervisors at Hangzhou Electronic Technology University conducted research on 5391 male students in the first and second years. In comparison with the physical health test data of students in 2019, it was found that 80% of students who participated in sunshine running improved their endurance quality. As shown in Figure 2, among the 5391 male students in the first and second years of Hangzhou Dianzi University, 19% of them failed in endurance performance, and only 9% of them scored more than 80 points. It can be seen from Figure 3 that among the 970 students who failed in endurance performance, 816 students had less than 32 times of sunshine running frequency, and 233 students with more than 80 points had more than 32 times of running frequency. This fully indicates that there is a strong correlation between students’ daily sunshine running and their endurance scores in the final physical fitness test (800 m for girls and 1000 m for boys as shown in Figures 2 and 3.

Due to differences in physiology and physical fitness between male and female students, the current national physical training standard for college students’ endurance quality is 1000 m for men and 800 m for women. This paper takes male students’ data as the main analysis object,and 1100 male students in grade one and grade two in Hangzhou Dianzi University were selected as the research objects, including the frequency, total mileage, average speed, endurance performance at the end of 2014, and the performance of 100 male students in March and April 2019.

Data preprocessing refers to the process of deriving valuable and meaningful data from the disorganized and difficult data obtained. This paper discusses the influence of student achievement and student behavior information data on students’ graduation achievement, and the hidden valuable information between them.

4.2. BP Neural Network Training Simulation Results

In this paper, the number of hidden layer nodes and the number of neural network iterations are determined experimentally. The experimental results are shown in Figure 4. When other parameters remain unchanged, the number of hidden layer nodes is adjusted. After many simulations and debugging, the average output error of multiple experiments is compared to determine the optimal number of nodes. As shown in the figure, when the number of hidden layer neurons exceeds 6, the prediction accuracy of the neural network is within 10%, and increasing the number of hidden layers will not greatly change the prediction accuracy, so the number of hidden layers determined by this paper is 7.

Then fix other parameters and adjust the number of iterations to obtain the relationship between the number of iterations and the prediction accuracy. It can be seen from the figure that when the maximum number of learning iterations of the network reaches 1500, the prediction result of the network achieves a higher accuracy so, in this experiment, the number of iterations is set to 1500, and the learning accuracy is 0.0004.

On the basis of determining all kinds of functions and parameters, using input and output to create a neural network, and by training the neural network, to predict the 2019 students’ physical fitness test endurance performance, the training results and analysis based on MATLAB are shown in Figure 5 below.

Then, the error percentage is calculated, as shown in Figure 6, and the error rate is within 15%. It shows that it is feasible to use BP neural network algorithm to predict students’ endurance score in the final physical fitness test. Then, the data of 100 groups of boys’ sunshine running from March to April 2019 were input to predict their endurance performance, as shown in Figure 7. Therefore, it can be seen that in this random group of 100 students, some students are likely to fail or fail to achieve excellent endurance performance.

The predicted results are shown in Table 4. To more fully test the accuracy of the model algorithm, we will check sample size change; it is concluded that the error percentages are shown in Table 5 below; for different numbers of test samples, the model can reach more than 80% accuracy. Therefore, the BP neural network prediction model for prediction has high feasibility of the application of sports.

The error data verified by experiments show that the error rate of BP neural network without principal element processing is larger than that of network processed by principal element, as shown in Table 6. The error rate of data processed by principal component analysis is less than 15%, and the error rate of data not processed by principal component analysis is less than 20%.

In view of the above prediction of students’ endurance performance at the end of the term, this paper added an advanced extension function, namely, the mode of student group management, to the original intelligent software system of sunshine running, and provided different sports prescription suggestions for different groups of people. The students who participated in the sunshine running were divided into groups according to male and female gender, and then the predicted results were imported through the background database. They were divided into five groups according to 60 points below, 0 points less than 70 points, 70 points less than 80 points, 80 points less than 90 points, 90 points more than 90 points. To different groups of students, parameters adjustment and personalized guidance are provided. For example, if the score is below 60, the minimum effective distance running frequency and total distance will be increased, so that they can do more practice in the usual long-distance running, which will help improve their endurance performance.

In order to compare the prediction effect of the data after principal component analysis, statistical methods can be used to compare the two methods. First of all, it is assumed that our results have a normal distribution, and then we extract the prediction results of 30 students for each of the two methods and calculate the prediction errors for these 30 students, respectively. Through the calculation, we get that the prediction error variance after principal component analysis is 1.23, the prediction error variance without principal component analysis is 3.12, and the mean value is 0; the result is shown in Figure 8. As can be seen from the chart, after principal component analysis, the average value of the result prediction remains unchanged but has a small variance, indicating that the processing of principal component analysis before neural network prediction is really necessary.

In this chapter, the BP neural network algorithm is used to establish the BP neural network prediction model. In terms of the selection of activation function, the Sigmoid function itself and its derivatives are continuous, smooth, and robust, so the Sigmoid function is selected as the activation function. In order to improve training speed and sensitivity and effectively avoid the saturation zone of Sigmoid function, it is generally required that the value of input data be between 0 and 1. Vector affects the performance of the gradient descent method, the move to the optimal parameters of impact velocity, and the dynamic vector; using cross validation method was carried out on the training set training, cross validation by division for many times, greatly reducing the accident caused by a random division, through many times, at the same time training for many times; the model can accommodate all types of data, thereby enhancing its generalizability. The neural network model can better reflect the mapping relationship between the indicators of the physical fitness test evaluation of college students and the evaluation results and is a more reasonable physical fitness assessment model for male college students.

5. Conclusion

In this paper, the application of BP neural network to physical test data analysis is used to establish a variety of long-distance movement management parameters (the average speed and total miles, etc.) and a statistical model of endurance performance relationship. Furthermore, the application of the model in the software development enables the student to participate in the sunlight sports quality and effect analysis of the evaluation, prediction, and decision support of sports training guide. This project has important theoretical guidance and practical significance to carry out mass sports scientifically and rationally, effectively improve students’ physique and endurance level, and develop sports teaching research and intelligent sports training system.

On the basis of the theoretical basis of the prediction of students’ grades by scholars at home and abroad, based on BP neural network technology, one is based on the data of students’ past grades, the other is based on the data of students’ past grades and students’ behavior, and the future graduation grades of students are predicted. Comparing the two studies, it is found that the prediction accuracy of students’ past achievements and behaviors as factors affecting students’ achievements is higher than that of only past achievements as factors affecting students’ achievements, and the effect is relatively ideal. It makes up for the defect that the prediction accuracy is not high only depending on the previous achievements. At the same time, the association rule algorithm is used to mine and analyze the relationship between courses and courses, courses and graduation scores, and student behavior and graduation scores. It is determined that there is a certain correlation between different courses, the degree of association between different courses is different, and students’ course scores are affected by the course arrangement order. One course or a few course results may affect the study of other courses, and students’ daily behavior has a certain impact on the final graduation result.

Data Availability

The labeled data set used to support the findings of this study is available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Basic Teaching Department of Hebei Women’s Vocational College and Sports Department of Cangzhou Normal University.