Abstract

Physics in middle school plays an irreplaceable role in cultivating students’ scientific quality and creativity. It occupies an essential position in middle school teaching, which has always been the focus of academia (academic concern) and the public (public concern). The relationship between academic and public concern for physics in middle school has become an important research issue. To solve this problem, this article proposes the Pearson correlation-linear prediction and ARIMA (autoregressive integrated moving average) (PCLA) framework. It consists of four steps. Firstly, PCLA obtains the public concern and academic concern in middle school physics. Secondly, the Pearson correlation between the public concern and academic concern is analysed, and a linear model between the two concerns is established. Thirdly, for better predicting public concern, the evaluation index is used to choose the best model between the ARIMA model and the SARIMA model. Finally, based on the result of the third step and the linear model, academic concern can also be predicted. The results show that the PCLA framework can reflect the relationship between academic concern and public concern for physics in middle school and can accurately predict public concern and academic concern.

1. Introduction

Middle school physics has received extensive academic and public attention. In academic research, to mine and judge the hotspots and trends of academic research, scholars need to analyse the volume and development trend of a certain topic from a large amount of literature. It requires an accurate prediction about the number of published articles or academic attention. However, the long publishing cycle, slow updating, and other factors make it difficult to predict academic attention, which requires seeking solutions through big data.

In the Internet age, the amount of Internet users has increased. According to CNNIC (China Internet Network Information Center), by December 2021, the total number of netizens in China reached 1.032 billion and the Internet penetration rate reached 73.0%. Online platforms have gradually replaced traditional channels to reflect public opinion [1]. Netizens’ concerns can express the public’s concerns. The public left records on the Internet through search behaviour, showing their attention to the incident. For example, the physics class taught by the Shenzhou 13 crew astronauts at the Chinese space station was brilliant and attracted a great deal of attention. On the day of the class, the Baidu index of “space teaching” and “Tiangong class” reached 158519, which is the peak of the search volume of the two keywords. This means that the Baidu index can reflect the public concern. Netizens input keywords in Baidu to search and generate search records. The Baidu company scientifically analyses and calculates the search situation of each keyword to form the Baidu index. Based on big data, it deeply mines market need, user features, and other aspects. The Baidu index provides public attention for all walks of life. In middle school physics, the needs and concerns of the public guide academic research. It is significant for the reform and development of middle school physics to study the relevance between public concern and academic concern.

In recent years, some scholars have studied public concern. For example, Meng and Zhao [2] studied the search behaviour and public attention on the Internet. They found that it was feasible using search data to measure public concern. By analysing the impact of public concern about environmental governance, Wu et al. [3] noted that public attention improved the rules and investment of environmental protection. Liu [4] found that public attention to government accounting information played a supervisory role. It improved the quality of public information. Wu et al. [5] put forward that the public concern about haze in new media alleviated the pollution of smog. Ruan and Xiao [6] found that public attention would positively affect technological innovation in enterprises under certain conditions. Zhang and Wang [7] studied public concern about historical celebrities and proposed that public attention can reflect their influence. Some scholars noted that public concern directly affects the prices of grain and livestock products [8, 9]. Ginsberg et al. [10] raised that public attention can predict disease outbreak. Ripberger [11] proposed that the spread of infectious diseases can be tracked by public concern. Peng et al. [12] proposed that public influence should be increased by analysing public attention to vocational education.

The research fields of public concern focus on environmental protection, enterprise innovation, health care, and market regulation. Few scholars studied the relationship between public concern and academic concern. However, some scholars have mentioned their relationship in the research. For example, Wang [13] analysed the difference of public and academic attention to left-behind children. He pointed out that there is a complex relationship between public and academic attention. By studying the transmission of buzzwords, Sun [14] argued that public concern would attract academic concern. Xu and Xiao [15] proposed that public attention can improve the precision of expert decisions. Zhao et al. [16] studied the public attention of “AI + education” and proposed to strengthen intelligent education research.

Summarizing existing studies, we can see that few studies on public concern involve the field of education and few scholars study the relationship between public concern and academic concern. Compared to previous studies, this article presents three main innovations:(1)We propose a PCLA framework to research the relationship between public concern and academic concern in middle school physics.(2)By constructing the prediction model, the trend of public concern and academic concern is effectively predicted.(3)This study broadens the research ideas of physics in middle school.

3. PCLA Framework

The research process of this article is shown in Figure 1. Firstly, we extract the public and academic concerns of middle school physics (MSP). Secondly, the correlation between public attention and academic attention is analysed and the linear prediction model of academic attention is established. Then, through comparing the prediction performance of the ARIMA model and the SARIMA model, this paper determined the optimal one. Finally, the public and academic attention to MSP is predicted.

3.1. Step 1: Obtaining the Academic Concern and Public Concern of MSP

By counting the number of articles published in academic journals of the CNKI (China National Knowledge Infrastructure), academic attention is determined. For getting public attention, the following steps are used:(1)Count all the keywords of articles about middle school physics in CNKI.(2)Get search keywords.(3)Determine the Baidu index of search keywords and get the public attention of MSP.

3.2. Step 2: Prediction Model of Academic Concern in MSP
3.2.1. Correlation Analysis

The correlation coefficient shows the relationship between the public attention and academic attention. The calculation equation is as follows:

Here, r represents the correlation coefficient, n denotes the sample size, xi represents the observed value of variable x, and yi represents the observed value of variable y. and represent the average value of x and y, respectively. The absolute value of r is less than 0.5, indicating that there is no significant correlation. If the absolute value of r is between 0.5 and 0.8, there is a significant correlation. It is highly correlated if r takes an absolute value between 0.8 and 1. However, if the value of r is negative, it means that there may be a negative correlation. If r is positive, this means that there may be a positive correlation.

3.2.2. Linear Model

The multiple linear regression model studies the linear correlation between dependent variables and two or more independent variables. In this article, it reflects the linear correlation between academic attention and public attention of middle school physics. The regression equation is as follows:

Here, y is an estimated value calculated from all variables x, b0 is a constant term, and m1, m2, m3, et al. are called partial regression coefficients. The validation method was LOOCV (leave-one-out cross-validation). It requires a small sample size, about 10 [17]. The principle of LOOCV is that if there are n samples, train n-1 samples and verify with the remaining data. The above operation needs to be carried out n times to get n models. Finally, the parameters of all models are averaged.

The model test indices used in this article include R2 (goodness of fit), RMSE (root mean square error), and MAPE (mean absolute percentage error). Their formulas are as follows:

Here, yi is the real value, ŷi denotes the predicted data, is the mean value, and n represents the sample sizes. In formula (3), R2 represents the goodness of fit of the model. The value range of R2 is 0 to 1. It can be used as a criterion to evaluate regression analysis in any scientific field [18]. In formula (4), RMSE measures the accuracy of prediction from the perspective of relative error. The measurement of the RMSE value has different standards according to different samples. The smaller the value, the higher the accuracy of the prediction. In formula (5), MAPE represents the absolute error of the prediction. In general, when the MAPE is less than 10, the prediction accuracy is high [19].

3.3. Step 3: Prediction Model of Public Concern in MSP

In order to predict public attention, we first collected the Baidu index of the search keywords in MSP. Then, the ARIMA model and the SARIMA model are performed. Finally, the errors of the two models are compared and the model with better performance is selected. We use the selected model to predict the public attention of middle school physics.

The ARIMA model is represented by ARIMA (p, d, q). The equation for the ARIMA (p, d, q) is as follows:

Here, ϕi is an autoregressive coefficient, θi represents the moving average coefficient, p is an autoregressive term, q is the moving average term, d is the difference times when the time series becomes stable, L is the lag factor, and εt means white noise.

To solve the problem of seasonal time series, Box and Jenkins [20] proposed the seasonal ARIMA (SARIMA) model. The SARIMA model is shown as follows:

Here, compared with the ARIMA model, the SARIMA model added four new parameters, where P represents the periodic autoregressive order, D represents the periodic difference order, Q represents the periodic moving average order, and m represents the periodic time interval.

The steps for building the ARIMA model and the SARIMA model are as follows:(1)Check the stability of the original data.(2)Differential smoothing.(3)Model order determination. We determine the model parameters using the minimum principle of the AIC (Akaike information criterion).(4)Model residual test. We used the Ljung–Box test to judge residual white noise. If the inspection value is less than 0.05, which means that the residual is not white noise [21]. Generally, the QQ chart, the residual chart, and the histogram are also observed.(5)Model prediction.

3.4. Step 4: Predicting Public and Academic Concern of MSP

The model selected in Step 3 is used to predict public concern about MSP. Combined with the linear model in Step 2, the predicted value of academic interest of MSP is obtained.

4. Experiment

4.1. Data Source

The data come from the Baidu index website and CNKI. Public attention is measured by the Baidu index from the Baidu index website. To make the data representative, we included the overall Baidu index of PC (personal computer) and mobile terminals in the study. Academic attention to MSP is obtained in CNKI. These data were retrieved on October 10, 2021. The research period is from 2011 to 2020. The main reasons for selecting this period are as follows:(1)The Baidu index began to provide mobile data in 2011.(2)In 2011, China’s Ministry of Education issued a new physics curriculum standard for the compulsory education stage, which opened another chapter in middle school physics.

4.2. Academic and Public Concern
4.2.1. Academic Concern

Considering the availability and comprehensiveness of the data, this study counted the number of papers published each year from 2011 to 2020 by retrieving “junior high school physics” or “high school physics” under the topic of academic journals in CNKI. It represents the academic attention of middle school physics.

4.2.2. Public Concern

Public concern is represented by the Baidu index of keywords related to middle school physics. Determining the search keywords is a key part for studying the relationship between public attention and academic attention. The specific steps to obtain the keywords are as follows:(1)The keywords were extracted from academic journals from 2011 to 2020 under the theme of “high school physics” or “junior high school physics” on CNKI. Use NoteExpress software to count keywords.(2)The initial keywords can be gained by counting keywords frequency and combining the subject (see Table 1).(3)Filter the initial keywords that are not included on the Baidu index website and are noisy during the search. Noisy keywords mean that the search focus of them is not what we want.

The search keywords finally screened by the above steps are “High School Physics (HP), “Physics Teaching (PT),” “Junior high school physics (JP),” “Physics Knowledge (PK),” “Efficient Classroom (EC),” and “Scientific Investigation (SI). We believe that the search behaviour for the six keywords is a kind of public concern for middle-school physics. By retrieving the Baidu index values of six keywords from 2011 to 2020, the public attention of middle school physics can be obtained. From Table 2, we can see the public attention and academic attention of middle school physics from 2011 to 2020.

4.3. Correlation Analysis between Academic Concern and Public Concern

For getting the relationship between public attention and academic attention, we use the Pearson correlation method to analyse. To quickly and effectively conduct the Pearson correlation method, SPSS is used. Table 3 shows the relation between them. The search keywords of “High School Physics,” “Junior high school physics,” “Efficient classroom,” and “Physics Knowledge” are negative correlation with the academic attention of MSP, and other keywords are positive correlation with the academic attention.

4.4. Prediction Model of Academic Concern

From Table 2, the academic attention of this article is taken as the dependent variable, and HP, PT, JP, PK, EC, and SC are taken as the independent variables. To eliminate the effect of covariance, the following linear model is established:

Here, b is a constant term and λ1, λ2,… λ6 are partial regression coefficients. Due to the small amount of sample, LOOCV is selected to prevent overfitting. The final model is obtained (see the following formula):

For the accuracy of prediction, the model should be tested. The result shows that R2 is equal to 0.83, RMSE is 0.05, and MAPE is equal to 0.56. The model error is small.

4.5. Prediction Model of Public Concern

Prediction of public interest in middle school physics is an important part of this study. On the official website of the Baidu index, we collected the daily Baidu index of six search keywords for middle-school physics from January 1, 2011 to December 31, 2020. Taking into account the lack of daily data (the value is 0), we take the average of daily data each month as monthly data. For ensuring the effectiveness of the sample, we take the monthly data as the research sample. Limited by space requirements, this article takes the keyword “Scientific Investigation” as an example to analyse and predict the public attention.

4.5.1. Analysis Data

The ARIMA and SARIMA models have strict requirements for data smoothness. First, a unit root test of the original data is performed. From Table 4, we can see that the test statistic value of the original data is greater than three confidence levels and the value is significantly greater than 0.05, indicating that the data have a unit root and the original data are nonstationary.

Because the original data are unstable, the first-order difference is needed. Then, we test the stability of the first-order difference data. According to Table 5, the test statistic value is less than the critical value at the 1% confidence level, which corresponds to a certainty of 99% that there is no unit root. Meanwhile, if the series is smooth, its average value and variance will not change appreciably. Therefore, it is necessary to test the average value and variance of first-order difference series to determine whether it satisfies the stationary requirement. Figure 2 shows the first-order difference sequence and its mean and variance. It can be seen from Figure 2(a) that the sequence exhibits strong independence, showing that it is nearly stable. In Figure 2(b), both the red line (mean) and the black line (variance) fluctuate slightly. They tend to be constant. According to the above analyses, we can see that the first-order difference sequence is stable.

4.5.2. ARIMA Model

Before establishing the model, we need to check the Ljung–Box test on the first-order difference series. The value of the white noise test is 0.0001, less than 0.05, indicating that the first-order difference series is not white noise. As this moment, the model can be established according to the previous steps. We choose models with the lowest AIC value and determine the final model as ARIMA (11, 1, 4). In order to ensure that the model is appropriate for prediction, we tested the model.

According to Figure 3, we can see that the ordered distribution of the residual (blue dot) follows the linear trend of the standard normal distribution (red line), indicating that the information in the model is sufficiently extracted. According to the above analysis, the model passed the residual test.

4.5.3. SARIMA Model

Taking into account the accuracy of the predictions of public concern, we also studied the seasonal ARIMA model. Figure 4 shows the decomposition of the original data. As shown in Figure 4(a), the original data have an upward trend and these data are unstable. Figure 4(b) shows that the data are seasonal and the period is 12 months. It can be seen in Figure 4(c) that the residual is unevenly distributed. To eliminate the trend and periodic items in the data, the first-order difference and 12-step difference are used to obtain the difference sequence. Then, the white noise test is performed on the difference sequence, in which the value is 1.82e − 05, less than 0.05. Therefore, the model passed the test. The parameters of the final model are obtained according to the AIC minimum principle, and the model is SARIMA (0, 1, 1) (0, 1, 1) 12.

After the model is determined, the residual error of the model is calculated to ensure that the predicted value is reasonable. Figure 5 shows the SARIMA model diagnosis diagram. Figure 5(a) shows that residual error fluctuates around zero value, which seems to be white noise. In Figure 5(b), the kernel density estimate (KDE) (red) is very close to N (0, 1) (standard normal distribution) (green). In Figure 5(c), we can see that the ordered distribution of the residual error (blue dots) shows a linear trend, indicating that the residual error is normally distributed. It can be seen from Figure 5(d) that the residual error has a very low correlation with its own delay value. The above analyses show that the model passes the test.

4.5.4. Model Selection

The ARIMA model and the SARIMA model have passed the test, which shows that they can implement the prediction. To find the optimal model, this article takes data from 2011 to 2017 as the training set. Data from 2018 to 2020 are taken as a test set. We calculated the RMSE and MAPE values of the two models under all keywords. Table 6 shows the model and its performance for each keyword. We can see that the errors of the SARIMA model are smaller than those of the ARIMA model. Therefore, the SARIMA model is selected to predict public concern.

4.6. Prediction of Academic and Public Concern
4.6.1. Prediction of Public Concern

The SARIMA model for SI in Table 6 is used to predict the public attention. Figure 6 shows the effect of the prediction on SI. In Figure 6, the blue data are the original data, the red data represent the fitting data from 2018 to 2020, and the green data are the forecast data from 2021 to 2023. The fitting data can basically coincide with the original data. In Table 6, the error between the predicted value and the real value is very small, indicating that the prediction effect is very good.

Use the SARIMA model for each keyword in Table 6 to predict the monthly data for all keywords. Figure 7 shows the forecast trend for all keywords. From Figure 7(a), we can see the trend of three keywords, named “High School Physics,” “Junior high school physics,” and “Efficient classroom.” Figure 7(b) shows the prediction of other keywords. It can be seen from Figure 7 that all keywords have certain trends and seasonality.

Public attention is represented by the annual data of the Baidu index. In order to maintain the correspondence, the monthly data of each year are averaged as the annual data. In Figure 8, we can see that the public pays high attention to the keywords “high school physics” and “junior high school physics,” indicating that the research scope of middle school physics is high school and junior high school. The efficient classroom, physics teaching, and physics knowledge are the hot spots of physics research in middle school. This is consistent with the research of Zhang and Wang on the focus and expectation in middle school physics [22, 23]. In Figure 8, we can see that the attention of “High School Physics” and “Junior high school physics” shows a downward trend. The trend of paying attention to “efficient classroom” and “physics knowledge” is stable. The focus of “Physics Teaching” has increased by a small margin.

4.6.2. Prediction of Academic Concern

Taking the predicted value of public attention into equation (9), academic attention can be obtained. Figure 9 shows the academic attention of middle school physics from 2011 to 2023.

In Figure 9, the orange data are the original value and the purple data are the forecast value. From 2011 to 2017, the academic attention of middle-school physics showed a fluctuating upward trend. Academic attention rose from 573 in 2011 to 836 in 2017, reaching a peak. The reasons are as follows: (1) In 2017, the first batch of reform provinces implemented the new college entrance examination. (2) Beijing, Tianjin, Hainan, Shandong, and other second batch of reform provinces announced the launch of the new college entrance examination in 2017. In the new college entrance examination, physics, as one of the preferred subjects, has become the focus of many scholars’ research, which improved the academic attention of middle school physics. From 2018 to 2020, middle school physics showed a downward trend with an average annual decline of 6.0%. In the next three years, the academic attention of middle school physics will increase steadily. In 2022, it will reach 754, which will increase slightly compared with 2020. The latest version of the physics curriculum standard for junior high school issued in 2022 will increase the academic attention of physics in middle school.

The hot spots of physics in middle school in China have experienced a stage from scientific inquiry and scientific methods to physics teaching and key literacy. The current focus is the deep integration between information technology and physics teaching [24]. This article emphasized the public search behaviour of middle school physics and used the big data on the network to predict academic attention.

5. Conclusions

This article studies the relationship between public and academic concern of middle school physics, which provides accurate data for academic research and better reflects the practical significance of academic concern in solving the urgent needs of the public. This study draws the following conclusions:(1)There is a significant correlation between public attention and academic attention in middle school physics. The Baidu index of search keywords under middle school physics has a great correlation with the number of papers on CNKI, which demonstrates that public attention affects academic attention. In theory, the issues of public concern are the most concerned in society. The social attention to these issues will raise academic attention.(2)Compared with the ARIMA model, the SARIMA model has a better consequence for predicting public attention to middle school physics. This article provides a reference for the prediction of public concern.(3)The public attention to middle school physics can predict academic attention. Research in middle school physics is concentrated in academia. But with the advent of big data, public concern is valued. This study points out that the public attention based on the Baidu index can linearly predict academic attention. The results show that the model provides more precise predictions with an R2 value of 0.83, RMSE value of 0.05, and MAPE value of 0.56.

Although the results of this study show that public attention in middle school physics can reflect academic attention, there are some deficiencies:(1)Whether the search keywords of middle school physics are comprehensive.(2)The public concern of middle school physics is only a quantitative prediction of academic concern. In the future research, on the one hand, we can explore the differences between the research hotspots of public concern and academic concern in middle school physics. On the other hand, we can investigate the relevance between public concern and academic concern in other disciplines.

Data Availability

The data are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (No. 61862051), the Science and Technology Foundation of Guizhou Province (No. [2019]1447), the Philosophy and Social Science Planning Youth Project of Guizhou Province (No. 18GZQN36), the Top-Notch Talent Program of Guizhou Province (No. KY[2018]080), the Nature Science Foundation of Educational Department (Nos. [2022]094 and [2022]100), the Nature Science Foundation of Qiannan Normal University for Nationalities (Nos. 2020qnsyzd03, QNSY2018JS013, QNSYRC201714, and QNSYRC201715), and the Postgraduate Project of Qiannan Normal University for Nationalities (No. 21yjszz013).