Abstract

Sci-Tech journals have long served as platforms for academic communication and the collision of ideas, facilitating advanced inventions and major discoveries in science. The speed of development and future prospects of a field in the current era can often be reflected by the quality and quantity of cutting-edge papers published in Sci-Tech journals within that field. Currently, the impact factor of Sci-Tech journals is a widely recognized journal evaluation index that comprehensively reflects the quality and influence of the journals under evaluation. However, traditional journal evaluation methods based on statistical formulas, while relatively simple and fast, have certain limitations. They are not comprehensive enough and do not support the comparison between journals from different disciplines. In recent times, researchers have delved into using multiple suitable indicators for comprehensive journal evaluation, attempting to understand the role each indicator plays in the evaluation process, such as the rank sum ratio. Our paper presents a new dataset constructed from data from journals across various fields obtained from the China Wanfang Literature Platform. We endeavor to explore a series of novel journal evaluation methods based on machine learning, including deep learning models. With these 9 methods, we aim to determine the contribution of 17 journal evaluation indicators to the impact factor and identify important factors that can further enhance the quality and influence of Sci-Tech journals, which has great guiding significance for the future development of journals.

1. Introduction

For a long time, Sci-Tech journals have acted as carriers of academic content, and their influence within their fields is typically shaped by their quality. This has driven researchers to continually seek out various indicators for the purpose of objectively and scientifically evaluating these journals. Over time, the emergence of numerous indicators can often lead us to overlook some more valuable evaluation indicators. Therefore, it is necessary to research methods to identify these more valuable indicators. Our hypothesis is that these indicators reflect various aspects of a journal, and some or most of them are correlated with the impact factor. Our goal is to explore the relationship between these indicators and the impact factor and identify which indicators are most important in predicting the impact factor so as to analyze which characteristics of a journal are more helpful for journal impact. The specific correlation cannot be determined simply by whether it contains the same calculation factors, so we need to use various methods we propose to observe from multiple perspectives, which is undoubtedly meaningful. In our paper, we aim to identify the journal evaluation indicators among the 17 available ones that have a greater impact on the journal’s influence and quality and determine their contribution to the journal’s influence. In our paper, we use the journal impact factor [1] as a reflection of a journal’s influence for the reason that the impact factor is widely accepted and utilized to measure the level of influence and citation impact of a journal’s published papers within the academic community. Therefore, our ultimate goal is to identify the more valuable indicators among the 17 available ones by assessing their respective contributions in predicting the journal Impact Factor. Each journal evaluation indicator can reflect a certain characteristic of a journal from a certain aspect. We can use the journal evaluation indicators with high contribution to understanding which journal characteristics are more conducive to improving the quality and influence of the journal.

In recent years, algorithms such as multiple linear regression, decision tree [2], and support vector machine [3] in machine learning and fully connected neural networks [4], and convolutional neural networks [5] in deep learning have demonstrated favorable outcomes across a range of regression prediction tasks. These tasks encompass areas such as housing price prediction [6] and analysis of the stock market. Their rapid development allows them to emerge and occupy a mainstream position in related tasks in various fields. It is evident that predicting the journal impact factor is indeed a regression task. We aim to leverage the successful performance of machine learning and deep learning in regression prediction tasks to approach our task from a new perspective.

Based on the whole content, this paper makes the following contributions:(1)In the field of journal evaluation, different from traditional statistical methods such as rank sum ratio (RSR), our paper attempts to establish a task for predicting the impact factor in order to train machine learning models, including deep learning models. So that we can obtain the contribution of each evaluation indicator to the impact factor by implementing methods that are based on the models after training under this task. By doing this work, we can research how to improve the quality of journals.(2)A new dataset is proposed, which contains the values of 18 evaluation indicators of Sci-Tech journals from 2015 to 2020. The data has a large time span, comprehensive indicator types, and rich types of journals, which have good research significance.(3)Implements nine methods based on machine learning models, including deep learning models, to obtain the contribution of each evaluation indicator to the journal Impact Factor, then compares and summarizes the experimental results to find the characteristics that contribute to boosting the quality and influence of journals.

2.1. Evaluation Method and System of Sci-Tech Journals Based on Statistical Methods

Since the Renaissance, science has developed rapidly and differentiated into various more professional fields at a high speed. On this basis, a large number of professional or comprehensive Sci-Tech journals have emerged. With such a huge number, the quality of journals will inevitably vary. Then, how to scientifically and objectively evaluate the quality of Sci-Tech journals becomes crucial.

Since Pinski [7] proposed the journal evaluation rules, various methods and indicators for evaluating the academic quality of Sci-Tech journals have emerged [8]. Traditional methods include citation analysis, quantitative analysis based on fuzzy mathematical theorem, and analytic hierarchy method [9]. Traditional journal evaluation indicators include impact factor, 5-year impact factor [10], the average number of citations, cited half-life, number of papers, and immediacy index [11]. With the continuous development of science and technology and the popularization of the Internet, some new evaluation indicators and evaluation systems for Sci-Tech journals are emerging.

The emergence of a large number of journal evaluation indicators allows researchers to conduct a comprehensive evaluation of a journal from various angles [12], but a large number of indicators sometimes makes researchers ignore the truly valuable characteristics of a journal when evaluating. At this time, it is meaningful to find the more important indicators among the multiple indicators for the journal evaluation [13].

2.1.1. H-Index

The H-index [14] signifies that a scholar has a maximum of h papers that have been cited at least h times. Simultaneously, it can also be utilized to assess the quality of journals. Concerning a journal, within all the papers it publishes, if there are a maximum of h papers that have been cited at least h times, then the value of h becomes the journal’s H-index.

In general, it is considered more scientific and objective to employ the H-index for evaluating the quality and influence of a Sci-Tech journal in its respective field. It not only quantifies the number of papers published in Sci-Tech journals but also gauges the quality of these publications. The H-index can provide valuable insights into a journal’s capacity for original innovation and its enduring impact. However, it should be noted that it has certain limitations [15], particularly when used for comparing journals in interdisciplinary fields.

As far as we can see, it is unrealistic to use H-index alone to evaluate a journal. H-index is extremely unfriendly to journals that have just started but are of high quality. At this time, if we can find several more important indicators from a large number of journal evaluation indicators to conduct a more comprehensive evaluation of the journal, we can make full use of the profound meaning behind the indicators while minimizing the impact of the limitation of each indicator. This is undoubtedly meaningful. The dataset constructed in our paper incorporates H-index into the journal evaluation indicators, allowing it to participate in the ranking of the contribution of journal impact factors as one of the 17 indicators.

2.1.2. RSR

The RSR [16] is a comprehensive evaluation method introduced by Chinese statistician Professor Tian Fengtiao, and it has found applications in journal assessment [17]. The fundamental principle of this method involves obtaining dimensionless quantitative statistics, called RSR, through rank transformation within a matrix with n rows and m columns. Building upon this foundation, statistical techniques are employed to analyze the distribution of RSR, facilitating a comprehensive evaluation. This method can also be applied to the assessment of Sci-Tech journals.

RSR is a set of comprehensive statistical analysis methods that solely rely on the dataset itself, without the need for complex data transformations and various processes. It can rapidly indicate the significance of each indicator in the evaluation. Compared to other traditional evaluation methods, its principles are easy to understand, and the calculation process is straightforward. It not only allows for comparisons of the strengths and weaknesses of various journals but also facilitates the comparison of a journal’s development status across different years. However, the RSR method does have certain limitations, owing to potential data loss during the data-to-rank conversion process, which may result in the underutilization of indicator information.

In the application of RSR in the field of journal evaluation, researchers have been exploring the role that each indicator plays in journal assessment, which holds guiding significance for the future development of journals.

However, when utilizing RSR to evaluate journals, the indicators considered in the evaluation are often chosen manually. At this point, the ability to select the most appropriate indicators from a multitude of options to participate in the evaluation directly impacts the accuracy of the comprehensive assessment following the use of RSR. Therefore, it is more meaningful to identify indicators that contribute more significantly to the impact factor.

2.1.3. Principal Component Analysis

With the continuous development of the evaluation system for Sci-Tech journals, the number of evaluation indicators is on the rise, and they reflect various characteristics. It is undoubtedly time-consuming, labor-intensive, and inefficient to use all evaluation indicators for comprehensive journal assessment. Selecting only one journal evaluation indicator for assessment would result in incomplete and inaccurate conclusions. In reality, due to the reuse of original values in the calculation of various journal evaluation indicators, there is a certain correlation between these indicators, and the characteristic information they reflect overlaps to some extent. In light of this situation, scholars consider using principal component analysis to comprehensively assess journal quality.

Today, principal component analysis [18] is extensively employed in the field of comprehensive journal evaluation. Jin et al. [19] applied the principal component analysis method and discovered that this approach can amalgamate multiple indicators into as few comprehensive indicators as possible, thereby reducing the overlap of original indicator information without sacrificing the original data.

2.2. Deep Learning

The most primitive form of deep learning is the artificial neural network [20], which is a subset of machine learning that attempts to emulate the human brain and automatically extract data features through a more complex structure. Machine learning is one of the pathways to achieving artificial intelligence, involving various disciplines such as probability theory, statistics, approximation theory, and convex analysis. The fundamental concept is to learn automatically from training data and apply that knowledge to predict unknown data. Traditional statistical algorithms used in machine learning methods include the following: linear regression model, logistic regression model [21], KNN algorithm [22], random forest algorithm [23], support vector machine, and others.

With the rise of big data and the emergence of high-performance GPUs, the training of more complex model networks has become more accessible. This has further propelled the development of deep learning and its applications in areas such as image recognition, speech recognition, and natural language processing. The remarkable performance in research also motivates scholars to delve into deeper and broader aspects of deep learning. Deep learning continues to flourish in various fields.

In our paper, we try to establish a prediction task for the journal’s impact factor. Under this task, we train various machine learning models (including deep learning models) and indirectly obtain the contribution of each indicator to the impact factor by obtaining model weight or other information during models’ training.

3. Wanfang Journal Evaluation Indicator Dataset (WFJEI)

3.1. Data Collection

The dataset used in our paper is constructed by collecting the indicator data of Sci-Tech journals over the years from China Wanfang Data Knowledge Service Platform, which is a well-known academic database in China. The dataset contains 45,775 pieces of data about Chinese Sci-Tech journals. The data covers the period from 2015 to 2020, with a long time span, a wide range of fields, and a wide range of regions. It basically records the relevant data of a majority of journals in China. The dataset contains 18 journal evaluation indicators collected from 5,425 journals over the past 5 years. In order to ensure the diversity of data sources in the dataset, we collected the indicators of journals for 5 consecutive years in 152 subdivisions and divided them into 12 major disciplines. Among them, the 12 categories are medicine, engineering, basic subjects, applied science, social science, natural science, pedagogy, economics, agriculture, materials science, transportation, and others. Their details are shown in Figure 1.

For each journal, the dataset collected 18 indicators, basically covering all qualitative evaluation indicators for journals. These 18 indicators include impact factor, ratio of international paper, ratio of funded paper, average number of authors, average number of citations, citing half-life, H-index, other-cite ratio, immediacy index, disciplinary impact index, disciplinary diffusion index, citation num, total citations, cited half-life, literature selection rate, number of institutions, number of papers, and number of regions [24]. Among the 18 indicators, the impact factor has become the most common journal evaluation indicator in the world. It is not only an indicator to measure the usefulness and visibility of journals but also an important indicator to measure the quality of journals. It can be used as the true label for the following regression tasks. The names and meanings of the next 17 indicators in the dataset are shown in Table 1. Our dataset covers all aspects of the qualitative evaluation of journals. This dataset contains a large volume of data and spans a significant period, making it highly suitable for analyzing the contributions of various indicators to the impact factor of journals. Its own time feature can also help relevant researchers to dynamically analyze the development of journals. It has a certain guiding significance for the future development direction of the journal.

3.2. Data Preprocess

In order to adapt to our experiments, we use the following method to preprocess the structured data. First, we used the imported predefined Python Library Pandas to read the data and obtained 45,775 pieces of data, which included 18 indicators for most of the journals in China during the 5 years from 2015 to 2020. The research on the contribution of each journal indicator needs models that are trained in one regression task. The task requires the distinction between the feature matrix and the dependent variable for the dataset. In our dataset, the impact factor is recognized as an important indicator for measuring the quality of journals, so it is used as the dependent variable, and the other 17 indicators are used as the feature matrix of the regression task. We used the iloc method in the Pandas Library to complete the extraction of the feature matrix and dependent variables.

The next step in preprocessing structured data involves handling missing data in the dataset. In our experiments, we employed two distinct methods to address missing data. The first method entails identifying specific rows with missing data and subsequently removing those rows. Given the dataset’s ample sample size, this data removal does not compromise experiment accuracy. The second method involves addressing missing data by computing the mean value. During the experiment, we calculated the mean value for each data type, essentially the average values within each column. These mean values were then used to fill in the missing data. This strategy proves effective for data with numerical characteristics.

After completing the missing processing of structured data, we need to perform feature scaling on the data, which is especially important in multiple linear regression, but this step is not required in random forest and XGBoost. In feature scaling, the 17 indicators are all in the same range, and the distribution of each indicator is consistent with that before feature scaling. In this way, it can be ensured that in the multiple linear regression experiment, when calculating the Euclidean distance, one variable can not dominate other variables among the indicators. Here, we apply a normalization [25] method to the dataset with the following formula:where x is the original value, mean is the mean value of the data participating in the standardization, std is the standard deviation of the data participating in the standardization, and is the data after standardization.

Finally, we split the dataset into training and testing sets. Although the focus of our research is not to obtain the accuracy of the model in a regression task but to analyze the contribution of each indicator when determining the value of the impact factor, the accuracy of the model is related to the reliability of the contribution ranking, so we believe that the existence of the test set is still necessary. Overall, 80% of the dataset is used as the training set, and the remaining 20% is used as the test set.

4. Models and Methods

In this session, we implemented 9 methods based on machine learning and deep learning to obtain the contribution of 17 journal evaluation indicators to journal impact factors so as to find out important factors that can further improve the quality and influence of journals. These 9 methods all follow the basic process shown in Figure 2 to obtain the contribution of 17 journal evaluation indicators to the journal impact factor.

First, we take 17 journal evaluation indicators as the input matrix and the journal impact factor as the dependent variable to construct a regression task for predicting the journal impact factor and use traditional machine learning methods (such as XGBoost [26]), simple neural network and deep neural network to complete the training under the regression task. Then, we use different methods to obtain the contribution of the 17 journal evaluation indicators to the journal impact factor through the trained model and the WFJEI.

4.1. Method Based on Traditional Machine Learning Model
4.1.1. Absolute Weight Method Based on Multiple Linear Regression Model

First, we choose the multiple linear regression method as the training model for the regression task, and the multiple linear regression is generalized on the basis of the single linear regression. Here, we take the impact factor of each journal as the dependent variable and the other 17 journal indicators as independent variables. After standardizing the data, we start the training of the multiple linear regression model and fit the following multiple linear functions:where is the predicted value of the impact factor of the ith journal, is the function bias, is the weight of the nth evaluation indicator of the journal, and is the nth evaluation index value of the ith journal.

In multiple linear regression, we use the Euclidean distance [27] as the objective function; when the objective function is as small as possible, the multiple linear function fitting will be completed, and the training of the multiple linear regression model will also be completed. At this point, we can obtain the weight of each evaluation indicator. Then, we use its absolute value as the contribution of each journal evaluation indicator to the impact factor. Among these weights, a positive number indicates a positive correlation, and a negative number indicates a negative correlation. We can directly sort the absolute value of these weights to obtain the corresponding indicator contribution ranking. The contribution calculation formula is as follows:where is the contribution of the ith journal evaluation indicator to the journal evaluation impact factor, and is the weight of the ith journal evaluation indicator in the multiple linear regression model.

4.1.2. Node Importance Method Based on Random Forest Model

We chose the random forest method as the training model for the regression task. It has been constructed as a Class in Python Library Sklearn. Random forest is an integrated algorithm, which belongs to the bagging type. It combines multiple weak classifiers and finally averages the results of all classifiers to obtain prediction results so that the results of the model not only have high accuracy but also have good generalization performance. Table 2 shows the experimental setting of the random forest model.

In our experiment, random forest uses a classification and regression tree (CART) decision tree as a weak learner. When generating a tree, the tree uses bootstrap sampling [28] to collect a random subdataset from the training set and randomly selects a small number of journal indicators as the input of the decision tree; this can ensure the randomness of the features. The number of selected journal indicators is the square root of the total number of indicators. It is worth noting that the input data does not need to be standardized at this time. During the training process, each tree first generates a root node and then determines whether the stop generation condition is satisfied (the number of training samples under the node is less than the predetermined threshold or the node impurity is less than the predetermined threshold). If the stop generation condition is not met, then traverse the selected journal evaluation indicators and their values, respectively, as segmentation variables and segmentation points, and judge the effection of segmentations by calculating for every segmentation. Then, the tree chooses the one with the best segmentation effect as the segmentation variable and segmentation point of the node, generates new left and right subtrees according to the segmentation variable and segmentation point, and performs a new round of the root node’s generation on the left and right subtrees. The calculation formula of is as follows:where is a segmentation variable, that is, the ith journal evaluation indicator. is a segmentation value of the segmentation variable, that is, the value of the ith journal evaluation indicator for the jth journal. and are the number of training samples of the left child node and the number of training samples of the right child node after segmentation, is the number of all training samples of the current node, and are the training sample sets of the left and right child nodes respectively. H(X) is a function to measure the impurity of the node, and it is calculated by the absolute average error H(Ω) in the regression task. The formula is as follows:where is the sample set on the node, N is the total number of samples in the sample set , is the average value of the impact factor of the training samples of the current node, and is the impact factor value of the ith sample in the sample set.

When a regression decision tree stops generating, the training of the tree is completed. Then, the method called NodeImportance will be used to sort the contribution of journal evaluation indicators to the journal impact factors. First, for a certain node k, its importance is calculated as follows:where , , and are the ratio of the number of training samples in node k and its left and right child nodes to the total number of training samples, and , , and are the impurity of node k and its left and right child nodes, respectively. After obtaining the importance of a certain node, we can obtain the importance of a certain journal evaluation indicator through the following formula:where is the set of the nodes which use the ith indicator as the segmentation variable, and all nodes are all nodes in the random forest.

4.1.3. Division Ratio Method Based on XGBoost Model

We chose the XGBoost model as the training model for the journal evaluation regression task, and it has been constructed as a Class in Python Library Sklearn. XGBoost is also an integrated algorithm, but it is different from random forest in that it belongs to the boosting type. XGBoost does not use Bootstrap to train each tree with different random subdata sets, and it trains on the full training set from start to finish. Both XGBoost and random forest use the classification regression tree CART as the weak classifier, but the latter uses the average of the independent prediction scores of each weak classifier as the prediction score of the strong classifier, while the former adds together the prediction scores of each weak classifier and uses the result as the prediction score of the strong classifier. Table 3 shows the experimental setting of XGBoost.

For the training of XGBoost, the training of the nth regression tree depends on the previous n − 1 trees. For example, when we use the journal evaluation indicator of the ith journal for the training of the nth regression tree, the impact factor is , and the prediction result of the impact factor of the first n − 1 trees on the journal is , then for the nth tree, the true value of the journal’s impact factor . We will use to participate in the calculation of impurity when nodes are generated in this CART, and its calculation formula is similar to the calculation of impurity above.

After the training of the XGBoost model is completed, we use the method called division ratio to sort the contribution of journal evaluation indicators to journal impact factors. The idea is that for each journal evaluation indicator, we calculate the total number of times it is used as a segmentation variable in all CARTs. The more times, the higher the contribution of this journal evaluation indicator. For the ith journal evaluation indicator, its contribution calculation formula is as follows:where is the number of times the ith evaluation index is used as a segmentation variable on the tth tree, is the number of total nodes of the tth tree, and Ω is the set of the classification regression trees, which have the node with the ith evaluation indicator as the segmentation variable.

4.2. Method Based on Simple Neural Network

In a recent study [29], the author proposed four feature importance ranking methods based on simple neural networks; we learn from the methods. The original intention was to improve the learning speed and reduce the number of feature variables of the input data to simplify the data. When we rank the contribution of journal evaluation indicators, we also use a relatively simple neural network as the training model for the regression task, which is proposed for predicting the journal impact factor. After completing the training, we use these four methods to obtain the contribution ranking of journal evaluation indicators based on this model. Table 4 shows the experimental setting of the model.

According to the original author, in the model training stage, we use the journal evaluation impact factor as the ground truth and use the other 17 journal evaluation indicators as the input feature matrix of the model. The model structure has four hidden layers, one input layer, and one output layer. The number of neurons in each hidden layer is 200, 100, 50, and 25, respectively, and the number of neurons in the last output layer is one without the softmax layer. Because this is a regression task, only outputs the predicted impact factors. The model adopts Adaptive Moment Estimation [30] as its optimizer. Adam not only uses Momentum [31] but also can adaptively change the learning rate, effectively preventing problems such as gradient oscillation and saddle point stagnation. According to the author’s suggestion, the initial learning rate of the model is set to 0.01, and the batch size is set to 32.

After completing the training of the model, we use the following four methods to obtain the contribution of each journal evaluation indicator to the journal evaluation impact factor and perform a top-five ranking.

4.2.1. Input Perturbation Method

The core idea of this method is that if a journal evaluation indicator contributes more to the prediction of the journal evaluation impact factor, then its perturbation will make the model’s prediction of the journal evaluation impact factor less accurate. Based on this idea, we can obtain the contribution of each journal evaluation indicator to the journal evaluation impact factor by using the trained simple neural network and the WFJEI.

First of all, we do not do any special processing after preprocessing data of the journal evaluation indicator and use the journal evaluation impact factor and the other 17 journal evaluation indicators as the real value and input feature matrix of the neural network model, respectively. After inputting the feature matrix and the forward propagation inside the model, we obtained the journal evaluation impact factor of each journal predicted by the model. Then, we calculated the loss by using the real values and the predicted values; the loss function is the absolute average error mean absolute error (MAE), and the calculation formula is as follows:where m is the total number of journals in the dataset, is the predicted value of the journal impact factor of the ith journal, and is the true value of the journal impact factor of the ith journal.

Afterward, we need to perturb the input of each journal evaluation indicator in turn to calculate the new MAE. Usually, the perturbation of the input journal evaluation indicator can be directly deleted or shuffled. Here, we do not choose to directly delete all the data of the journal’s evaluation indicators, because we have fixed the input dimension of the model structure during the model training phase. The input dimension is (batch size, 17), where the batch size is the batch size, which is always 32. And 17 is the number of input journal evaluation indicators. Therefore, if we directly delete a certain journal evaluation indicator, the input dimension will become 16, and the model we trained before will lose its effect. Training a new model to measure the contribution of only one of the journal evaluation indicators is unnecessary and extremely wasteful of resources and time. Therefore, we choose to shuffle the data of each journal evaluation indicator in turn, that is, to shuffle the data of each column that needs to be evaluated in the input feature matrix in turn.

We store the MAE calculated after each perturbation. When the cycle is completed, we can use the following formula to calculate the contribution of each journal evaluation indicator to the journal evaluation impact factor:where is the contribution of the ith journal evaluation indicator to the journal evaluation impact factor, m is the total number of journal evaluation indicators, and is the absolute average error of the ith journal evaluation indicator after perturbation.

4.2.2. Correlation Coefficient Method

This method completely relies on the WFJEI. Its core idea is that if the correlation coefficient between a journal evaluation indicator and the journal impact factor is greater, then its contribution to the journal impact factor is greater. This proposed approach can also be used in the following hybrid approach. In our experiment, we use the Pearson correlation coefficient as the correlation coefficient between the journal evaluation indicator and the journal impact factor, and the specific formula is as follows:where correlation is the Pearson correlation coefficient [32], n is the total number of journals in the dataset, is the value of the tth journal indicator in the ith journal, is the value of the journal impact factor of the ith journal, is the average of all the values, which represent the tth journal indicator in all journals, is the average of journal impact factors in all journals.

After completing the calculation of the Pearson correlation coefficient between all journal evaluation indicators and journal impact factors, we obtain the contribution of journal evaluation indicators to the journal impact factor through the following formula:where is the contribution of the ith journal evaluation indicator to the journal impact factor, m is the total number of journal evaluation indicators, is the Pearson correlation coefficient between the ith journal evaluation indicator and its journal impact factor.

4.2.3. Square Weight Method

The core idea of this method is that when a journal evaluation indicator is used as an input of the network, if the weight assigned to the input is greater, the contribution of this journal evaluation indicator to the journal evaluation impact factor is larger. Therefore, this method relies entirely on the neural network model that has been trained, as shown in Figure 3.

We can see that I1, I2, and I3 are three indicators of the I7 input journal evaluation indicators, and O is the predicted journal impact factor. When calculating the weight of I1, we consider the solid arrows in Figure 3 to obtain its weight. The values of the solid arrows between the input layer and hidden layer 1 represent the weight between I1 and every neuron in hidden layer 1, and they are squared and then summed to obtain the total weight of I1 to the hidden layer. Next, we calculate the contribution of the journal evaluation indicator to the journal impact factor according to the following formula:where is the contribution of the ith journal evaluation indicator to the journal evaluation impact factor, m is the total number of journal evaluation indicators, is the total weight of the ith journal evaluation indicator to the hidden layer.

4.2.4. Hybrid Method

Among the first three methods, some either only depend on the dataset or the model that has been trained. Therefore, a hybrid method is implemented here, which combines the first three methods to calculate the contribution of the journal evaluation indicators to the journal impact factor. This method proposes a parameter d, which will affect the weight of the values, which are calculated by using correlation coefficient methods and input perturbation methods, respectively, when they participate in the final contribution calculation. The calculation formula of the parameter d is as follows:where n is the total number of journal evaluation indicators, is the contribution of the ith journal evaluation indicator to the journal impact factor calculated under the input perturbation method, and is the mean value of all journal evaluation indicators’ contribution to the journal impact factor when using the input perturbation method.

After calculating the parameter d, we will calculate the final contribution of each journal evaluation index according to the following formula:where is the contribution of the ith journal evaluation indicator to the journal impact factor calculated under the square weight algorithm, and is the contribution of the ith journal evaluation indicator to the journal impact factor calculated under the input perturbation algorithm, is the contribution of the ith journal evaluation indicator to the journal impact factor calculated under the correlation coefficient algorithm.

4.3. Method Based on Deeper Neural Network

In a recent study [33], the author implements two methods to complete the acquisition and sorting of the importance of the input features of the deeper neural network, we learn from the methods. When we rank the contribution of journal evaluation indicators, we also use a deeper neural network as the training model for the regression task, which is proposed for predicting the journal impact factor. After completing the training, we use two methods based on the trained model to obtain the contribution of journal evaluation indicators to the journal impact factor. The experimental settings of the model are shown in Table 5.

According to the original author, we built a deeper neural network, taking the journal evaluation impact factor as the real result and using the other 17 journal evaluation indicators as the input feature matrix of the model. The number of neurons in the hidden layer of the model is 50, 1,024, 2,048, 4,096, 2,048, 1,024, 50, and the number of neurons in the final output layer is 1, and there is no need for a softmax activation function because we are performing a regression task instead of a classification task. In the model, each fully connected layer is followed by a BatchNormalization layer [34], and the DropOut [35] method is implemented to prevent the model from overfitting. The model adopts Adam as the optimizer, and all parameters are consistent with those proposed by the original author.

After completing the training, we use two methods based on the trained model to obtain the contribution of journal evaluation indicators to the journal impact factor.

4.3.1. VIANN Method

In the process of deeper neural network training, the weight of the model is continuously optimized according to the gradient descent algorithm in iteration after iteration until the result of its loss function reaches as small as possible to stop optimization and achieve fitting. The core idea of this method is that when the input weight of a certain journal evaluation indicator changes more significantly during the training process of the deeper neural network, the greater the contribution of the journal evaluation indicator to the journal impact factor. Therefore, in this method, during the model training process, we constantly monitor the weight of the input feature variable updated after each iteration and calculate its variance. The larger the variance, the more obvious the change, and the greater the contribution of the journal evaluation indicator to the journal impact factor.

However, during the training process, if we store the weights in all iterations to calculate the variance, it will cause huge computational overhead and consume a lot of resources and time. Therefore, the original author adopts a method and a new parameter called running variance, which are proposed by Welford. In this way, the current running variance can be updated when the weight is updated at the end of each iteration, which saves a lot of time and storage space.

We know that one weight can generate a list of size n after n iterations, we call it list 1. One weight can also generate a list of size n − 1 after n − 1 iterations, and we call it list 2. List 1 has one more value than list 2, which is . It is obvious that is the updated weight in the nth iteration. From the derivation by Welford, we can obtain the following three formulas:where is the mean value of the list 1, is the mean value of the list 2, is the variance of the list 1, is the variance of the list 2, is the running variance of weight after n iterations.

When all iterations are over, we obtain all the weights between the input layer and the first hidden layer after the last update. Then, we use it to calculate the contribution of each journal evaluation indicator to the journal impact factor. The calculation formula is as follows:where is the contribution of the ith journal evaluation indicator to the journal impact factor, Ω is the set of the first hidden layer’s neurons connected to the ith input journal evaluation indicator, is the running variance of the weight between the ith journal evaluation indicator and the tth neuron in the hidden layer, is the value of the weight between the ith journal evaluation indicator and the tth neuron in the hidden layer after the last updating.

4.3.2. Garson Method

This method is relatively simple to implement and only needs to obtain the weight matrix between the input layer and the first hidden layer and the weight matrix between the last hidden layer and the output layer in the deep neural network. After obtaining these two matrices, we can calculate according to the following formula to obtain a vector composed of the contribution of journal evaluation indicators to the journal impact factor:where std() is a normalization function. Since the size of is (17,50) and the size of is (50,1), the size of vector Importance is (17,1), and there are 17 values in it represents the contribution of the 17 journal evaluation indicators to the journal impact factor, and their sum is 1.

5. Experimental Results and Analysis

Table 6 presents a comparison of two evaluation metrics (MAE and mean square error (MSE)) for five trained models used in regression tasks. Smaller values of MSE and MAE indicate higher predictive accuracy of the models. It is evident that traditional machine learning achieves significantly higher prediction accuracy compared to simple neural networks. Even after training, the deep neural network’s predictive accuracy approaches that of the linear regression model, but there is still a noticeable gap compared to XGBoost. This discrepancy arises because, in regression tasks based on structured data, the underlying rules that need to be “learned” are not overly complex, and the models do not need to “learn” intricate and incomprehensible rules as in the domains of image recognition and natural language processing. Therefore, the performance of traditional machine learning shines in this task, and its contribution ranking is more dependable.

5.1. Methods Based on Traditional Machine Learning Model

Table 7 presents the ranking of journal evaluation indicators’ contributions to the journal impact factor obtained using three methods: XGBoost, random forest, and linear regression, respectively. It is evident that under all three methods, the H-index and immediacy index consistently rank among the top two contributors to the journal impact factor.

The H-index not only takes into account the quantity of papers published by the journal but also places importance on the quality of those papers. When paper quantities are equal, a higher number of citations per paper signifies higher quality and a higher H-index. Given the impact factor’s correlation with citation numbers, the substantial contribution of the H-index to the journal impact factor is justified.

The immediacy index of a journal indicates the immediate response rate of the journal, primarily reflecting citations of journal papers within the year of publication. In contrast, the journal impact factor directly captures citations of journal papers in the year following publication. These two indicators exhibit a strong correlation; higher citations in the current year typically translate to higher citations in the following year. Therefore, the substantial contribution of the immediacy index to the journal impact factor is also reasonable.

The number of papers ranks among the top five contributors across the three methods, albeit with slight variations in ranking.

In the methods based on the XGBoost model and random forest model, the contribution of the average number of citations and the number of institutions entered the top five. In fact, both of these models use the CART as the weak classifier, so it is reasonable that similar results are obtained in the methods based on the XGBoost model and the random forest model. Finally, we can observe that in the linear regression method, the contribution of subject impact indicators and the ratio of funded papers entered the top five.

In Figure 4, the x-axis is the journal evaluation indicator, and the y-axis is its contribution to the journal impact factor. It can be seen that when x = 7 and x = 8, the contribution is much greater than other indicators. When x = 7, the journal evaluation indicator is the H-index, and when x = 8, the journal evaluation indicator is the immediacy index. Therefore, it can be seen from the way that H-index and immediacy index contribute far more to the journal impact factor than other journal evaluation indicators.

5.2. Methods Based on Simple Neural Network

Table 8 displays the ranking of journal evaluation indicators’ contributions to the journal impact factor obtained through four methods based on a simple neural network. First, under the correlation coefficient, square weight, and hybrid methods, we can observe that the H-index and immediacy index continue to secure the top two positions. The last three indicators are citation num, total citations, cited half-life, and average number of citations. These indicators are all related to citations and exhibit a strong positive correlation with the impact factor. Total citations underscores the importance of evaluating a journal’s development achievements over the years. The disciplinary impact index also appears in the top five twice, indicating that, for journals, comprehensiveness is undoubtedly important, but so is their expertise in their respective fields.

In the input perturbation method, we found that the top five indicators in this method are quite different from those in other methods, because the main idea of this method is to recalculate the loss by perturbing the order of data of the journal evaluation indicator. Then, the method generates a contribution ranking in descending order of loss, which is still applicable in deeper neural networks. Given that deeper neural networks offer higher accuracy compared to simple neural networks, we intend to revisit this method in the next subsection, focusing on deeper neural networks. In this context, we plan to visually represent the contribution values of each indicator instead of presenting them in a ranking format.

5.3. Methods Based on Deeper Neural Network

Table 9 presents the rankings of journal evaluation indicators’ contributions to the journal impact factor as obtained through two methods based on deeper neural networks. The VIANN method calculates contributions by assessing the variance of continuously updated weights during the training process, while the Garson method only considers the weights updated in the final iteration to determine contributions. Both methods rely entirely on the model’s own weights, with the dataset being used solely for model training.

Under these two methods, the H-index and immediacy index maintain their positions in the top three contributions. The fourth and fifth positions in terms of contribution are occupied by the other-cite ratio and literature selection rate. Additionally, it is worth noting that cited half-life, a journal evaluation indicator absent from the top five in previous methods, ranks in the top two in terms of contribution under these two methods. This suggests that emphasizing the cutting-edge nature of a journal’s papers plays a pivotal role in enhancing the journal’s quality and impact.

In the input perturbation method using simple neural networks, we observed that the ranking of each journal evaluation metric’s contribution to the journal impact factor differs significantly from other approaches. Since this method remains applicable with deeper neural networks, and deeper networks yield higher accuracy compared to simpler ones, we applied the input perturbation method with the deeper neural network to recalculate contributions. At this point, we no longer discuss ranking contributions under this method but directly assess their values. Table 10 illustrates the loss incurred for each indicator in the input perturbation method based on the deeper neural network. We calculated their MSE and MAE, respectively. Notably, after data perturbation, the loss increases across all conditions. What is most striking is that the losses resulting from perturbing journal evaluation indicators are quite similar. Hence, ranking them by loss becomes meaningless at this stage. This also underscores the significance of all 17 journal evaluation indicators in assessing journal quality, as they collectively influence the journal impact factor and comprehensively evaluate a journal’s academic stature and influence.

6. Conclusions and Future Work

Sci-Tech journals have long been the carrier of academic content, and their quality usually determines their influence in various fields of science and technology. Therefore, researchers have long been trying to come up with various indicators to evaluate journals scientifically and objectively. Journal impact factor has also become an important factor in evaluating the quality of journals. We try to find out the journal evaluation indicators that can affect the journal impact factor more from the 17 journal evaluation indicators and obtain their contribution to the journal impact factor. Every journal evaluation can evaluate a certain characteristic of a journal from a certain aspect. We can use the journal evaluation indicators with high contribution to understanding, which characteristics of the journal are more conducive to improving the quality and influence of the journal.

If the number of times an indicator ranks in the top five is greater than three under the other eight methods, except for the input perturbation method, the indicator and its number of times will be shown in Table 11.

It can be seen intuitively that the H-index and immediacy index are the highest, with eight times in both. Among them, the H-index not only requires a high output of journal papers but also a high quality of journal papers. To increase the number of journal papers, we must first attract more researchers to submit papers to the journal, so it is necessary to improve the academic ability of reviewers, reduce the review cycle, and return more constructive and comprehensive revision comments to papers. The more attractive the journal is to researchers, the more papers are available for inclusion. At this time, if the reviewers review the manuscript with a more serious and responsible attitude and return the revised comments, it will naturally increase the average number of citations of the journal’s papers, thereby improving the quality of the journal’s papers and achieving the purpose of increasing the influence of the journal. The immediacy index reflects the “vitality” of the journal. To a certain extent, it reflects whether the journal’s content is cutting-edge enough and innovative enough, which also affects the quality and influence of the journal. We can also see that the number of papers and average number of citations exist in Table 11. These two indicators’ appearance indicate that the improvement of the quality and influence of the journal also requires the contributors to have a solid theoretical foundation in related fields, which still requires journals to improve their attractiveness to contributors. In the final analysis, in order to improve the quality and influence of a journal, its own attractiveness to contributors and its own ability to review papers are necessary.

Based on the content of the whole paper, our paper proposes a new dataset, which contains the values of 18 evaluation indicators of Sci-Tech journals from 2015 to 2020. The data has a large time span, comprehensive indicator types, and rich types of journals, which have good research significance. On the basis of this dataset, our paper attempts to establish a task for predicting impact factor in order to train machine learning models, including deep learning models, and obtain the contribution of each evaluation indicator to impact factor by implementing nine methods based on the models after training under this task. By doing this work, we can research how to improve the quality of journals.

Among these 17 indicators, their calculation formulas may include the number of article citations or the number of articles published, but in fact, when these 17 indicators are calculated using these two calculation factors, the time range and field range of concern are different, and some new calculation factors are added, resulting in the information reflected by them and the information reflected by the impact factors are also different. In our final experimental conclusion, we can also find that when calculating the disciplinary diffusion index, the number of articles published was used, and when calculating total citations, the number of article citations was used, but the contribution of these two indicators to the impact factor ranked last. Therefore, using these two calculation factors will not simply lead to a greater contribution to the impact factor. Our research is undoubtedly meaningful because we have used various methods to explore in-depth the contribution of 17 journal evaluation indicators to the impact factor.

We still believe that our research is valuable because it provides us with a new perspective to understand the relationship between the impact factor and other journal indicators. We hope these results can provide some guidance for the development of journals and help them formulate strategies to improve their quality.

In the future, we can use another journal evaluation indicator as the value needed to predict in the regression task and explore the contribution of the 17 indicators to this new indicator. For example, we can use a 5-year impact factor, which is similar to the impact factor, but it is calculated in a 5-year period. Compared with the impact factor, the 5-year impact factor can better comprehensively judge the quality and influence of the journals in fields (such as mathematics) with a long life cycle of citations. At the same time, we can also try to learn from the advantages of the traditional journal evaluation method based on statistical formulas and combine these methods with our method based on deep learning, then get a new and more comprehensive journal evaluation method.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

We declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Conceptualization was done by Ma Y., Han Y., and Zeng H.; methodology was done by Zeng H. and Ma Y.; software-related task was done by Zeng H.; validation was performed by Zeng H.; formal analysis was performed by Zeng H.; investigation was done by Zeng H.; resources were provided by Ma Y., Han Y., and Ma L.; data curation was done by Zeng H., Ma Y., and Han Y.; writing—original draft preparation was done by Zeng H.; writing—review and editing was done by Zeng H.; visualization was performed by Zeng H.; supervision was performed by Ma Y., Han Y., and Ma L.; project administration was done by Ma Y., Han Y., and Ma L.; funding acquisition was done by Ma Y., Han Y., and Ma L. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

This research was funded by the Scientific Project of State Grid Shandong Electric Power Research Institute, grant number ZY-2022-07, and Scientific Project of State Grid, grant number 520626230079.