Abstract
In order to improve the recommendation performance of online teaching resources in colleges and universities and the learning efficiency of users, this paper considers user preference factors and studies the recommendation method of online teaching resources in colleges and universities. This paper selects different user preference factors and extracts preference keywords and builds a keyword projection model for different user preference resources based on this. In this paper, Markov is used to construct a keyword probability model, and the TF-IDF algorithm is introduced to refine online teaching resources. According to the calculated closeness of the resources required by the user, this paper realizes the accurate recommendation of resources. The experimental results show that each recommendation performance index of this method has reached a high value. The recommendation accuracy rate of this algorithm is the highest at 82.2%, the recall rate is the highest at 78.7%, the value is the highest at 83.5%, and the average absolute error is lower than 0.56. And the successful recommendation rate of this method is as high as 97.6%.
1. Introduction
With the continuous development of the information age, the Internet is playing an increasingly important role in people’s lives, and the way people acquire and process information has gradually shifted from offline to online, and the shortcomings of offline learning have gradually emerged. Its advantages, such as rich content and easy dissemination, facilitate users’ learning [1–3]. At the same time, the sudden increase of massive data also brings about the problem of information overload, especially that the uneven quality of network information makes it more difficult for users to obtain effective information. On the one hand, users need to spend more time and energy to filter the information they need, and the process is time-consuming and labor-intensive, but the ideal results may not be obtained [4, 5]. On the other hand, users spend patience in making choices in the platform, and are prone to fatigue, thus losing confidence in the platform and looking for other ways [6, 7]. Therefore, it is more and more important to provide users with personalized information recommendations so that users can quickly find information suitable for them in the massive data. At present, information recommendation has been widely used in e-commerce, digital library, news tourism website, and other fields and is constantly developing and improving [8–11]. The scale of online education and network education continues to expand, and it is gradually becoming the main force for knowledge sharing and talent training. However, with the development of big data, various educational resources continue to accumulate, and a large number of educational resources have also brought some headaches to the majority of users, such as “information overload” and how to let learning users quickly find what they like. And also useful educational resources are becoming more and more important [12–14].
The online education platform has the characteristics of strong knowledge, rich types, and systematic resources. Due to the influence of academic background, major interests, and other factors, users have different mastery of existing knowledge and different abilities to accept new knowledge. Faced with massive learning resources, their ability to find interesting and useful resources is also different. Only through data analysis, a complete user preference recommendation system can be established, and then, the resources that are really suitable for users can be recommended to users, which can greatly improve users’ learning efficiency, provide users with good services for online learning, and increase users’ loyalty to the platform. In related research, some scholars proposed to use the CF-IRST method, that is, combining the collaborative filtering algorithm and the Intel Rapid Storage Technology driver, to design an online recommendation method for teaching resources. This method finds the nearest neighbor set of resources according to the similarity between the vectors formed by multiuser’s rating of items. This method can better realize the recommendation of “cross-type” resources. However, the satisfaction of user resources needs to be further improved. Some scholars have proposed a resource recommendation method based on the DeepCoNN model. This model extends the DeepCoNN model by introducing an additional latent layer representing the target user-target item pair. This layer is then regularized at training time to resemble another latent representation of the target item by the target user. This completes the resource recommendation. However, the accuracy of resource recommendation of this method needs to be further improved. Some scholars have also proposed a personalized learning full-path recommendation model based on LSTM neural network. Based on the learner’s feature similarity measure, a set of learners is first clustered, and a long short-term memory (LSTM) model is trained to predict their learning paths and performance. Then, select the personalized learning full path from the results of path prediction. Finally, a suitable learning full path is recommended specifically to test learners. However, this method lacks in consideration of user preferences and can be further improved.
In order to further improve the problems of recommendation accuracy and user satisfaction in the recommendation of teaching resources, this paper proposes a method for recommending online teaching resources in colleges and universities that considers different user preference factors. This method selects different user preference factors and extracts preference keywords to construct a resource preference model. This method introduces the TF-IDF algorithm to calculate the closeness of the resources required by the user and completes the teaching recommendation.
2. Selection of Different User Preference Factors and Keyword Extraction
The composition of users in colleges and universities is relatively simple, and it is relatively simple to provide personalized services. The main personalized recommendation service is to recommend interesting and valuable teaching resources for each user [15]. Therefore, based on the analysis of user reading data generated by a university library in 2020, and according to the differences in the reading needs of university users, this paper is divided into three types of groups: to explore the potential needs and interests of each group, to preliminarily determine the three types of groups, and in order to more accurately construct the user preference model between different groups.
According to the two conditions of the composition of the university population and the different learning and reading needs of different groups, users can be roughly divided into three categories: faculty groups, undergraduate groups, and postgraduate groups [16]. The faculty group accounts for the smallest proportion of the total number of college groups, mainly including university teachers, administrative staff, nonstaff personnel, advanced students, and retirees. It is mainly responsible for the multiple tasks of scientific research, teaching, management, and party building ideas. Therefore, it is necessary for online teaching resources to provide targeted and personalized services for faculty and user groups [17]. Among them, the largest proportion is undergraduate students, and a large number of freshmen are enrolled every year. Therefore, the group who read the most online is usually undergraduate students. The main purpose of reading resources for undergraduate group users is as follows: graduation thesis, professional final exams, various certificate exams, and personal hobbies. The online resources of colleges and universities should mainly recommend this part of knowledge. The postgraduate group users mainly include master students and doctoral students with main tasks such as research expertise, dissertations, and journal articles. At the same time, it also needs to be responsible for scientific research writing and project reporting. In addition to the recommendation services previously proposed, the personalized service of teaching resources in colleges and universities should also provide postgraduate groups with books in professional fields, as well as guidance resources recommendation services on topics.
2.1. Selection of Different User Preference Factors and Construction of Keyword Projection Model
The research of user preference model is an important content in the personalized recommendation service of online teaching resources in colleges and universities. According to the different reading needs of different groups, we have subdivided the groups into three categories: undergraduate groups, postgraduate groups, and faculty groups. Through the reading data of college users, the preference factors of different groups of users can be mined, and user portrait models of different factors can be constructed [18–20]. In the group user portraits of university libraries, the dominant features are mainly obtained through the university campus card, such as student number, college, major, grade, gender, name, and other information. The hidden characteristics of users can constitute the dimension of user preference, and users can be analyzed and divided through multidimensional user preferences. At the same time, the hidden characteristics of users are often more able to reflect the real needs of users and the subconscious demand for resources. The implicit preferences of users include preferences in four different dimensions: reading duration, popularity, reading frequency, and reading characteristics.
2.1.1. Reading Time
There is a certain relationship between the user’s interest in a book and the length of the borrowing time. The longer the reading time, the higher the interest in the book [21]. In this article, we will ignore some noisy data, such as reading time during vacation and overdue reading time. In the reading records, we use the percentage to represent the reading time. The specific calculation is shown in where represents the start time when user starts to read online teaching resource and represents the time when user finishes reading online teaching resource .
2.1.2. Popularity
Users’ reading behavior is often affected by the popularity of books. The more popular books and journals tend to be sought after by a large number of readers [22]. Among them, factors such as the evaluation details of the book by Douban APP and other applications, the recent popular TV series adapted from a book, and the recommendations of influential institutions will all affect the recent popularity of a book, thereby changing users’ reading in the near future, which tend to subtly influence the reading needs of users. The prevalence calculation is shown in where represents the resource , count is the statistical method, and is the user set. Get the popularity of a resource through a formula.
2.1.3. Reading Frequency
The frequency of each professional user reading books in a certain professional field will also potentially affect the user’s reading interest. These books have the characteristics of new technology, new ideas, or the representativeness and authority of the professional field and are often sought after and read by users. The statistics of the book rankings in the professional field will affect the users’ recent reading needs, thereby changing the recent reading behavior [23–25]. According to the information of each user’s major and college, clean all the reading history data of the user in the major, count and sort the number of books read by their major, and select the top ten books to recommend to the user first. The calculation for the reading frequency is shown in where top is the online resource method for generating top 10 and represents the online resource reading data of each college major.
2.1.4. Reading Feature
By analyzing the text features of the readers’ recently read books, the readers’ recent reading interest preferences can be collected. Each book in the collection basically covers valid information such as book title, CLC number, ISBN, and author. The feature vector of the book can be extracted from this information. The feature vector is composed of book information feature items and weights, and the calculation formula is shown as where is the text feature weight, is the text set, is the word frequency in the feature item text, is the total number of texts, and is the feature item text number in the text set.
2.2. Keyword Projection Model for Different User Preference Resources
In order to improve the accuracy of online resource recommendation in colleges and universities, a keyword projection model of different user preferences is constructed. The user’s preference rating data is described in a matrix form, and the matrix is used as the input item of the keyword projection model. The selection relationship between users and resources can reflect the interest trend among users [26–28], and the relationship between resource characteristics and user preference characteristics in college online is displayed by a dual-mode network. As far as a specific user group is concerned, when a user is associated with multiple network activities, multidimensional connections will be generated between users in the group, forming a multimodal network of selection relationships between users and resources. Combined with the complex attributes of the multidimensional network model, a resource keyword projection model is constructed, which is described as follows: (1)Extension: The number of resource keywords in the initial network is , and the number of users is . Every time a new user enters, it will build a connection relationship with the current keywords in the network(2)Local scope definition: Arbitrarily extract resource keywords in the resource network, and design the local scope connected to the user to be (3)Connection prioritization: The probability of connection between a newly entered user and the local scope is , and the probability of connecting with a keyword node outside the local scope is . Then, the relational expression of the connection probability of this node is shown aswhere the node degree is .
From this, the following expression for the preferential connection probability based on the local scope is derived as
In the above formula, the similarity between user preferences and keywords is , which can be solved by where the shared neighborhood of user and user is , the average score of user is , and the score of user is .
According to the above formula, the random connection probability expression outside the local scope is constructed as where the time interval is denoted by .
After the above process, the construction of the keyword projection model can be completed. The model can directly generate a multidimensional keyword network and filter out the complex process of updating the multidimensional network model by projecting new users from the multimodal network to the user nodes after entering.
3. Recommendation of Online Teaching Resources in Colleges and Universities Based on User Preferences
3.1. Resource Recommendation Profile considering User Preferences
The method designed in this paper mainly relies on the user’s preference information for recommendation, and for the newly registered users in the system, they have not yet generated any preference information and behavioral characteristics, which makes it impossible to recommend to new users. Therefore, for new users of the personalized recommendation system for online resources in colleges and universities, users need to improve their personal information after logging in to the system, including name, gender, major, grade, and department, in addition to the above basic information. In order to distinguish user groups, the user type must also be selected. If it is a graduate student or a faculty member, other information needs to be provided, including scientific research direction, scientific research projects, participation in topics, and teaching tasks. With the basic information provided by users, especially scientific research projects, we can calculate the similarity matrix between users by extracting keywords, and obtain the preferences of neighboring users for resources, so as to recommend them to new users. Therefore, based on the information provided by multidimensional user portraits, this paper proposes a keyword-based content recommendation algorithm.
For compound sentences composed of multiple words such as university project names or scientific research project names, how to extract keywords from these compound sentences has become the key research content of this section. The premise of extracting keywords is to perform Chinese word segmentation on these complex topics. Through the NLPIR Chinese word segmentation technology of the Chinese Academy of Sciences, the complex names generated by the user’s subject or project are segmented, and some meaningless words are deactivated, such as “is” and “based on”.
According to the size of the calculated keyword weight, that is, the keyword is obtained through the keyword extraction algorithm, and then, different weights are assigned. The purpose is to distinguish each vocabulary that appears in the subject or project and to assign weights to the vocabulary. We can assign higher weights to the words that are summarizing or distinguishing the topics or projects. The less representative words are assigned their lower weights. In summary, the similarity between users can be calculated by selecting keywords with higher weights, so as to optimize and improve the content-based recommendation algorithm.
The TF-IDF algorithm is one of the most commonly used, best, and most applied algorithms in the field of keyword extraction. When a feature word appears in a certain subject area, the feature word can directly represent the specific research content of this type of subject, such as “convolutional neural network.” If a certain keyword is frequently used in most subject areas (such as “research”), it is impossible to distinguish the content of each university subject research field.
In addition to solving the cold start of the system by obtaining personal basic information, this paper will also provide users with a way to select their favorite resource tags to obtain user interest preferences. When the user selects the resource-tag category information, the university system will recommend books based on the resource preference category selected by the user, rather than the recommendation result obtained based on the user’s basic information. At the same time, the content-based recommendation module can be directly connected to the real-time recommendation service, and similar books can be calculated by the user’s rating of the current book, thereby realizing content-based real-time recommendation.
At the same time, the online teaching resources of colleges and universities will be supplemented with new resources to the campus network every once in a while, because these new resources do not have loan data. Therefore, content-based recommendation algorithms are also suitable for newly updated resources. Then, according to the resource keyword projection model, the similarity between the resource and the user’s preferred keywords is calculated.
3.2. Calculation of the Correlation between User Preferences and Keywords
Since the keywords are only literally related to a certain extent, it is very likely that the recommended resources deviate significantly from the original topic. Therefore, it is necessary to build a keyword probability model that can analyze the latent semantics of resources and to combine the correlation between user preferences, latent keywords and tags.
According to the Markov hypothesis, it is considered that keywords and preferences are lattices, and the label in the lattice is set as , then the preference observation value of the -th label is , and the keyword probability feature is . If each label in the preference is only is connected with the corresponding keywords and adjacent preferences, and in the case of known topic information, the statistical information of the corresponding preferences can be clarified. Figure 1 is a schematic diagram of the keyword probability model under the Markov model.

It can be seen from Figure 1 that the preference node is represented by the black dot , and the keyword node is represented by the white dot . The compatibility of two kinds of nodes contains the requirement of known preferences for keyword features, which is similar to the likelihood function of the Bayesian framework, and can also be written as . The compatibility of two keyword nodes contains the key. The statistical feature of the word itself is set to according to the prior probability function, and the label and the label are adjacent states. The joint probability of preference node and topic node is calculated using
Using the maximum posterior probability to determine the keyword node corresponding to the th label, the following equation is obtained:
In the above formula, the solution formula of the topic edge information probability is shown as
In practical application, the number of resources and keyword nodes is constantly increasing, and the calculation amount of Equation (11) will also increase greatly. Therefore, the belief propagation algorithm is used to improve the real-time and accuracy of recommendation.
3.3. Refinement and Recommendation of Online Education Recommendation Resources
In order to improve the accuracy of online education resource recommendation, the TF-IDF algorithm is introduced. It is assumed that in the model , , , and represent the top-level node set, bottom-level node set, and online education network connection set, respectively. Nodes are not directly connected. If the number of recommended users is and the target number is , create a weight matrix of order as shown in
If the user has a score for the label , the value of in the weight matrix is 1; otherwise, the value is 0; if the user does not have a score for the label , then, amount of resources is allocated to the keyword label. First, the online teaching resources are sent to the users connected to the tag , and then, the resources are sent to all keyword tags through the users. Use the following formula to calculate and to solve the resource obtained by any tag in tag : where the number of users who evaluate the score of the keyword tag is and the number of keyword tags that the user scores is .
After formula (10) is iteratively calculated, all resource vectors of all labels are obtained, and the degree of closeness of resources required by users between the unrated label and any label is expressed by
The above formula describes the resource sharing degree between tag and tag , and the value range of this indicator is from 0 to 1. The larger the value, the higher the degree of willingness. According to the resource closeness of the unrated tag and the remaining tags, the predicted score of the user of the tag is obtained as where the real score of user with label is . After filling the initial resource scoring matrix, the model is relatively dense with the following scoring matrix
Resource clustering is completed based on the similarity matrix of user preferences and keywords, and each topic is a representative of candidate categories. Assuming any two topics and , the Euclidean distance is used to measure the similarity between the two topics, as shown in
Knowing a set of topics and the similarity between the two, by searching the topics of each category and the representative topics of the categories, the attraction information and the attribution information are spread among the topics. The iterative process in the belief propagation algorithm is the alternate update stage of the two information indicators. Attractiveness points from keyword to keyword and is used to reflect the accumulated evidence of keyword , describing the suitability of the keyword as a category representative. For any keyword , the sum of the attraction degree and the attribution degree of all teaching resources is obtained. At this time, the keyword is the category representative of the keyword, and the following equation is obtained:
Assuming that the diagonal elements of the similarity matrix are the same value , initialize the attraction and attribution, and get
The amount of attractiveness and attribution information is updated using the following
Get the category center of each topic, if the result is greater than the maximum value of one of the iterations. The amount of information change is smaller than the fixed threshold, and in several consecutive iterations, the selected category center is still stable, and the algorithm stops. By arranging user preference keyword category centers in descending order, the final recommendation of online teaching resources is realized.
4. Experiment and Analysis
4.1. Dataset
In order to verify the effectiveness of the method, we crawled the historical data of learners on NetEase Cloud Classroom, Geek Academy, Love Course, and MOOC. % as the validation set and 10% as the test set. The role of the validation set is to choose appropriate parameters, and the role of the test set is to evaluate the performance of the proposed method. The topics and sizes of the learning resources of the 4 datasets are different. In these 4 datasets, only 4 of the fields are used for each dataset, which are the learner ID, the learning resource ID, and the learner’s rating of the learning resource (1~5 points) and the text of the learners’ comments on the learning resources. The specific dataset information is shown in Table 1.
4.2. Experimental Indicators
According to the scalar attributes of the user feedback results, the average absolute error, precision, recall, and comprehensive average value are used to evaluate the performance of the recommendation algorithm.
Assuming that is user ’s predicted rating of teaching resource , and is the user’s actual rating, the following equation is used to define the mean absolute error between the two:
Knowing the recommended list obtained from the training set and the actual list obtained from the test set, the following equations for the precision rate , the recall rate , and the comprehensive average are derived:
4.3. Contrast Method
In order to verify the effectiveness of the method in this paper, two methods, CF-IRST and DeepCoNN, were selected for comparison. The purpose of this experiment is as follows: first, to verify and improve the performance of learning resource push, and the second is to verify whether the method in this paper can further reduce the prediction error compared with other learning resource recommendation methods based on deep learning models. (1)CF-IRST: This method uses the learner’s rating data to obtain the learner’s interest and preference and is the current representative learning resource recommendation method based on collaborative filtering(2)DeepCoNN: a deep collaborative neural network model, a deep learning model that divides the review set into a learner review set and a learning resource review set as input. Using two parallel CNNs to obtain the learner preference vector and learning resource feature vector from the review text, respectively, and concatenate them, and then use FM to predict the score, is a representative method based on deep learning recommendation methods
4.4. Results and Analysis
The CF-IRST, DeepCoNN method, and the algorithm in this paper are used for personalized recommendation of music resources, and the effectiveness and feasibility of the algorithm in this paper are verified by comparing the index data of each algorithm. The experimental data comparison results are shown in Figures 2–5.




It can be seen from the curve trends in Figures 2–5 that compared with the two methods of CF-IRST and DeepCoNN, the method in this paper has significant advantages. In Figure 2, the accuracy of each algorithm is proportional to the number of nearest neighbors. When the number of nearest neighbors reaches a certain value, the algorithm in this paper sets the user data preference keyword projection model, which suppresses the interference of outliers and the recommendation effect of noisy data. While the accuracy of the comparison method gradually stabilized, there was still a small increase. From the comparison of recall rates shown in Figure 3, although the fluctuations of each algorithm are small, the curve of the algorithm in this paper is always at a high level, which indicates that the algorithm recommends the teaching resources that users are interested in with a high probability. According to the trend of the weighted harmonic mean value in Figure 4, it is found that the algorithm in this paper has good comprehensive performance, and the recommendation effect is relatively ideal. The average absolute error value of each algorithm in Figure 5 shows that the error of the proposed algorithm in this paper is small and has been declining, which is feasible to a certain extent.
After counting the number of successful recommendations of each method, the successful recommendation rate statistics table is obtained, as shown in Table 1.
According to the data in Table 2, as the number of clicks increases, the system acquires more resources, which leads to an increase in the successful recommendation rate. Due to the refinement of the recommendation method of this method, when the number of clicks reaches 3,000, the recommendation success rate improves and is significant. It shows that the method in this paper can make relatively fast recommendation of teaching resources according to user demand information.
To sum up, the recommendation accuracy rate of this method is the highest at 82.2%, the recall rate is the highest at 78.7%, the value is the highest at 83.5%, and the average absolute error is lower than 0.56. And the successful recommendation rate is as high as 97.6%. It not only achieves the standard of accurate resource recommendation and user satisfaction, but also maintains a high level of resource recommendation in the case of a large amount of data clicks, which achieves the purpose of this research. Because this paper considers different user preference factors and extracts preference keywords, according to the preference resource keyword projection model, the user’s satisfaction with the recommended resources is improved. This method uses the TF-IDF algorithm to refine the online teaching resources, so as to achieve accurate recommendation of resources.
5. Conclusion
In order to improve the recommendation performance of online teaching resources in colleges and universities, and improve the learning efficiency of users, considering user preferences, this paper designs a recommendation method for online teaching resources in colleges and universities. Different user preference factors are selected, and preference keywords are extracted, and based on this, a keyword projection model of different user preference resources is constructed. Use Markov to build a keyword probability model, and introduce TF-IDF algorithm to refine online teaching resources. According to the calculated closeness of the resources required by the user, the accurate recommendation of resources is realized. The experimental results show that the proposed method has the highest recommendation accuracy rate of 82.2%, the highest recall rate of 78.7%, and the highest value of 83.5%, and the average absolute error is lower than 0.56. This is because this paper considers different user preference factors and extracts the preference, which improves the user’s satisfaction with the recommended resources. And the success rate recommended in this paper is as high as 97.6%, which is the result of introducing the TF-IDF algorithm to refine online teaching resources. In the next stage of research, mainly from the phenomenon of “negative transfer” of teaching resources, how to solve the problem of “negative transfer” needs further research.
Data Availability
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Conflicts of Interest
The authors declared that they have no conflicts of interest regarding this work.