Abstract

In order to improve the level of personalized news recommendation efficiently and accurately, an event network-oriented personalized news recommendation algorithm is proposed. First, the event network is used to analyze, predict users’ interests and preferences, and actively push information content to meet users’ personalized needs, so as to build a personalized news recommendation model. Under the mobile Internet technology, combined with the characteristics of the Internet, through the position and title similarity of sentences in the document and other features, the combined features are formed to calculate the sentence weight. Finally, the sentences are extracted according to the weight ranking to generate the news summary, so as to realize the research on personalized news recommendation algorithm for event network. The experimental results show that the proposed algorithm has high recall and coverage, short time, good recommendation effect, and strong recommendation performance.

1. Introduction

In the process of the continuous updating of network technology, the traditional way of reading has no longer played a major role, and the public is gradually inclined to read online [13]. News recommendation as a means of news filtering and user positioning. It can recommend news topics that may be of interest to users based on their historical reading habits, ensure that users can quickly and effectively obtain the data information they need, reduce the cost of reading, avoid the occurrence of information overload, and provide high-quality and personalized services to users [46]. At present, in the process of developing e-commerce activities, the recommendation system under the condition of information overload will be applied, while there are few personalized recommendation systems for news. However, as news is an indispensable part of daily life, the update speed of network news is extremely fast, which leads to users’ inability to accurately find the required information in a large amount of information [79]. Therefore, personalized recommendation for news is of great significance.

At this stage, scholars in related fields have conducted research on news recommendation algorithms. Reference [10] proposed an improved news recommendation algorithm based on text similarity. A blockchain-based distributed collaborative recommendation protocol is developed. Considering the characteristics of the news industry itself, the proposed TF-N algorithm is more suitable for stabilizing public opinion and has positive and negative control effects on the outbreak of the Internet. Through the verification of the experimental data set, the algorithm is superior to the traditional information retrieval and text mining technology TF-IDF in both the time dimension and the emotional dimension and is not affected by citizens’ privacy rights. Reference [11] proposed a hybrid algorithm for personalized news recommendation. The personalized news recommendation system extracts news sets from multiple press releases and presents the recommended news to users. This paper proposes a personalized news recommendation framework and a hybrid personalized news recommendation algorithm. Hybrid personalized news recommendation combines collaborative filtering and content-based filtering. The framework aims to improve the accuracy of news recommendation by solving the scalability problems caused by large news corpora, enriching users’ personal information, representing the exact attributes and characteristics of news items, and recommending different sets of news items. However, the above methods still have the problems of poor recommendation effect and weak recommendation performance.

To solve these problems, a personalized news recommendation algorithm for event network is proposed. Based on the analysis of event network, a personalized news recommendation model is constructed. With the help of mobile Internet technology, sentence weights are calculated based on sentence combination features, and sentences are extracted according to the weight order to generate news summaries, so as to realize personalized news recommendation. Among them, weight ranking is a relative concept, aiming at a certain index, and the weight of an index refers to the relative importance of the index in the overall evaluation. Weight refers to the quantitative distribution of the importance of different aspects of the evaluated object in the evaluation process, and the role of each evaluation factor in the overall evaluation is treated differently. In fact, evaluation without focus is not an objective evaluation. This paper uses event network to analyze, predicts users’ interest preference, actively pushes information content to meet users’ personalized needs, and constructs personalized news recommendation model. Under the mobile Internet technology, combined with the characteristics of the Internet, through the position of sentences in the document and the similarity of the title and other features, the combined feature is formed to calculate the weight of sentences, and finally, according to the weight order, the sentence is extracted to generate a news summary. The experiment proves that the algorithm has good recommendation effect and strong recommendation performance.

2. Personalized Recommendation Algorithm for Web News under Event Network

First, the user’s microblog content is matched to the news in the network media, and then the news text is represented as an event network, and community discovery is performed on the text event network to obtain a collection of events and represents them as topics. Using the semantic information of the topic, predict the user’s interest and preference, actively push the information content to suit the differentiated needs of different users, and build a personalized news recommendation model. The overall process of personalized recommendation of web news under event network is shown in Figure 1.

According to Figure 1, a good news recommendation can deal with news resources in a timely manner, recommend personalized news to users, greatly increase user satisfaction, and bring large user flow and income to news websites. News recommendation has gradually become a hot topic on the Internet. Major companies in the industry compete to develop the field of news recommendation, hoping to gain user traffic and monetize the user traffic. News recommendation is now a hot area on the Internet, and companies are hoping to capture a slice of the region, user traffic, and profits.

2.1. Event Network Model

Text is the carrier of information, so the text representation method is the basic problem of text information processing research. Event network is a semantic model with events as the basic unit. This paper will use the event network model to represent text, so as to realize the semantic information processing of text [1214]. The theoretical basis of the event network model is that the model is composed of many units that are actually interwoven into a network form. It takes the event type unit as the core and has the knowledge expression method of event unit and concept unit at the same time. The so-called event is an objective event that occurs in the characteristic time and space; is composed of multiple entity roles; and has the characteristics of action, behavior, and change; The event granularity is divided differently in different fields. For example, in the field of linguistics, there are window granularity such as sentences, paragraphs, and chapters.

2.1.1. Event Definition

Events developed in the field of cognitive science and play an important role in the processing of natural language and information currently. Events are the basic unit of human understanding and memory of information. Based on the understanding of events in cognitive science, events are regarded as the basic unit of knowledge representation. Through induction and reasoning, it can be expressed in a formal way, so that the syntax and semantic analysis of event sentences can be realized and applied to various fields [1517].

Events are defined as things that occur by different roles, showing multiple differentiated action characteristics under a specific environment and time. can be used to represent events, defined as a six-tuple:

In formula (1), actions, times, objects, environments, assertions, and language representation can be expressed through the six event elements contained in a sextuple.(1) (action element): it represents the process and characteristics of the event change and is a description of the degree, method, method, and tool of the action.(2) (object element): it includes a collection of role objects participating in the event, which can be divided into subject objects (actors of actions) and object objects (actors of actions).(3) (time element): it represents the time period of the event, that is, from the beginning to the end of the event, including relative and absolute time periods.(4) (environmental elements):it indicates the place where the event occurred and the characteristics of the environment.(5) (assertion elements): it includes the pre-assertion, intermediate assertion, and postassertion of the event occurrence. Preassertion refers to the constraints or trigger conditions satisfied by each element at the beginning of the time; intermediate predicate refers to the conditions satisfied by each element during the occurrence of the event; postcondition refers to the conditions satisfied by each element at the end of the event.(6)(language representation element): it indicates the language law of the event, including the collection of core words, the expression of core words, and the collocation of core words. The core words refer to the commonly used symbolic words of events. Core words represent the positional relationship between the representation of each element in the sentence and the core word. Core words can be matched with related words to form core word collocations.

2.1.2. Event Ontology

As an event-oriented representation model, event ontology is more in line with the laws of objective reality and the laws of human cognition [1820]. The history of the world is a long history of evolution and change of events. To describe the objective world that is constantly changing, it is necessary to reasonably describe events and their relationships. Event ontology can represent rich event information, so that a rich event knowledge base can be constructed as the basis for computer event semantic information processing.

Collect events with the same characteristics to obtain an event class. Event classes can be represented by :

In formula (2), event set means to extend the event class. is called the connotation of the event class, which is the set of common features of the element of each event in , and is a common feature of the element of each event in the event class.

Ontology refers to the refinement of formal specification based on shared conceptual model. Based on this, event ontology can also be defined as a formal of objectively existing, shared, and event-like system model. The event ontology is represented as a triple:

In formula (3), is the set of all event classes, is the set of relations between event classes, including categorical and noncategorical relations, and is a rule expressed by a logical language, which can be used for event transformation and reasoning.

The classification relationship of events is also called parent–child relationship or upper and lower relationship, which belongs to is an inheritance relationship and is static; nonclassification relationship includes composition, causality, following, accompanying, concurrent relationship, and so on:(1)Composition relation: a big event can be divided into a number of small events, then they are said to have composition relation, the big event is composed of small events. For example, “building a house” is composed of “laying foundations,” “laying bricks,” “painting,” and so on.(2)Causality: if the occurrence of one event is caused by another, they are said to be causal. Causality is a kind of important relationship between events, which can not only reflect the relationship between events but also reflect the sequence of events. For example, “earthquake” and “house collapse” is the causal relationship, “earthquake” is the cause, “house collapse” is the result.(3)Follow-through relationship: follow-through relationship reflects the sequence of occurrence of two events, but the two events do not necessarily cause each other. Such as “get up” and “brush your teeth.”(4)Concurrency: two or more events occur at short intervals, almost simultaneously. These events are sometimes a series of events triggered by the same event. Such as “wind” and “rain.”

2.1.3. Event Network

Events are the basic unit of human cognition, understanding, and memory. Text representation should take events as the starting point and develop towards semantics. The six elements of the event itself have rich semantic information and can describe the event information in detail. The event network uses higher granularity events as the basic unit to represent the text content, which is more in line with the laws of the objective world and the laws of human cognition. The event is regarded as the feature item of text representation, and the event network is used as the text representation model, and the properties and operations of the event network are studied, so as to lay the foundation for the semantic processing of text.

An event network is a directed acyclic graph composed of many different event nodes. Each node stands for an independent event, every edge represents the relationship between the two events connected by the edge. The event network includes two different ancestors:

In formula (4), the set of event relationships is represented as . There are many kinds of event relationships, the common ones are composition, inheritance, following, accompanying, causation, and so on. An event network is a special kind of graph in which each node and each edge on the graph carry information, and there can be multiple edges between two nodes. By analyzing the structure of the event network, we can get a lot of information such as the importance of the events in the network, the characteristics of the events, and the dynamics of the events. Due to the characteristics of the event network, it not only has the properties of a graph but also nodes and edges carry more semantic information. Combining this information and applying some properties and operations in graph theory to event networks, the classification, clustering, reduction, expansion, merging, and even mining and reasoning based on event networks can be realized. Therefore, event networks have unique properties and special computing methods that are different from general graphs. The event network hierarchy is shown in Figure 2.

Using mathematical methods to calculate at the abstract level of the event network can solve many problems. In the specific application of the event network model, events represented by different event models and methods can be abstracted into an event node. Instead of ignoring its internal structure, only focus on the external features of events, which is helpful for analyzing the flow of events in the objective world from outside to inside. On the basis of network operation, it can realize text logical level division, classification and clustering, automatic summarization, topic detection and tracking, and so on. For example, using the method of node strength analysis in complex networks, the nodes are clustered. Using the method of node association, mining important nodes, and realizing key sentence discovery and automatic text summarization, combining with community structure detection algorithms in complex networks, event network can be processed by community discovery method, and event clusters can also be obtained, which can realize the detection and tracking of text topics.

2.2. Community Detecting Algorithm

The community detecting algorithm is used to semantically process the event network [2123]. The event network has the characteristics of a complex network. The community discovery algorithm of the complex network can be used to divide the event nodes in the event network. In the result of the division, the similarity of the events belonging to the same community is high, and the similarity of the events belonging to different communities is low.

In the process of quantitative description of network community, it is necessary to clearly divide community structure and define modular function . Among them, the difference between the expected value of the proportion of the two internal nodes in the connected network and any network is modularization. Modular function is described as follows:

In formula (5), are the community number of the node, and the adjacency matrix element of the network is represented as , the edge in the network graph’ number is represented as , and the node order is represented as .

2.3. Building a Personalized News Recommendation Model

The goal of this research is to effectively recommend targeted and personalized news to users. Personalized news recommendation refers to providing each user with personalized information according to user preferences and interests, and realizing active information acquisition and efficient information filtering for users. Personalized news recommendation model calculates vector similarity between news content model and user interest model and judges news recommendation results by defining a threshold value. In news recommendation, news that has not generated interest preference is recommended to users, so as to prevent excessive convergence of recommendation topics and increase the diversity and novelty of recommendation. Find news B that is similar to the “nearest neighbor” of news A being visited. The larger the similarity value is, the more similar news A and news B will be, and the greater the weight value of news B’s score will be in the process of news recommendation prediction. Cosine similarity algorithm is also used to calculate the similarity between news. The calculation results are sorted according to the weight from high to low, and the first three of them are recommended to be inserted into the recommendation list of news interest together with the user. If the user is interested in this kind of news, the user’s theme preference will change with the user’s access behavior.

The content-based personalized news recommendation model is to find out the news list similar to the user’s historical browsing news. This method avoids the occurrence of cold start of the project. When publishing news in the network, with this method, the personalized recommendation score of news can be obtained by calculating the user’s historical search information and the similarity of network news. With the increase of the user’s browsing record data, the recommendation effect of this method will become more and more precise. Usually, the keywords of the news in the user’s reading history are extracted as the user interest preference model. The content-based personalized news recommendation model is as follows:

According to Figure 3, it can be clearly found that the recommendation algorithm generates several keywords that can show the interests of user as the user’s hobby model based on historical search record and calculate the similarity between the news to be recommended and the interest preference of user to give a recommendation list. At present, the most commonly used text feature representation method is the vector space model, which constructs a keyword table by extracting keywords. The user’s interest features and news features are mapped to a high-dimensional vector, and the degree of similarity of user interests and items is measured according to the similarity of the vectors.

Suppose a user interest preference and news features can be expressed as a vector shown in the following formula:

In the above formula, represents the weight of the feature, that is, the user’s preference for this keyword. The operation formula of TF-IDF algorithm is given by:

In formula (7), the number of occurrences of the word in document is represented by , its inverse document frequency is represented by , and the inverse document frequency is used to measure whether a word is a stop word and other words that have no effect on the meaning of the sentence. In theory, the more times a word appears in an article, the more important the word is, but there are too many stop words. In order to prevent this from happening, the inverse document frequency is introduced. That is, when a fixed word is repeated in some documents, it means that the word is unimportant, and the function of inverse document frequency is to balance the keyword extraction error caused by stop words. The content-based personalized news recommendation model is mainly implemented through three steps: first, determine the range of items to be recommended and extract the features of each item to be recommended as the feature representation of the item to be recommended. Second, extract the user’s interest distribution model according to the user’s historical browsing records. Finally, it is necessary to calculate the similarity of user interest characteristics and recommended items and determine the recommendation document based on the maximum similarity. The similarity measurement formula is as follows:

The advantage of the content-based personalized news recommendation model is that it solves the problem of cold start of items. For example, when a new news is released, the text feature vector can be extracted, and the similarity with the user’s interest features can be calculated to generate recommendations for users. Describe the main structure of personalized news recommendation model, further improve user interest expression, improve the more accurate expression of the hidden meaning of news text topic, meet the differentiated needs of users for news recommendation services, and improve the accuracy of personalized news recommendation and user satisfaction.

3. News Summarization Algorithm Based on Combined Feature LDA

3.1. Mobile Internet and Its Characteristics

Mobile Internet refers to the use of mobile devices by users to access the Internet through mobile communication networks, which is the product of the combination of mobile communication networks and the Internet. The main online behaviors of current mobile users include mobile news reading, mobile search, mobile web browsing, mobile music and video playback, mobile application download and use, mobile social services, mobile network office, mobile e-commerce, and so on.

Mobile Internet has major attributes such as mobility, context awareness, and personalization of mobile devices. For mobile users in the Internet environment, their identities are more specific and detailed. Such as the amount of the mobile users can be obtained according to the information filled in when the mobile users register to access the network or some machine learning and data mining techniques. In addition, other mobile user information such as the geographic location information of the mobile user can be obtained through GPS. In terms of mobile terminal devices, mobile devices used by mobile users have many brands and models, which are more personalized and diversified. Due to the small screen of the mobile Internet access device, the display and description of the recommendation results need to be changed to adapt to the mobile user’s sword browsing behavior, so as to improve the mobile user’s use experience. In order to alleviate the limitation of the mobile terminal, various applications in the mobile Internet are generally small software and small applications. In the process of the rapid development of network information technology, future mobile devices will be more intelligent and more functional. In addition, the development and wide application of cloud storage technology and cloud computing technology has contributed to the rapid popularization of smart mobile devices in various fields. Relying on the advanced wireless Internet technology, the majority of users can combine their own needs to facilitate the application of the network, search for the data they need, but also accept the news pushed in real time. In different time periods, the news topics that mobile users pay attention to when accessing the mobile Internet are also different.

3.2. Linear Discriminant Analysis (LDA) Topic Model

As a highly applied subject model, LAD is a three-layer Bayesian probability model that can effectively connect documents with words by using potential topics. Similar to many probabilistic models, by constructing the word package, the LDA model only analyzes the word frequency by ignoring the influence of word order and grammar factors [2426]. The LDA model is a probabilistic sampling process that describes how to generate words in documents based on latent topics. The generation steps of word package are:(1)Generate a for document under corpus;(2)Consider the generation of word in document : generate a topic , generate a discrete variable for topic , and generate such that . Among them, the value of represents the weight distribution of each topic before sampling, and the value of represents the prior distribution of each topic to words. The probabilistic model of the LDA generation process is shown in Figure 4.

In Figure 4, the outer rectangles represent the topic distribution repeatedly in every document distribution under set , and the inner rectangles represent the words that are repeatedly sampled from the topic distribution to generate document . LDA generates the probability model formula:

In formula (9), is the observed variable, and and are hidden variables.

3.3. Basic Features of Sentences

Basic features characterize the importance of sentences in documents, including: sentence length, position, and title similarity.(1)Length feature : to avoid the bias of the summary to long sentences, add a weight value to the length of the sentence and define the length feature of the sentence as:In formula (10), the average value of sentence length under the document set is expressed as , and is the length of sentences.(2)Location feature : suppose a document has sentences, and is the sentence in it. The location feature that defines sentence is as follows:(3)Similarity feature : regard the title of a document as the most important sentence, vectorize the title sentence and each sentence in the document and calculate the similarity between them.

In this paper, the extraction method of the three basic features is used as the baseline. The basic features of sentence in the baseline system are the weighted sum of the above three feature values, which is expressed as follows:

News documents are different from ordinary text documents. The summary sentences and key sentences of news generally appear at the beginning of the paragraph, which is early. Therefore, the position information of the sentence has a better indication for judging whether the sentence can be used as a summary sentence, and the weight is larger.

3.4. LDA Topic Probability Features of Sentences

In order to represent the relationship between words, sentences, and documents, this paper adopts a four-layer LDA model to extract the topic probability distribution models of documents and sentences, respectively, and obtains the topic similarity features between the sentence model and the document model.(1)Sentence weight calculation: in general, if the subject expressed by a sentence in a document is more similar to the subject expressed by the document, the more likely this sentence is to be selected as a summary sentence. Combined with this theory, it is necessary to calculate the similarity of topic probability distribution between sentences and documents when judging the importance of sentences. The higher the similarity, the higher the LDA feature score of the sentence. Among them, the similarity of different probability distributions can be calculated by KL divergence.In formula (14), the probability of occurrence of sentence in the topic is represented as , while the probability of occurrence of document in the topic is represented as . represents the divergence between two probability distributions, and , as follows:(2)Calculation of topic probability distribution: LDA is used to model the document collection, and the news documents are divided into set topics. Documents can therefore be represented by a dimensional vector space , where is the topic, and component represents the probability that a given document belongs to topic . Sentences are represented by a dimensional vector space , where component represents the probability that a given sentence belongs to topic .

The topic distribution of each document can be obtained from in the LDA model. Usually, a sentence consists of several words; therefore, by calculating the occurrence probability of word topic in the sentence, the topic distribution probability of sentence can be obtained. Sentence , where represents a word in the sentence, then the topic distribution of sentences can be clarified through the formula.

According to the above combined features, in the news document, the sentences’ weight can be defined. Referring to the weight value, the order of sentences can be arranged, the sentences can be extracted, and the news summary can be formed finally, which can achieve the purpose of recommending personalized news to users.

4. Experimental Analysis

When the algorithm proposed in this study is verified, the data set in the DUC2007 news summary evaluation task and the web crawler to grab the news report data from the Netease website are used as the experimental data. DUC2007 contains 45 document collections, each collection contains 25 documents with a common topic or related topics, and the document sentences are divided by software. The experiment establishes an LDA model for each document set, randomly selects 1000 users from a well-known domestic financial news website-Caixin.com, and extracts all the news browsing records of these 1000 users, a total of 116237 browsing records. Every record covers four major contents, namely news ID, browsing time, news text content, and user ID. The user ID has been anonymized to prevent exposure of user privacy. Divide the experimental data into two parts, among them, 1/5 of the data is the test set, and the rest 4/5 is the training set. Since the news recommendation is time-sensitive, the data are divided according to time to test the pros and cons of the algorithm; 500 registered users were extracted from 1000 registered users, and about 50,000 pieces of behavior data were used for experiments.

To test the personalized news recommendation efficiency of this algorithm, the recall rate is used as the evaluation index. Recall rate is the proportion of the total amount of recommended news that meet the needs of users and the total amount of news that users are interested in. The higher the recall rate, the better the personalized news recommendation effect of the method. Its calculation formula is as follows:

In formula (17), is the number of news that the recommended news is really the news that the user likes. Number the news that users are interested in, specifically expressed as . A comparative analysis is conducted on the algorithm in this study and the algorithm in references [10, 11], the comparison results of the recall rate of personalized news recommendation are as follows:

According to Figure 5, when the number of all news reaches 50,000, the average recall rate of personalized news recommendation of the reference [10] algorithm is 78.6%, and the average recall rate of the personalized news recommendation of the reference [11] algorithm is 69.2%. The average recall rate of the algorithm in this study is as high as 95.4%. From which it can clear that the recall rate of personalized recommendation algorithm for web news under event network is high, indicating that this algorithm has a good recommendation efficiency.

In the process of verifying the personalized recommendation performance of this research algorithm, the selected evaluation index is coverage. Coverage refers to the ratio of the number of all recommended news to the total amount of news that the user is interested in. The higher the coverage, the stronger the personalized news recommendation performance of the method.

In formula (18), the total amount of the recommended news is expressed as . The reference [10] algorithm, the reference [11] algorithm, and the proposed algorithm are used to compare, and the comparison results of the personalized news recommendation coverage are as follows:

According to Figure 6, when the number of all news items reaches 50,000, the average coverage rate of personalized news recommendation of the reference [10] algorithm is 84.8%, and the average coverage rate of personalized news recommendation of the reference [10] algorithm is 74.6%. The average coverage rate of personalized recommendation algorithm for web news under event network is 97.5%. Therefore, the coverage rate of personalized news recommendation of the proposed algorithm is higher than the other two algorithms, which shows that the algorithm in this study has strong performance of personalized news recommendation.

On this basis, the next step is to conduct a comparative study on the time-consuming of personalized recommendation of web news with different algorithms. A comparative analysis is conducted on the algorithm in this study and the algorithm in references [10, 11], and the comparison results of the personalized news recommendation response time are as shown in Table 1:

According to Table 1, in the process of increasing the total amount of web news, the response time of personalized news recommendation by different algorithms also increases. When the number of all news reaches 50,000, the personalized news recommendation response time of the algorithms in reference [10] and reference [11] is 2.85 s and 2.48 s, respectively, and the personalized news recommendation response time of the algorithm in this study is 1.2 s, which indicates that the personalized news recommendation response time of this algorithm is much shorter.

In conclusion, when the number of all news items reaches 50000, the average recall rate of personalized news recommendation of the proposed algorithm is as high as 95.4%. The average coverage rate is as high as 97.5%, and the response time is only 1.2 s, which indicates that the proposed algorithm has good personalized news recommendation performance, strong personalized news recommendation performance, short recommendation response time, and good effect.

5. Conclusion

The personalized recommendation algorithm for web news under event network is studied in this paper, and the event network is used to analyze and construct a personalized news recommendation model. Under the mobile Internet technology, sentence weights are calculated by combining sentence features to form combined features, and sentences are extracted according to the weight order to generate news summaries to achieve personalized news recommendation. The proposed algorithm has good recommendation effect and strong recommendation performance. But the algorithm does not take user experience into account. Therefore, in the following research, the acquired interest preferences of mobile users are used to predict new points of interest for users based on the similarity between mobile users to improve user experience.

Data Availability

The raw data supporting the conclusions of this article can be obtained from the author upon request, without undue reservation.

Conflicts of Interest

The author declared that there are no conflicts of interest regarding this work.