Abstract
The Internet has become one of the important channels for users to obtain information and knowledge. It is crucial to work out how to acquire personalized requirement of users accurately and effectively from huge amount of network document resources. Group recommendation is an information system for group participation in common activities that meets the common interests of all members in the group. This paper proposes a group recommendation system for network document resource exploration using the knowledge graph and LSTM in edge computing, which can solve the problem of information overload and resource trek effectively. An extensive system test has been carried out in the field of big data application in packaging industry. The experimental results show that the proposed system recommends network document resource more accurately and further improves recommendation quality using the knowledge graph and LSTM in edge computing. Therefore, it can meet the user’s personalized resource need more effectively.
1. Introduction
With the popularity of the Internet, network resources have become people’s first choice to find information. As a kind of special resources of the Internet, the rapid growth of network document resources makes the problem of “information overload” and “resource trek” increasingly serious, preventing people from collecting and obtaining information efficiently. For example, there will be more than 19 million query results when the keyword “recommendation system” is given in Baidu Library. Massive and excessive information will be presented at the same time, which makes it difficult for people to make correct and efficient choices and obtain the resources they really look for. As an essential means of information filtering, the recommendation system is one of the most effective methods to solve the current “information overload” and “resource trek” problems [1]. However, most existing recommendation systems support a single user. In recent years, with the development of social networks and online communities, under the environment of “human beings are social animals,” users with similar interests form groups and participate in practical activities. Group recommendation systems have been successfully applied to learning [2], academic knowledge [3], audio and video services [4], travel [5], communications [6], and other fields. Group recommendation systems have gradually become one of the research hotspots in the field of recommendation systems. Methods, theories, and applications of group recommendation systems have been studied in depth abroad, while research on group recommendation systems has just begun in China. In a group recommendation system, the determination of recommendations depends on the selected preference fusion strategy.
With the deepening of the era of big data [7, 8], the application of deep learning [9] and the knowledge graph [10] in the recommendation system has been paid increasingly attention by academics and industry. For one thing, the research on the recommendation system based on deep learning has become a hot research topic. Deep learning applied to the mining of comment text corpus can effectively improve the recommendation accuracy. For another, the knowledge graph can better enrich and represent the semantics of resources and provide more comprehensive and relevant information. The emergence of the knowledge graph provides an effective way to design recommendation systems in big data environments. It can enhance the semantic accuracy of the data to further improve recommendation accuracy and can solve the data sparsity problem of recommendation technology as well.
However, the current personalization of the recommendation system is expressed by the behavior of the user interacting with the item as a feature, but the user’s behavior actually occurs on the client. The recommendation system model wants to get the user’s behavior characteristics. When the data on the end are sent to the server, there will be a delay problem. Due to the delay in real-time perception of user's behavior, the resources obtained by the user cannot match the changes in the user's interest timely. Edge computing [11] has real-time perception and real-time feedback, which can solve the problem of insufficient real-time perception and real-time feedback capabilities of the current client-server architecture recommendation system.
Based on the traditional recommendation technology, a group recommendation system for network document resource discovery based on the knowledge graph and LSTM in edge computing is proposed, which is able to work out the target information in accordance with users’ needs proactively and solve the “information overload” and “resource trek” problem as well. The main contributions of this paper are as follows:(1)A group recommendation system based on the knowledge graph and LSTM in edge computing is proposed. For processing data through LSTM in edge computing, the proposed system combines group recommendation, collaborative filtering-based recommendation, and content-based recommendation based on knowledge. The recommendation results are all individually adjustable, and they meet the real users’ need accordingly between the group of similar interest and single user, which undoubtedly makes the accuracy of the recommendation better.(2)Recommendation quality is improved. The proposed system takes advantage of group recommendation and the knowledge graph to make up for sparsity of recommendation and increase the recommendation precision rate and recall rate. Therefore, the proposed system provides practical value for personalized recommendation systems of network document resources.(3)The system test has been accomplished in the field of big data application in the packaging industry. The experimental results suggest that the group recommendation results can meet the real user’s need with higher efficiency.
The remainder of the paper covers the background and related work discussion (Section 2), the preliminaries of the group recommendation system, and detailed illustration (Section 3), the design of the group recommendation system, and detailed illustration (Section 4), the experiments and test results (Section 5), and the conclusions and future work (Section 6).
2. Related Work
2.1. Recommendation System
With the accelerating loading of the Internet information, serious “information overload” and “resource trek” problems have been emerging. The recommendation system has received wide attention as a solution in academia and business. The recommendation system is a subset of the information-filtering system that predicts the user’s possible preferences and recommends to users, based on user preferences, habits, personalized needs, and characteristics of information or objects (such as movies, TV shows, music, books, news, photos, and web pages), and it helps users to make quick decisions and improves user satisfaction [12, 13]. In recent years, with the continuous development of the recommendation system, according to the different selection methods, there are some recommended algorithms: demographic-based recommendation [14], content-based recommendation [15], collaborative filtering-based recommendation [16], knowledge-based recommendation [17], model-based recommendation [18], association rule mining for recommendation [19], social-based recommendation [20], hybrid recommendation [21], group recommendation [22], and so on. With the increasing application forms and scenarios of the recommendation system, the research and application of the recommendation system face some important issues, such as cold start problem, user niche problem, and personalized recommendation interpretability problem. The existing research focuses on constructing a personalized recommendation service based on a data model that reflects the user’s interest characteristics. Hu [23] proposed a recommendation algorithm based on user interest and the topic model to solve the problems of data sparsity, cold start, and user interest acquisition. Hu et al. [24] proposed an enhanced group recommendation method based on preference aggregation, incorporating simultaneously the advantages of the aforesaid two aggregation methods, and effectively improved recommendation accuracy. The authors [25–27] all proposed to satisfy the user’s preference, rely on the user’s own attributes to make recommendations based on utility, and apply them in the recommendation of papers, music, and electronic products. The goal is to maximize the user’s interests, improve the accuracy, and ensure the quality of recommendation services.
2.2. Deep Learning
Deep learning-based recommendation methods can incorporate multisource heterogeneous data for recommendation, including explicit or implicit feedback data from users, user portrait and project content data, and user-generated content. Deep learning methods use multisource heterogeneous data as input and use an end-to-end model to automatically train prediction models, which can effectively integrate multisource heterogeneous data into the recommendation system, thereby alleviating the data sparseness and cold start in traditional recommendation system problems and improving the ability of the recommendation system. The application of deep learning to corpus mining is a research hotspot. After 2006, with the publication of Hinton, it was wildly sought after by scholars in the artificial intelligence world. This model is based on a neural network model, but it is more complex than a simple neural model, and the problems it deals with are more complex and diverse. Deep learning methods have been successfully used in many applications in the computer field, including speech recognition, speech search, natural language understanding, information retrieval, and robotics. Mokri et al. [28] applied the neural network model to have a high degree of relevance. It retrieved Slovak-related documents, processed keyword parts of speech, and greatly improved accuracy and recall. Based on the highly nonlinear characteristics of neural network algorithms, using the BP network to optimize the weight of each parameter in the entire neural network, constantly revising the weights, Xu et al. [29] constructed a personalized behavior based on users, the information retrieval model. Guezouli [30] used the correlation characteristics of neighbor nodes in the neural network to combine all documents into a neural network and retrieves the most relevant document according to query. Wang et al. [31] designed a data collection and preprocessing scheme based on deep learning, which adopted the semisupervised learning algorithm of data augmentation and label guessing. The recommendation system collects and processes the data by using deep learning to improve the accuracy of recommendation. Therefore, it has important research significance and practical value, and becomes one of the most active branches on the research recommendation system.
2.3. Knowledge Graph
The application of the knowledge graph is coherently born to enrich and represent the semantics of resources. It was proposed by Google in 2012 to describe the various entities or concepts that exist in the real world and the incidence relation between them. The knowledge graph is not a substitute for ontology. Ontology describes the data scheme of the knowledge graph, namely, for knowledge graph building, a data schema is equivalent to establishing its ontology. The knowledge graph based on ontology enriches and expands, and the expansion is mainly embodied in the entity level. The knowledge graph is more accurate to describe the incidence of various relationships in the real world. The knowledge graph is a great promoter of the semantic annotation of digital resources and promoting the efficient acquisition of knowledge and information. At present, Google, Sogou cubic, Baidu bosom, Microsoft Probase, etc. already preliminarily applied the knowledge graph system in the industry. Most of them are the general knowledge graph, which emphasizes the breadth of knowledge and includes more entities. It is difficult to have a complete and global ontology layer for unified management and mainly used in the search services business with no need for high accuracy requirements. There are some industry knowledge graphs which have high accuracy requirements, used for auxiliary complex decision support, the rich and strict data patterns, etc. The authors [32, 33] reviewed knowledge graph technology in academia. Hu [34] researched on the construction of the knowledge graph based on the application. Li et al. [35] proposed an automatic knowledge graph establishment method and established a knowledge graph of the packaging industry. Chang et al. [36] summarized the application of a knowledge graph in the recommendation system. To seek semantics support for searching, understanding, analyzing, and mining, Wu et al. [37] proposed a more convenient way which is based on the domain knowledge graph to annotate network documents automatically. Jiang et al. [38] focused on graph-based trust evaluation models in Online Social Networks (OSNs). The recommendation system based on the knowledge graph enhances the semantic information of the data by connecting users and users, users and items, and items and items to further improve the accuracy of recommendation. Therefore, it has important research significance and practical value, and gradually becomes one of the most active branches on the research recommendation system.
2.4. Edge Computing
In response to the difficulties faced by cloud computing, edge computing has been proposed as a new computing paradigm and has gradually become an emerging computing model that meets the needs of Internet of Everything applications. The edge devices in the edge computing model have computing and analysis capabilities and provide computing power support for application developers and service providers by performing calculations at the edge of the network. It uses a distributed computing architecture to sink major applications, services, and data storage to the edge of the network, thereby bringing computing closer to the source of data. It decomposes the large tasks originally processed at the central node into multiple smaller, more manageable subtasks, which are placed close to the data source or user service terminal to provide edge intelligent services nearby, thereby reducing the delay of network communication and service delivery, reducing cloud pressure, and generating faster network service response, to meet the industry’s key requirements for real-time business, intelligent applications, security and privacy protection, and so on. The research of edge computing has received increasingly attention, which involves application fields such as smart education, smart manufacturing, smart transportation, and smart medical care, which has important research significance and practical value [39–45]. Jiang et al. [46] introduced the concepts and characteristics of cloud computing and fog computing, and compared the cooperations between cloud computing and fog computing. The advantage of edge computing is that edge nodes have the ability to “independently think,” which makes some decisions and calculations no longer dependent on the cloud, and the end-side can give results in a more real-time and more strategic manner. Especially, with the advent of the 5G era, its low-latency feature greatly reduces the interaction time between the end and the cloud and is more conducive to us using end intelligence to achieve lower-cost decision-making and rapid response. The cloud and the end-side are more closely integrated, and the end-side can perceive the user’s intention in seconds to make decisions. The recommendation system based on edge computing enhances service performance in the network low latency, real-time interaction, service stability, and security. Therefore, edge computing can promote the development of the recommendation system and become a research hotspot, which will have great application value in practical business.
3. Preliminaries
3.1. The Computation Model of Word Vector
TF-IDF is a very significant concept and method in the field of information retrieval and data mining. The figure of TF-IDF is inversely proportional to the time of the word that exists in the whole gathered document, and is proportional to the frequency that appears in the document. However, all the document sets is converged by all attribute characteristics of instances, including the basic attribute and the domain attribute. In the traditional TF-IDF model, it failed to reflect the contribution of the different attributes to instance word vectors. Hence, this paper advocates to calculate the word vector by the use of the upgraded TF-IDF model based on the contribution of the literature [37].
CTF is short for contribution of term frequency, which is defined as follows:
CIDF is defined as follows, which is short for contribution of inverse document frequency,
In addition, this formula CTF(wi) demonstrates the word frequency of the contribution for , C(wij) represents the word frequency in the j attribute text for , W(Attrij) represents the weight of the j attribute, and the formula CIDF(wi) demonstrates the inverse document frequency based on the contribution for .
Calculate the figure of the upgraded TF-IDF:
The network document refers to an article including information, new, paper, and so on. Its format can be structured in types such as TXT and XML. It can also be unstructured types such as WORD, PDF, and others. The network document is presented with an upgraded word model TF-IDF based on the contribution in this paper:
In this paper, the system has a user set, each user has own hobbies and interests. Interests are grouped by the subject. Users’ interests are collected by both manual and automatic acquisition modes in this paper. Adding keywords by users themselves is the manual mode, while the automatic mode is that the system obtains keywords through processing user access records, achieving adaptive updates in the interactive process in edge computing, and putting these keywords into the word library. The user interest model is presented with an upgraded word model TF-IDF based on the contribution in this paper:
3.2. LSTM
Long short-term memory (LSTM) [47] is an improved recurrent neural network (RNN). The block cell of LSTM is shown in Figure 1. Compared with traditional recurrent networks, LSTM has an additional unit state for storing long-distance information, which solves the problem of gradient dispersion caused by excessively long gradients; LSTM repeating modules have different structures, including four interactive layers and a special form of interaction; the specially designed gate structure in LSTM enables the model to decide to discard information, determine to update cells, and update cell status. Because of its design characteristics, LSTM is very suitable for modeling time-series data, such as text data. The LSTM model can better capture the long-distance dependencies because LSTM can learn what information to remember and what information to forget through the training process.

The memory unit module is composed of three “gate” structures: input gate, forget gate, and output gate, and a loop connection unit. The internal parameters of the LSTM unit structure can be expressed as follows: assume that at time t, the input of a memory unit module is xt, the output is ht, unit status is Ct, and then the forget gate, input gate, input conversion, unit status update, output gate, and hidden layer output of the memory unit module are shown by equations (6) to (11), respectively.(1)Forget gate: it can choose to forget certain past information and decide what information should be discarded or retained:(2)Input gate: it can remember some information now and update unit status:(3)Merge and update:(4)Output gate: it can determine the value of the next hidden state. The hidden state contains the related information of the previous input and can also be used for prediction:where σ is the sigmoid function; tanh is the hyperbolic tangent function; it, ft, ot, and are the input of the input gate, forget gate, output gate, and input transformation to the unit; Wi, Wf, Wo, and Wc are weight matrices of input gate, forget gate, output gate, and input conversion corresponding to xt and ht-1; and bi, bf, bo, and bc are the offset vector of input gate, forget gate, output gate, and input conversion, respectively.
3.3. Knowledge Graph Construction Method
The framework of the knowledge graph construction method is shown in Figure 2. It includes the lifecycle of the domain knowledge graph, which mainly has five processes, namely, ontology definition, knowledge extraction, knowledge fusion, knowledge storage, and knowledge application, respectively. Each process has its own methods and tasks. For example, D2RQ is used to transform the atomic entity table and the atomic relation table into RDF in knowledge extraction; defined by the knowledge fusion rules to complete the knowledge fusion task while extracting knowledge with D2R and Wrappers, the tasks are such as entity merge, entity linking, and attribute merge.

In this paper, the authors obtain the semantic annotation knowledge graph. The semantic annotation helps the generation of sentence text and eliminates the ambiguity and ambiguity of natural language text. The entity in the knowledge graph can be used as a word segmentation dictionary. The semantics of entities, attributes, and relationships provide synonymy, inclusion, etc., and remove ambiguity and ambiguity, thus providing standard, concise, and comprehensive knowledge information.
4. The Design of Group Recommendation System
This paper designs a personalized group recommendation system of network document resources based on knowledge graph and LSTM in edge computing. Through the knowledge graph and LSTM in edge computing, the new meaning of the string is given, the document set is associated with the document feature, and the knowledge system related to the keyword is systematically made, so that the recommendation is superior in quality.
4.1. System Architecture
Based on the knowledge graph and LSTM, combining the content recommendation algorithm and collaborative filtering algorithm, this paper presents a group recommendation system of network document resources in edge computing, which has a five-data flow part comprising data collection, data mining, data fusion, data computing, and data application. Figure 3 shows the architecture of our proposed system.

It mainly includes the following parts:(1)Data collection: data sources include user behavior data, user interest data, system historical data, resource evaluation data, and network data. These data are the basis for building knowledge maps and recommending. They need to be processed through the data mining part.(2)Data mining: data cleaning and analysis on the collected data are performed. Corpus learning and entity naming process are carried out through LSTM. Then, the processed data are aggregated into the data fusion part.(3)Data fusion: data from different data sources are processed for integration of heterogeneous data under the same framework specification and stored in different types of databases for use in the data calculation part.(4)Data computing: the text classification process is run based on LSTM in edge computing, and the document set associated with document feature is got. Personalized recommendation results are obtained through user interest graph, topic association interest recommendation, semantic annotation-based content recommendation, and knowledge map-based group recommendation, and transmitted to the data application part.(5)Data application: the recommended results are displayed to users according to the topic of network document resources. At the same time, relevant data are fed back to the data collection part.
4.2. The Description of Personalized Recommendation Algorithm
Performing corpus learning, entity naming, and text classification process through LSTM, the domain knowledge graph and the document set associated with document feature are ready. Then, go to recommendation.
4.2.1. Topic Recommendations Based on Interest Graph
Through the user’s interest graph, find other users associated with the user’s interests, then combine the other users who have acted in the document what the target user has acted on, and form a similar interest user set U1. At the same time, through the user’s interest graph, the user’s interest is extended to the topic layer, then perform the content-based recommendation, and remove the document what the target user has acted on, and obtain the corresponding document set L1.
4.2.2. Content Recommendations Based on Semantic Annotation
In the process of constructing the domain knowledge graph, the documents and instances are semantically annotated to obtain the triplet < document, instance, and similarity > annotation library. Then, based on the user’s attention graph instance, perform the content-based recommendation, remove the document what the target user has acted on, and obtain the corresponding document set L2.
4.2.3. Group Recommendation Based on Knowledge Graph
(1) Computing User Interest Similarity. In the system, the user interest similarity is defined by sim, which is measured with the similarity between two user interest vectors q and d, seeingwhere ,q represents the weight of the interest keywords i in user q, ,d represents the weight of the interest keywords i in user d, and n is the number of the keywords in the user interest set. The matrix of the user interest similarity is obtained by processing with cosine similarity calculation between two user interest vectors, see Table 1. The similar interest user set U2 is obtained by computing the user interest similarity.
(2) Predicting User Document Behavior Evaluation. In the system, there is behavior evaluation between the user and the document. Six behavioral characteristics were selected as the users’ interest in the document to participate in the prediction score, selecting the highest behavioral score. Implicit scoring principle [48] is used for reducing the degree of user participation in this paper. It marks 1 point when the user downloads the document; it marks 0.8 point when the user shares the document; it marks 0.6 point when the user comments the document; it marks 0.4 point when the user collects the document; it marks 0.2 point when the user clicks the document; it marks 0.1 point when the user only browses the document; it marks “/” when the user does not browse the document. All users and documents form a behavior evaluation matrix at the same time, see Table 2.
Based on K users who are similar to the target user’s interest, find documents that K users like but the target user has not touched, predict the target user’s interest in a document using equation (13), sort the documents according to the degree of interest, and get the document set L3 finally:where represents the interest similarity of two users u and , rvi represents the interest weight of the user and the document i, S(u,K) represents K users most similar to user u interests, and N(i) represents having acted on document i.
(3) Performing Group Recommendation. There is a consensus function [49] between the user group and the document in the system. The consensus function quantifies the utility of the candidate item to the group from the degree of preference of the entire group to the project and the degree of preference difference between group members. The formalization of the group recommendation system is presented in this paper:(1)Group prediction score: the predicted score GP(G,i) of group G for document i is obtained by fusing the predicted score P(u,i) of each user in the group:(2)Group divergence: the degree of divergence dis(G,i) of the group to the document indicates the degree of difference in the prediction score of users of group G for document i:where mean(G,i) is the average of the prediction scores of users of group G for document i.(3)Consensus function:
Among them, and , respectively, represent the weight of the group prediction score and the group disagreement in the consensus function, and + = 1, Algorithm 1
| 
 | 
5. Experiment and Evaluation
In order to verify the feasibility of the proposed system and its services, we conducted experiments in the big data knowledge graph platform in the China packaging industry (URL: http://58.20.192.198:8090). The data in the experiments are collected from government information, business information, industry information, academic papers, global packaging patents, and other data resources, which add up to more than 3 million articles. We choose 28150 document resources for the experiment. Offline experiments aimed at 10 users and pretreated their web usage access logs.
5.1. Experiment Environment Configuration
The experiment environment configuration is as shown in Table 3. We build a knowledge base of packaging knowledge graph covering information, policies, conferences, standards, papers, patents, companies, products, universities, institutions, and experts. The instances in the knowledge graph are stored in MongoDB via key values. The data of semantic annotation library are stored in ES in triples. Network document resource is also stored in ES.
5.2. Constructing Packaging Knowledge Graph
As is shown in Figure 4, a packaging knowledge graph [35]is constructed. For example, the knowledge graph includes the following basic concepts, namely, “packaging knowledge point,” “company,” “product,” “organization,” “patent,” “paper,” and “event.” Major relations include “has product,” “upstream,” “downstream,” “has patent,” and “executive.”

(a)

(b)
5.3. Algorithm Evaluation
In this paper, we adopt an evaluation method to calculate the precision rate, recall rate, F-measure value, and real-time interaction response time T.
The precision rate calculation formula is as follows:
The recall rate calculation formula is as follows:
The F-measure value calculation formula is as follows:
Through the system implementation, we put part of the data of the matrix of two user interest similarities and user document behavior evaluation in Table 1 and Table 2, and give recommendation results by using the traditional content-based recommendation, traditional collaborative filtering recommendation, personalized collaborative filtering recommendation based on knowledge graph [50], and proposed group recommendation based on knowledge graph and LSTM. The experimental results are as shown in Tables 4 and 5.
From the results of Table 4, we can see that the improved personalized group recommendation algorithm has higher precision rate, recall rate, and F-measure value among them, indicating that the use of domain knowledge graph and LSTM helps to enhance the semantic information of data and improve the quality of recommendation, and the group recommendation system effectively alleviates cold start problems. In conclusion, it is obvious that the personalized group recommendation system that we have proposed has higher accuracy as well as high availability in real systems.
From the results of Table 5, we can see that the improved personalized group recommendation algorithm has higher timely real-time interaction response time among them, indicating that the use of edge computing helps to enhance the quality of recommendation, because the end-side results can be presented in a more real-time and strategic manner. In conclusion, it is obvious that the personalized group recommendation system that we have proposed has higher interaction in real systems.
6. Conclusion
With the explosive growth of information on the Internet, the mining of massive multisource heterogeneous data is a key issue in the recommendation system. The emergence of the knowledge graph and deep learning brings a new opportunity for the integration processing of multisource heterogeneous data in the recommendation system. Therefore, the recommendation system based on the knowledge graph and LSTM in edge computing has become a new research field. In this paper, based on the packaging industry knowledge graph, the authors provide a technical implementation scheme for the group recommendation system of network document resources in edge computing, by joining the content-based recommendation and collaborative filtering-based recommendation algorithms. The proposed method is considering personalized demand between group and single user. The experimental results show that the proposed system improves the quality of recommendation.
The future research includes applying BiLSTM to mine and learn text eigenvector, optimizing the prediction algorithm based on deep learning model at the end-side in edge computing, improving the group recommendation algorithm, analyzing personalized recommendation reviews based on emotion weight, doing experiments in packaging evaluation corpus, and constructing a complete packaging big data recommendation system.
Data Availability
The data used to support the findings of this study are not applicable because the data interface cannot provide external access temporarily.
Disclosure
The work of packaging knowledge graph construction is published in Cyberspace Data and Intelligence and Cyber-Living Syndrome and Health. The authors have extended this research, and modified and optimized the algorithms.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported in part by the National Key R&D Program Funded Project of China under grant no. 2018YFB1700200, in part by the Hunan Provincial Key Research and Development Project of China under grant no. 2019GK2133, in part by the Scientific Research Project of Hunan Provincial Department of Education under grant nos. 19B147, 18K077, and 17C0479, in part by the Project of China Packaging Federation under funding support no. 17ZBLWT001KT010, and in part by the Intelligent Information Perception and Processing Technology Hunan Province Key Laboratory under grant no. 2017KF07.