Abstract
In recent years, with the continuous development of the country’s Internet platforms, China has gradually entered the e-commerce era of national online shopping, and more and more e-commerce platforms and stores have adopted intelligent recommendation systems to increase transaction rates. However, it is not easy for consumers to filter out the products they want from a large amount of information. The emergence of intelligent recommendation systems provides great convenience for people to screen out personalized products that meet their own characteristics. However, the algorithms used in traditional recommendation technology focus on the single-computer environment and do not consider the performance of the recommendation method when distributed parallel processing is required in the big data environment, so it cannot meet the personalized needs of users in the big data environment. Aiming at the new requirements for the development of e-commerce intelligent recommendation technology in the big data environment, this paper uses the big data processing technology based on cloud computing and focuses on the realization technology of the e-commerce intelligent recommendation algorithm and the comprehensive evaluation method of the recommendation system in the big data environment. A prototype system of personalized intelligent recommendation based on cloud computing has been developed, which is of great importance to meet the needs of e-commerce personalized intelligent recommendation in the big data environment, improve the effectiveness, scale, and real-time performance of the personalized intelligent recommendation system, and improve the level of personalized precision marketing., which is of theoretical significance and economic value.
1. Introduction
At present, information on the Internet is growing at an alarming rate, and users who look for information on the Internet do not feel that the information is insufficient but generally feel like they are drowned in a vast sea of information. Although users can get some help through search engines, directories, manual editing, and other tools, the service dedicated to information collection, collation, comparison, and reminder of individual enterprises or individuals still mainly relies on a large number of human work to complete. For example, through search engines, people need to constantly find the information they need again and again, and this information is often very untimely and difficult to meet the needs of users [1].
In particular, Internet-backed e-commerce also faces this problem. E-commerce companies are in unprecedented contact with users through the Internet, and all marketing strategies are closely focused on the needs of users. The first time to grasp the rich user information and understand the needs of users, you can accurately grasp the pulse of market changes and control the highest point of competition [2]. Apart from that, because the sites in the field of e-commerce are built based on the Internet and the convenience of using the Internet to obtain information, each site can provide very personalized, transparent, and humanized recommendations [3].Combining the above two reasons, the personalized service of e-commerce is getting more and more attention. Personalized service requires all-round contact with customers, obtaining complete and continuous customer information, and connecting them in an all-round way to implement cross-marketing, one-to-one marketing, customized customer services, and providing timely customer care, so that customers can have a sense of being valued and improve customer satisfaction and support. At the same time, enterprises can maximize the value of customer life cycle and achieve a win-win situation for enterprises and customers. The e-commerce recommendation system analyzes the user’s purchase habits, provides valuable product recommendations to users according to user needs, helps users find the goods they need, and thus successfully completes the purchase process, so that the e-commerce recommendation system can not only provide users with personalized recommendation services. Merchants can also establish long-term and stable relationships with users through the recommendation system, so as to effectively retain users and rebuild customer relationships [4].
The search function of the search engine reduces the scale of user filtering data to a certain extent, improves the practical value of data, and thus reduces the workload of users filtering data [5]. However, the search function must be implemented on the premise of typing the keyword of the search item; if the user can not accurately enter the information keyword of the item, then the search engine can not be 5′ to provide effective help to the user [6]. In addition, the search results fed back through keyword search are artificially limited to the specific information determined by the keyword, and the potential needs of users cannot be mined in combination with the characteristics of the user’s own behavior. The emergence of the recommendation system has changed the status quo that search engines cannot tap the potential needs of users in combination with the characteristics of users’ own behavior. The recommendation system establishes a user personalized preference model by collecting the user’s historical behavior characteristics, and in the process of shopping, the system combines the user’s personalized preference model to process the user’s needs and predict the products the user may be interested in. The process of predicting the user’s product preferences by the recommendation system not only solves the problem of the user’s screening data information scale that is too large, but also provides users with more feasible choices, so that the user gets a good consumer experience in the process of e-commerce activities [7].
In academia, most of the traditional recommendation techniques only pursue the final effect of the algorithm, without considering the cost and not considering how long the algorithm will run [8]. Moreover, the algorithm used in the traditional recommendation technology focuses on the stand-alone environment and does not consider the performance problem of the recommended method when distributed parallel processing is required in the big data environment, so it cannot meet the individual needs of users in the big data environment [9]. In view of the new requirements for the development of e-commerce intelligent recommendation technology in the big data environment, this paper uses the big data processing technology based on cloud computing to study the implementation technology and comprehensive evaluation method of the recommendation system of the e-commerce intelligent recommendation algorithm in the big data environment, develops the personalized intelligent recommendation prototype system based on cloud computing, meets the needs of personalized intelligent recommendation of e-commerce in the big data environment, and improves the effectiveness, scale, and real-time of the personalized intelligent recommendation system. Improving the level of personalized precision marketing has important theoretical significance and economic value [10].
2. State of the Art
It is the popularization of the Internet and the development of e-commerce and commodity recommendation system, has gradually become an important research content of e-commerce IT technology, and has received more and more attention from researchers. At present, almost all large e-commerce systems, such as Amazon, CDNOW, eBay, and Dangdang online bookstore, use various forms of commodity recommendation systems to varying degrees [11].
The key to building a product recommendation system is to build a user model, and building a user model requires determining the algorithm for the recommendation [12]. In order to generate reasonable recommendations and ensure the real-time nature of the recommendation system and the application requirements in different fields, researchers have proposed a variety of different recommendation algorithms, such as collaborative filtering algorithms, Bayesian network technology, clustering technology, correlation rule technology, and graph-based Hunting graph technology [13].
Now, in the field of recommendation system research, the mainstream is the improvement of recommendation algorithms. It is mainly studied according to the following three types of algorithms: (1) collaborative filtering algorithm; (2) content-based algorithm; (3) the context of mixed arithmetic, and its research hotspots focus on the improvement of the accuracy of recommended results and data problems such as sparsity issues and problems where algorithm complexity makes it difficult to solve in real-time recommendation. The existing recommendation system basically only considers a single recommendation algorithm, rarely considers the application of a specific recommendation algorithm, and rarely discusses the impact of user feedback on the improvement of the recommendation algorithm. Most of the research on the “recommendation system” also focuses on the improvement of the recommendation algorithm and rarely discusses the design and implementation of a recommendation system from the perspective of the system.
Typestry is the first recommendation system based on collaborative filtering, and the target user needs to clearly identify other users who behave similarly to themselves. GroupLens is an automated, collaborative filtering recommendation system based on user ratings for recommending movies and news [14]. Ringo Recommendation System and Video Recommendation System recommend music and movies by e-mail, respectively. Breese et al. conducted an in-depth analysis of various collaborative filtering recommendation algorithms and their improvements [15].
Collaborative filtering recommendations generate the final recommendation through the user’s nearest neighbor, and the item-based collaborative filtering recommendation first calculates the correlation between the items and then predicts the user’s rating of the unrated items by the user’s score of the relevant item [16].
Bayesian networking technology uses training sets to create models that are represented by decision trees and nodes and edges that represent user information. The resulting model is very small, so the application of the model is very fast. This method is suitable for the user’s interests and hobbies to change more slowly [17].
Clustering technology assigns users with similar interests to the same orange, and after clustering, predicts the target user’s evaluation of the product based on the rating of the product by other users in the cluster. Since the clustering process takes place offline, the online recommendation algorithm generates recommendations relatively quickly [18].
Association rule technology has been widely used in the retail industry, and association rule mining can find the correlation of different goods in the sales process [19]. The recommendation algorithm based on the association rule generates recommendations to the user based on the generated association rule model and the user’s current purchase behavior. The generation of the association rule model can be carried out offline, so that the real-time requirements of the system can be effectively recommended. In practice, the collaborative filtering algorithm is the most used [20].
3. Methodology
3.1. Overview of E-Commerce Intelligent Recommendation System
The intelligent recommendation system mainly refers to the analysis of consumer behavior characteristics by means of collection, statistics, and analysis, so that the recommendation algorithm can study consumers’ purchasing behavior preferences. The biggest feature of the intelligent recommendation system is that it can update the behavior data of consumers in a timely manner and actively push product information that meets their needs to consumers [21].
The core purpose of the application of the e-commerce intelligent recommendation system is to accurately tap the personalized needs of consumers, establish a personalized marketing strategy based on consumer consumption characteristics, and then meet consumers’ behavioral preferences. The application of the intelligent recommendation system has changed the overall architecture of the e-commerce platform, the exposure of its product information has been further increased, and more online stores have been visited and browsed, which has fundamentally improved the marketing capabilities of the e-commerce platform [22].
The key to how e-commerce platforms ensure that their intelligent recommendation systems meet the needs of consumers lies in the use of recommendation technology. The recommendation system is at the heart of the entire e-commerce platform and a key technology that must rely on to function properly [23].
3.2. Traditional Collaborative Filtering Recommendation Algorithms
3.2.1. Basic Principles
Collaborative filtering algorithms can be divided into user-based Collaborative Filtering (Abbreviated UserCF) and Item-based Collaborative Filtering (ItemCF) [110]. The basic principles of the two are the same. Take, for example, a user-based collaborative filtering algorithm whose main idea is to use the behavior or opinions of a known user base to predict the most likely preferences of current users. First, enter the ID of a current user and a scoring dataset, and from the scoring dataset, dig out the users who have a similar score to the same item as the user, which is called the nearest neighbor. Then, for each item that does not have behavior of the current user, the current user’s predictive score value for it is calculated according to the score of the nearest neighbor, and the predicted value is sorted from high to low, and the Top-N items are recommended to the current user. Of course, the premise of the algorithm here is that it is assumed that the user’s interest will not change with the change of time; that is, the user’s previous preferences are the same for a period of time [24].
E-commerce systems widely use collaborative filtering recommendations, such as a customer browsing Kingston 8 GB U disk, and the system after analysis on the page shows “customers who viewed the product also browsed…” [25].
The choice of the nearest neighbor is a core part of the UserCF recommendation algorithm, and the effect and efficiency of the neighbor user similarity calculation largely determine the effect and efficiency of the UserCF algorithm. The more commonly used calculation methods are cosine similarity, modified cosine similarity, and Pearson correlation coefficient, which will be described in detail below [26].
3.2.2. Business Algorithms
User Collaborative Filtering (User CF) is a collaborative filtering algorithm based on the user. “When a user A needs a personalized recommendation, you can first find other users who have similar interests to him and then recommend to A those items that the user likes but that user A has not heard of.”
The basic steps of the User CF recommendation algorithm are as follows. (Algorithms 1–3)
|
|
|
3.3. Collaborative Filter Recommendation Algorithm Based on Cloud Computing
Aiming at the problem that the data of the traditional recommendation system is sparse, and the similarity calculation method leads to the small number of users for common scoring, the advantages of cloud model concept and quantitative numerical conversion are proposed. The overall characteristics of the concept expressed by the cloud model can be reflected by the numerical characteristics of the cloud, and the cloud can characterize a concept as a whole with the three numerical characteristics of expectation, entropy, and superentropy. This paper proposes to improve it using Map Reduce, referred to as the MR-UserCF algorithm. Analyzing the UserCF algorithm, it can be seen that its core mainly includes two steps; the first step is to calculate the similarity between users according to the scoring matrix and find the user’s nearest neighbor. The second step is to calculate a predictive score for the target user’s unrated items based on their nearest neighbors. Because the second step is based on the basis of the first step, the two steps are sequential serial tasks. According to the design principle of the recommended algorithm based on cloud computing, the task is decomposed, and it can be found that, in the process of calculating the user similarity in the first step, the similarity calculation task of any two users is an independent parallel process that can be completed as a Map Reduce. When predicting a user’s rating of ungraded items, the prediction for each user is a separate and parallel process that is calculated by another Map Reduce. The two Map Reduce are serial.
Further analysis, in the first Process of Map Reduce calculating user similarity, the input of the algorithm is <null, (user, project, rating)>, and the output of the algorithm is <(user 1, user 2), and the similarity>. Using reverse analysis, to get the output of the algorithm, it is first necessary to get the score of all items by user 1 and user 2, that is, a set of user item mappings in the form of <(user 1, user 2), (user 1 score of item 1, user 2 of item 1),> form. Therefore, the Map Reduce job that calculates the user similarity needs to be decomposed into two Map Reduce subtasks to complete.
The specific algorithm flow is shown in Figure 1.

The MR-UserCF recommendation algorithm repeats the above three Map Reduce tasks continuously, and each task is a parallelizable execution process. Since the mathematical principle of the MR-UserCF recommendation algorithm has not changed, because compared with the traditional UserCF algorithm, the recommendation result is generally the same, and its advantage is that, in the face of large-scale data sets, the processing power of distributed parallel computing is greatly enhanced, which can improve the execution efficiency of the algorithm.
3.4. Content Filter Recommendation Algorithm
From the previous section, it can be seen that the collaborative filter recommendation is irrelevant to the specific recommendation object; that is, the application of the collaborative filter recommendation does not need to know any information about the recommended object. This avoids the cost of providing detailed and real-time updated product description information to the recommendation system. However, if the attribute information for these products is already available, it is also a good idea to take advantage of content-filtered recommendations.
3.5. Traditional Content Filtering Recommendation Algorithm
The CBR algorithm recommends objects with other similar attributes based on the user’s selection, as shown in Figure 2. A user watches movie A, and the recommendation system can recommend movie B to A based on the attributes of movie A: Hollywood and action movies, while domestic romance movie C will not be recommended.

In the film recommendation, first analyze the commonalities (actors, directors, genres, etc.) of the films that the user has seen with a higher score, and then recommend the films with the highest similarity with these attributes, which may be the same starring role, may be the same director, or may be more comprehensively similar. The recommendation process based on content filtering is shown in Figure 3. It is mainly divided into two steps; first of all, according to the user’s historical consumption and scoring and other business information, extract the content attribute characteristics of the recommended object, dig out the user’s consumption preference model, then calculate the similarity of the user’s consumption preference model and the commodity resource representation model, and recommend the most similar products to the user from the recommended object.

In the representation of the commodity resource representation model and the user consumption preference model, it has been proposed above that because e-commerce involves a variety of commodity types and different categories of commodity characteristic attributes, all categories of commodities are represented as a set {R}, and Rr is the t-class commodity, represented by a binary <dI, >. d represents the ith attribute of the t-class commodity, and represents the set of values of attribute d. Express the user’s preferences on all categories of goods as a collection {U, where U is the user’s preference on the t-class products. Ur < a with triples; , > represents, i = 1, …, m, where a represents the ith of the t-class goods attributes. If the value of attribute a is numeric, is represented as the value that the user prefers on the attribute, and if the attribute α value is nonnumeric, represents the set of attribute values that the user prefers on the attribute; represents the user’s weight on attribute a, which is used to describe the extent to which the user pays attention to the attribute and satisfies the temple = 1, m is the quantity of the t-class commodity attribute.
3.5.1. Recommendation Algorithm Based on User Consumption Preferences
Calculate the similarity of user consumption preferences <ai, , > and commodity <di, >, match consumption preferences with commodities, arrange and generate recommendation lists according to the degree of matching, and put the top-N in turn products that are recommended to users. The recommendation process is as follows: Step l: The attributes, attribute values, and attribute weights of the t-class commodities are obtained from the current user consumption preference model U and the commodity model R, respectively. Step 2: If the value of attribute a is numeric data, the user’s consumption preferences and goods are on the i attribute The degree of similarity is if the attribute ai value is nonnumeric data, the user’s consumption preference. The similarity with the product in the i attribute is s = B . The user’s consumption preferences are similar to the product Step 3: Sort the products from high to low according to the similarity, and return the top-N product lists to the user.
3.6. Cloud-Based Content Filtering Recommendation Algorithm
Similar to the analysis process of the MR-UserCF algorithm, the two main steps of the CBR algorithm are to calculate the user’s consumption preference model based on the user’s consumption data and to calculate the similarity with other goods according to the user’s consumption preference model. These two steps are sequential serial steps. Under the Map Reduce framework, the MR-CBR algorithm first calculates the user’s consumption preference process, and the calculation of the consumption preference model for any one user is an independent and parallel process, which can be completed as a Map Reduce. In the process of calculating the similarity between the user consumption preference model and the commodity representation model, the similarity calculation for each user is also an independent and parallel process. Another Map Reduce performs the calculations. The two Map Reduce are serial.
For further analysis, in the first Map Reduce process of calculating the user’s consumption preference model, the input of the algorithm is <null, (user, item)>, and the output of the algorithm is <user, ((a, , )…)>, and for the same reverse analysis, to get the algorithm output, we first need to convert the Item into a key-value pair with attribute a as the key and the attribute value as value and then calculate the weight according to the algorithm of Section 3.3, where another Map Reduce is required for data conversion; specifically, the MR-CBR algorithm flow is shown in Figure 4. Step l: the first Map Reduce task, data source transformation processing, collects consumption history data of each user by user. The Map stage receives the input <null, (User ID, Item ID)> raw data and converts the input data into key-value pairs with User ID as the key and Item ID as the value. The Reduce function combines the items of the same user and outputs key-value pairs with User ID as key and list (Item ID) as value. Step 2: the second Map Reduce task calculates the user’s consumption preference model. The Map stage enters the key-value pairs of the Reduce output in Step 1 and extracts and transforms each Item ID, and each Item ID is converted into several key-value pairs with attribute a as the key and the attribute value as value, and the output <User ID, list(a, ).)> form of a key-value pair. The Reduce phase mainly deals with list(a, ), which will calculate its weights based on different types of attributes, output <User ID, list(a, , )> form of user consumption preference model, and persist the model to a distributed file system. Step 3: the third Map Reduce task calculates the user’s similarity to all Items and makes recommendations. The Map stage receives all items <Item ID.list(a, )> form, reads the user consumption preference model persisted into the file system in Step 2, and calculates the similarity between the user and the item according to the algorithm in Section 3.3.1, and the output is <User, list (Item, sim)> similarity results. The Reduce stage collects the results, sorts them by similarity size, and returns a list of Top-N Items as recommended items.

3.7. Association Rule Recommendation Algorithm
3.7.1. Traditional Association Rules Recommendation Algorithm
A recommendation algorithm based on correlation rules is a technique for identifying similar rules in large-scale transactions. The following is an example of the association recommendation provided by Amazon’s site, as shown in Figure 5, Amazon users are trying to purchase the interface you see when touch Screen 4 GB. Users select Touch Screen 4G from the product catalog provided by Amazon to enter the purchase interface. At this time, users can see the description of the product, the customer’s evaluation, and other information; at the same time, it can be found that Amazon set up several modules to recommend other products for the user, one of which is to recommend the products that are often purchased together for the user; as shown in Figure 6, Amazon recommends a memory card for users who intend to purchase Touch Screen 4G, which is used to expand the amount of memory of Touch.


Since sites in the field of e-commerce are all built on the Internet, the convenience of using the Internet to obtain information, each site can provide very personalized, transparent, and humanized recommendations. In recent years, knowledge discovery has received more and more attention in the field of artificial intelligence, and one of the most important research directions in knowledge discovery is the mining of correlation rules. The association rule is also known as “shopping basket” analysis, because its main research object is the e-commerce order database, and its main purpose is to discover some correlation combinations between transaction items from the order database. The most commonly used core algorithm in association rule mining is the Apriori algorithm, the core idea of which is to first find the elements with a frequent item set of K and then find the elements with a frequent item set of K + 1 according to the K-frequent item set. The main step is divided into two steps:(1)Generate, all K-frequent item sets, first, scan the transaction database, obtain an item set data, calculate the support of an item set (that is, the number of times the data item set X appears in the transaction database), remove the results from which the support is less than the preset minimum support threshold, and retain a frequent set. Then, construct a binomial set, that is, a frequently set of results of two pairs of combinations, and calculate its degree of support, retaining the item set where the degree of support is not less than the preset minimum degree of support. Similarly, the K-frequent item set can be mined to generate.(2)Generate trusted association rules from the K-frequent item sets, calculate the confidence level of the frequent item set obtained by mining (that is, the ratio of the number of transactions containing both X and Y in the transaction database D and the number of transactions containing X in D), the minimum confidence threshold is whether the correlation rule generated is the boundary of the trusted association rule, and the value of the confidence degree indicates the extent to which the result of the association rule can be relied upon.
From the above analysis, it can be seen that the Apriori algorithm has been looking for frequent item sets in iterations, so the biggest problem is to scan the database multiple times to produce a large number of quasifrequent item sets or candidate sets, which is not efficient and has low performance, especially in the face of massive data, which will appear to be inadequate. Of course, since R. Agrawal and others [27] first proposed correlation rules, many experts and scholars in the field of data mining at home and abroad have conducted in-depth research on the methods of correlation rule mining and proposed many algorithms. Most of them are improved extensions with the Apriori algorithm as the core such as hash-based algorithms, incremental update algorithms, and parallel algorithms. Some scholars have also explored new methods different from the Apriori algorithm to avoid the shortcomings of the Apriori algorithm itself, such as the FP-Growth (Frequent Pattern Growth) algorithm proposed by J. Han and others [28]. An example of this algorithm step is shown in Figure 5.(1)By scanning the transaction database DB for the first time, the set F of frequent items and the degree of support of each frequent item are obtained, F is sorted in descending order of support, the result L is obtained, and the header table (Header Table) is constructed accordingly.(2)In the second scan of the transactional database DB, the unfree items in each transaction read are sorted by L in the first step, and after sorting, null is created as the root node to create a path to the FP tree, and the count of items on the path is increased by 1. Find the corresponding item in the header table during the FP tree insertion process and establish the pointer index, and so on; continue to insert other transactions until their complete FP tree is built.(3)Mining the association rules of transaction items from the FP tree built above: traverse the frequent item set from the end of the header table upwards and access the FP tree from the pointer to the header table to obtain the conditional pattern base (that is, the set containing the prefix paths in the FP tree that appear with the suffix pattern, the conditional pattern base of p in Figure 5 is {fcam : 2, cb : 1}), and the conditional FP tree for each frequent item is constructed according to the conditional pattern base (the conditional FP tree of p in Figure 5 is {(c : 3)}p).(4)From the conditional FP tree to the conditional FP tree divided into single branch and multibranch two-case recursion until there is only one path in the conditional FP tree, that is, the mined association rules.
It can be seen from the above steps that the FP-growth algorithm only needs to scan the database twice, which greatly reduces the number of traversals compared to the Apriori algorithm, which will greatly improve the execution efficiency of the algorithm; in addition, the FP-Growth algorithm does not have to produce candidate sets like the Apriori algorithm but adopts a divide-and-conquer strategy. Compressing a database that provides frequent item sets into an FP tree that generates correlation rules also greatly improves the efficiency of the algorithm.
3.7.2. Association Rule Recommendation Algorithm Based on Cloud Computing
Regarding the analytical FP-growth algorithm, the FP-growth algorithm is built on the Apriori algorithm but uses advanced data structures to reduce the number of scans, so it greatly accelerates the execution of the algorithm. It is mainly divided into two steps: the first is to build the FP tree, and the second is to mine the frequent pattern according to the FP tree. In this paper, the FP-Growth algorithm is improved by using the Map Reduce programming framework to improve the distributed MR-FP algorithm. Specifically, this is done by designing two Map Reduce tasks. The first Map Reduce task corresponds to the first step of the FP-Growth algorithm, which can be completed with distributed computing by iterating through the database to count the frequency of transaction items and sorting them according to the degree of support. The second Map Reduce task calculates the frequent item set and collects, which is the core task of the distributed algorithm, and the mining of each frequent item conditional FP tree is an independent and parallel task, so it will be distributed to multiple nodes for calculation, and then merged to obtain the final solution, and finally the correlation rule result is obtained according to the frequent item set. The distributed FP-growth algorithm flow is shown in Figure 6.
To illustrate Figure 6, step l in the map stage of the mapper input is null as the key, transaction T as the value of the key pair, according to the transaction T item disassembled, output as the item Item is a key, with 1 as the intermediate resulting key-value pair of value, and sent to reducer in key order through Map Reduce’s Merge stage, so that the key-value pair of the same key is received by the same reducer node. In the Reduce phase, each reducer will receive their own key-value pairs according to the cumulative count of keys, that is, the frequency of an item corresponding to the count.
In the Map procedure in Step 2, enter the same map input as in Step 1, sort the array L = Get Sorted item List() according to the frequent items calculated by Step l, and sort the transaction fragments t = Sort(T, L) according to L. It then iterates through the frequent items in the transaction and finds the transaction fragment t needed to construct the frequent item, and then <item, t> saves them to a distributed file system in the form of key-value pairs. In the Reduce process, the results of the Map stage on a distributed file system are read as input, and the transaction fragments corresponding to these frequent items are iterated over, building them into a conditional FP tree for the frequent items. From the conditional FP tree, the frequent item set containing the frequent items is obtained, with null as the key and the frequent item set as the value output. Finally, a node summarizes all frequent item sets to obtain the final association rules.
The MR-FP algorithm is different from the traditional FP-Growth algorithm in that it does not need to construct an FP tree for the entire transaction set, but instead issues the transactions that construct the frequent item conditional FP tree to its compute node, which constructs the conditional FP tree of the frequent item. In the distributed algorithm, the header table is frequently indexed in the form of keys, which can improve its retrieval efficiency.
4. Result Analysis and Discussion
4.1. System Test Scheme
4.1.1. Experimental Objective
Taking the massive behavior data of e-commerce shopping website as an example, the effect of the personalized recommendation algorithm based on cloud computing is verified according to the designed distributed recommendation system framework and algorithm.
4.1.2. Experimental Content
According to the requirements of the algorithm, the data source is preprocessed, and the recommendation algorithm based on content filtering, the recommendation algorithm based on user collaborative filtering, and the recommendation algorithm based on association rules are applied under the Hadoop platform to detect the efficiency improvement of the algorithm based on the cloud computing environment, and the recommendation system evaluation indicators in Chapter 4 are applied to evaluate and analyze it.
4.1.3. Experimental Steps
(1)Configure the experimental environment, build a distributed environment, and develop a software platform;(2)Data source preparation: the test data set used in this experiment is Amazon data (https://snap.stanford.edu/data/web-Amazon.html) collected by Stanford University, which covers more than 6.6 million users, more than 2.4 million products, a total of more than 34 million user purchase review data, the size of the data is 33.3 GB after decompression, and the specific data contained in the data are product ID, product name, and commodity price, user ID, user rating, time, comments, and other information;(3)Implement three personalized recommendation algorithms based on cloud computing in turn;(4)System deployment, comprehensive performance testing, and improvement.
4.2. System Operation Effect
The e-commerce intelligent recommendation prototype system based on cloud computing uses J2EE technology to simulate the web-side recommendation system to facilitate viewing the recommendation results. The system supports the use of cloud computing platform to mine customers’ personalized consumption preferences, supports the intelligent mining and aggregation of commodity resources, and realizes intelligent recommendation of e-commerce for customers’ personalized needs. It also provides an interface for the e-commerce operation platform to improve the accuracy of customer-facing e-commerce recommendations.
After logging in, you can see that various recommendation functions are listed on the left side of the system, including the latest resource recommendation, popular resource recommendation, content-based recommendation, recommendation based on collaborative filtering (including user-based collaborative filtering and project-based collaborative filtering), and recommendation based on association rules. Click a recommendation feature, and a list of recommendations for that user will appear on the right.
The system settings section provides functions such as adjustment of system parameters, user historical consumption, historical score, and interest model.
4.3. Evaluation of System Effects
Since the distributed e-commerce intelligent recommendation system in this paper is built on the Hadoop platform, the number of nodes of the Hadoop platform can be flexibly increased or decreased, and in order to compare the recommendation effect of the distributed recommendation system and the recommendation system in the stand-alone environment, a concept of acceleration ratio R is now defined:where T indicates the runtime of the recommended algorithm in a stand-alone environment (i.e., the number of nodes on a distributed platform is 1); Te indicates the recommended runtime of the algorithm in a distributed environment (i.e., the number of nodes on a distributed platform is at least 2).
According to the formula, the execution time of the three recommended algorithms under different Hadoop cluster nodes is counted, and the acceleration ratio curve of the three recommended algorithms is shown in Figure 7.

Experimental data comparison description: the three distributed algorithms and the traditional algorithms in the implementation time advantages are more obvious, in the case of this experimental data volume, with the increase in the number of clusters, the acceleration ratio of the three recommended algorithms gradually increases, indicating that the distributed algorithm continues to run at this time, especially mr-UserCF algorithm; when the cluster size reaches 10, the execution efficiency is more than 7 times that of the stand-alone environment, and MR-CBR and MR-FP algorithms can also reach 4x and 3x more in the best case. However, due to the scale of experimental data, when the Hadoop cluster node is increased after the acceleration ratio peak is reached, it slows down as the nodes increase, but the slope decreases, indicating that the growth rate slows down with the increase of the nodes. However, in general, the increase in the number of nodes can effectively ensure the reduction of system running time.
In addition, according to the evaluation index system in Chapter 4, the three recommendation engines implemented by this system are compared and analyzed, and the conclusions are drawn as shown in Table 2.
As far as the accuracy index is concerned, because the three recommendation engines of the system’s collaborative filtering, content filtering, and correlation rules are all recommended by TOPN according, the method of five-fold cross-verification is adopted, and the final evaluation result is the accuracy rate, recall rate, and the average of the five experiments taken by Ft. The F1 values of the three were obtained, which were 7.2%, 8.4%, and 5.6%, respectively. Visible content filtering recommendation engine dominates recommendation accuracy. In terms of efficiency indicators, 5000 users are randomly selected for offline calculation, and the running time of the three is 12 s, 23 s, and 42 s respectively, which shows that the collaborative filtering recommendation engine is optimal in this regard. As far as the coverage index is concerned, the three recommendation engines are the proportion of all items recommended by all users to all items, and the coverage rate is calculated according, to 45.6%, 68.4%, and 72.1%, respectively. In terms of diversity indicators, take the recommended list of users used by the three engines at a certain time, and calculate the Hamming distance of the user pairs used according, and the diversity of the three recommended engines is 74.6%, 86.2%, and 64.5%, respectively. In terms of novelty indicators, the average popularity of the items in the recommended lists of the three recommendation engines is calculated, and the novelties of the three are 15.2%, 28.6%, and 15.6%, respectively, according to the formula.
5. Conclusion
Combined with the intelligent recommendation system used by multiple e-commerce platforms, the advantages of the intelligent recommendation system are significantly greater than the traditional passive recommendation technology, which expands the sales volume of the platform while creating a better shopping experience for users and ensuring that users can find the goods they need in the shortest possible time. At the same time, many e-commerce platforms also actively apply big data and cloud computing technologies, further improving the accuracy of intelligent recommendations.
Although some e-commerce platforms have realized the integration of traditional recommendation technology and intelligent recommendation technology and used cloud computing and big data analysis, there are still many flaws. For example, the matching degree between intelligent recommendation and user needs after big data analysis is still relatively low, which affects the further improvement of users’ shopping experience to a certain extent. Therefore, the intelligent recommendation system of e-commerce platforms still needs to be further improved.
With the popularization of the Internet and the development of e-commerce, the recommendation system has gradually become an important research content of e-commerce IT technology, which has received more and more attention from researchers. At present, the research on the recommendation system basically focuses on the improvement of the recommendation algorithm, and few researchers discuss the methods and principles of the design and implementation of the recommendation system from the perspective of the information system, which is inconsistent with the application background of the recommendation system with strong practicality. Starting from the analysis of the demand for e-commerce recommendation system, this paper analyzes and designs a prototype of e-commerce recommendation system and hopes to propose some common methods and principles in the design of business recommendation system to provide theoretical guidance for the development of actual systems.
Data Availability
The labeled dataset used to support the findings of this study is available from the corresponding author upon request.
Conflicts of Interest
The author declares that there are no conflicts of interest.