Abstract
Mobile edge computing (MEC) solves the high latency problem of cloud computing by offloading tasks to edge servers. Due to limited resources, it is necessary to improve the efficiency of computation offloading. However, there is a lot of redundant data transmission between MEC servers and users in the existing methods. Additional data transmission increases the task processing delay. To reduce the total delay, a new cache-assisted computation offloading strategy is proposed. In response to a large number of similar requests from users, a new cache management mechanism is designed. This mechanism can select reusable calculation results more accurately in the cache space through an approximate matching method and improve the cache hit ratio. Then, aiming at the problem of offloading efficiency, the delay optimization problem is transformed into an optimal path problem, a cost function is defined to determine the optimal offloading position, and an improved path planning method is used to plan the optimal offloading path. The simulation results indicate that the proposed scheme can improve the cache hit ratio and reduce the total processing delay of tasks compared with other standard schemes.
1. Introduction
With the development of Internet, increase in the number of network devices generates huge data traffic [1]. It is expected that in the next few years, the load on cloud data centers will be under tremendous pressure. Mobile edge computing (MEC) [2] as a new computing paradigm has been proposed to deal with the problem. MEC servers are deployed at the base stations (BSs) in the MEC system, which can execute the delay sensitive applications in close proximity to the end-users [3]. The edge servers can deploy the computation and storage resources to nearby IoT devices and offer data processing services [4]. Unlike traditional mobile cloud computing (MCC) [5], MEC is able to extend the cloud computing power and the services to the edge of network for two reasons: on the one hand, MEC ensures that data processing relies primarily on local devices rather than cloud servers; on the other hand, it usually does not need to establish a relationship with a remote cloud server, and it can meet the requirements of local users [6, 7]. The edge systems support fine-grained access to different dimensions of data [8]. In addition, mobile devices are faced with the problems such as the energy consumption of battery, the limited resources and the computing capacity in terms of the local processing [9]. Therefore, computation offloading has emerged, which can optimize the transmission delay of the task and reduce the user’s computing burden [10]. However, if the users can offload all application tasks to MEC server, the server is likely to be overloaded. While studying the problem of computation offloading, the service cache is also an important topic for MEC [11]. The service cache prestores the database or library related to the application and allows the corresponding task to be offloaded. Due to the limited resources in edge servers, the caching decisions must be made carefully to maximize system performance [12]. However, how to use cache to reduce the service delay while maximize storage utilization is still a key issue in the edge network. But the heterogeneity of edge network and the uneven distribution of users make it difficult for the system to balance cache and offloading. To deal with this problem, this paper proposes a cache-assisted offloading method and a target server matching strategy based on the cost of task to determine the appropriate target server to meet user’s needs.
In brief, the main contributions of this paper are threefold: (1)This paper considers the MEC server cooperative cache system. To reduce the access delay and the computation overhead, we design a new cache management strategy based on dynamic data approximate matching. Through an approximate matching algorithm based on the sample distance, a data set similar to the input data is selected from the collaborative cache space. By obtaining the corresponding calculation result for reuse, the cache hit ratio can be improved(2)In order to improve the offloading efficiency, this paper proposes a new computation offloading method. According to the time sensitivity and communication cost, the optimal target server can be estimated; then the computation offloading problem can be transformed into an optimal path planning problem; finally, the optimal offloading path can be planned(3)Evaluating the effectiveness of HCAM through specific simulation experiments
The rest of the paper is organized as follows. Section 2 summarizes the most related work. Section 3 introduces the system model in detail. In Section 4, we describe the cache-assisted offloading strategy. Section 5 introduces the efficiency evaluation. The simulation results are reported in Section 6. Finally, the conclusion and the future work are discussed in Section 7.
2. Related Works
In recent years, some augmented reality tasks have higher requirements for real-time performance when processing data. So, the traffic of mobile data continues to grow. It is not difficult to find that data requested by users is highly repetitive, which will lead to a large amount of redundant data transmission. In recent years, the caching problem has attracted the attention of researchers as a method to solve the delay problem [13, 14]. Cache is a new strategy to improve the performance and the service quality of mobile edge networks. It includes offloading tasks to the mobile edge cloud and storing computation results in the local storage located at the edge of network. This technology avoids redundant and repetitive processing of the same task, thereby simplifying the offloading process and improving the utilization of network resources [15, 16]. As a new method to alleviate the unprecedented network traffic, mobile edge caching has been widely used in the wired internet, and it has proved that it can reduce delay and energy consumption [17]. To date, a lot of research works have focused on optimizing caching methods to solve the delay and energy consumption problems in computation offloading. In [18], the author considered the horizontal cooperation between mobile edge nodes for joint caching and proposes a new transformation method to solve the problem of edge caching and improve cache hit rate of the network. In [19], authors designed a heterogeneous collaborative edge cache framework by jointly optimizing node selection and cache replacement in mobile networks. The joint optimization problem is expressed as a Markov Decision Process (MDP), and Deep Q Network (DQN) is used to solve the problem, which alleviates the offloading traffic load. In [20], the problem of edge cache optimization in fog radio access networks (F-RANs) was studied, and a distributed edge cache scheme was proposed, which reduced the delay of service and the traffic load. In [21], authors combined user’s context behavior to optimize the cache and modeled the problem of maximizing the click-through rate of the content as a knapsack problem. In the MEC paradigm, a heuristic intelligent caching algorithm was proposed, which had the better cache hit rate and the stability and the lower overhead. In [22], authors studied the problem of vehicle edge caching in the actual vehicle scenes. In order to obtain the higher hit ratio, the service process was modeled as a joint process of vehicle movement and parking through the approximation theory, and a method based on the practical vehicle edge cache solution realizes the trade-off between hit ratio and interrupt request ratio. In [23], the computation offloading method of cached data was studied, and a new cache-aware computation offloading strategy was proposed. The goal was to minimize the equivalent weighted response time of all tasks with the constraint of computational power and caching capacity. In [24], the authors designed the underlying structure of cache causality and task’s dependency model and designed an alternate minimization technique to reduce the complexity to alternately update the cache placement and the offloading decisions. In [25], the authors considered a complexed scenario, in which multiple moving MDs are sharing multiple heterogeneous MEC servers, and a problem named as minimum energy consumption problem in deadline-aware MEC system is formulated.
Some research works have concentrated on introducing the concept of edge caching in different systems, proposing the new frameworks or models to solve the optimization problem during offloading. In [26], a cooperative offloading and buffering model was designed, an optimization problem containing two independent problems was constructed, and a resource management algorithm was developed to guide a BS to jointly schedule the calculation of offloading and allocate the data buffers. The total delay of system communication can be minimized through the optimal offloading and caching decisions. In [27], authors proposed a collaborative edge caching scheme, defined the joint optimization problem as a Dual-Time-Scale Markov Decision Process (DTS-MDP), and proposed a framework based on Deep Deterministic Policy Gradient (DDPG). In [28], in view of the high link load of edge cache and the small storage space of the server, a cloud-edge collaborative cache model based on the greedy algorithm was proposed. In [29], the problem of edge caching in the optical fiber computing networks was analyzed, and a capacity-aware edge caching framework was proposed. The problem of average download time minimization is described as a multiclass processor queuing process, and an algorithm based on the Alternating Direction Multiplier Method (ADMM) was proposed. In [30], a new intelligent edge is defined, which combines a heterogeneous IoT architecture with edge computing, caching, and communication. In [31], an offloading framework that enables task caching was proposed in edge computing to jointly optimize the response delay and the energy consumption of roadside units. In [32], in order to minimize the total delay consumption of tasks, the authors jointly considered computation offloading, content caching, and resource allocation as an integrated model, designed an asymmetric search tree, and improved the branch and bound method to obtain a set of accurate decision-making and resource allocation strategies. By summarizing the research of computation offloading method with cached data in MEC, we can conclude that the combination of edge caching and computation offloading has made progress in meeting user’s requirements and improving user’s experience.
In summary, most of the existing works do not take into account the influence of cache management and also not have full investigation of the collaboration of MEC servers. Thus, when the MEC environment changes dramatically, the burst request volume can bring sudden increased computation load to MEC servers, and the edge network links in certain regions will also become congested, leading to a significant impact on the efficiency of computation offloading. Accordingly, we take full use of the characteristics of edge cache to propose a computation offloading method based on cache-assisted, which can improve the cache hit ratio and the offloading efficiency.
3. Network Model
3.1. System Architecture
Computation offloading is a proven and successful example that can be used to enable resource-intensive applications on mobile devices. Efficient data sharing extends the collaboration capabilities of edge system [33]. For emerging mobile collaboration applications, when multiple users are at the same distance, offloaded tasks can be copied. Researchers urgently need to design a collaborative offloading scheme and cache popular calculation results that may be reused by other mobile users. In multi-acess mobile edge computing, tasks offloaded from the users are usually associated with the specific services, and these services need to be cached in MEC nodes to perform tasks. Deciding which services to cache and which tasks to perform on each MEC node with limited resources is critical to maximizing the efficiency of offloading [34]. In this section, considering an optimized regional collaborative cache system architecture. Table 1 presents the key notations of optimization model and corresponding descriptions.
As shown in Figure 1, considering a distributed multiuser MEC system consisting of multiple MEC servers connected via backhaul links, each of which can provide computation and storage power to meet the delay-sensitive requirements of tasks. This article assumes that only one task is generated per user. In this system, let denotes all users, and let denotes a random generating task . represents the size of tasks; denotes the amount of computation resource needed to execute the application task, quantified in CPU cycles; represents the time required to perform the task. In this system model, each BS is equipped with a MEC server to handle offloading requests. According to their own needs, the users can choose to perform tasks locally or offloaded to the edge servers. Assuming that each task occupies only one virtual machine, the user-generated request determines whether the virtual machine is occupied or not based on the offloading decisions. In this paper, the optimized cache management model and the offloading strategy are designed to reduce the user’s request delay, which are introduced in the following sections.

3.2. Problem Formulation
Latency is an efficiency manifestation of system executing user’s requests and a direct evaluation criterion of user experience. In this paper, the delay is composed of four parts: the communication time between MEC servers and users ; the calculating time for servers to execute tasks ; the waiting time for other tasks ; and the time for BS forwarding to target MEC server .
Defining the offloading decision variables , a binary variable is used to represent the task executing locally or offloaded to the edge server:
Assuming that the channel adopts microwave link and the communication mode is full duplex, the calculation formula of the communication rate between the MEC server and the user is as follows: where is the channel bandwidth, is the data rate sent by the user, is the channel gain, is the channel interference, and is Gaussian white noise. Therefore, the calculation formula of is:
The calculation formula of :
Because the MEC server accepts user requests on a first-come, first-served basis, the formula for is as follows:
is the number of transmitted tasks before the task . When , the server chooses to forward task . The calculation formula of is as follows:
is the forwarding transmission rate of BS. To sum up, the sum delay of task is defined as the sum of , , , and . The calculation formula of is as follows:
The calculation formula of is as follows:
This problem is equivalent to assigning task to the resource node in different region, minimizing the total processing time of all tasks. When the limited cache capacity of edge server is relaxed, it can be transformed into a classic transmission problem [35]. The optimization problem is as follows:
4. Optimal Offloading Solution
4.1. Content Update Method
In the MEC distributed architecture, a scenario of dynamic probabilistic cache is designed according to the time-varying content, and it can adapt to the time-varying content popularity without knowing the popularity. In the environment of the server collaboration, due to the different needs of users and the different requests cluster into the area of radius , different regions are connected by optical fiber, and the collaboration area of MEC server can realize sharing of content. In the case of limited cache capacity, the MEC server must take the initiative to cache. The BS can parse part of the content request and place the cached content without returning the obtained result through the backhaul link, which relieves the pressure on the communication link. However, the popularity of the content changes with time, and the dynamic probability cache can adapt to the time-varying instantaneous content popularity and improve the cache hit rate of instantaneous content. In this article, the probability of user is randomly requesting task , obeys Zipf’s law, and therefore, is calculated as follows:
is the value of the Zipf’s distribution exponent.
4.2. Optimal Cache Management Strategy
In order to further improve the cache hit ratio, this paper adopts an optimized cache management strategy on the basis of the above content update method. Assuming that the area near each BS is divided according to empirical values, it is ensured that the number of edge servers in each area is approximately the same. Collaboration between servers can integrate cache at the edge of network. In a collaborative environment, the requested content can be transferred from one MEC server to another MEC server. As the computation capacity of edge servers are limited, repeated calculation of the same request will consume computing resources and increase the waiting delay of end users [36]. The above process will face two challenges: on the one hand, since there are almost no two identical images and voices in the scene of image recognition and speech recognition, only the most similar data can be found instead of the completely identical data, so the traditional cache selection strategy based on the accurate matching is no longer applicable [37]; on the other hand, users generate a large amount of data every day, and it takes a lot of time to search for the same or similar data in the massive data, and the search becomes more difficult due to the increase of data dimensions. To address such problems, we propose a new cache management strategy based on the dynamic data approximate matching as given below.
Among these spatial index data structure construction methods, Baton-tree [38] is the most effective, and the complexity of other methods is affected by the dimension of data. When doing an approximate data look up with Baton-tree, it can get a similar data set of the input data, and then, the general approaches are to go through the similar data set and find the closest data set to input data and return the result but the search accuracy of that method is low. In order to improve the search accuracy, KNN [39] algorithm can be used to filter the data in the similar data set.
4.2.1. Matching Method Based on the Distance Threshold of Cache Data
The existing KNN search algorithm generally ignores the influence of distance on the accuracy of the algorithm and believes that approximate data has the same distance weight [40]. In fact, the distance between data in the set and the input data determines the similarity between the data and the input data. In this paper, a matching algorithm based on distance is proposed to search the data in the similar data set acquired by Baton-tree algorithm more accurately, so as to effectively improve the accuracy of data selection.
When defining the weight of each data, the matching method based on distance threshold takes into account the Euclidean distance between each similar data and the input data. Specifically, the farther the approximate data is from the Euclidean distance of the input data, the smaller the weight. Defining the Euclidean distance as , the formula is as follows: where denotes the input data and denotes approximate data of the input data . Let represents the coordinate of , and let represents the coordinate of . Given the input data and the approximate data set , where , . is used to indicate the weight value of the approximate data , it can be calculated using the following:
is the weight threshold.
In this paper, the discriminant function between and can be expressed as ; it can be calculated using the following:
Let denotes the coordinate axis vector modulo , vector can be expressed as . Therefore, let defines the similarity between input data and data in the cache space; it can be indicated with cosine between and ; the formula is calculated as follows:
is the similarity threshold. For input data, is similar to the data set obtained by the Baton-tree algorithm with different distances:
The paper takes the maximum value among the cosine values in , and the corresponding data value is denoted as . If and , return and corresponding as the approximate match of the input data ; otherwise, return Null, the query fails. As shown in Figure 2, it describes a cache management mechanism based on approximate matching.

4.3. Problem Transformation
According to the Dijkstra theoretical method [41], the problem of finding appropriate edge cache nodes can be transformed into the problem of shortest path planning. This paper assumes that the transmission rate between connected MEC servers, denoted as , are all equal. Let be the transmission rate of the shortest route between the and . If the server is connected with another sever , there is a relationship among , , and :
It can be deduced from (16):
According to (16) and (17), it is obvious that the value of and proved that the shortest path means the least number of channels.
4.4. Offloading Location Confirmation
Some researchers have proposed different cost estimation methods of task execution. The most common methods are based on task time sensitivity [42]. However, they did not consider the computing resource usage of the MEC server. Therefore, this paper proposes a new method for estimating the cost of task execution; the formula is as follows:
It can obtain the congestion degree of the communication link of servers through (18), where is the number of divided areas, is the number of nodes in the area , and is the average of the number of nodes in all areas. Combined with formula (8), the execution cost function of task is:
and are the weights given to the two objectives, respectively, with . After the estimated execution time of the edge server and the congestion level of the communication link are returned to the terminal, the terminal device decides whether to perform a computation offloading and the computation offloading location. This article uses the following steps to confirm offloading position. It is described in Algorithm 1.
Avoiding resource conflicts: after the MEC server executes the search algorithm, it will find a server that meets the user’s needs. Before assigning to users, because each user has different task requests and server processing time is also different, it will lead to some resource conflicts: when the server is idle, multiple users compete for cache and computing resources at the same time. This article considers that the server receives the user’s resource request and sets the dynamic priority according to the execution time and order of the user request. Among them, the dynamic priority refers to obtaining the initial priority when the user applies for resources. The users constantly modify the priority level when using resources. In this way, conflicts in resource usage are avoided.
4.5. Offloading Path Identification
In respective areas, there are complex routes between edge cache nodes, and the least costly path needs to be found on the premise of determining the appropriate target server. This paper designs a path planning algorithm based on the cost of task. By using problem information to guide the search, the cost of the system search is reduced and the throughput is improved. It is shown in Algorithm 2.
|
5. Efficiency Evaluation
Figure 3 describes the performance comparison between the impoved distance search algorithm (IDSA) proposed in this paper and the distance search algorithm (DSA). By increasing the number of edge nodes, the total delay changes of two algorithms are compared. It can be seen that when the number of edge node is less than 4, the performance of two algorithms is close. Due to increase in the number of requested users, waiting, transmission, and calculation delay will all increase, resulting in the different degrees of increase in the delay of the two algorithms. However, it can be seen from the figure that when the number of servers is greater than 6, the performance gap between IDSA and DSA gradually increases, mainly because the algorithm proposed in this paper can quickly plan the offloading path and reduce the total delay of users. Therefore, the algorithm proposed in this paper is significantly better than DSA in terms of delay performance.

6. Simulation Experiment
In this section, we will evaluate the performance of the proposed scheme through simulation. The cache scheme is compared with the following four schemes: (1) random cache (RC) [43]: randomly caches popular content; (2) greedy cache (GC) [44]: only cache popular content in this area; (3) fair cache (FC) [45]: each collaboration area proportionally caches popular content; (4) collaborative edge cache offloading (CECO) [46]: only collaborative caching between edge servers; (5) heuristic cache-assisted method (HCAM): cache-assisted offloading method based on approximate matching proposed in this paper.
Experimental simulation parameters are shown in Table 2.
Figure 4 shows the relationship between the number of tasks and the total delay of under the same task processing method and different cache schemes. When the number of tasks is between 100 and 200, the system can process user requests in time, and the performance of each scheme is very close. When the number of tasks is greater than 200, the local and edge nodes cannot process all tasks within the time required by the user, and task access and waiting delays increase, which in turn causes the total delay to increase with the increase of tasks. Compared with the greedy cache scheme, fair cache scheme, and random cache scheme, HCAM has a gap in the total delay of tasks as the number of tasks increases. When the number of tasks is 400, the gap is maximum. The task delay of HCAM scheme is 0.053 s, and the task delay of GC is 0.27 s. Although the performance of HCAM and CECO is relatively close, as the tasks increase, CECO has always been above HCAM. It can be seen from the figure that HCAM finally controls the task delay below 0.1 s; its performance is better than the other four schemes. This is mainly because the scheme adopts the principle of approximate matching to improve the cache hit rate when processing user requests at the edge and can reduce the transmission of the backhaul link, thereby reducing the user’s waiting delay.

In Figure 5, when the number of user tasks is small, the four methods all show better optimization effect. When the number of tasks is about 100, because the user’s request can be processed locally in time, it reduces task transmission and calculation time, so the local method performs better than offloading, CECO, and HCAM. It can be seen that when the number of tasks is between 100 and 200, the performance of HCAM and CECO is close to local, and three methods are better than the offloading. However, due to limited resources, as the number of user tasks increases, the total delay of the four schemes is increasing. When the number of tasks is greater than 200, the delay performance of four methods begins to show a gap. Finally, when the number of tasks reaches 500, HCAM is significantly better than the other three methods, and the performance gap is maximized. It can be seen that when there are many computing tasks, both HCAM and CECO use cache resources to reduce the transmission delay of the task; the performance is better than local and offloading. But when the cache is not hit, HCAM reduces the total delay by approximately 11.5% by forwarding tasks to the appropriate MEC server for processing in time which is compared with CECO.

Compared with GC, RC, FC, and CECO to verify the effectiveness of the proposed method, Figure 6 describes the comparison of the four methods on the cache hit ratio, and the cache hit rate is used as one of the performance criteria for evaluating the method proposed in this article. In the case of limited cache space, the higher the cache hit ratio, the lower the overall task processing delay. It can be seen from the figure that when the task time is relatively small, as the cache space increases, these four methods can all increase the cache hit ratio. As the number of tasks increases, the performance of the method proposed in this paper is better than other methods. The main reason is that HCAM optimizes the management and allocation of cache space through a new cache management mechanism. Compared with CECO, it adopts the cache approximate matching principle on the basis of edge collaborative cache, which improves the cache hit ratio.

Figure 7 shows the impact of cache size on the average system delay variation. Since five schemes adopt different cache management and allocation strategies, it can be seen from the figure that as the cache increases, the performance of the five methods differs in performance. As cache space increases, the cache of hot content will also increase, which increases the cache hit ratio. When the user generates a request, the edge server can directly send the cached content to the user. Users no longer need to wait for tasks to be offloaded to the target server, reducing transmission and calculation delays. When this article adopts an optimized cache management mechanism and cooperative cache model, even if the content requested by the user is not cached, it can be processed at the edge system as far as possible. Compared with the four methods, the average processing delay of the system is reduced to a certain extent. When the cache size is set to 40 GBs, the average delay of HCAM is the smallest. As the cache space increases, the GC method only satisfies the requests of a few users, and the system latency exhibits the greatest. Although FC has a higher cache hit ratio and better performance than GC, the average delay in the system is very close. The RC method further improves the cache hit ratio, which is better than the FC and GC methods. Although both HCAM and CECO use cooperative caching to reduce the average delay performance close to each other, HCAM uses the principle of approximate matching to increase the cache hit ratio, thereby reducing user access latency. However, when the cache space is 200 GBs, the performance of the HCAM method is optimal, which is about 1%, 24%, 36%, and 42% higher than the performance of CECO, RC, FC, and GC.

7. Conclusion and Future Work
In this paper, we focus on a computation offloading strategy. To reduce the processing delay, this paper design a new cache management strategy based on dynamic data approximate matching. Then, a new cache-assisted offloading mechanism for edge server is proposed. To improve the efficiency of offloading, this paper transforms the problem of offloading location confirmation into an optimal path planning problem, a heuristic algorithm based on task cost has been introduced to confirm the optimal server. The simulation results show that our scheme can reduce the total delay compared to GC, FC, RC, and CECO.
In the future, we will optimize the cache strategy. Besides, we will further study the computation offloading method under the job-related situation. In addition, we will explore algorithms suitable for task priority.
Data Availability
The (data type) data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work is supported by the National Natural Science Foundation of China grants 61771289 and 61832012, Major Basic Research of Natural Science Foundation of Shandong Province with grants ZR2019ZD10; Key Research and Development Program of Shandong Province with grants 2019GGX101050.