Abstract
Performance evaluation of modern cloud data centers has attracted considerable research attention among both cloud providers and cloud customers. In this paper, we investigate the heterogeneity of modern data centers and the service process used in these heterogeneous data centers. Using queuing theory, we construct a complex queuing model composed of two concatenated queuing systems and present this as an analytical model for evaluating the performance of heterogeneous data centers. Based on this complex queuing model, we analyze the mean response time, the mean waiting time, and other important performance indicators. We also conduct simulation experiments to confirm the validity of the complex queuing model. We further conduct numerical experiments to demonstrate that the traffic intensity (or utilization) of each execution server, as well as the configuration of server clusters, in a heterogeneous data center will impact the performance of the system. Our results indicate that our analytical model is effective in accurately estimating the performance of the heterogeneous data center.
1. Introduction
Cloud computing is a popular paradigm for providing services to users via three fundamental models: Infrastructure as a service (IaaS), Platform as a service (PaaS), and Software as a service (SaaS) [1, 2]. Cloud computing providers offer computing resources (e.g., servers, storage, networks, development platforms, and applications) to users either elastically or dynamically, according to user-demand and form of payment (e.g., pay-as-you-go) [3]. A modern data center is considered a heterogeneous environment because it contains many generations of servers with hardware configurations that may differ, especially in terms of the speed and capacity of the processors. Generally, these servers have been added to the data center gradually and provisioned to replace the existing (or “legacy”) machines already in the data center’s infrastructure [4, 5]. The heterogeneity of this mix of machine platforms affects the performance of data centers and the execution of cloud computing applications.
The ability to ensure the desired Quality of Service (QoS), which is a standard element of the Service Level Agreement (SLA) established between consumers and cloud providers, is one the most important business considerations for a cloud computing provider [6, 7]. A typical QoS outlines a set of critical performance indicators: mean response time, mean task queuing length, mean waiting time, mean task number of throughput, task capacity of the system, blocking probability, and probability of immediate service of the system. All of these indicators can be analyzed and described using queuing theory [8].
A task completed by the cloud computing center follows several steps: a customer’s request is transformed into a cloud computing task and sent to the task queue (the first level queue) maintained by the main scheduler server; the scheduler allocates computing resources for the task and dispatches it to a node (execution server) to execute; the node takes the task from the second level queue and allocates space in a CPU core (the main computing resource) to complete it; the system outputs the result, and the task leaves the system. In this process, blocking will occur if a task leaves the system without having been executed (or the task may have been abandoned by the system) because of capacity constraint at the first level queue. A queuing model for a cloud computing heterogeneous data center can analyze the relationship among three factors: the arrival rate of tasks, the core utilization, and the probability of a node (execution server) being dispatched. Cloud providers can utilize this information to analyze their systems’ management and optimize their allocation of computing resources [9]. As we show in Section 2, much existing research has discussed the problem of how to analyze data center performance. However, these studies have not considered the following important issues:(i)A cloud center can include a large number of servers of different generations with different CPU speeds and processor capacities. Existing research rarely considers cloud center heterogeneity; rather, studies generally assume that every node operates at the same speed.(ii)A cloud system has at least two levels of scheduling processes (service processes): the main scheduler, which allocates computing resources, and the second scheduler (a node/execution server), which executes tasks. The arrival rate of a task to the main scheduler versus to a node is different, and this must be taken into account in the queuing analysis.(iii)A node server with more than one speed and processor capacity will have different probabilities of being dispatched, depending on the situation. Each node has its own task queue to maintain, which is tracked by a separate queuing model that is included in the overall system queuing model.(iv)A state-of-the-art resource allocation scheme based on the utilization threshold controls nodes that are to be added to a cloud system or removed from it [10].
To fill these gaps, in this paper, we employ a complex queuing model to investigate the performance of a cloud center and the relationship among various performance indicators in a heterogeneous data center. We aim to make the following contributions with this paper:(1)We model the heterogeneous data center as a complex queuing system to characterize the service process. Two concatenated queuing systems compose the complex queuing model. The first is the finite capacity scheduler queuing system, which is used to study the main scheduling server. The second is the execution queuing system, which has multiple servers for each multicore server processor.(2)Based on this complex queuing model, and using queuing theory, we analyze the real output rate of the main scheduling server—that is to say, the real arrival rate of the execution queuing systems, the mean response time of the system, the blocking probability, and other performance indicators—and use this analysis to evaluate the performance of, and the relationships among, these indicators in a cloud center.(3)We conduct extensive discrete event simulations to evaluate and validate the results of using the complex queuing model to analyze the performance of heterogeneous data centers in cloud computing.
The remainder of this paper is organized as follows. Section 2 reviews and discusses the related scholarship on queuing models for cloud centers. Section 3 discusses our analytical model in detail. All of the performance metrics obtained by using this complex queuing model are presented in Section 4. In Section 5, we show the simulation results and compare them with our analytical results to evaluate and validate the applicability of the complex queuing model. Finally, conclusions and future work are presented in Section 6.
2. Related Work
Although much scholarship has been conducted on the performance analysis of a cloud center based on queuing models [11–16], factors including the loosely coupled architectures of cloud computing and the heterogeneity and dynamic infrastructure of a cloud center have not been considered.
In [11], the authors used an queuing system to evaluate a cloud computing center and obtained a complete probability distribution of response time, number of tasks in the system, and other important performance metrics. They also considered the different mean waiting times between heterogeneous services and homogeneous services under the same conditions in the system. Based on queuing theory and open Jackson’s networks, a combination of and was presented to model the cloud platform for QoS requirements in [12]. In [13], the authors presented an queuing system and proposed a synthesis optimization of model, function, and strategy to optimize the performance of services in the cloud center. In [14, 15], the authors introduced a queuing model consisting of three concatenated queuing systems (the schedule queue, , the computation queue, , and the transmission queue, ) to characterize the service process in a multimedia cloud and to investigate the resource optimization problems of multimedia cloud computing. In [16], a cloud center was modeled as a queuing model to discuss cloud service performance related to fault recovery.
In many existing methodologies [17–20], queuing theory has been used to analyze or optimize factors such as power allocation, load distribution, and profit control.
In [17], the authors presented an energy proportional model—which treats a server as an queuing system—as a strategy to improve performance efficiency. In [18], an queuing model, which considers factors such as the requirement of a service and the configuration of a multiserver system, is used to optimize multiserver configuration for profit maximization in cloud computing. In [19], the authors presented an queuing system to model a cloud server farm provided with finite capacity for profit optimization. Another study presents a queuing model for a group of heterogeneous multicore servers with different sizes and speeds to optimize power allocation and load distribution in a cloud computing environment [20]. In [17], the authors used the queuing system to model a cloud center with different priority classes to analyze the general problem of resource deployment within cloud computing.
However, none of the above literature considers that the main scheduler and the execution servers have their own task queuing, that an execution server with different speeds will have different processing times, and that each execution server has a different probability of being dispatched given different arrival rates of tasks. As a result, a queuing model used to model a heterogeneous data center for performance analysis should characterize the service process within the cloud center as well as the infrastructural architecture of a data center.
3. Queuing Model for Performance Analysis
In this section, we present a complex queuing model with two concatenated queuing systems to characterize the service process and the heterogeneity of the infrastructural architecture in order to analyze the performance of a heterogeneous data center.
3.1. Cloud Center Architecture and the Service Model
In the cloud computing environment, a cloud computing provider builds a data center as the infrastructure to provide service for customers [21]. Figure 1 illustrates the architecture of a heterogeneous data center and its service model. The heterogeneous data center consists of large-scale servers of different generations. It includes the master server, which works as a main scheduler to allocate computing resources for a task or to dispatch a node to execute a task, and large-scale computing nodes, which are multicore servers with different speeds that work as execution servers.

All customers submit requests to the data center for services through the Internet, which has been deployed by the users or supplied by the cloud providers. When the tasks (user requests) arrive at the data center, the main scheduler allocates resources for the task or dispatches nodes; the task is then distributed to the execution servers to be executed. The main scheduler has a waiting buffer with a finite capacity, in which it saves waiting tasks that cannot be scheduled immediately. The buffer’s finite capacity means that some requests cannot enter the waiting buffer and will instead leave the system immediately after arriving. Each execution server receives all requests from the main scheduler and maintains its own task queue.
After a task has been completed, it leaves the system and the result is sent back to the customer.
3.2. Other Queuing Models for Reference
3.2.1. Multiple-Server Queue with the Same Service Rate
A queuing model containing multiple servers with the same service rate is shown in Figure 2(a). Using this queuing model, the authors treated a cloud computing data center as an queuing system in [11], as an queuing system in [19], and as an queuing system in [12, 13, 18]. Because each server has the same service rate, this queuing model treats each server in the data center the same and assigns it the same dispatching probability .

(a)

(b)
This queuing model simplifies analysis and makes it easier to deduce an equation for the important performance metrics (e.g., the mean response time, mean number of tasks in the system, and mean waiting time). However, the process presented in this model does not apply to the service process in a cloud data center. It ignores the scheduling activity of the main server and the different speeds of execution servers of different generations.
3.2.2. Multiple-Server Queue with Different Service Rates
A queuing model including multiple servers with different service rates is shown in Figure 2(b). We can use this queuing system to model a heterogeneous data center that includes multicore servers with different speeds. Each execution server in the data center has a different service rate and dispatching probability . We assume that the conditions are and task arrival rate . The state-transition-probability diagram for is shown in Figure 3.

When a task arrives and there are two servers that are idle, the scheduler will select one of them to execute the task. The selection probability of the number 1 server is and that of the number 2 server is (). The state of indicates that there is no task in the queuing system; indicates that the task has been distributed to the number 1 server, and the number 2 server is idle; indicates that the task has been distributed to the number 2 server, and the number 1 server is idle; indicates that there are two tasks in the system, and each has been distributed to one of the two servers. indicate that there are tasks in the system, and two of them are in the two execution servers. Let and denote the service rates of the number 1 and number 2 servers, respectively. The system’s service rate is . Figure 3 shows how this model calculates probability when and [22]:
From –, we can see that if , it can use to calculate the probability of each state in Figure 3. The state-transition-probability diagram for will be more complex.
Although it considers the heterogeneity of data centers, this model still does not accurately describe the service process in the cloud center; also, if , the performance analysis will be more difficult.
3.3. Queuing Model for Performance Analysis
A queuing model of heterogeneous data centers with a large number of execution servers with multiple cores of different generations is shown in Figure 4. The two concatenated queuing systems—the scheduler queuing system and the execution queuing system—make up the complex queuing model that characterizes the heterogeneity of the data center and of the service processes involved in cloud computing. This model uses a monitor to obtain performance metrics—including the task number in the queuing, , the mean waiting time of a task, , and the utilization of the execution server, . The model then sends this information to the balance controller and optimizes performance by proposing an optimal strategy based on these metrics. This is a project we will consider doing in future.

The complex queuing model is presented as follows.
The master server works as the main scheduler and maintains scheduler queuing for all requests from all users. Since the process of allocating resources for tasks in the cloud computing environment should consider all resources in the data center, the master server is treated as an queuing system with finite capacity .
Each node in a data center works as an execution server, which is a multicore server that has identical cores with the core execution speed (GIPS, measured by giga instructions per second). Since each multicore execution server can parallel-process multiple tasks, it is treated as an queuing system.
The stream of tasks is a Poisson process with arrival rate (the task interarrival times are independent and identically distributed exponential random variables with a rate of ). The related notations of the system parameters are listed in the Notations.
Since arrivals are independent of the queuing state and the scheduler queuing system is a finite capacity queuing model, the blocking probability of the scheduling system will be greater than or equal to 0 () and , to yield the following equations:
4. Performance Metrics of the Queuing Model
In this section, we will use the complex queuing model with the two concatenated queuing systems proposed in Section 3.3 to analyze the performance problems of a heterogeneous data center in cloud computing. We consider the performance metrics of the scheduler queuing system, the execution queuing systems, and the entire queuing system.
4.1. The Scheduler Queuing System
Since management of computing resources and scheduling tasks across entire data center and cloud environments are done by a unified resources management platform (e.g., Mesos [23] and YARN [24]) and the system should be accompanied by finite capacity, the master server is treated as an queuing system with finite capacity . The state-transition-probability diagram for the scheduler queuing model is shown in Figure 5.

If the scheduler utilization and , then the probability that tasks are in the queuing system is [22]
If , the task request could not enter the queuing system. The blocking probability is
From (1) and (6), the master server throughput rate (the effective arrival rate of the execution queuing systems) is
The tasks waiting in the queue and the tasks scheduled by the main scheduler are, respectively,
From (8), the mean task number (including the tasks waiting in the queue and the tasks scheduled by the main scheduler) in the scheduler queuing system is
Using (7) and (9), ; applying Little’s formulas (the response time = the tasks number in the system/the tasks arrival rate), the mean task response time of the scheduler queuing system is
4.2. The Execution Queuing Systems
Assume that there are execution servers in the data center. Each execution server has cores and can be treated as an queuing system.
The utilization of one core in the th execution server is
From the state-transition-probability diagram for an queuing system [22], let indicate the probability that there are task requests (including waiting in the queue and being executed) in the execution queuing system of the execution server :
Under the stability condition, there are constraints as follows:..
We can derive the probability that there are task requests in the execution queuing system :
The probability that a newly arrived task request from the main scheduler to the th execution server will be executed immediately is
The mean probability of a newly arrived task request entering the execution queuing systems is
In the th execution server, the tasks waiting in the queue and the tasks executed by the th execution server are, respectively,
From (16), the mean task number (including the tasks waiting in the queue and the tasks executed by the execution server) in the th execution server is
Applying Little’s formulas, the mean task response time of the th execution queuing systems and the mean waiting time for execution of a task in the th execution queuing systems are, respectively,
The mean response time of the execution queuing systems for a group of execution servers in the data center is
The mean waiting time of the execution queuing systems in a group of execution servers in the data center is
4.3. The Performance Metrics of the Entire Queuing System
In order to analyze the performance of heterogeneous data centers in cloud computing, we consider the main performance metrics of a complex queuing system, which consists of two concatenated queuing systems, as follows.
(i) Response Time. Based on analyzing the two concatenated queuing systems, the equilibrium response time in heterogeneous data centers is the sum of the response time of the two phases. The response time equation can be formulated as
(ii) Waiting Time. The mean waiting time for a task request is
(iii) Loss (Blocking) Probability. Since the scheduler queuing system is a finite capacity queuing model, blocking will occur:
From (6), the loss (blocking) probability of the system is related to the finite capacity of the scheduler queuing system and the scheduling rate of the main scheduler server . The parameters of or can be adjusted to obtain different loss (blocking) probabilities.
(iv) Probability of Immediate Execution. Here, the probability of immediate execution indicates the probability of a task request being executed immediately by an execution server after being scheduled by the scheduler server and sent to the execution server:
(v) Execution Server Utilization Optimization Model Based on the Arrival Rate . Our analytical model can be used to optimize the performance for a data center. This is one of the significant applications of the model.
Based on the utilization threshold of the execution servers, we can determine the number of execution servers and the utilization of each execution server in the system, by using the calculation model, which is based on our analytical model, to configure the execution servers in the data center to deal with tasks with the arrival rate . It can minimize the mean response time of the system.
Assume the use of FCFS as the scheduling strategy, and set the heterogeneous data center as a group of heterogeneous execution servers with multicores and execution speeds of . The execution server utilization optimization model based on arrival rate can be formulated as follows:
The utilization of the execution servers will determine mean response time, occupied time, and resource capacity; therefore, the execution server utilization optimization model based on arrival rate can be used to analyze the problem of optimal resource cost or the problem of energy efficiency. We will accomplish this research in future work.
In this paper, we just consider the special case—the condition of . In a small case, we can use a numerical method to gain the values of for optimal performance. Under the condition of , the parameter of in (25) is ; (25) can be rewritten as follows:
5. Numerical Validation
In order to validate the performance analysis equations presented above, we built a heterogeneous cloud computing data center farm in the CloudSim platform and a tasks generator with using the discrete event tool MATLAB. In the numerical examples and the simulation experiments we present, the parameters are illustrative and can be changed to suit the cloud computing environment.
5.1. Performance Simulation
In this section, we describe the simulation experiments that we conducted to validate the performance analysis equations of the performance metrics. In all cases, the mean number of instructions of a task to be executed by the execution server was (giga instructions). In the CloudSim, we considered a heterogeneous data center with a group of multicore execution servers Host#1, Host#2,…, Host#10 and a main scheduler server Ms. The traffic intensity of the main scheduler server was . The finite capacity of the scheduler . The different settings of the Host# () are shown in Table 1. The allocation policy of the VM-scheduler is space-shared, which makes one task in one VM. The total computing ability (computing power) of these two groups of execution servers with different settings was the same.
In Table 1, the heterogeneous data centers of ST1 and ST2 are configured with 10 PMs and 110 VMs and 10 PMs and 58 VMs, respectively. The total computing powers of ST1 and ST2 are all 220 (GIPS). The hardware environment is 2×Dell PowerEdge T720 (2×CPUs: Xeon 6-core E5-2630 2.3 G, 12 cores in total).
The interarrival times of task requests were independent and followed an exponential distribution with arrival rate (the number of task requests per second). The cloudlet information is created by tasks generator, which generates task arrival interval time, task scheduling time, and task length.
In the experiments, we assigned each multicore execution server—which had a dispatched probability according to its core speeds or set traffic intensity—to each multicore execution server to control a task. The rule for setting the dispatched probability and the traffic intensity, which can be changed by administrators, was as follows.
(i) Setting the Dispatched Probability. Consider
Under this pattern, the traffic intensity of each execution server was the same and may have exceeded the scope of the utilization threshold . Using these two groups of execution servers (ST1 and ST2), the dispatched probability of each execution server was set using (27).
Simulation Experiments 1. We have the following:(1)Tasks number: cloudletsNo = 100000, tasks length (GIPS): , and : , 160.5, 163.5, 166.5, 169.5, 172.5, 175.5, 178.5, 181.5, 184.5, .(2)The main scheduler computing power: , hosts (PMs) and VMs setting: ST1 and St2 (shown in Table 1), and the probability of host (PM) dispatched: (27).(3)Process: simulation experiment times: 30. In every time, we recorded the loss tasks number in the main scheduler, the immediate execution tasks number (if the arrival time equates to start time), the response time, and the waiting time of each host. After 30 times’ experiments, we calculated the average values of the loss (blocking) probability , the probability of immediate execution , the mean response time , and the mean waiting time . Then, we calculated the 95% confidence interval (95% C.I.) for each metric to validate these metrics in the analysis model.(4)The analytical results are calculated by using the analytical model with (21)–(24). The settings of each parameter are : , 160.5, 163.5, 166.5, 169.5, 172.5, 175.5, 178.5, 181.5, 184.5, , , , , (27), and and shown in Table 1.
The simulations and the analytical results (: ) are shown in Tables 2–5.
The comparisons between the analytical and the simulations results (: , 160.5, 163.5, 166.5, 169.5, 172.5, 175.5, 178.5, 181.5, 184.5, ) are shown in Figures 6–9.




By comparing the simulation results obtained from CloudSim and the analytical results calculated by using the model, we can observe that the analytical results of , , , and are all in the range of 95% C.I., which are shown in Tables 2–5 and Figures 6–9. It demonstrates that the analytical results are in agreement with the CloudSim simulation results. It also confirms that the metrics of ), , , and in the analytical model can be trusted under the confidence level of 95%.
Simulation Experiments 2. Use the same dataset (task number: 100000, ST1 and ST2) as the simulation experiments 1 to complete 30 times’ experiments under the arrival rate . From this experiment, we recorded the response and waiting times of each host in ST1 and ST2 ( and ). Then, we can get the 95% confidence interval (95% C.I.) of and to validate these metrics in the model.
The analytical results are gotten by using (18) and (19).
In Tables 6 and 7, we demonstrate the simulation results and the 95% C.I. of and () of every host in ST1 and ST2 with a task request arrival rate .
The comparisons between the analytical and the simulations results of the hosts are shown in Figures 10-11.


As shown in Tables 6 and 7 and Figures 10 and 11, we can observe that the analytical results of and for all the hosts in ST1 and ST2 are all in the range of 95% C.I. It demonstrates that the analytical results are in agreement with the CloudSim simulation results and the hosts’ analytical model can be trusted under the confidence level of 95%.
After comparing the results, we also conclude that(1)the relative errors of the simulation and the analytics are less than 3.5%,(2)under the same arrival rate , the performance metrics will be different with different settings despite having the same computing abilities.
(ii) Setting the Traffic Intensity. The traffic intensity/utilization () was set in the scope of (setting and ). Under this pattern, the scope of the task request arrival rate of the th execution server will be () according to (4). A group of () multicore execution servers can handle the effective arrival rate :
Simulation Experiments 3. Let us consider using the same hosts set to deal with the same arrival rate ((28), ) at designated rates of traffic intensity/utilization () setting. The different () sets are shown in Table 8 as TIs1 and TIs2, respectively.
We completed 30 times with these settings to get the mean values. The mean response time results of the execution queuing systems () are shown in Table 8.
From Table 8, we can see that the maximum relative error between the simulation and the analysis is 0.51%. By setting the traffic intensity of TIs1 and TIs2 for each host in the group of ST1, we find different mean response times for the execution queuing systems. The in TIs1 has fallen from 0.571203 (second) to 0.5607632 (second) in TIs2—an increase of 1.8%. It shows that the same hardware resource will have different performance with different traffic intensity setting for each host.
We can compare the results of our analytical calculation with the results of simulation data shown in Tables 2–8 and Figures 6–11, and all of the analytical results’ relative errors are less than 3.5%. The results also showed that all of the analytical results are in the range of 95% C.I. for the metrics. This shows that our analysis agrees with our simulation results, which confirms the validity of our analytical model.
5.2. Numerical Examples
We demonstrate some numerical examples to analyze the relationship among performance metrics with using the analytical model.
(i) Numerical Example 1. We use the two groups of execution servers (hosts in ST1 and ST2) shown in Table 1 with the same total computing ability to handle an arrival rate . The execution servers in ST1 and ST2 are sorted by computing ability (computing power). In the experiments, we adjust the traffic intensity of each execution server, so that we can analyze the mean response time and the mean waiting time of the system. We select 16 sets of traffic intensity, and each set has 10 () to fit 10 execution servers.
The results of the experiments are shown in Figures 12 and 13. In the 16 experiments, we set similar traffic intensity for the th execution server in ST1 and ST2. Although ST1 and ST2 have the same total computing ability, the mean response time and the mean waiting time of the system differ significantly from each other.


In Figures 12 and 13, we can see that adjusting the traffic intensity (utilization) of each execution server can enhance the response and waiting times. In ST1, the mean response time is decreased from 0.707159 (second) to 0.554670 (second), and response time has been enhanced by 21.56%. The mean waiting time is decreased from 0.201998 (second) to 0.049509 (second), and response time has been enhanced by 75.49%. In ST2, the mean response time is decreased from 0.523104 (second) to 0.333339 (second), and response time has been enhanced by 36.27%. The mean waiting time is decreased from 0.240814 (second) to 0.067228 (second), and response time has been enhanced by 72.08%. We can see that the traffic intensity (utilization) of each execution server has a significant impact on the performance of the heterogeneous data center.
The configuration of a server cluster in a heterogeneous data center is an important factor that greatly impacts the system’s performance. Again, Figures 12 and 13 show the results of our 16 experiments, and we can see that the mean response time of ST2 is better than ST1 under a similar traffic intensity (utilization) setting. Mean response time improves by 35.18% on average, and the maximum is 39.9%. However, the mean waiting time of ST2 is worse than ST1. Mean waiting time is decreased by 24.95% on average and the maximum is 29%. It is important for a cloud provider to configure the server cluster into a reasonable structure to provide services for the customer. This conclusion will be confirmed in numerical example 2.
(ii) Numerical Example 2. Let us consider the third group of execution servers, named ST3, that includes 10 servers, each of which is configured as and (). The other conditions are the same as the conditions in simulation experiments 1. Assume that the arrival rate is set from 150 to 187.5, which means that the traffic intensity of each execution server would be in the threshold .
The performance results (, , and ) of the three groups (ST1, ST2, and ST3) of execution servers are shown in Figures 14–16. Different configurations of server clusters will have different performance results. The best performance of the mean response time is group ST3, and the worst is ST1, as shown in Figure 14. The best performance of mean waiting time is group ST1, and the worst is ST3, as shown in Figure 15. In Figure 16, ST1 has the maximum immediate execution probability, while the ST3 has the minimum value. We can use this queuing model to estimate the performance results of a server cluster with a certain configuration under different . In order to enhance the performance of mean response time, configuring the server cluster in a reasonable structure is an efficient method.



In practice, users will present some constraints when they ask cloud computing providers for services. They will present constraints as follows:(1)The task request arrival rate .(2)The mean response time .(3)The mean wasting time or the immediate execution probability .
Cloud computing providers could configure the server cluster to serve customers under certain constraints by using our queuing model. In our future work, we will use the complex queuing model to further research dynamic cluster technology in a heterogeneous data center to learn how it handles different tasks with different arrival rates under various constraints.
(iii) Numerical Example 3. One application of our analytical model is to use this model for optimal system performance by finding a reasonable traffic intensity setting for each execution server. Numerical example 3 shows the optimization under the condition of to get the values of .
Assume using three execution servers (, ), (, ), and (, ) to handle the arrival rate (). Equations (25) and (26)—the execution server utilization optimization model based on the arrival rate —can use a numerical method to get the minimum mean response time of this system under the condition of . As Figure 17 shows, we can see that the minimum mean response time when , , and . Controlling the traffic intensity (or utilization) of each execution server allows for the control of the dispatched probability of each execution server and the optimization of system performance.

The traffic intensity (or utilization) of each execution server will impact the response time, computing resource allocation, and energy usage of the system; therefore, we will further optimize the execution server utilization optimization model in future work.
6. Conclusions and Future Work
Performance analysis of heterogeneous data centers is a crucial aspect of cloud computing for both cloud providers and cloud service customers. Based on an analysis of the characteristics of heterogeneous data centers and their service processes, we propose a complex queuing model composed of two concatenated queuing systems—the main schedule queue and the execution queue—to evaluate the performance of heterogeneous data centers. We theoretically analyzed mean response time, mean waiting time, and other important performance indicators. We also conducted simulation experiments to validate the complex queuing model. The simulation results and the calculated values demonstrate that the complex queuing model provides results with a high degree of accuracy for each performance metric and allows for a sophisticated analysis of heterogeneous data centers.
We have further conducted some numerical examples to analyze factors such as the traffic intensity (or utilization) of each execution server and the configuration of server clusters in a heterogeneous data center; these are factors that significantly impact the performance of the system. Based on this complex queuing model, we plan to extend our analytical model to the study of dynamic cluster technology in a heterogeneous data center. In doing so, we aim to help service providers optimize resource allocation using the execution server utilization optimization model.
Notations
: | The finite capacity of the scheduler queuing system |
: | The scheduling rate of the master server |
The master server throughput rate/the effective arrival rate of the execution queuing systems | |
: | The probability that execution server has been dispatched to execute a task/the probability of a task request sent to the execution server |
: | The arrival rate of the task request distributed to the execution server |
: | The th node/the th execution server |
The number of cores in the th execution server | |
: | The execution rate of the core in the th execution server |
: | The core execution speed of the th execution server (measured by giga instructions per second (GIPS)) |
: | The mean number of instructions in a task to be executed in the execution server (giga) |
: | The utilization/traffic intensity of the th execution server |
: | The number of execution servers in the data center. |
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This project is supported by the Funds of Core Technology and Emerging Industry Strategic Project of Guangdong Province (Project nos. 2011A010801008, 2012A010701011, and 2012A010701003), the Guangdong Provincial Treasury Project (Project no. 503-503054010110), and the Technology and Emerging Industry Strategic Project of Guangzhou (Project no. 201200000034).