A Two-Step Job Scheduling Algorithm Based on Priority for Cloud Computing

  • cc icon
  • ABSTRACT

    Cloud systems are popular computing environment because they can provide easy access to computing resources for users as well as efficient use of resources for companies. The resources of cloud computing are heterogeneous and jobs have various characteristics. One such issue is effective job scheduling. Scheduling in the cloud system may be defined as a multiple criteria decision model. To address this issue, this paper proposes a priority-based two-step job scheduling algorithm. On the first level, jobs are classified based on preference. Resources are dedicated to a job if a deadline failure would cause severe results or critical business losses. In case of only minor discomfort or slight functional impairment, the job is scheduled using a best effort approach. On the second level, jobs are allocated to adequate resources through their priorities that are calculated by the analytic hierarchic process model. We then analyze the proposed algorithm and make a scheduling example to confirm its efficiency.


  • KEYWORD

    Analytic hierarchic process , Cloud computing , Job Scheduling , Priority

  • III. TWO-STEP JOB SCHEDULING SCHEME BASED ON PRIORITY

    The proposed scheduling scheme in this paper is composed of two steps. In the first step, the priority of a job is calculated based on the degree of importance of each job. In the second step, the algorithm selects a VM on which to run the job. Every candidate VM is weighted using the AHP model.

      >  A. First Step: Job Classification Based on Its Degree of Importance

    As all resources in cloud computing are dedicated to jobs, it is advantageous for a cloud system to secure the resources needed in advance. In the case of data-oriented jobs, this resource provisioning will be an especially important factor in performance improvement because the required storage as well as bandwidth increases with time passing. Thus, our scheme classifies all jobs into one of two categories: resource provisioning or best effort.

    In this step, the jobs for which the user pays a high cost or that are of higher importance belong to the resource provisioning case. If scheduling failures of jobs may bring about severe business loss, the jobs are classified to the first level of priority or the highest level. If the failures may cause considerable after-effects, these jobs belong to the second level of priority. Jobs for which the user pays regular or low costs or are of medium importance belong to the best effort case. If scheduling failures of jobs may bring about a trivial inconvenience to a small number of users, the jobs are classified into the third level of priority. If the failures may cause minor functional impairments, these jobs belong to the fourth level or the lowest priority. The classified jobs are queued in order in the cloud system scheduler. The sequences are fixed, and the jobs are executed in a nonpreemptive way. The jobs with the same priority are served in a round-robin manner.

      >  B. Second Step: Job Allocation to VM Based on the AHP Model

    The AHP is a kind of MCDM that helps to choose one of the alternatives. Each selection has a related attribute attached to it, and the weights of each attribute are set. Therefore, the AHP model can select the best choice out of the list of alternatives. The merit of the AHP is that it considers variable parameters for many alternatives and generates the result that best matches the parameters [10]. The proposed scheme uses an AHP model for the VM selection.

    As Fig. 1 shows, the objective level in the proposed algorithm is to place a job in a VM that it best matches under many different parameters. The attribute level in our algorithm is composed of user requirements or decision criteria such as response time, system utilization, and cost, and the alternative level represents all VMs in a cloud system.

    As each parameter in the AHP model has a preference, the proposed scheme also has a preference from 1 to 9. The preference 9 is the highest one. The values 2, 4, 6, and 8 are intermediate values for the preference, and the inverse number represents the counterpart preference. Therefore, these preference values are used in representing and calculating the requirements of the job as well as parameters of each VM, such as the response time, system utilization, and cost.

    Suppose that a set of jobs, which are scheduling objects in cloud environment, is ζ = {J1,J2,…,Jm}, and the criteria set is ψ = {C1,C2,…,Cn}, and set of VMs is ξ = {VM1, VM2,…,VMv}.

    The VM allocation is done in two steps. In the first step, the algorithm calculates the weight vector by pairwise comparison and does a consistency check for both of the levels. In the second step, the best VM is selected by multiplying the two weight vectors of each level. The process is explained in detail below.

    Sub-step 1: Calculation of the weight matrix and consistency index between the objective and attribute level.

    image

    Suppose that the pairwise comparison matrix between the objective level and attribute level is named “pairwise comparison matrix (PCM)” in this paper. This PCM represents the preference for a job under all criteria such as responsiveness, system utilization, and cost. The PCM is an n by n matrix like Eq. (1). If the element of PCM, (i,j) is 5, the preference of PCM(j,i) will be 1/5. There are n pairwise comparison matrices for all criteria, which are created according to the priority of the decision criteria. For each of the comparison matrices, the scheme should compute a priority vector (vector of weights). The priority vector can be obtained by solving Eq. (2). The λmax is the principal eigenvalue of PCM and is denoted by the corresponding eigenvector ωcriteria. With any arbitrary comparison of PCM, the model can produce a vector of weights such as ωcriteria = {ω1, ω2, …, ωn}. An essential step in this model is to obtain a vector of weights. The vector of weights can be computed through Eq. (2). A positive n by n matrix has the ratio form PCM = (wi/wj), i,j = 1, ..., n, if, and only if, it is consistent. The matrix of ratios PCM = (wi/wj) is consistent if and only if n is its principal eigenvalue and PCM?ωcriteria = λmax?ωcriteria. Further, ωcriteria > 0 is unique to within a multiplicative constant.

    image

    Saaty [10] has defined the consistency ratio (CR) as Eq. (3).

    image

    In Eq. (3), RI is the random index, which is randomly calculated based on the rank of the comparison matrix. Eq. (3) also uses the RI values, which were calculated by Saaty [10]. If CR < 0.1, then the PCM should be considered consistent.

    Sub-step 2: Calculation of the weight matrix and consistency index between the attribute and alternative level.

    The next step is also to calculate the weight matrix and consistency index for each decision criterion to all VMs. Like Eqs. (1)?(3), this scheme determines the pairwise comparison matrix and calculates the weight matrix and consistency ratios for each of the decision criteria to each VM.

    Suppose that the weight vector for each VM is ωvm={ω12,…,ωv}, and ωvm is a v by n matrix. The v is the number of the VMs and the n is the number of the decision criteria. We then obtain the final score for each VM by multiplying the weight vector of each sub-step 1 by the weight matrix of the sub-step 2. The number of elements of the score vector in Eq. (4) is v. The index of maximum value means a VM that will execute the job.

    image

    where {ωvm is v by n, ωcriteria is n by 1} .

    IV. ANALYSIS AND EXAMPLE OF THE PROPOSED SCHEDULING ALGORITHM

    First, we analyze the proposed scheme by time complexity. As the priority classification of the first step is determined by job characteristics or requirements, the time complexity may be trivial. The computation of the pairwise comparison matrix and CR occupies a huge portion of complexity in the proposed scheme, such as Eq. (5).

    image

    In Eq. (6), n is the number of the decision criteria and v is the number of the VMs, and d denotes the number of the eliminated matrices because of inconsistency while che-cking CR. In Eq. (6), n2.38 are the additions and the multiplications in the calculation of the weight vector from the objective level to the attribute level. The n×v2.38 is the arithmetic operations in the computation of the weight matrix from the attributes level to the alternative level. Furthermore, if the candidate pairwise matrix is rejected in the consistency check, the matrix must be recalculated in d×n2.38 times. Thus, the final time complexity may be determined by Eq. (6).

    image

    The time complexity for multiplying n by n matrices requires O(n3) multiplications that are of worst case complexity. The fastest known algorithm, devised by Don Coppersmith and Shmuel Winograd, runs in O(n2.38) time [11]. Most researchers believe that an optimal algorithm will run in essentially O(n2) time, yet until recently, no further progress has been made in finding one [12]. Therefore, the best case complexity is O(n2.38), and the worst case com-plexity of the proposed scheme is O(n3), and a general case of time complexity is n2.38 or v2.38. However, as the number of VMs is generally greater than the decision criteria, the complexity of the proposed algorithm should be described as n×v2.38.

    The following is an example of the proposed scheduling algorithm. Table 1 is a sample preference matrix of a job.

    The following Eq. (7) is a PCM and weight vector.

    image

    For obtaining the CR, we first multiply the preference matrix by the eigenvalue of the weight vector as shown below Eq. (8):

    image

    From Eqs. (2) and (3), we now can get the λmax=3.004 and CR is 0.002. As the CR is smaller than 0.1, this preference matrix must be consistent.

    Finally, the score of each VM is calculated by multiplying the weight matrix (ωvm) in the alternative level by the weight vector (ωcriteria) in the attribute level through Eqs. (4)?(9), which follows, is the score vector for each VM. Because VM3 recorded the maximum score (0.375), the proposed scheduler should select VM3 to execute the job.

    image

    Next, we calculate the weight vector and CR of each VM for the response time, system utilization, and cost. Tables 24 show these weight vectors and CRs. As all CRs are less than 0.1, all preference matrices must be consistent.

    V. CONCLUSION

    Job scheduling is an important problem in cloud computing, which is naturally composed of heterogeneous resources. We defined this issue as MCDM. To settle this issue, this paper introduced a two-step job scheduling scheme based on priority, with consideration of both the job characteristics and the MCDM. In the first step, the priority of a job was classified in 1 of 4 levels, which are determined by importance attributes. In the second step, we applied the AHP process in job allocations to the VM to tackle the MCDM issue. We adopted responsibility, system utilization, and cost of jobs as decision criteria. We then analyzed the time complexity of the proposed scheme and confirmed that the proposed algorithm showed acceptable complexity. In the future, work will be carried out aiming to minimize the complexity and to implement a real scheduler in a sample cloud system.

  • 1. Baraglia R., Capannini G., Dazzi P., Pagano G. 2013 “A multicriteria job scheduling framework for large computing farms” [Journal of Computer and System Sciences] Vol.79 P.230-244 google doi
  • 2. Deelman E. 2010 “Grids and clouds: making workflow applications work in heterogeneous distributed environments” [International Journal of High Performance Computing Applications] Vol.24 P.284-298 google doi
  • 3. Yu J., Buyya R., Tham C. K. 2005 “Cost-based scheduling of scientific workflow applications on utility grids” [in Proceedings of the 1st IEEE International Conference on e-Science and Grid Computing] P.5-8 google
  • 4. Yu J., Buyya R. 2006 “A budget constrained scheduling of workflow applications on utility grids using genetic algorithms” [in Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing] P.19-23 google
  • 5. Padala P., Shin K. G., Zhu X., Uysal M., Wang Z., Singhal S., Merchant A., Salem K. 2007 “Adaptive control of virtualized resources in utility computing environments” [in Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems] P.289-302 google
  • 6. Yu Z., Shi W. 2008 “A planner-guided scheduling strategy for multiple workflow applications” [in Proceedings of the 37th International Conference on Parallel Processing Workshops] P.1-8 google
  • 7. Xu M., Cui L., Wang H., Bi Y. 2009 “A multiple QoS constrained scheduling strategy of multiple workflows for cloud computing” [in Proceedings of IEEE International Symposium on Parallel and Distributed Processing with Applications] P.629-634 google
  • 8. Kosinska J., Kosinski J., Zielinski K. 2010 “The concept of application clustering in cloud computing environments: the need for extending the capabilities of virtual networks” [in Proceedings of the 5th International Multi-Conference on Computing in the Global Information Technology] P.139-145 google
  • 9. Ghanbaria S., Othman M. 2012 “A priority based job scheduling algorithm in cloud computing” [in Proceedings of the International Conference on Advances Science and Contemporary Engineering 2012] P.778-785 google
  • 10. Saaty T. L. 2012 Decision Making for Leaders: The Analytical Hierarchy Process for Decisions in a Complex World. google
  • 11. Strassen V. 1969 “Gaussian elimination is not optimal” [Numerische Mathematik] Vol.13 P.354-356 google doi
  • 12. Robinson S. 2005 “Toward an optimal algorithm for matrix multiplication” [SIAM News] Vol.38 google
  • [Fig. 1.] The analytic hierarchic process model to place jobs on the virtual machine (VM).
    The analytic hierarchic process model to place jobs on the virtual
machine (VM).
  • [Table 1.] Preference matrix example of a job
    Preference matrix example of a job
  • [Table 2.] λmax and consistency ratio (CR) for response time
    λmax and consistency ratio (CR) for response time
  • [Table 3.] λmax and consistency ratio (CR) for system utilization
    λmax and consistency ratio (CR) for system utilization
  • [Table 4.] λmax and consistency ratio (CR) for cost
    λmax and consistency ratio (CR) for cost