An Efficient Scheduling Method for Grid Systems Based on a Hierarchical Stochastic Petri Net

  • cc icon
  • ABSTRACT

    This paper addresses the problem of resource scheduling in a grid computing environment. One of the main goals of grid computing is to share system resources among geographically dispersed users, and schedule resource requests in an efficient manner. Grid computing resources are distributed, heterogeneous, dynamic, and autonomous, which makes resource scheduling a complex problem. This paper proposes a new approach to resource scheduling in grid computing environments, the hierarchical stochastic Petri net (HSPN). The HSPN optimizes grid resource sharing, by categorizing resource requests in three layers, where each layer has special functions for receiving subtasks from, and delivering data to, the layer above or below. We compare the HSPN performance with the Min-min and Max-min resource scheduling algorithms. Our results show that the HSPN performs better than Max-min, but slightly underperforms Min-min.


  • KEYWORD

    Grid computing , Hierarchical stochastic Petri net (HSPN) , Resource scheduling , Resource allocation , Modeling

  • I. INTRODUCTION

    Recent trends in parallel processing system design and deployment have been toward the development of distributed network systems [1]. In particular, grid computing systems, in which networks of computers from multiple administrative domains are integrated to create large-scale virtual computers, are becoming more prevalent. Many different groups have been involved in a collaborative effort to establish so-called virtual organizations (VOs). A VO may be established to perform a single task, and then disappear just as quickly [2,3]. Grid computing has evolved beyond its roots in academic science, and is currently on the threshold of mainstream commercial adoption. One of the promises of grid computing is that it will enable applications to be executed across multiple sites. Grid computing technology is developing rapidly, and many have begun to run large-scale parallel applications on these systems [4]. Some of the applications require coordinated access to resources that are managed by autonomous entities. This coordinated access is known as resource scheduling.

    In this paper, we address the problem of resource scheduling for global resource sharing in a grid computing environment. Resource scheduling is a crucial issue for the development of stable and effective production grid computing infrastructures [5]. However, since grid computing resources are distributed, heterogeneous, dynamic, and autonomous, deciding how to allocate the grid resources to meet the requirements of applications is a challenging and complex issue.

    We propose a new approach to grid computing resource scheduling, the hierarchical stochastic Petri net (HSPN), to optimize grid resource sharing. Stochastic Petri nets are modeling tools that can easily be used to analyze and evaluate complex models of discrete event dynamic systems (DEDSs) for performance and reliability. Probabilistic models are automatically constructed, based on a set of results underlying the dynamic behaviors of these nets, from which the theory of untimed Petri nets is derived [6]. The HSPN categorizes resource requests in three layers, where each layer has special functions to receive subtasks from, and deliver data to, the layer above or below. This hierarchical model reduces the complexity of scheduling [7]. We compare the performance of the HSPN with the Min-min and Max-min resource scheduling algorithms, and our experiments show that the HSPN performs better than Max-min, but slightly underperforms Min-min.

    The rest of this paper is organized as follows. Section II presents related work on grid computing resource scheduling, focusing on Petri nets. Section III outlines the grid resource scheduling problem, and Section IV discusses the HSPN, our proposed approach. Section V evaluates the performance of the HSPN, and compares it with the performance of Min-min and Max-min. Finally, Section VI presents our conclusions, including the advantages and disadvantages of the HSPN, and discusses future research directions.

    II. RELATED WORK

    Scheduling and resource allocation are important issues for grid computing. Considerable research over the last decade has addressed the problem of job scheduling for parallel and distributed systems. Abawajy and Dandamudi [7,8] propose an online dynamic scheduling policy that manages multiple job streams, with the aim of improving the mean response time and system utilization. Yu and Buyya [9] present a general framework to facilitate directed acyclic graph (DAG) scheduling in grid systems. However, their research does not account for dynamic changes in grid computing resources. Several Petri net models, such as extended time Petri nets, colored Petri nets (CPNs), and stochastic Petri nets (SPNs), have been developed to address dynamic changes, and are considered to be effective tools for scheduling and resource allocation in grid computing systems [10-12].

    Generally, Petri nets assign tasks to grid resources, using either a distributed scheme, or a hierarchical scheme [13,14]. In a distributed scheme, a broker considers resources to be distributed states for each request, sends the request to the sites that contain the distributed resources, and receives results from the distributed sites’ coordinators. In a hierarchical scheme, requests for resources from a site are arranged hierarchically. Resources are classified based on their geographical location in relation to the site making the request, and sites have coordinators at different levels. Tasks are sent through the hierarchical brokers, to sites that have the needed resources. A distributed scheme is better than a hierarchical scheme, when all tasks are locally requested on sites; a request is sent directly to the brokers on sites where the resource is available. Otherwise, a hierarchical scheme is better for resource allocation and responding to requests. In a hierarchical scheme, scheduling occurs in three layers, and each site has one broker and scheduler, to control and manage allocation of its resources.

    A three-layer model based on a hierarchical time Petri net (HTPN) is presented in [10], with different Petri net models constructed for each level. The Petri net models that the HTPN uses for grid resource scheduling are different from the models in our proposed method. Also, the HTPN focuses on independent tasks, and does not consider dependent tasks. Dependent task scheduling is presented in [11], using an extended time Petri net. The grid resource scheduling model based on Petri nets is extended in [12], which includes a discussion of existing studies of grid scheduling using Petri nets, and proposes a four-level scheduling algorithm that considers independent tasks. In the present paper, we propose the HSPN, a hierarchical scheme that uses SPNs to schedule and allocate resources, based on the hierarchical and distributed schemes presented in [15]. Our preliminary version of this paper, which is published in [16,23], introduced the algorithm, without considering the models in detail.

    III. THE GRID RESOURCE SCHEDULING PROBLEM

    The efficiency of our proposed algorithm for independent task scheduling problem depends on minimizing makespan. When scheduling tasks in a grid network, the broker should map n tasks T = T1, T2, …, Tn to m available resources R = R1, R2, …, Rm in a way that minimizes the makespan, which is the maximum completion time for these tasks, as defined in Equation (1).

    image

    where the ETC matrix gives each task’s execution time for each resource. It is assumed that every task executes on some resource, and is not mapped to another machine, until its execution ends.

    As an example, given two resources {R1, R2} and eight tasks {T1, T2, T3, T4, T5, T6, T7, T8}, ETC might be as shown in Equation (2). The values show the runtime of each task on each resource, without considering the resource’s queue. When we also take into account the fact that each resource has several tasks in its queue, we arrive at the makespan, as defined above. Each resource has a certain execution rate per one million instructions (MI) so it is better to separate tasks into resource groups based on their time and cost consumption. Each task includes a certain number of instructions, which is used in Equation (3) to calculate the task’s execution time on each resource. The number of instructions for each task comes from a uniform distribution on [100 MI, 110 MI].

    image

    For each (i,j):

    image

    If we sort the tasks in the present example in order of their execution times, they will fall into the two groups

    shown in Table 1. In this table, each task’s deadline is chosen at random.

    The queue for R1 in Table 1 has T2 as the first task and T8 as the last task, while the queue for R2 has T5 as the first task and T4 as the last task. Based on Equation (1), the makespan for these tasks is shown in the series of equations in (4).

    image

    Providers will charge customers based on the time and cost of each task per second (makespan).

    IV. OUR PROPOSED METHOD (HSPN)

    In this section, we introduce the HSPN, our proposed three-level grid resource scheduling model.

    Fig. 1 shows the high-level structure of the HSPN. Resources are connected through a three-level hierarchical network. The first level is a wide area network (WAN) that connects local area networks (LANs). The second level is composed of the LANs that connect computing resources (personal computers and high performance computers), and the third level connects storage resources and other resources. In this three-level scheduling scheme, unlike the hierarchical and distributed schemes, users submit all tasks to a home scheduler at their own site, rather than to a grid scheduler, which preserves the autonomy of grid resources, and makes it convenient for users to submit and supervise tasks. In addition, the threelevel scheme adds a local scheduler between the home scheduler and the grid scheduler, which not only removes some pressure from the grid scheduler, but also makes it possible for tasks to be executed locally.

    Table 2 explains the notation for the HSPN model, showing the places and transitions that we use to simulate our hierarchical scheduling.

    Each request for a grid resource is assessed and analyzed by the resource’s local scheduler. Each task can be divided into several independent subtasks. Users submit tasks to their home scheduler with processing requirements, such as estimated processing time, estimated communication time, deadline, and degree of parallelism.

    We have modeled these schedulers in the HSPN. The processes of task submission and assignment are as follows:

    A user submits a task to the home scheduler through the home machine. The home scheduler analyzes the submitted task. If the task can be completed within

    the deadline on the home machine, then the task will be executed on the home machine. Otherwise, the task is sent to the local scheduler.

    When the local scheduler receives the task submitted by the home scheduler, it decides whether the task can be completed within its deadline in the local area network. If so, the local scheduler assigns subtasks of the task to machines in the local area network, according to some algorithm. Otherwise, the task is sent to the grid scheduler.

    When the grid scheduler receives a task submitted by a local scheduler, it inserts the task into its queue of tasks.

    The HSPN model, which depends on hierarchical scheduling, as shown in Fig. 1, is illustrated in Fig. 2. First, the home scheduler queue is searched. If the requested resource is not found on the home machine, then the request is sent to the local site coordinator, to try to find the best resource in the site, based on the QoS specified by the task parameters. The local scheduler searches the machines in its site. If the best resource is not found in that site, the coordinator sends a request to the site’s broker, to contact brokers at other sites, to search for the resource in their sites. In this way, the grid scheduler enables a task to be allocated resources in other sites.

    We implemented this model using GridSim [17], a resource scheduling simulator for grid systems. In Grid- Sim, entities use events for both service requests, and service deliveries. An event can be raised by any entity for either immediate delivery, or delivery with a specified delay, to itself or other entities. Events that originate from the same entity to which they are delivered, are called internal events, and those that originate from external entities, are called external events. GridSim protocols are used to define entity services. An event is called synchronous, when the event source entity waits until the event destination entity performs all the actions associated with the event (i.e., until the full service is delivered). An event is called asynchronous, when the event source entity that raises the event continues with other activities, without waiting for its completion. When the destination entity receives events or service requests, it responds by sending back one or more events, which can then take appropriate actions. External events can be synchronous or asynchronous, but internal events must be raised only as asynchronous events, to avoid deadlocks [17].

    V. EXPERIMENTAL RESULTS

    In this section, we evaluate the performance of our proposed approach, and compare it to the performance of two existing approaches.

      >  A. Existing Scheduling Algorithms

    We compare the HSPN with the Min-min and Maxmin algorithms [18-21], briefly explained as follows.

    1) Min-min computes the minimum completion time for each task over all machines. The task with the minimum of these completion times is selected, and assigned to the corresponding machine. The newly mapped task is removed from its waiting queue, and the process repeats, until all tasks are mapped.

    2) Max-min is very similar to Min-min. The minimum completion time is calculated for each task, but the task with the maximum of these completion times is selected, and assigned to the corresponding machine.

    We use identical uniform parameters for the Min-min and Max-min algorithms, when comparing them with the HSPN.

      >  B. Testing Environment

    We implemented the HSPN using GridSim package in NetBeans IDE 6.0 [22], each of which has four homogenous processors and a 1,000 GB storage unit. However, the machine at each site is different from the machines at the other sites, so that each site is heterogeneous with respect to the others.

    There are 100 tasks that originate on each machine, and each task has an associated execution time, and execution cost. We used 100 tasks for each machine, because of the constraints of the underlying system environment; a greater number of tasks would not leave enough space in Java’s heap for the grids. Table 3 shows the task states: there are four states, which are based on the resources that need to be run. Each task has a certain number of

    instructions, selected as a uniform distribution on [100 MI, 110 MI]. There are eight resources, and the resources each task needs to execute its instructions are defined by a uniform distribution on the range [1,9], rounded up to an integer. Each resource has a queue of tasks that need that resource, and tasks are added to these queues, along with the number of instructions that comprise the tasks. So, when tasks are assigned, they are added to the resources that are needed. This means that each resource has a queue of tasks each which has an instruction number. All tasks are executed 100 times, to determine the best way to assign resources to tasks. The thinnest requests, for which it is easy to find resources, need only one or two resources, while the thickest requests need seven or eight.

    The 100 tasks originating from each machine are separated into four groups, as shown in Table 4. The requests for each group of 25 are ordered, from the thinnest request to the thickest request. The four groups represent the four different states that tasks can be in, and are ordered based on their ETC values. This provides material for four tests in the HSPN, with 100 iterations to provide the best solutions.

    The following section presents the results.

      >  C. Performance Metrics

    We used average response time and average response cost as metrics to evaluate the performance. These parameters are defined as follows:

    Average response time per request (ART): This metric evaluates the intervals between when a resource allocation request is received, and when the response is sent. The response time in HSPN is calculated as the sum of the times needed to pass through the three layers, and depends on the request; we can say that ART is the makespan for each task.

    Average response cost per request (ARC): We define the response cost as the delivered data cost or resource cost for a task. All resource scheduling algorithms consider this metric. They usually consider the closest resources to the task site whose costs are less than the cost indicated for the task. However, it is possible for some of these resources to have costs that are slightly greater than the indicated cost. For example, if T1 wants a resource that costs less than $300, we search for resources that cost less than $300 + ((1/100) × 300), or $303. The ARC is the cost that the schedulers find for a response accepted by the user. In the HSPN, the ARC is the sum of the costs of passing through the three layers, and depends on the request. Some requests do not need to be sent to the grid layer or the local layer, so their cost is less than the cost for other methods.

      >  D. Discussion of Results

    Fig. 3 illustrates the ART per request coming from each user. The plot shows that Min-min is better than both Max-min and the HSPN for the first 25 tasks, with the HSPN falling between Min-min and Max-min. We considered the maximum time to be 36,000 seconds, and the maximum cost to be $3,000. The times and costs increase from the 26th task to the 100th. As Fig. 3 shows, HSPN is near Min-min in ART, and it is better than Minmin for thick requests from the fourth class of the 25th task group. Moreover, the HSPN decreases the ART for

    Max-min by approximately 11%?12%. These results reflect the fact that some tasks found better resources, when searching the grid category.

    Fig. 4 presents the average allotted time for each task for the three algorithms, and shows that the HSPN consumes an average of more than 150 units of time for the tasks, compared to Min-min; this is due to the context switches between layers. However, the HSPN consumes over 1,500 units less than Max-min, and hence decreases its average time allocation by approximately 10%.

    Fig. 5 illustrates the ARC for each user’s requests. The plot shows that for the first 25 tasks, which do not mention cost and time, all three algorithms deliver high value for each task; but for the second group of tasks, we considered a maximum cost of $3,000. The scheduling should involve only the machines whose cost is less than the cost indicated for each task, which is Cost (Task) + ((1/100) ×

    Cost (Task)). As we can see, the HSPN decreases the cost by bounding the resources, but the decrease is not better than that obtained by Min-min. However, it is better than the cost obtained by Max-min. For thick requests, the performance of the HSPN is similar to Min-min, but sometimes it is better.

    This is clearer in Fig. 6, which shows the average costs for tasks in the three algorithms. As can be seen, the average cost for the HSPN is about 170 units higher than for Min-min; this is related to the HSPN’s context switch between layers. On the other hand, the HSPN decreases Max-min’s average cost for 100 tasks by approximately 200 units, or approximately 11%.

    VI. CONCLUSIONS AND FUTURE RESEARCH DIRECTIONS

    No method for scheduling and allocating resources is optimal for grid computing systems, because these systems are heterogeneous, dynamic, and expanding worldwide. This paper presented a three-level resource scheduling scheme for grid computing environments. The HSPN’s resource scheduling scheme is modeled and analyzed by a hierarchical SPN, with different SPN models scheduling the resources of different layers (i.e., the home scheduler, the local scheduler and the grid scheduler).

    We compared the HSPN with two major resource scheduling algorithms that are applicable in grid systems: Min-min and Max-min, and we used netbeans 6.0 IDE to analyze the results in a GridSim Jar file We used three queues for the schedulers, with Gridlet brokers sending requests between the layers. We evaluated the HSPN for 100 tasks, and found that it can decrease the execution time for tasks in on the order of 10%?11%, compared to Max-min. Moreover, the HSPN can be used for scheduling and allocating resources to tasks or subtasks. The HSPN can separate tasks based on their requested resources, but the other two methods cannot do so. Hence, the HSPN could see widespread use for scheduling in grid computing.

    In the future, we plan to extend and implement our method for dependent tasks based on artificial intelligence algorithms, in order to decrease the scheduling time, and the time needed for resource allocation. We plan to improve the HSPN, to provide better results in comparison to the Min-min method.

  • 1. Javadi B., Abawajy J. H., Akbari M. K. 2006 “Modelling and analysis of heterogeneous loosely-coupled distributed systems” google
  • 2. Koole G., Righter R. 2008 “Resource allocation in grid computing” [Journal of Scheduling] Vol.11 P.163-173 google
  • 3. Foster I., Sakellarius R. 2001 “The anatomy of the grid: enabling scalable virtual organizations,” Euro-Par 2001 Parallel Processing, Lecture Notes in Computer Science vol. 2150 P.1-4 google
  • 4. Wei X., Ding Z., Xing S., Yuan Y., Li W. W. 2009 “VJM: a novel grid resource co-scheduling model for parallel jobs” [International Journal of Grid and Distributed Computing] Vol.2 P.1-12 google
  • 5. Abawajy J. H. 2009 “Adaptive hierarchical scheduling policy for enterprise grid computing systems” [Journal of Network and Computer Applications] Vol.32 P.770-779 google
  • 6. Balbo G., Brinksma E. 2001 ”Introduction to stochastic Petri nets,” Lectures on Formal Methods and Performance Analysis, Lecture Notes in Computer Science vol. 2090 P.84-155 google
  • 7. Abawajy J. H., Dandamudi S. P. 2003 “Scheduling parallel jobs with CPU and I/O resource requirements in cluster computing systems” [in Proceedings of the 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems] P.336-343 google
  • 8. Abawajy J. H., Dandamudi S. P. 2003 “Parallel job scheduling on multicluster computing systems” [in Proceedings of the IEEE International Conference on Cluster Computing] P.11-18 google
  • 9. Yu J., Buyya R. 2006 “A budget constrained scheduling of workflow applications on utility grids using genetic algorithms” [in Proceedings of the Workshop on Workflows in Surpport of Large-Scale Science] P.1-10 google
  • 10. Han Y., Jiang C. J., Luo S. 2005 “Resource scheduling model for grid computing based on sharing synthesis of Petri net” [in Proceedings of the 9th International Conference on Computer Supported Cooperative Work in Design] P.367-372 google
  • 11. Han Y., Luo X. 2006 “Modelling and performance analysis of grid task scheduling based on composition and reduction of Petri nets” [in Proceedings of the 5th International Conference on Grid and Cooperative Computing] P.331-334 google
  • 12. Zhao X., Wang B., Xu L. 2007 “Grid application scheduling model based on Petri net with changeable structure” [in Proceeding of 6th International Conference on Grid and Cooperative Computing] P.733-736 google
  • 13. Han Y., Jiang C., Luo X., Chen G. 2005 “Resource scheduling scheme for grid computing and its Petri net model and analysis,” Parallel and Distributed Processing and Applications, Lecture Notes in Computer Science vol. 3759 P.530-539 google
  • 14. Hu Z., Hu R., Gui W., Chen J., Chen S. 2005 “General scheduling framework in computational grid based on Petri net” [Journal of Central South University of Technology] Vol.12 P.232-237 google
  • 15. Subramani V., Kettimuthu R., Srinivasan S., Sadayappan P. 2002 “Distributed job scheduling on computational grids using multiple simultaneous requests” [in Proceedings of 11th IEEE International Symposium on High Performance Distributed Computing] P.359-366 google
  • 16. Shojafar M., Barzegar S., Meybodi M. R. 2010 “A new method on resource scheduling in grid systems based on hierarchical stochastic Petri net” [in Proceedings of the 3rd International Conference on Computer and Electrical Engineering] P.175-180 google
  • 17. Buyya R., Murshed M. 2002 “GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing” [Concurrency and Computation: Practice and Experience] Vol.14 P.1175-1220 google
  • 18. Senthil Kumar B., Chitra P., Prakash G. 2009 “Robust task scheduling on heterogeneous computing systems using segmented MaxR-MinCT” [International Journal of Recent Trends in Engineering] Vol.1 P.63-65 google
  • 19. Wu M. Y., Shu W. W., Zhang H. 2000 “Segmented min-min: a static mapping algorithm for meta-tasks on heterogeneous computing systems” [in Proceedings of the 9th Heterogeneous Computing Workshop] P.375-385 google
  • 20. Braun T. D., Siegel H. J., Beck N., Boloni L. L., Maheswaran M. 2001 “A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing system” [Journal of Parallel and Distributed Computing] Vol.61 P.810-837 google
  • 21. Armstrong R., Hensgen D. A., Kidd T. 1998 “The relative performance of various mapping algorithms is independent of sizable variances in run-time predictions” [in Proceedings of the 7th IEEE Heterogeneous Computing Workshop] P.79-87 google
  • 22.
  • 23. Shojafar M., Barzegar S., Maybodi M. R. 2011 “Time optimizing in Economical Grid Using Adaptive Stochastic Petri Net Based on Learning Automata” [in Proceedings of International Conference on Grid Computing & Applications (GCA), Worldcomp] P.67-73 google
  • [Table 1.] Tasks assigned to two separate groups, for allocation to the resources
    Tasks assigned to two separate groups, for allocation to the resources
  • [Fig. 1.] The three layers for assigning tasks and subtasks.
    The three layers for assigning tasks and subtasks.
  • [Table 2.] Notation for the hierarchical stochastic Petri net model
    Notation for the hierarchical stochastic Petri net model
  • [Fig. 2.] Hierarchical stochastic Petri net model for resource scheduling.
    Hierarchical stochastic Petri net model for resource scheduling.
  • [Table 3.] Task states
    Task states
  • [Table 4.] Budget/time deadlines for the 100 tasks assumed in simulation and tests
    Budget/time deadlines for the 100 tasks assumed in simulation and tests
  • [Fig. 3.] Average response time (ART, or makespan) per request. HSPN: hierarchical stochastic Petri net.
    Average response time (ART, or makespan) per request. HSPN: hierarchical stochastic Petri net.
  • [Fig. 4.] Average time allocated for each task. HSPN: hierarchical stochastic Petri net.
    Average time allocated for each task. HSPN: hierarchical stochastic Petri net.
  • [Fig. 5.] Average response cost (ARC) per request. HSPN: hierarchical stochastic Petri net.
    Average response cost (ARC) per request. HSPN: hierarchical stochastic Petri net.
  • [Fig. 6.] Average cost for each task. HSPN: hierarchical stochastic Petri net.
    Average cost for each task. HSPN: hierarchical stochastic Petri net.