SSDRM: SemiPartitioned Scheduling Based on Delayed Rate Monotonic on Multiprocessor Platforms
 DOI : 10.5626/JCSE.2014.8.1.43
 Author: Senobary Saeed, Naghibzadeh Mahmoud
 Organization: Senobary Saeed; Naghibzadeh Mahmoud
 Publish: Journal of Computing Science and Engineering Volume 8, Issue1, p43~56, 30 March 2014

ABSTRACT
Semipartitioned scheduling is a new approach for allocating tasks on multiprocessor platforms. By splitting some tasks between processors, semipartitioned scheduling is used to improve processor utilization. In this paper, a new semipartitioned scheduling algorithm called SSDRM is proposed for multiprocessor platforms. The scheduling policy used in SSDRM is based on the delayed rate monotonic algorithm, which is a modified version of the rate monotonic algorithm that can achieve higher processor utilization. This algorithm can safely schedule any system composed of two tasks with total utilization less than or equal to that on a single processor. First, it is formally proven that any task which is feasible under the rate monotonic algorithm will be feasible under the delayed rate monotonic algorithm as well. Then, the existing allocation method is extended to the delayed rate monotonic algorithm. After that, two improvements are proposed to achieve more processor utilization with the SSDRM algorithm than with the rate monotonic algorithm. According to the simulation results, SSDRM improves the scheduling performance compared with previous work in terms of processor utilization, the number of required processors, and the number of created subtasks.

KEYWORD
Realtime systems , Scheduling algorithms , Delayed rate monotonic , Semipartitioned technique

I. INTRODUCTION
One of the significant issues in multiprocessor scheduling is how to better utilize processors. The scheduling algorithms are classified as global scheduling and partitioned scheduling. In global scheduling, only one queue exists for the entire system, and tasks can run on different processors. This scheduling algorithm has high overhead. On the other hand, in partitioned scheduling, each processor has a separate queue, and each task may only run on one processor.
The problem of partitioned scheduling is very similar to binpacking, which is known to be NPhard [1]. In recent years, a new approach called semipartitioned scheduling has been proposed. In this approach, most tasks are assigned to a single processor, while a few tasks are allowed to be split into several subtasks and assigned to different processors to improve utilization.
The delayed rate monotonic algorithm is a modified version of the rate monotonic algorithm, which can achieve higher processor utilization than the rate monotonic algorithm [2, 3]. In this algorithm, a new state called the delayed state is added to the system. All tasks except for the one with lowest priority enter this state as they make a request. It is proven that the delayed rate monotonic can safely schedule any system composed of two tasks with total utilization less than or equal to that on a single processor.
In this paper, the problem of scheduling tasks with fixed priority on multiprocessor platforms is studied. The proposed approach can increase the utilization of some processors to 100% based on the delayed rate monotonic policy. Our novel contributions are as follows:
First, it is formally proven that any task which is feasible under the rate monotonic algorithm will be feasible under the delayed rate monotonic algorithm as well. Then, we propose a new semipartitioned scheduling algorithm based on the delayed rate monotonic algorithm called SSDRM, which adapts a previous allocation method [4] with the delayed rate monotonic algorithm. The scheduling policy [4] is based on the rate monotonic algorithm. The use of the delayed rate monotonic algorithm makes SSDRM perform better against possible overload compared to the rate monotonic algorithm. After that, two improvements are proposed to change the allocation method of SSDRM in order to improve the processor utilization by using a new preassignment function and a new splitting function. Based on these two improvements, SSDRM can achieve full processor utilization in some special cases. Another advantage of the new preassignment function is the number of created subtasks that can be safely scheduled. The simulation results demonstrate that SSDRM improves the scheduling performance compared with previous work in terms of processor utilization, the number of required processors, and the number of created subtasks.
The rest of this paper is organized as follows. In Section II, related works are discussed, while in Section III, the system model is introduced. Sections IV and V are related to the allocation method and SSDRM. Finally, an evaluation and conclusions are discussed in Sections VI and VII.
II. RELATED WORKS
The first semipartitioned technique was used with the EDF algorithm for soft realtime systems [5]. Then, some studies worked on periodic tasks [4, 68]. An approach called PDMS_HPTS selects and splits the highest priority task in each processor [8]. The scheduler of processors in this paper is based on the deadline monotonic algorithm, with which the utilization is 60%.
In some other research, the processor utilization is raised up to the wellknown L&L bound (
L&L bound =n *(2^{1/n} – 1)) [7, 9, 10]. The distinctive points of these papers are the allocation method and definition of tasks. For instance, a semipartitioned scheduling algorithm called SPA2 was proposed [7] for periodic tasks, with which the processor utilization reaches the L&L bound under a rate monotonic policy. The allocation method used in this paper is a worstfit method, which selects a processor with the least total utilization of tasks assigned to it so far.Increasing the processor utilization beyond the L&L bound has also been studied in some papers. An approach called pCOMPATS was proposed [6] for multicore platforms. As the number of cores increases, this approach can achieve 100% utilization. When the utilization of each task does not exceed 0.5, this processor utilization is achieved. Therefore, tasks are divided into two groups: light tasks with utilization of lower than 0.5 and heavy tasks with utilization more than or equal to 0.5. Light tasks are assigning by a firstfit method. First, tasks are sorted in ascending order of their periods, and then allocation is done from the front of the queue towards the end. Heavy tasks are assigned like PDMS_HPTS. It means that the highest priority task is selected and then split. The least upper bound to the processor utilization for heavy tasks is 72%.
The scheduler of processors is based on the rate monotonic algorithm, and the scheduleability test used in this paper is Rbound [11]. This scheduleability test was proposed by Lauzac et al. [11] in 2003. Their approach is sufficient for the scheduleability test with good approximation.
The RMTS approach increases the processor utilization beyond the L&L bound [4]. As in another study [7], the worstfit method is used in the allocation of RMTS, but admission control in RMTS is based on response time analysis, so the processor utilization can be improved. There are also some other works which studied semipartitioned scheduling based on the earliest deadline first algorithm [12, 13].
The approach of this paper can also increase the processor utilization more than the L&L bound based on delayed rate monotonic policy. The overhead of task splitting is the same as in previous approaches [6, 8]. Based on these studies, the overhead of context switching due to task splitting is negligible. There are also some other approaches that are similar to delayed rate monotonic approach which improves the scheduleability of the rate monotonic algorithm. This group of scheduling algorithms studied the problem of fixed priority scheduling with a limitedpreemptive approach. For instance, an approach was proposed that improves the feasibility of tasks, so the processor utilization could be enhanced [14].
III. SYSTEM MODEL
In the proposed system, tasks are periodic, and their deadline parameters (i.e., relative deadline) are assumed to be equal to their periods. A request of task
_{i} (τ i = {1, ,n }) is called a job. Every taskτ _{i} is modeled by two parameters:Ci = the worstcase execution time required by task τi on each job. Ti = the time interval between two jobs of task τi.
Tasks are considered hard realtime (as opposed to soft realtime [3]). Every job of task
τ _{i} should be completed before the next job of the same task arrives. The response time of a job is the time span from the job arriving up to its execution completion.Liu and Layland [15] proved that the worst response time of task
τ _{i} occurs when it simultaneously requests with all higher priority tasks. Taskτ _{i} is feasible if its worst response time is lower than or equal to its period. The utilization of taskτ _{i} is defined by . In this system, we want to assign task set Γ = {(C _{1},T _{1}), (C _{2},T _{2}), ..., (C_{n} ,T_{n} )} to m processors, {M _{1}, ...,M_{m} }.DEFINITION 1 .The total utilization of task set Γ is defined by: DEFINITION 2. The processor utilization of M_{i}, i = {1, , m} is the total utilization of task set Γ', which is assigned to M_{i}. DEFINITION 3. The average processor utilization of task set Γ with m processors is: A semipartitioned scheduling includes two phases: partitioning and scheduling. In the partitioning phase, all tasks are assigned to the processors. If a task cannot be entirely accommodated by a processor, it is split into two subtasks. These tasks are called splittasks. The other tasks which are entirely assigned to a processor are called nonsplit tasks. Nonsplit tasks always run on a single processor.
The second phase of a semipartitioned scheduling is the policy to determine how to schedule assigned tasks on each processor. The scheduler used within each processor is based on the delayed rate monotonic algorithm, which is an improved version of the rate monotonic algorithm [15]. In the rate monotonic algorithm, every task with a lower period has higher priority. Further, this algorithm is preemptive. So, the lowest priority task has the highest probability for overrun. In the delayed rate monotonic algorithm, all tasks have a secure delay except for the task with the lowest priority. Therefore, in this scheduler, these two queues are defined for tasks:
Ready: a task with the lowest priority directly enters this queue, and other tasks enter when their delays are over. Delay: all tasks except for the lowest priority one enter this queue as they make a request.
A task in the ready queue has higher priority over all tasks in the delay queue. If no tasks exist in the ready queue, then a task is selected from the delay queue to run next. Tasks in both ready and delay queues are scheduled based on the rate monotonic algorithm. Suppose tasks are sorted in ascending order of their periods. Now, Formula (3) calculates the delay of each task.
For
τ _{1}, with the shortest period, the delay is equal to:T _{1} –C _{1}For
τ _{i} ∀ 2 ≤i ≤n – 1, the delay is equal to:Parameter
α is a constant used to adjust the amount of delay imposed on tasks. It can be a value varying between [0, 1]. The delay of tasks is the maximum ifα is equal to one.It should be mentioned that the concept of delay used by the delayed rate monotonic algorithm is different from zero laxity [16]. If
T_{i,k} is the period of thek th job of taskτ _{i} andC_{i,k} (t ) is the remaining worstcase execution time of taskτ _{i} in timet , then the laxity of taskτ _{i} in timet is defined asT_{i,k} – (t +C_{i,k} (t )). When this laxity is equal to zero, it is called zero laxity. A task with zero delay (i.e., a task whose state has to change from delay to ready) is not necessarily a zerolaxity task.IV. ALLOCATION METHOD
An approach called RMTS was introduced in another study [4] for allocating tasks on multiprocessor platforms with a semipartitioned technique and the rate monotonic algorithm. In this paper, the scheduleability test is based on the response time analysis (RTA) [17]. The worst response time of a task under the rate monotonic algorithm can be calculated by Formula (4). Suppose tasks are sorted in ascending order of their periods. Now, the worst response time of task
τ _{i} is:This iteration is terminated when then task
τ _{i} (C_{i} ,T_{i} ) is feasible under the rate monotonic algorithm. In the following, to check the feasibility of tasks, we demonstrate an example.EXAMPLE 1. Task set Γ includes three tasks with the parameters shown in Table 1. Now, we want to check whether this task set is schedulable under the rate monotonic algorithm or not.● First,
τ _{1} (30, 125).● Second,
τ _{2} (48, 130).● Third,
τ _{3} (92, 275).The worst response time of all tasks is lower than their periods. Therefore, task set Γ is schedulable under the rate monotonic algorithm.
Tasks in RMTS are classified into two groups.
● Light: task
τ _{i} is light when● Heavy: task
τ _{i} is heavy whenΘ ( )is the L&L bound for task setτ .τ > A. Preassignment Function for Heavy Tasks
First, tasks are sorted in descending order of their periods, and then a preassignment function is done from the end of the queue towards the front. In the preassignment function, the heavy tasks which satisfy Condition (5) should be separated and assigned to processors in the preassignment phase. These processors are called preassign processors. Suppose
τ_{i} (C_{i} ,T_{i} ) is a heavy task.Λ(
τ ) in Condition (5) determines the parametric utilization bound (e.g., L&L is one of the parametric utilization bounds for rate monotonic algorithm). Condition (5) mentions that taskτ_{i} (C_{i} ,T_{i} ) can be preassigned if all lower priority tasks can be accommodated by all normal processors. Normal processors are processors which do not have a preassign task.> B. Assigning Normal Tasks
After the preassignment phase, it is time for the remaining tasks. These tasks are called normal tasks. Based on the sorting method, the lowest priority task is in front of the queue. Now, a processor is chosen among normal processors with the worstfit method. It means the processor with the lowest load factor is chosen for assigning the current task. If all tasks (including the current task) in this processor can be feasible and all can meet their deadlines, we can add the entire task to this processor. As mentioned, such a task is called a nonsplit task. If an infeasible task exists, then the current task should be split. This task is called a splittask. The first part is added to the processor, and the second part is put back in front of the queue. After adding the first part to a processor, no other task or subtask is assigned to this processor.
If no normal processor exists, then a preassign processor is chosen. In this case, the preassign processor which has a task with the largest period is chosen for assigning.
This process continues. If the queue of tasks is empty, we can say that all tasks are successfully assigned to processors. Algorithm 1 shows the pseudocode of the allocation method of RMTS.
It is easy to derive the following property from Algorithm 1.
LEMMA 1. Suppose task τ_{i} is split into several subtasks Now, each subtask ,j={1, , q–1} is the highest priority task in its own processor. The proof follows from the method of splitting tasks.Proof. Now, under the rate monotonic algorithm, the first subtask of a splittask will start its execution as soon as it makes a request, and it will run to the end without any interruption. Other subtasks of a splittask should wait until the prior subtasks are completed before starting to execute. For this concept, the term of release time is used [4]. It means that the release time of each subtask of a splittask is equal to the total worst response time of prior subtasks. This release time is calculated as shown in Formula (6).
Suppose task
τ_{i} (C_{i} ,T_{i} ) is split into several subtasks and each subtask is assigned to a different processor. Now, the release time of subtask is:These subtasks need to be synchronized to execute correctly. For example, subtask cannot start its execution until prior subtasks, are finished. According to Lemma 1, the worst response time of a subtask is equal to its worstcase execution time. However, for the last subtask, this may not hold.
A binary search can be done for finding the maximum value for the worstcase execution time of the first subtask of a splittask. With this allocation method, it was proven that all tasks including nonsplit tasks and splittasks are feasible under the rate monotonic algorithm [4].
V. SSDRM
In this section, we propose a new semipartitioned scheduling algorithm based on a delayed rate monotonic algorithm called SSDRM, which adapts the allocation method of RMTS to the delayed rate monotonic algorithm. In the delayed rate monotonic algorithm, all tasks except for the lowest priority one enter the delay queue before moving to the ready queue. Each task stays for a different time interval in the delay queue. To prove that the feasibility of tasks is exactly the same as in the rate monotonic algorithm, the delay of tasks has been changed in this proposal.
As mentioned, tasks in RMTS are divided into several groups. In a general classification, we have nonsplit tasks and splittasks.
Nonsplit tasks: these tasks are each entirely assigned to a processor. Splittasks: these tasks are split into several subtasks and each assigned to a different processor. Considering Formula (6) and Lemma 1, the release time of a subtask is equal to the total worstcase execution time of prior subtasks.
Both heavy tasks and light tasks belong to the same groups, i.e., nonsplit tasks. It means a nonsplit task can be heavy or light.
Now, three types of delay are introduced in SSDRM
For τn, with the largest period, delay is equal to zero. For τi ∀ 1 ≤ i ≤ n – 1
If
τ_{i} is a nonsplit task, then delay is equal to:where
R_{i} is the worst response time of taskτ_{i} and is calculated by Formula (4). Suppose task set Γ = {τ _{1}, …,τ _{n} } is safely scheduled by the rate monotonic algorithm. Now, the delay of taskτ_{i} ,i = {2, …,n – 1} rarely becomes zero. On the other hand, the worst response time of taskτ_{i} is rarely equal to its period. The delay of taskτ _{1} with the highest priority is not equal to zero unlessU _{1} = 1.Otherwise,
τ_{i} is thej th subtask of a splittask, and the delay is equal to:Considering Algorithm 1, processors are occupied in parallel, so we can express Lemma 2.
LEMMA 2. The lowest priority task in each processor is a nonsplit task. Considering Phase 1 of Algorithm 1, a heavy task is entirely assigned to a processor if all lower priority tasks can be accommodated by all normal processors. Therefore, preassign tasks are the lowest priority tasks in their own processors. Further, they are nonsplit tasks.Proof. According to Phase 2 of Algorithm 1, normal processors are chosen by a worstfit method and accommodated in parallel, so one nonsplit task exists in each normal processor. The nonsplit task is the lowest priority task in each normal processor, because of the sorting method. Therefore, a nonsplit task exists in each processor in which it is the lowest priority task.
Now, by Lemma 3, we show that nonsplit tasks which are feasible under the rate monotonic algorithm will be feasible under the SSDRM algorithm as well.
LEMMA 3. Suppose task τ_{i} (C_{i}, T_{i}) is a nonsplit task and it is feasible under the rate monotonic algorithm. This task is feasible under the SSDRM algorithm as well. According to Lemma 2, the lowest priority task in each processor is a nonsplit task. The lowest priority task in SSDRM has no delay and directly enters the ready queue. We assumed in Lemma 3 that taskProof. τ_{i} is feasible under the rate monotonic algorithm, and it means thatR_{i} ≤T_{i} . In SSDRM, scheduling in both the ready and delay queues is done based on the rate monotonic algorithm. Therefore, ifτ_{i} is the lowest priority task, then it is feasible under SSDRM, because it directly enters the ready queue.For other nonsplit tasks, the delay is considered as in Formula (7). As mentioned earlier, scheduling in the ready queue is based on the rate monotonic algorithm, so the worst response time of task
τ_{i} in the ready queue is equal toR_{i} . Therefore, the maximum response time of taskτ_{i} under SSDRM,R_{iSS DRM} , is equal toDelay_{i} +R_{i} .τi is feasible under rate monotonic → Ri ≤ Ti Delayi = Ti –Ri RiSS–DRM = Ti –Ri+Ri
The worst response time of task
τ_{i} is at most equal to its period. So, this task is feasible under the SSDRM algorithm.Therefore, it is proven that all nonsplit tasks, which are feasible under the rate monotonic algorithm, are feasible under SSDRM as well. So, the response time analysis can be used for admission control of SSDRM.
In SSDRM, based on the delayed rate monotonic algorithm, a nonsplit task which is in the delay queue can run next if the ready queue is empty. According to Formula (8), the delay of a splittask is equal to the total worstcase execution time of prior subtasks. It means that with the delay of subtasks, we express the concept of release time used in another study [4]. Now, a splittask in SSDRM can run only if its delay is completely finished and it has entered the ready queue.
This is done to synchronize subtasks of a splittask. In this work, the same behavior for a splittask as it is regarded in RMTS is considered. More details on the feasibility of the splittasks are presented in a previous study [4]. In this work, to use the advantages of the delayed rate monotonic algorithm, such as its resilience against possible overload, the allocation method of RMTS is adapted to this algorithm.
> A. Changing Assigning Method
To improve the processor utilization under the SSDRM algorithm, we express two improvements. First, Theorem 1 is stated.
THEOREM 1. Two tasks, τ_{1} and τ_{2}, where τ_{1} has higher priority and τ_{2} is not the first subtask of a splittask, are feasible under SSDRM if and only if U_{1} + U_{2} ≤ 1. As mentioned earlier, the delayed rate monotonic algorithm can safely schedule any system composed of two tasks with total utilization less than or equal to one on a single processor [2]. The proof is based on the fact that taskProof. τ _{1} (C _{1},T _{1}) is at most delayed forT _{1} –C _{1}. With SSDRM, three types of systems composed of two tasks exist.First, a system composed of two nonsplit tasks. Suppose two nonsplit tasks τ1(C1, T1) and τ2(C2, T2) are assigned to a processor, T2 > T1 and U1 + U2 ≤ 1. According to Relation (7), the delay of task τ1 under SSDRM is equal to T1 – R1. Considering Formula (4), the worst response time of the highest priority task is equal to its worstcase execution time, so the delay of task τ1 is equal to T1 – C1, and this system is scheduled safely under SSDRM. The second type of system is composed of two tasks and includes a nonsplit task τ1 (C1, T1) and a splittask in which is the last subtask of splittask τ2.
To illustrate this system, suppose
τ _{2} is split into two subtasks and The first subtask is assigned to a processor. Now, we try to assign the second subtask to processorM_{a} . The processorM_{a} consists of a nonsplit taskτ _{1}(C _{1},T _{1}), which is assigned earlier. According to Relation (8), the delay of subtask in processorM_{a} is equal to , which is less thanWe know that and in realtime systems the worstcase execution time of a task is at most equal to its period. Now, we have:
Therefore, the delay of subtask in processor
M_{a} is lower than , which is considered in a previous study [2]. So, based on the delayed rate monotonic algorithm, two tasksτ _{1} and are feasible under SSDRM, if and only ifIn Theorem 1, it is expressed that because of the third type of system, the task
τ _{2} should not be a first subtask of a splittask.● The third type of a system composed of two tasks is a system which consists of a nonsplit task
τ _{1}(C _{1},T _{1}) and the first subtask of a splittask,.The delay of this subtask is equal to zero and it directly enters the ready queue. The fact that their total utilization is less than one does not guarantee the feasibility of these tasks under SSDRM. Therefore, this state is separated in Theorem 1.
Now, we use Theorem 1 to express two improvements in allocation method of SSDRM. These improvements make SSDRM achieve higher processor utilization than RMTS. The first improvement is a new preassignment function, and the second improvement is a new splitting function. In the new preassignment function of SSDRM, we try to fully utilize the existing processors by assigning two tasks with total utilization in a specific bounds to each processor.
For each task
τ_{i} ,i = {1, ...,n } withU_{i} ≥ 0.5, if another task exists, i.e.,τ_{j} ,j = {1, ...,n } −i , which satisfies Condition (9), then tasksτ_{i} andτ_{j} are entirely assigned to a processor, and no other task or subtask is assigned to this processor.Condition (9) determines the specific bounds of processor utilization of preassign processors. The lower bound, δ, is
Θ (2) = 2 * (2^{0.5} – 1) ≈ 0.82. The best value for threshold δ in our evaluations is equal to 0.95. It should be mentioned that if very many tasksτ_{j} exist in whichδ ≤U_{i} +U_{j} ≤ 1, then the best result which has the largest total utilization is chosen. Algorithm 2 illustrates the pseudocode of the allocation method of SSDRM which uses the new preassignment function.First, we try to find some groups of two tasks, each with total utilization satisfying Condition (9). Then, each group is assigned to a single processor. These processors are removed from the queue of processors, and hence, no other task or subtask is assigned to these processors.
After the preassignment function, it is time for remaining tasks. These tasks are assigned to the remaining processors, just like in the assignment method used in RMTS.
From Algorithm 2, it is easy to derive the following property.
LEMMA 4. The composition of our new preassignment function and the allocation method of RMTS preserves the feasibility of the system, similarly to a previous study [4]. Therefore, Theorem 2 can be stated.
THEOREM 2. If all tasks in task set Γ are successfully partitioned by the allocation method of SSDRM on m processors and scheduled using the delayed rate monotonic algorithm policy, then all tasks can meet their deadlines. Theorem 2 can be proven by noting that the feasibility of two tasks with total utilization less than or equal to one are guaranteed based on Theorem 1 and Type 1, since these tasks are entirely assigned to the processors and considered as nonsplit tasks. The feasibility of other tasks which are assigned using the allocation method of RMTS are guaranteed based on the exact timing analysis method [4].Proof. Therefore, using this preassignment function, some processors exist in which the processor utilization is raised up to 100%. Another advantage of the new preassignment function is the number of created subtasks. The number of created subtasks is at most m – 1 [4]. Based on the new preassignment function, after assigning two tasks to one processor, no other task or subtask is assigned to this processor. Therefore, the number of created subtasks in SSDRM is smaller than that of RMTS, and due to the number of created subtasks, the additional overhead of the system is reduced.
The second improvement proposed for the allocation method of SSDRM in order to improve the processor utilization is a new task splitting function. This function uses the second type of system composed of two tasks, which is expressed in Theorem 1. Two assumptions exist in our splitting function.
Assumption 1. The current task in front of the queue of unassigned tasks is a subtask, i.e., ,j ∈ {2, ...,q – 1}, is the last subtask of splittaskτ_{i} , and this subtask cannot be entirely accommodated by the host processor.Assumption 2. The host processor includes only one nonsplit taskτ_{x} , which is assigned earlier.If these two assumptions hold, then tasks
τ_{x} and are not feasible under the rate monotonic algorithm. Therefore, considering Algorithm 1 and line 25, the subtaskj ∈{2, ...,q – 1} , should be split into two subtask and .Now, in the new splitting function of SSDRM, and such a situation, if the total utilization of two tasks,
τ_{x} and , is larger than one, then the worstcase execution time of subtask is obtained in which the processor utilization of the host processor is equal to one, Otherwise, when the total utilization of two tasks,τ_{x} and , is lower than one, two tasks,τ_{x} and , are feasible under SSDRM based on Theorem 1, so is returned.Algorithm 3 demonstrates the pseudocode of the new splitting function of SSDRM. Three parameters are inputs of this algorithm: first, the current task; second, a task set Γ' which illustrates all assigned tasks to the host processor; and third, the delay of the current task.
As mentioned earlier, the delay of nonsplit tasks is calculated by Formula (7). This delay is calculated in the scheduling phase when all tasks are clearly assigned to the processors. Therefore, in the partitioning phase, the delay of nonsplit tasks is assumed to be zero (line 2 of Algorithm 1). On the other hand, for splittasks, the delay is calculated in the partitioning phase, so their delay is not equal to zero (line 29 of Algorithm 1).
The longest value for the worstcase execution time of the current task by which all tasks in Γ' are feasible is the output of this algorithm. Based on this function, in some processors, SSDRM can raise the processor utilization to 100%.
In line 6 of Algorithm 3, we check two assumptions. First, only one task exists in the host processor;
n′ = 1. Second, the current task is not the first subtask of a split task;Delayi ≠ 0. If these two assumptions are correct andU (Γ' ) +U_{i} > 1, then the worstcase execution time of subtask is calculated, by which the processor utilization of the host processor is equal to one.In the following, we illustrate an example to show the performance of our splitting function.
EXAMPLE 2. There are two processors,M _{1} andM _{2}, and three tasks, as shown in Table 2. All of these tasks are heavy, but only two tasksτ _{2} (36, 64) andτ _{1} (60, 100) are preassigned to processorsM _{1} andM _{2}, respectively. Now, taskτ _{3} is in front of the queue of unassigned tasks. No normal processor exists, so a preassign processor is selected to assign the current task. Therefore, a processor which has a task with the largest period is chosen for assigning taskτ _{3}, and hence, processorM _{2} is selected.According to row two of Table 3, the current task,
τ _{3}, cannot be entirely accommodated by processorM _{2}, and we should split taskτ _{3} into two subtasks andFor finding a suitable value for , the splitting function is called. This subtask is the first subtask of splittask
τ _{3}, and its delay is equal to zero. We express this state as the third type of system in Theorem 1. As mentioned, this state is separated from Theorem 1, and only a simple binary search is done to find the worstcase execution time of subtask . According to the binary search, this value is equal to 18. Therefore, subtask (18, 48) is assigned to the processorM _{2} and no other task or subtask is assigned to this processor. After this step, subtask (22, 48) is in front of the queue of unassigned tasks. The processorM _{1} is chosen for assigning this subtask. Taskτ _{1} (36, 64) exists in this processor. Considering Table 3 and row four, this subtask cannot entirely be accommodated byM _{1}.Therefore, it should be split into two subtasks ( ,
T _{3} ) and , and the splitting function is called again to find the suitable value for . This subtask is the second subtask ofτ _{3}, and its delay is equal to 18. So, the second type of system in Theorem 1 is obtained here. It means that two tasksτ _{1} (36, 64) and ( ,T _{3} ) are feasible under SSDRM if and only if Therefore:The output of the splitting function of SSDRM is equal to 21. It means the value of the worstcase execution time of is obtained as 21, and the processor utilization of
M _{1} is equal to one.If we use a simple binary search which has to be used for RMTS, the worstcase execution time of reduces to 14. In this case, the processor utilization of
M _{1} increases from 0.854 to 1 using the new splitting function and SSDRM algorithm.VI. EVALUATION
In this section, we investigate the performance of SSDRM compared with three prior works. In our evaluations, three different kinds of task sets are randomly generated. The distribution of our random function is uniform. The details of each task set are shown in Table 4.
The total utilization of each task set,
U (Γ), is a random value between[
L&L bound(m)_{m→∞} * V_{i}, V_{i} ].We assume that:
L&L bound(m)_{m→∞} = 0.7where
V_{i} determines the maximum total utilization of each task set and its value will be one of the following values of the following set:V_{i} ∈{4, 8, 16, 32, 64, 128}For instance, when
V_{i} is equal to 4, it means that a task set = {(C _{1},T _{1}), (C _{2},T _{2}), , (C_{n} ,T_{n} )} is generated in which As shown in Table 4, for eachV_{i} , 5000 task sets with random utilization are generated. The number of processors is assumed to be variable so that these task sets can completely be assigned by all scheduling algorithms. Indeed, by assigning these task sets to lower processors, higher utilization can be achieved.We define another parameter called the average processor utilization to evaluate each algorithm. This parameter is calculated by Formula (10).
First, we compare SSDRM with SPA [7] and RMTS [4]. In the first row of Table 4, the features of generated task sets are illustrated. For instance, the utilization of each task is randomly selected from 0.01 to 1 and consists of both light tasks and heavy tasks.
The total number of required processors in the system are demonstrated in Fig. 1(a) and Table 5 through the three aforementioned algorithms. Fig. 1(b) and Table 6 show the total number of subtasks which are created by these three algorithms. As shown in Fig. 1(a) and Table 5, by increasing the value of
V_{i} , the difference of the total number of required processors in SSDRM compared with RMTS or SPA is decreasing. On the other hand, the difference of created subtasks is increasing as well. For instance, whenV_{i} is equal to 64, then the total number of required processors in SSDRM is 0.43% lower than that of RMTS. On the other hand, the total number of created subtasks under SSDRM, whenV_{i} is equal to 64, is 94% lower than RMTS.Considering Fig. 1(b) and Table 6, the difference of the total number of created subtasks under SSDRM compared with RMTS or SPA is remarkable. As mentioned earlier, the overhead of task splitting is the same as previous works. Therefore, by creating a lower number of subtasks, the overhead of task splitting is reduced in comparison to RMTS and SPA. This difference occurs because of the preassignment function of SSDRM. As
V_{i} is increasing, the usage of the preassignment function of SSDRM is increasing as well. For instance, whenV_{i} is equal to 8 and 32, then 32% and 55% of existing processors are preassigned, respectively. These processors include two nonsplit tasks in which the processor utilization is between [0.95, 1], and they are scheduled safely with the SSDRM algorithm. In these evaluations, the threshold δ is assumed to be 0.95.Table 7 illustrates the average processor utilization of three algorithms. It can be observed that SSDRM significantly achieves better average processor utilization in comparison to SPA and RMTS. The average processor utilization of SSDRM is about 0.58% and 10% more than RMTS and SPA, respectively.
The average number of callings of our new splitting function in all simulations is about 1.1% of the total number of processors. It means that the processor utilization of 1.1% of the processors is almost equal to one. These processors have two tasks: a nonsplit task and a splittask. The splittask is not the first part of the task which is split. Their safety is assured by Theorem 1.
The second row in Table 4 shows tasks with random utilization between 0.01 and 0.49 in order for SSDRM to be comparable with pCOMPATS [6]. As mentioned earlier, tasks in pCOMPATS are divided into two groups: light and heavy. For each group, a separate approach is proposed. A firstfit method is used for allocating light tasks (tasks with utilization lower than 0.5).
In this approach, the lowest priority task is split. pCOMPATS uses RBound for the scheduleability test to cluster compatible tasks together. Table 8 shows the average processor utilization, and Fig. 2(a) shows the total number of required processors. As shown in Fig. 2(a), the performance of pCOMPATS is a little better than SSDRM. Since the utilization of each task is less than 0.5, it is hard to find a system composed of two tasks in which the total utilization of a system is near one. Therefore, the preassignment function of SSDRM is not used much. But the largest number of required processors in pCOMPATS is more than the largest number of required processors in SSDRM. Details of the largest number of required processors for each approach are shown in Table 9.
NT_{Largest} is the number of task sets which are successfully partitioned onP_{Largest} processors. For instance, the range of required processors whenV_{i} is equal to 128 is illustrated in Fig. 3.The total number of created subtasks by two approaches is shown in Fig. 4(a). Considering Fig. 4(a), the total number of created subtasks by SSDRM is more than pCOMPATS, since pCOMPATS uses the firstfit method to allocate compatible tasks to one processor. The disability of pCOMPATS to partition tasks with utilization larger than 0.5 is the main challenge of this approach. The third group of generated task sets is used to compare SSDRM with pCOMPATSHT. In [6], pCOMPATSHT is proposed to assign heavy tasks (tasks with
U_{i} ≥ 0.5).In pCOMPATSHT, tasks are sorted in descending order of their periods. First, one task is assigned to each empty processor. If no empty processor is found, and the current processor is not full, then the current task is split into two subtasks. According to the sorting method, the highest priority task is split. Fig. 2(b) shows the total number of required processors under SSDRM and pCOMPATS HT.
As shown in Fig. 2(b), the difference of the total number of required processors between two approaches is remarkable. The total number of created subtasks is more proof which indicates the superiority of SSDRM compared to pCOMPATSHT. The total number of created subtasks by two approaches is demonstrated in Fig. 4(b).
According to the utilization of each task, most processors have two tasks. The total utilization of two tasks is at least equal to one. Therefore, the preassignment function of SSDRM is not useful here. On the other hand, the splitting function of SSDRM can be used to split a task in order to fully utilize processors. For instance, when
V_{i} is equal to 32, the processor utilization of almost 20% of the processors is raised up to 100% using the splitting function of SSDRM. Table 10 illustrates the average processor utilization. Considering Table 10, the average processor utilization is increased by about 11% using SSDRM.For evaluation of the delayed rate monotonic algorithm against overload, we randomly generated 1000 task sets for each test. These task sets are generated so that they can be completely assigned to the processors. The allocation method of RMTS is used for assigning tasks. For producing overload, a task is randomly selected, and its execution time is somewhat increased. Then, the executions of these task sets are simulated under both the rate monotonic algorithm and the delayed rate monotonic algorithm.
For the first evaluation, a task is randomly selected, and its execution time is enlarged so that its overload factor is increased by 0.1, 0.2, and 0.3 for three different tests. At the end, the success ratio average of these tests is computed for the corresponding number of processors. Fig. 5(a) shows the result of this evaluation. As shown in Fig. 5(a), the success ratio of two algorithms when the number of processors is high becomes closer. This happened because the number of processors which are not overloaded is increased as well.
In the second evaluation, for each processor, a task is randomly selected, and the overload factor is added to its worstcase execution time. Fig. 5(b) shows the result of this evaluation for 32 processors. The success ratio of two algorithms converges together with increasing overload factor, because the number of processors with utilization larger than one is increased as well.
VII. CONSLUSION
In this paper, a new scheduling algorithm called SSDRM was proposed by combining the delayed rate monotonic algorithm with a semipartitioned technique. SSDRM can safely schedule any task set which is feasible under the rate monotonic algorithm.
Our proposed approach can increase the processor utilization beyond the wellknown L&L bound. SSDRM can also be more resilient against possible overload compared with the rate monotonic algorithm. Further, it achieves the same utilization. Then, we expressed two improvements to achieve higher processor utilization in some special cases under the SSDRM algorithm. In our future works, further improvements will be sought to change the allocation method of SSDRM to improve the processor utilization.

[Table 1.] Period and worstcase execution time of tasks

[Table 2.] Period and worstcase execution time of tasks

[Table 3.] Check feasibility of tasks using response time analysis

[Table 4.] Features of each test

[Fig. 1.] Experimental result based on variable number of processors for investigating (a) the total number of required processors (b) the total number of created subtasks.

[Table 5.] Total number of required processors

[Table 6.] Total number of created subtasks

[Table 7.] Average processor utilization

[Table 8.] Average processor utilization

[Fig. 2.] Evaluation of SSDRM based on the number of required processors in comparison with (a) pCOMPATS (b) pCOMPATSHT.

[Table 9.] The largest number of required processors

[Fig. 3.] The range of required processors when Vi is equal to 128.

[Fig. 4.] Evaluation of SSDRM based on the number of created subtasks in comparison with (a) pCOMPATS (b) pCOMPATSHT.

[Table 10.] Average processor utilization

[Fig. 5.] Resistance of two algorithms against overload in which producing overload on (a) one task from the entire system (b) one task from each processor.