1 Introduction
Due to higher computational power demands in modern computing systems, e.g. sensor networks, satellites, multirobot systems, as well as personal embedded electronic devices, a system’s power consumption is significantly increase. Therefore, an efficient energy management protocol is required in order to balance the power consumption and workload requirement of a system. With this motivation, this paper is aimed at using an optimizationbased approach to develop an algorithm for a multiprocessor realtime scheduling problem with the goal of minimizing energy consumption.
The dynamic power consumption of a CMOS processor is often represented as a function of both clock frequency and supply voltage [1]. Thus, a dynamic voltage and frequency scaling (DVFS) scheme is often applied to reduce processor power consumption. DVFS is now commonly implemented in various computing systems, both at hardware and software levels. Examples include Intel’s SpeedStep technology, AMD’s PowerNow! technology and the Linux Kernel Power Management Scheme.
1.1 Terminologies and Definitions
This section provides basic terminologies and definitions used throughout the paper.
Speed : The speed of a processor is defined as the ratio between the operating frequency and the maximum system frequency , i.e. . We denote with the minimum execution speed of a processor.
Task : A task is defined as a triple ; is the required number of processor cycles, is the task’s deadline and is the task’s period. If the task’s deadline is equal to its period, the task is said to have an ‘implicit deadline’. The task is considered to have a ‘constrained deadline’ if its deadline is not larger than its period, i.e. . In the case that the task’s deadline can be less than, equal to, or greater than its period, it is said to have an ‘arbitrary deadline’.
Job : A job is defined as the instance of task , where . arrives at time , has the required execution cycles and a deadline at time .
Taskset: The taskset is defined as a set composed of realtime tasks.
Minimum Execution Time : The minimum execution time of a task is the execution time of the task when the task is executed at , .
Task Density : The task density of a task executed at a speed is defined as the ratio between the task execution time and the minimum of its deadline and its period, i.e. . When all tasks are assumed to have an implicit deadline, this is often called ‘task utilization’.
Taskset Density : The taskset density is defined as the summation of all task densities in the taskset, i.e. , where is a number of tasks in the taskset. The minimum taskset density is given by .
Feasibility Optimal: Feasibility optimality of a homogeneous multiprocessor system is obtained when the upper bound on the minimum taskset density , for which the algorithm is able to construct a valid schedule such that no deadlines are missed, equals the total number of processors in the system.
Scheduling Scheme
: Multiprocessor scheduling can be classified according to the task migration scheme
^{1}^{1}1A task migration occurs when a task execution is suspended on one processor and continues on another processor, while preemption is used when the execution of a task on a processor is suspended in order to start executing another task.: a ‘global scheduling scheme’ allows task migration between processors and a ‘partitioned scheduling scheme’ does not allow task migration.1.2 Related Work
There are at least two wellknown feasibility optimal homogeneous multiprocessor scheduling algorithms of an implicit deadline taskset that are based on a fluid scheduling model: Proportionatefair (Pfair) [2] and Largest Local Remaining Execution Time First (LLREF) [3]. Both Pfair and LLREF are global scheduling algorithms. A fluid model, shown in Figure 1, is the ideal schedule path of a task , where the remaining minimum execution time is represented by a straight line and the slope of the line is the task execution speed .
By introducing the notion of fairness, Pfair ensures that at any instant no task is one or more quanta (time intervals) away from the task’s fluid path. However, the Pfair algorithm suffers from a significant runtime overhead, because tasks are split into several segments, incurring frequent algorithm invocations and task migrations.
To overcome these disadvantages, the LLREF algorithm preempts a task at two scheduling events within each time interval [3]. One occurs when the remaining time of an executing task is zero and the other event happens when the difference between a deadline and a remaining execution time of a task is zero.
By incorporating a DVFS scheme with a fluid model, [4] proposed the realtime static voltage and frequency scaling (RTSVFS) algorithm, which allows the slope of a fluid schedule to vary between 0 and 1. To improve the performance of the openloop static algorithm of [4], a closedloop dynamic algorithm is proposed in [5]. Their dynamic algorithm considered uncertainty in the task execution time. By extending [4], an energyefficient scheduling algorithm for realtime tasks with unknown arrival time, known as a sporadic realtime task, was proposed in [6].
Deadline Partitioning (DP) [7] is the technique that partitions time into intervals bounded by two successive task deadlines, after which each task is allocated the workload and is scheduled at each time interval. A simple optimal scheduling algorithm, called DPWRAP, was presented in [7]. The DPWRAP algorithm partitions time according to the DP technique and, at each time interval, the tasks are scheduled using McNaughton’s wrap around algorithm [8]. McNaughton’s wrap around algorithm aligns all task workloads along a number line, starting at zero, then splits tasks into chunks of length 1 and assigns each chunk to the same processor.
However, the algorithms that are based on the fairness notion [3, 4, 5, 7, 9, 10] are schedulability optimal, but have hardly been applied in a real system, since they suffer from high scheduling overheads, i.e. task preemptions and migrations. Recently, two schedulability optimal algorithms that are not based on the notion of fairness have been proposed. One is the RUN algorithm [11], which uses a dualization technique to reduce the multiprocessor scheduling problem to a series of uniprocessor scheduling problems. The other is UEDF [12], which generalises the earliest deadline first (EDF) algorithm to multiprocessors by reducing the problem to EDF on a uniprocessor.
Alternatively to the above methods, the multiprocessor scheduling problem can also be formulated as an optimization problem. However, since the problem is NPhard, in general [13]
, polynomialtime heuristic methods are often used. An example of these approaches can be found in
[14], which addresses the energyaware multiprocessor partitioning problem and [15], which proposes an energy and feasibility optimal global framework for a twotype multicore platform. In general, the tasks are partitioned among the set of processors, followed with computing the running frequency. Among all of the feasibility assignments, an optimal energy consumption assignment is chosen by solving a mathematical optimization problem, where the objective is to minimize energy. The constraints are to ensure that all tasks will meet their deadlines and only one processor is assigned to a task. In partitioned scheduling algorithms, such as [14], once a task is assigned to a specific processor, the multiprocessor scheduling problem is reduced to a set of uniprocessor scheduling problems, which is well studied [16]. However, a partitioned scheduling method cannot provide an optimal schedule.1.3 Contribution
In this paper, we propose three mathematical optimization problems to solve a periodic hard realtime task scheduling problem on homogeneous multiprocessor systems with DVFS capabilities. First is an MINLP, which adopts the fluid model used in [2, 3, 4] to represent a scheduling dynamic. The MINLP relies on the optimal control of a suitablydefined dynamic to globally solve for a valid schedule, while solutions are obtained by solving each instance of the time interval obtained using the DP technique [3, 4, 7]. By determining a fraction of task’s execution time and operating speed, rather than task assignments, the same scheduling problem can be formulated as an NLP. Lastly, we propose an LP for a system with discrete speed levels. Our work presents homogeneous multiprocessor scheduling algorithms that are both feasibility optimal and energy optimal. Furthermore, our formulations are capable of solving any periodic tasksets, i.e. implict, constrained and arbitary deadlines.
1.4 Outline of Paper
This paper is organized as follows: Section 2 defines our scheduling problem in detail. Three mathematical optimization formulations to solve the same multiprocessor scheduling problem are proposed in Section 3. The simulation setup and results are presented in Section 4. Finally, conclusions and future work are discussed in Section 5.
2 Problem Formulation
2.1 Task and Processor Model
We consider a set of periodic realtime tasks that are to be partitioned on identical processors, where each processor’s voltage/speed can be adjusted individually. All tasks are assumed to start at the same time. The hyperperiod is defined as the least common multiple of all task periods. The tasks can be preempted at any time, do not share resources and do not have any precedence constraints. It is assumed that in order to guarantee the existence of a valid schedule.
Below, we will refer to the sets , and . The remaining minimum execution time of job at time will be denoted by . Note that will be used as shorthand for , respectively.
2.2 Energy Consumption Model
For CMOS processors, the total power consumption is often simply expressed as an increasing function of the form
(1) 
where is the dynamic power consumption due to the charging and discharging of CMOS gates, and are hardwaredependent constants and the static power consumption , which is mostly due to leakage current, is assumed to be either constant or zero [1].
The total energy consumption from executing a task can be expressed as a summation of the active and idle energy consumed, i.e. , where is the energy consumed when the processor is busy executing the task and is the energy consumed when the processor is idle. The energy consumed by executing and completing a task at a constant speed is
(2a)  
(2b)  
(2c) 
where is the power consumption in the active interval, is the power consumption during the idle period. and will be assumed to be constants. Note that is strictly greater than zero and that the last term is not a function of the speed.
2.3 Scheduling as an Optimal Control Problem
The objective is to minimize the total energy consumption of executing a periodic taskset within the hyperperiod . The scheduling problem can therefore be formulated as the following infinitedimensional continuoustime optimal control problem:
(3a)  
subject to  
(3b)  
(3c)  
(3d)  
(3e)  
(3f)  
(3g)  
(3h) 
where
is the execution speed of job at time and is used to indicate processor assignment, i.e. if and only if job is active on processor at time .
The initial conditions on the minimum execution time of all jobs are specified in (3b) and job deadline constraints are specified by (3c). The fluid model of the scheduling dynamic is given in (3d), where the state is and the control inputs are and . The constraint that each job is assigned to at most one processor at a time is ensured by (3e) and (3f) enforces that each processor is assigned to at most one job at a time. Upper and lower bounds on the processor speed are given in (3e). The binary nature of job assignment variables are given by (3h).
3 Solving the Scheduling Problem with Finitedimensional Mathematical Optimization
This section provides details on three mathematical optimization problems to solve the same scheduling problem defined in Section 2.3. The original problem (3) will be discretized by introducing piecewise constant constraints on the control inputs and .
Let , which we will refer to as the major grid, be the set of time instances corresponding to the distinct arrival times and deadlines of all jobs within the hyperperiod , where .
3.1 Mixedinteger Nonlinear Program (MINLPDVFS)
The above scheduling problem, subject to piecewise constant constraints on the control inputs, is most naturally formulated as an MINLP, as defined below.
Though it has been observed by [3] that it is often the case that tasks might have to be split in order to make the entire taskset schedulable, the runtime overheads caused by context switches can jeopardize the performance. Therefore, a variable discretization time step [17] method is applied in a socalled minor grid, so that the solution to our scheduling problem does not depend on the size of the discretization time step. Let denote the set of time instances on a minor grid within the time interval with , so that is to be determined for all from solving an appropriatelydefined optimization problem.
Let and be shorthand for and . Define the notation and the discretized state and input sequences as
If and are constant inbetween time instances on the minor grid, i.e.
(5a)  
(5b) 
then it is easy to show that the solution of the scheduling dynamic (3d) is given by
(6a)  
where  
(6b) 
Let denote the set of all jobs within hyperperiod , i.e. . Define a function by such that and a function by such that .
3.2 Continuous Nonlinear Program (NLPDVFS)
This section proposes an NLP formulation without integer variables to solve the problem in Section 2.3. The idea is to relax the binary constraints in (6i) so that the value of can be interpreted as the fraction of a time interval during which the job is executed on a processor.
Moreover, the number of variables can also be reduced compared to problem (6), since the processor assignment information does not help in finding the task execution order, hence is dropped from the subscripts in the notation. That is, partitioning time using only the major grid (i.e. ) is enough to find a feasible schedule if a solution exists to the original problem (3). Since we only need a major grid, we define the notation .
Consider now the following finitedimensional NLP:
(7a)  
subject to  
(7b)  
(7c)  
(7d)  
(7e)  
(7f)  
(7g) 
where is the fraction of the time interval for which job is executing on a processor at speed and (7e) specifies that the total workload in time interval should be less than or equal to the system capacity.
Theorem 2
Proof:
A similar argument as Theorem 1 can be used to prove the existence of a solution. The simplest valid schedule can be constructed using McNaughton’s wrap around algorithm [8] for each time interval . During each interval , the taskset is schedulable since (i) the total density of the taskset does not exceed the total number of processors, which is guaranteed by contraint (7e), (ii) no task workload is greater than its deadline, which is ensured by constraint (7g) and (iii) task migration is allowed, which is our assumption. The optimal energy consumption relies on the fact that the total energy consumption does not depend on the order of which tasks are scheduled, but rather depends on the taskset’s density, i.e. the objective value stays the same regardless of the number of discretization steps on the minor grid.
3.3 Linear Program (LPDVFS)
Suppose now that the set of speed levels is finite, as is the case with a real processor. We denote with the speed of the processor at level , where is the total number of speed levels.
Consider now the following finitedimensional LP:
(8a)  
subject to  
(8b)  
(8c)  
(8d)  
(8e)  
(8f)  
(8g) 
where is the fraction of the time interval for which job is executing on a processor at speed .
Constraint (8e) assures that a task will not run on more than one processor at a time. Constraint (8f) guarantees that a processor’s workload will not exceed its capacity at a time. Lastly, constraint (8g) provides upper and lower bounds on a fraction of a job’s execution time at a specific speed level variable.
Note that, given a solution to the optimization problem (8), one could also employ McNaughton’s wrap around algorithm [8] to construct a valid schedule of processor assignments with the same energy consumption. In other words, Theorem 2 can be applied here as well to prove an existence of a valid schedule given a solution to problem (8).
4 Simulation Results
4.1 System, Processor and Task Models
The performance of solving the above optimization problems is evaluated on two models of practical commercial processors, namely an XScale and PowerPC 405LP. The power consumption details of the two commercial processors, which have also been used in [18, 19, 20], are given in Table I. The active power consumption models of the XScale and PowerPC 405LP shown in Table II were obtained by a polynomial curve fitting to the generic form (1) (details are given in the appendix B). The plots of the actual data versus the fitted models are shown in Fig. 4.
Processor type  XScale [21]  PowerPC 405LP [22]  

Frequency (MHz)  150  400  600  800  1000  33  100  266  333 
Speed  0.15  0.4  0.6  0.8  1.0  0.1  0.3  0.8  1.0 
Voltage (V)  0.75  1.0  1.3  1.6  1.8  1.0  1.0  1.8  1.9 
Active Power (mW)  80  170  400  900  1600  19  72  600  750 
Idle Power (mW)  40 [18]  12 
Processor type  Active Power Model  MAPE 

XScale  1.1236  
PowerPC 405LP  5.2323 
4.2 Comparison between Algorithms
For a system with a continuous speed range, four algorithms were compared: (i) MINLPDVFS, (ii) NLPDVFS, (iii) GPSVFS, which represents a global energy/feasibilityoptimal workload partitioning with constant frequency scaling scheme and (iv) GPNoDVFS, which is a global workload allocation without frequency scaling scheme. For a system with discrete speed levels, three algorithms are compared: (i) LPDVFS, (ii) GPNoDVFS and (iii) GPSDiscrete, which represents global energy/feasibilityoptimal workload allocations with constant discrete frequency scaling. Note that the formulations of GPSVFS and GPSDiscrete are similar to [14, 15]. Specifically, the GPSVFS is based on a constant frequency scaling with global scheduling scheme, while [14] is a partitioningbased formulation and [15] is a generalized formulation for a twotype heterogeneous multicore system. GPSDiscrete is an extension to systems with discrete speed levels. Details on GPSVFS, GPNoDVFS and GPSDiscrete are given below.
GPSVFS/GPNoDVFS: Solve
(9a)  
(9b)  
(9c)  
(9d)  
(9e) 
where is the task density on processor , i.e. ) and is the static execution speed of processor . A task will not be executed on more than one processor at the same time due to (9b) and the assigned workload will not exceed processor capacity due to (9c). The difference between GPSVFS and GPNoDVFS lies in restriction on operating speed to be either a continous variable (9d) or fixed at the maximum speed (9e).
GPSDiscrete: Determine a fraction of the workload of a task at a specific speed level and a processor speed level selection such that:
(10a)  
(10b)  
(10c)  
(10d)  
(10e)  
(10f)  
(10g) 
where represents a fraction of the workload of a task at a specific speed level and is a speed level selection variable for processor , i.e. if a speed level is selected and otherwise. The constraints in (10b) guarantee that the total workload of a task is equal to . The constraints in (10c) ensure that only one speed level is selected, (10d) assures that a task will be executed on only one processor at a time, (10f) ensures that a processor workload capacity is not violated and (10g) emphasises that the speed level selection variable is binary. Note that GPSVFS and GPNoDVFS are NLPs and GPSDiscrete is an MINLP.
0.4  (0.75,5,10)  (0.75,5,10)  (0.5,10,10)  (0.5,10,10) 
0.6  (1,5,10)  (1,5,10)  (1,10,10)  (1,10,10) 
0.8  (1.5,5,10)  (1.5,5,10)  (1,10,10)  (1,10,10) 
1.0  (2,5,10)  (2,5,10)  (1,10,10)  (1,10,10) 
1.2  (2.5,5,10)  (2.5,5,10)  (1,10,10)  (1,10,10) 
1.4  (3,5,10)  (3,5,10)  (1,10,10)  (1,10,10) 
1.6  (3,5,10)  (3,5,10)  (2,10,10)  (2,10,10) 
1.8  (3.5,5,10)  (3.5,5,10)  (2,10,10)  (2,10,10) 
2.0  (4,5,10)  (4,5,10)  (2,10,10)  (2,10,10) 
Note: The first parameter of a task is ; can be obtained  
by multiplying by . 
4.3 Simulation Setup and Results
For simplicity, we consider the case where four independent periodic realtime tasks with constrained deadlines need to be scheduled onto two homogeneous processors. The total energy consumption of each taskset in Table III were evaluated. For the MINLPDVFS, NLPDVFS and LPDVFS implementations, the major grid discretization step because all tasksets for the simulation only have two distinct deadlines. For the MINLPDVFS, we chose the minor grid discretization step , since there are at most four jobs in each major grid, which implies that there will be at most four preemptions within each major grid. All of our formulations were modelled using ZIMPL [23] and solved by SCIP [24].
Simulation results are shown in Figures 7 and 10. The minimum taskset density is represented on the horizontal axis. The vertical axis represents the total energy consumption normalised by the energy used by GPNoDVFS, i.e. less than 1 means that an algorithm does better than GPNoDVFS. It can be seen from the plots that for a timevarying workload such as a constrained deadline taskset, our results from solving MINLPDVFS, NLPDVFS and LPDVFS are energy optimal, while GPSVFS, GPDiscrete and GPNoDVFS are not. This is because our formulations are incorporated with time, which provides benefits on solving a scheduling problem with both timevarying workload (constrained deadline taskset) and constant workload (implicit deadline taskset). In general, compared to the constant speed profile, the timevarying speed profile saving increases as the minimum taskset density increases. It can also be noticed that the percentage saving of the Xscale is nonlinear compared to that of the PowerPC’s. This is because the power consumption model of the Xscale is a cubic, while the PowerPC’s power consumption model is a quadratic. However, it has to be mentioned that the energy saving percentage varies with the taskset, which implies that the number shown on the plots can be varied, but the significant outcomes stay the same. Lastly, the computation times to solve NLPDVFS, LPDVFS, GPSVFS and GPSDiscrete are extremely fast, i.e. milliseconds using a generalpurpose desktop PC with offtheshelf optimization solvers, while the time to solve MINLPDVFS can be up to an hour in some cases.
5 Conclusions
Three mathematical optimization problems were proposed to solve a homogeneous multiprocessor scheduling problem with a periodic realtime taskset. Though our MINLP and NLP formulations are both energy optimal and feasibility optimal, the computation time is high compared with heuristic algorithms. However, our LP formulation is computationally tractable and suitable for a system with discrete speed level set. Moreover, we have shown via simulations that our formulations are able to solve a more general class of scheduling problem than existing work in the literature due to incorporating a scheduling dynamic model in the formulations as well as allowing for a timevarying executing speed profile. The simulation results illustrate the possibility that the timevarying speed profile can save energy up to 70% compared to the constant speed profile. Possible future work could also include developing numerically efficient methods to solve the various mathematical optimization problems defined in this paper. One could also extend the ideas presented here to solve a dynamic scheduling problem with uncertainty in task execution time and to include slack reclamation for further energy reduction.
References
 [1] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolic, Digital integrated circuits : a design perspective, 2nd ed., ser. Prentice Hall electronics and VLSI series. Pearson Education, Jan. 2003.

[2]
S. K. Baruah, N. K. Cohen, C. G. Plaxton, and D. A. Varvel, “Proportionate
progress: A notion of fairness in resource allocation,” in
Proceedings of the Twentyfifth Annual ACM Symposium on Theory of Computing
, ser. STOC ’93. New York, NY, USA: ACM, 1993, pp. 345–354. [Online]. Available: http://doi.acm.org/10.1145/167088.167194  [3] H. Cho, B. Ravindran, and E. Jensen, “An optimal realtime scheduling algorithm for multiprocessors,” in RealTime Systems Symposium, 2006. RTSS ’06. 27th IEEE International, Dec 2006, pp. 101–110.
 [4] K. Funaoka, S. Kato, and N. Yamasaki, “Energyefficient optimal realtime scheduling on multiprocessors,” in Object Oriented RealTime Distributed Computing (ISORC), 2008 11th IEEE International Symposium on, May 2008, pp. 23–30.
 [5] K. Funaoka, A. Takeda, S. Kato, and N. Yamasaki, “Dynamic voltage and frequency scaling for optimal realtime scheduling on multiprocessors,” in Industrial Embedded Systems, 2008. SIES 2008. International Symposium on, June 2008, pp. 27–33.
 [6] D.S. Zhang, F.Y. Chen, H.H. Li, S.Y. Jin, and D.K. Guo, “An energyefficient scheduling algorithm for sporadic realtime tasks in multiprocessor systems,” in High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on, Sept 2011, pp. 187–194.
 [7] G. Levin, S. Funk, C. Sadowski, I. Pye, and S. Brandt, “Dpfair: A simple model for understanding optimal multiprocessor scheduling,” in RealTime Systems (ECRTS), 2010 22nd Euromicro Conference on, July 2010, pp. 3–13.

[8]
R. McNaughton, “Scheduling with deadlines and loss function,”
Machine Science, vol. 6(1), pp. 1–12, October 1959.  [9] S. Funk, V. Berten, C. Ho, and J. Goossens, “A global optimal scheduling algorithm for multiprocessor lowpower platforms,” in Proceedings of the 20th International Conference on RealTime and Network Systems, ser. RTNS ’12. New York, NY, USA: ACM, 2012, pp. 71–80. [Online]. Available: http://doi.acm.org/10.1145/2392987.2392996
 [10] F. Wu, S. Jin, and Y. Wang, “A simple model for the energyefficient optimal realtime multiprocessor scheduling,” in Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on, vol. 3, May 2012, pp. 18–21.
 [11] P. Regnier, G. Lima, E. Massa, G. Levin, and S. Brandt, “Run: Optimal multiprocessor realtime scheduling via reduction to uniprocessor,” in RealTime Systems Symposium (RTSS), 2011 IEEE 32nd, Nov 2011, pp. 104–115.
 [12] G. Nelissen, V. Berten, V. Nelis, J. Goossens, and D. Milojevic, “Uedf: An unfair but optimal multiprocessor scheduling algorithm for sporadic tasks,” in RealTime Systems (ECRTS), 2012 24th Euromicro Conference on, July 2012, pp. 13–23.
 [13] E. Lawler, “Recent results in the theory of machine scheduling,” in Mathematical Programming The State of the Art, A. Bachem, B. Korte, and M. Grötschel, Eds. Springer Berlin Heidelberg, 1983, pp. 202–234. [Online]. Available: http://dx.doi.org/10.1007/9783642688744_9
 [14] H. Aydin and Q. Yang, “Energyaware partitioning for multiprocessor realtime systems,” in Parallel and Distributed Processing Symposium, 2003. Proceedings. International, April 2003, pp. 9 pp.–.
 [15] H. S. Chwa, J. Seo, H. Yoo, J. Lee, and I. Shin, “Energy and feasibility optimal global scheduling framework on big.little platforms,” Department of Computer Science, KAIST and Department of Computer Science and Engineering, Sungkyunkwan University, Republic of Korea, Tech. Rep., 2014. [Online]. Available: https://cs.kaist.ac.kr/upload_files/report/1407392146.pdf
 [16] J.J. Chen and C.F. Kuo, “Energyefficient scheduling for realtime systems on dynamic voltage scaling (DVS) platforms,” in Embedded and RealTime Computing Systems and Applications, 2007. RTCSA 2007. 13th IEEE International Conference on, Aug 2007, pp. 28–38.
 [17] M. Gerdts, “A variable time transformation method for mixedinteger optimal control problems,” Optimal Control Applications and Methods, vol. 27, no. 3, pp. 169–182, 2006. [Online]. Available: http://dx.doi.org/10.1002/oca.778
 [18] R. Xu, C. Xi, R. Melhem, and D. Moss, “Practical pace for embedded systems,” in Proceedings of the 4th ACM International Conference on Embedded Software, ser. EMSOFT ’04. New York, NY, USA: ACM, 2004, pp. 54–63. [Online]. Available: http://doi.acm.org/10.1145/1017753.1017767
 [19] G. Zeng, T. Yokoyama, H. Tomiyama, and H. Takada, “Practical energyaware scheduling for realtime multiprocessor systems,” in Embedded and RealTime Computing Systems and Applications, 2009. RTCSA ’09. 15th IEEE International Conference on, Aug 2009, pp. 383–392.
 [20] H.C. Wang, I. Woungang, C.W. Yao, A. Anpalagan, and M. S. Obaidat, “Energyefficient tasks scheduling algorithm for realtime multiprocessor embedded systems,” J. Supercomput., vol. 62, no. 2, pp. 967–988, Nov. 2012. [Online]. Available: http://dx.doi.org/10.1007/s1122701207710
 [21] I. X. M. Benchmarks, 2005, http://web.archive.org/web/20050326232506/developer.intel.com/design/intelxscale/benchmarks.htm.
 [22] C. Rusu, R. Xu, R. Melhem, and D. Mossé, “Energyefficient policies for requestdriven soft realtime systems,” in RealTime Systems, 2004. ECRTS 2004. Proceedings. 16th Euromicro Conference on, June 2004, pp. 175–183.
 [23] T. Koch, “Rapid mathematical prototyping,” Ph.D. dissertation, Technische Universität Berlin, 2004.
 [24] T. Achterberg, “SCIP: Solving constraint integer programs,” Mathematical Programming Computation, vol. 1, no. 1, pp. 1–41, July 2009, http://mpc.zib.de/index.php/MPC/article/view/4.
Appendix A Mean Absolute Percentage Error (MAPE)
(11) 
where is the magnitude of the relative error in the measurement,
is the estimated function,
is the input data, is the actual data and is the total number of fitted points.Appendix B Curve Fitting
The following shows the curve fitting formulation to obtain the active power consumption function of the XScale and PowerPC 405LP with the objective of minimizing the MAPE:
subject to  
(12a)  
(12b)  
(12c) 
where and are parameters to be fitted, is the total number of speed levels of a processor. The speed level of a processor and the active power consumption of a processor are measurement data shown in Table I.
Comments
There are no comments yet.