Log In Sign Up

Optimising energy and overhead for large parameter space simulations

Many systems require optimisation over multiple objectives, where objectives are characteristics of the system such as energy consumed or increase in time to perform the work. Optimisation is performed by selecting the `best' set of input parameters to elicit the desired objectives. However, the parameter search space can often be far larger than can be searched in a reasonable time. Additionally, the objectives are often mutually exclusive – leading to a decision being made as to which objective is more important or optimising over a combination of the objectives. This work is an application of a Genetic Algorithm to identify the Pareto frontier for finding the optimal parameter sets for all combinations of objectives. A Pareto frontier can be used to identify the sets of optimal parameters for which each is the `best' for a given combination of objectives – thus allowing decisions to be made with full knowledge. We demonstrate this approach for the HTC-Sim simulation system in the case where a Reinforcement Learning scheduler is tuned for the two objectives of energy consumption and task overhead. Demonstrating that this approach can reduce the energy consumed by  36 without significantly increasing the overhead.


page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 8


Selecting Miners within Blockchain-based Systems Using Evolutionary Algorithms for Energy Optimisation

In this paper, we represent the problem of selecting miners within a blo...

Energy-Aware JPEG Image Compression: A Multi-Objective Approach

Customer satisfaction is crucially affected by energy consumption in mob...

Airfoil Optimization using Design-by-Morphing

We present Design-by-Morphing (DbM), a novel design methodology to creat...

Multi-Objective Reinforcement Learning based Multi-Microgrid System Optimisation Problem

Microgrids with energy storage systems and distributed renewable energy ...

Optimistic Optimisation of Composite Objective with Exponentiated Update

This paper proposes a new family of algorithms for the online optimisati...

Budgeted Classification with Rejection: An Evolutionary Method with Multiple Objectives

Classification systems are often deployed in resource-constrained settin...

I Introduction

There is a strong desire to model real-world systems through computer simulation – a software system which replicates the salient features of the real-world system. This permits ‘what if’ analysis, where one desires to know how the real-world system will be affected by changes in environment or policy. This is especially important when the proposed changes to the real system would be unpalatable to perform – such as costing too much, having significant impact or potentially causing a degradation of service.

In recent years the concept of the ‘digital twin’, a simulation of a specific instance rather than a generic type of system, has emerged. Allowing ‘what if’ analysis of the digital twin which can then be applied to the real system. For example, optimising the parameters controlling how the system performs. Traditionally this would be very difficult to perform on the real system due to fears that changes could have unforeseen detrimental impacts. However, by making the changes to the digital twin we remove this risk and can perform many simulations faster-than-real-time in order to identify the ‘optimal’ set of parameters.

One may assume that to find the optimal set of parameters, where we are optimising over a single output metric – referred to as an objective – is just the process of running the simulation many times until we find the ‘best’ set. Unfortunately, far too often, this is not the case. The search space over which parameters can vary and the number of possible parameters can be far larger than what can be feasibly (or economically) searched. One may conclude that each individual parameter may be optimised in isolation. However, if the relationship between parameters and the objective is complex, then the optimal value for one parameter may not be part of the global optimal.

This complexity can be compounded when one wishes to optimise for multiple objectives, for example the energy used by a system and the increase in time to perform the work – overhead. If one is fortunate, these objectives are mutually constructive and this degrades to a single optimisation case. However, in most cases multiple objectives are mutually destructive. In our example using only the lowest energy consuming computers could minimise energy consumption, though at the expense of delaying work completion when the number of available low-energy computers is insufficient.

In order to deal with optimising over multiple objectives one may choose to optimise for one objective over the other(s) or to optimise for a combination of them. However, this removes full transparency of the interplay between optimising for the different objectives – diminishing the ability for decisions to be made from full knowledge.

We overcome the search space and multiple objectives problems by applying a Genetic Algorithm (GA) [14] to generate a Pareto frontier [16, 18] using the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) [5]. Using a GA allows us to quickly identify those parameters which lead to optimal output by selecting parameter sets which are mutations of the best sets identified in previous generations. Using NSGA-II allows us to identify those parameters which lead to objectives which lie along the Pareto frontier – a curve which identifies those points for which there is no (yet identified111Note that as we do not try every combination of parameters it may be possible to improve these points.) combination of parameters which would improve one of the objectives without diminishing the others.

We exemplify this for the digital twin tailored from the HTC-Sim [7] simulation system of a high-throughput computing (HTC) setup – specifically HTCondor [12] at Newcastle University. Our scheduler, which chooses which resources should run which tasks222Here a task, sometimes referred to as a job, is a single executable which is run on one computer within the HTCondor system., employs a Reinforcement Learning [19] (RL) approach based on the work by McGough et al. [13]. We identify twenty parameters from this work which can be used to configure the RL scheduler and consider the two objectives of energy consumed by the system and average task overhead – difference between execution time and time in the system.

The rest of this paper is set out as follows. In Section II we motivate the need for a GA along with Pareto frontier for the HTCondor RL scheduler. Related work is presented in Section III followed by a discussion of the optimisation method in Section IV. Section V presents the simulation environment. Results are presented in Section VI. We offer conclusions and identify future directions in Section VII.

Ii Motivation

Parameter spaces for simulations rapidly become large as the number of parameters and valid values increase. This is perhaps why in their work McGough et al. [13] only ever vary two input parameters at a time and even then only consider a maximum of eleven different values for each of these parameters – leading to 121 different simulations.

Continuous value parameters are the hardest to deal with as selecting the size of discretisation is vitally important – too small will lead to excessively large numbers of simulations, whilst too large means one is more likely to miss the optimal value. Integer values are similar in complexity but there is a minimum level of discretisation – the unit value.

Let us assume here that we wish to perform a parameter sweep over just six continuous values for a simulation which takes just five minutes per run. If we discretise each of the continuous parameters to one hundred values then we would require

simulation runs, which is just over 9.5 million years of execution time. If we were to restrict the discretisation to just ten values per parameter this would reduce our parameter space to one million simulation runs and 9.5 years of execution time. Either case is far in excess of what can be performed – and would be a significant energy drain in its own right. Thus the use of a machine learning optimisation approach is highly desirable here.

We use Figure 1, adapted from [13], to illustrate the motivation for identifying a Pareto frontier. The variation in colours represents the different learning rates () of the RL approach whilst the spread of each colour represents variations in how much importance is placed on the computers selected. It can be seen from this figure that there is no global optimal – minimising energy and average overhead. This figure demonstrates that a Pareto frontier is present, though, due to the small number of sample points this is most likely not the actual Pareto frontier. For illustrative purposes we add here (black points) the identified Pareto frontier from our work. Demonstrating the savings which can be made by identifying more optimal parameter sets.

Fig. 1: Overhead vs Energy, adapted from [13], along with Pareto frontier

Iii Related Work

Multi-objective optimisation problems are commonplace, with applications as diverse as electoral zone design [17] to generation expansion planning [10]. Here we review various applications that have utilized multi-objective optimization.

Multi-objective optimization has been used in many different fields, and many multi-objective problems have been solved with Non-Dominated Sorting Genetic Algorithm II (NSGA-II) [5]. There are, however, other algorithms which are used such as Multi-Objective Genetic Algorithm [15].

Ponsich et al. apply NSGA-II to electoral zone design [17]. The criteria in which the geographical units must be aggregated are population equality, compactness and contiguity. They found that NSGA-II obtains promising results when compared with simulated annealing, producing better-distributed solutions over a wider-spread front.

Kannan et al. used NSGA-II for the generation expansion planning problem [10]. Seeking to identify which generating units should be commissioned and when they should become available over the long-term planning horizon. Optimising for two trade-off solutions: minimize cost, and minimize sum of normalized constraint violations; and to minimize investment cost and minimize outage cost. They were able to find a Pareto-front with high computational efficiency.

Wei et al. used NSGA-II to optimize energy consumption and indoor environment thermal performance [21]. Using simulation data containing energy consumption and indoor thermal comfort. They used a fitness function for NSGA-II comprising of a back propagation network optimised by a genetic algorithm to characterize building behaviour.

Guha et al. used multi-objective optimization to design a ship hull [8]. As their objective functions were not smooth, they found evolutionary techniques the most practical. They tested a number of different algorithms, and found that Sequential Quadratic Programming, Pattern Search and Interior-Point were very sensitive to the initial guess and prone to getting stuck in local minima. The genetic algorithm and particle swarm optimisation proved to be more robust and able to determine the global minima in most trials.

Iv Optimization methods

Classical optimization methods, such as non-linear programming, find single solutions per simulation run. However, many real-world problems naturally have multiple objectives to optimise. Traditionally, optimization methods are used by converting them into a single-objective problem. However, this does not take into account the various trade-offs between equally optimal (Pareto-optimal) solutions. It is therefore important to find multiple Pareto-optimal solutions. A Pareto frontier is made up of many Pareto-optimal solutions. These can be displayed graphically, allowing a user to choose between various solutions and trade-offs.

Classical methods require multiple applications of an optimization algorithm, with various scalings between rewards to achieve a single reward. The population approach of genetic algorithms, however, enable the Pareto frontier to be found in relatively few simulation runs. NSGA-II is a multi-objective genetic algorithm and is used here.

Iv-a Genetic Algorithms

GAs [9]

are a class of evolutionary algorithms. We detail the workings of genetic algorithms in this section.

An initial population of structures , for generation 0, is generated and each individual is evaluated for fitness. A subset of individuals, , are chosen for mating, selected proportional to their fitness. ‘Fitter’ individuals have a higher chance of reproducing to create the offspring group . have characteristics dependent on the genetic operators: crossover and mutation. The genetic operators are an implementation decision [2].

Once the new population has been created, the new population is created by merging individuals from and . See Algorithm 1 for detailed pseudocode.

3:evaluate structures in
4:while termination condition not satisfied do
6:     select reproduction from
7:     recombine and mutate structures in forming
8:     evaluate structures in
9:     select each individual for from or
10:end while
Algorithm 1 Genetic algorithm [2]

Iv-B Nsga-Ii

NSGA-II is efficient for multi-objective optimization on a number of benchmark problems and finds a better spread of solutions than Pareto Archived Evolution Strategy (PAES) [11] and Strength Pareto EA (SPEA) [22] when approximating the true Pareto-optimal front [5].

The majority of multi-objective optimization algorithms use the concept of domination during population selection [4]. A non-dominated genetic algorithm seeks to achieve the Pareto-optimal solution, so no single optimization solution should dominate another. An individual solution is said to dominate another , if and only if there is no objective of that is worse than objective of and at least one objective of is better than the same objective of [3]. Non-domination sorting is the process of finding a set of solutions which do not dominate each other and make up the Pareto front. A Pareto front contains solutions that have dominated all inferior solutions, and have at least one objective which is better than the other solutions of the Pareto front. See Figure 2a for a visual representation, where and are two objectives to minimise.

We define a process to determine which solutions to keep:

Iv-B1 Non-dominated sorting

We assume that there are objective functions to minimise, and that and are two solutions. implies solution is better than solution on objective . A solution is said to dominate the solution if the following conditions are true:

  1. The solution is no worse than in every objective. I.e. .

  2. The solution is better than in at least one objective. I.e. .

Fig. 2: a) Schematic of non-dominated sorting with solution layering b) Schematic of the NSGA-II procedure

Once the solutions are calculated for the objective functions, the solutions are sorted according to their level of non-domination. An example of layering of levels is shown in Figure 2a. Here, and are the objective functions to be minimized. The Pareto front is the first front which contains solutions that are not dominated by any other solution. The solutions in layer 1 are dominated only by those in the Pareto front, and are non-dominated by layer 2 and layer 3.

The solutions are then ranked according to their layer. Solutions in the Pareto front are given a fitness rank () of 1, solutions in layer 1 have of 2, etc.

Iv-B2 Density Estimation


) is computed for each solution as the average distance between the two closest points to the solution in question, and is an estimate of the largest cuboid which contains only

and no other points.

Iv-B3 Crowded comparison operator

() is used to ensure that the final frontier is an evenly spread out Pareto-optimal front. Each solution has two attributes: and. We can then define a partial order:
if or and [5].

This concludes that a point with a lower rank is preferred, and if two points have the same rank the point which is located in a less dense area is preferred.

Iv-B4 Main loop

As with standard GA a random population is created. This is then sorted according to non-domination. Binary tournament selection, recombination and mutation operators are used to create a child population of size . Where tournament selection is a process of evaluating and comparing the fitness of various individuals in a population. Binary tournament selection begins by selecting two individuals at random, evaluating the fitnesses, and selecting the individual with the better solution [1].

1: combine parent and child population
2: fast-non-dominated-sort where
5:     Calculate the crowding distance of )
7:end while
8:Sort() sort in descending order using
9: select the first elements of
10: make-new-population using selection, crossover and mutation to create the new population
Algorithm 2 NSGA-II main loop [5]

After the first population the procedure changes (see Algorithm 2). Initially, a combined population is formed of size . is sorted according to non-domination. A new population is now formed , adding solutions from each front level until the size of exceeds . The solutions of the last accepted level are then sorted according to , and a total of solutions are chosen, rejecting those from the last layer that have a smaller crowding distance [5].

The entire process is shown in Figure 2b, and is repeated until the termination condition is met. Termination conditions could be: no significant improvement over iterations or a specified number of iterations have been performed.

V Simulation environment

V-a HTC-Sim

HTC-Sim is a trace-driven simulation framework for energy consumption in High Throughput Computing systems [7]. The simulation handles two types of users – interactive users who can sit down in-front of a computer and use it along with high-throughput users who submit multiple tasks through a batch submission system which use the computers when idle. Interactive users will evict HTC tasks requiring the task to be rerun. The computers in the system are considered at three logical levels – the whole system, a cluster of computers (a number of computers in a distinct location) and individual computers.

The model characterises each computer through a set of parameters. These describe the resource in terms of operating system, architecture type, memory size, performance metrics (such as number of cores, CPU speed, MIPS), along with an energy profile. The model is extensible, allowing practitioners to define their own custom parameters.

The workload of the HTC system is comprised of a set of high throughput tasks, submitted either independently or together as part of a batch. A task submitted to the system is initially placed into a queue. If an appropriate computer is available the task is allocated to that resource – the task is now in the running state. If no appropriate computer is available the task will remain queued until an appropriate computer is available. If an interactive user logs into the computer whilst a high throughput task is running, the task will relinquish the resource either by entering a suspended state (if possible) or re-entering the queue to be re-run later. Tasks that remain in a suspended state for longer than a pre-determined threshold are evicted and re-enter the queue.

An ordered set of all interactive sessions is used to replay the interactive user activity across the computers within an organisation. The data used to exemplify the system is trace data obtained from December 2009 through December 2010. These traces are indicative of current system usage and analysis that has been ongoing since 2010.

We are primarily concerned with two objectives (metrics):

Average task overhead – the time difference between the task entering () and departing () the system, and the actual task execution time () for a set of tasks :

Energy consumption – the total energy consumed by the HTC workload. Fine-grained energy consumption is recorded per- computer, cluster and system, for each state, e.g. sleep, idle, active (HTC and/or interactive user). The total energy consumption is then calculated as follows:

where is the number of computers, is the number of power states, is the time spent by computer in state and is the power consumption rate of computer in state . For non-HTC states, .

V-B Reinforcement Learning Scheduler

Reinforcement Learning [19]

(RL) is a machine learning technique used to learn how to react to an environment. An agent observes an environment which is often represented by a state space. For each state in the state space there is a corresponding action vector representing every action which can be taken in that state. Initially each action has the same probability of being selected. When the agent observes a specific state it chooses an action from the action vector based on either an explorative or exploitative policy – selected between at random with probability

. If an explorative policy is chosen then the action is selected at random from the action vector whilst if an exploitative policy is in force then the action which has seen the ‘best’ historical reward is selected. Once the action is completed and it is known if the action was good or bad then the reward value for the action is updated – rewarding good actions (increasing the reward) and punishing bad actions (decreasing the reward).

The RL scheduler by McGough et al. [13] has a state space which is a combination of whether computers within the HTC system are free for use and the hour of the day when the request to schedule a task is made. The granularity of the action space can be varied in size from representing each computer individually through to only selecting the cluster on which to place a task or placing in the queue, to only selecting between allocating the tasks to a computer or queueing the task. Likewise the hour of the day could be for any day (24 actions) or for each hour within a week (168 actions).

V-C Parameters for RL

We present here the parameters that the GA will search over in order to identify the optimal policies. Further details can be found in [13].

The exploration versus exploitation of a RL approach is potentially the most significant factor in optimising the approach. Too small a value of will lead to the system not searching the possible outcome space and hence performing little better than randomly choosing actions. Likewise, too large a value of will mean the RL is spending more time searching for optimal solutions than actually using the ones that it has found already. However, having a single value of for the whole RL process can be too restrictive. We therefore allow to decrease as the simulation progresses. Below we present parameters which control the action space, the reward computation and how the value of is varied:

  • Week: is the state space for the RL – day or week {boolean}. As weekends have a different usage patten to week days this could allow the RL to adapt to this.

  • Entity level: is the component of action space in terms of computer granularity – {(computer, cluster, whole)}. Fine grained actions could be better, but at the expense of needing far more examples to train on.

  • -policy: What is changed on? {(days, previous, ratio, hit)}. Note that this has an impact on the meaning of many of the parameters below.

  • ranges: The date range on which to change . {[0,999999], …}333Although this can be an arbitrarily long list we limit this to three values for this work.,444Note .

  • reward boundaries: reward values over which the value will be changed. {(0,1], …},

  • : The amount of influence the computer energy efficiency has on RL reward {[0,1]}.

  • days: Change based on the number of days of RL which have been performed {[0, 365]}, 0 = don’t use.

  • : Increase by if current day into RL days {[0,1]}.

  • history: The number of previous tasks to consider when computing the action {[-1,999999]}, -1 = all tasks.

  • gaussian: Do we apply a gaussian decay over the task history when computing the action? {boolean}.

  • prior: If all prior actions gave a negative reward then increase by 0.1 {boolean}.

  • threshold: If the ratio of best reward to average reward is less than threshold, use the previous value{[0,1]}.

  • defer: Defer running a task during the same hour that the computer is due to be rebooted {boolean}.

If we assume here, conservatively, that each continuous parameter is discretised into one hundred values then this creates a search space of some . Again, assuming that the simulation takes five minutes to run for each parameter combination this is years for a full parameter space search.

Vi Results

TABLE I: Parameters for optimal objectives

We first evaluate if the NSGA-II approach will lead to a Pareto frontier for total energy consumed and average overhead. Figure 3 illustrates progressive iterations of the NSGA-II algorithm with successive iterations in different colours. The initial (red) colours are scattered widely whilst the final iteration (magenta) indicates a sharp edge closest to the two axis. It is interesting to note that although there is no global optimal for both energy and overhead the Pareto front is ‘sharp’ in the bottom left corner indicating that there is a good compromise for both objectives. There is also a separate region of points with lower energy consumption but substantially higher overheads. This appears to be cases where tasks are only allowed to run when the task is almost definitely going to finish – at the expense of significantly increasing overhead. For all parameter sets along the Pareto front defer was true, demonstrating that not running tasks during the hour when a computer will be rebooted was the best policy. We therefore consider defer no further.

Fig. 3: Progress towards the Pareto Frontier

Table I presents the parameter sets and objective values for minimum overhead, minimum energy and ‘optimal’ combination of overhead and energy. The optimal combination was attained by first scaling average overhead and total power consumed between 0 and 100 using min-max scaling. Next, we summed the scaled overhead and total power consumed and chose the combination with the minimum value. This enabled us to choose the minimum combination of both objectives with equal weighting. The scaling, however, could be changed to suit individual preferences of power consumption or average overhead. As the following parameters were identical for all cases, we present them here rather than in the table: -policy was previous, gaussian was false, was 0, ratio was 0.9375 and was -0.5842.

The energy consumption here is far better than those presented in the paper by McGough et al. [13], reducing the energy consumption by MWh for effectively the same average overhead whilst also being able to beat their best energy case by MWh again for no appreciative change in average overhead. We can save over MWh of energy over their lowest energy case, however, this is at the cost of massively increasing the overhead. It should be noted that this was achieved solely through the tuning of the simulation parameters with the use of NSGA-II, as both sets of results run the same underlying code.

To better understand the parameters which effect the Pareto Front we fit a Lasso regression

[20] to distinct clusters of the Pareto Front (optimisation dominant (-1), central (1) and energy dominant (0) – Figure 4

a). Lasso regression is a linear regression technique which steers the coefficients for insignificant parameters to zero allowing for the identification of important parameters and their significance. The parameters were all scaled between 1 and 100 allowing direct comparison between parameters.

Fig. 4: Objective clustering a) Clusters, b) Overhead impact

We clustered the data using the unsupervised learning technique DBSCAN


. This technique was chosen due to its effectiveness at clustering data points which are close together. This yields better results for our dataset than a method such as k-means clustering which partitions the data into Voronoi cells. The results, Figure

4b, show a large negative coefficient for , and when predicting average overhead, though only for the significant overhead case (cluster 0). This would somewhat suggest that looking at the energy efficiency of computers in these cases is detrimental – potentially as this cluster favours queueing tasks rather than running them.

The coefficients for total energy, Figure 5, show that lower values of and history have the best impact on reducing energy for cluster -1 – low overhead cases. This is against the naive assumption that taking energy efficiency into account would reduce overall energy consumption – potentially as this could reduce clarity for which computer to use. Shorter history would suggest that the system changes over time and hence only recent history should be considered.

Figure 6 displays the distribution of parameters for , , , change and threshold for the final population. It can be seen that for , the parameters converge to a high value for , low value for with between these two values. This is to be expected as the reward boundaries should decrease. and are both bimodal, whereas is trimodal. converges to an increasing relationship with successive , however, it is less defined than , with a bimodal relationship for and .

Fig. 5: Parameters which impact energy consumption
Fig. 6: Violin distribution of parameters
Fig. 7: value and days impact on
Fig. 8: a) Granularity of RL action space with respect to computer- and cluster-level. b) Day / Week granularity for action/state space.
Fig. 9: a) -policy, b) Reward history window size and c) Using a gaussian decay over the reward history window
Fig. 10: a) Impact of negative prior results, b) Impact of energy efficiency of computer and c) Threshold impact

Figure 7 demonstrates the impact of adding to if the RL trainer has been running for less than the prescribed number of days. Here the GA has learnt to use large values of when energy consumption is more important, suggesting a more explorative approach favours energy efficiency – perhaps due to the fact that over the simulation period the state varies significantly. By contrast, the number of days shows no clear pattern. Though as the value is often very small this may have little if any impact.

The granularity of the RL action space with respect to computers is presented in Figure 8a. In almost all cases cluster level is the best choice. This is most likely a consequence of the fact that it is a compromise between fine-grained computer level and course-grained whole system level. Interestingly for minimum energy whole system becomes more optimal. Perhaps a consequence of most tasks being held in a queue rather than executed, hence more fine-grained knowledge no longer helps.

The other aspect of action/state space – day or week – is presented in Figure 8b. Here all but the most extreme overhead cases are optimal with the day case. This would suggest that, although there is a difference based on the day of the week, this can only be exploited in the case where energy reduction is key.

The policy is compared in Figure 9a. In almost all cases the ‘Previous’ policy is optimal apart from a small number of cases. This indicates that basing on the average reward of the previous day is the best policy. The ratio of best reward to average reward makes up most of the remaining points indicating that for both of these cases an adaptive policy which can move between explorative and exploitative modes over time is the best approach – a consequence of the state of the system changing as time progresses. Only one ‘static’ policy is seen as optimal - where the value of changes by the number of days the RL has been running.

Reward history and applying gaussian decay over the history is presented in Figures 9b and 9c. History size seems to be bimodal with the extremes of overhead and energy having a value around 840,000 whilst most of the low overhead values are in the region of 500,000. This suggests that forgetting history more quickly favours lower overheads – but at the expense of higher energy consumption. By contrast, the choice of when to use a gaussian decay is less obvious suggesting that other factors are at play.

Increasing by 0.1 when prior rewards are negative is a mixed case for central points – Figure 10a – though for lowest overhead and lowest energy the best approach appears to be disabled and enabled respectively, again suggesting that a more explorative approach favours lower energy. The impact of taking computer energy efficiency into account when computing the reward is presented in Figure 10b. One would assume that taking energy efficiency into account would be important for low overall energy usage, however, the opposite seems to be the case. This would suggest that for extremely low energy cases whether the task is launched or not is most important. The ratio of best reward to average reward is presented in Figure 10c. Results are variable, but lower thresholds tend to be better.

Vii Conclusions

In this paper we have demonstrated the potential of genetic algorithms, specifically NSGA-II, for efficient design space exploration of the operating policies of digital twin simulations. We apply the approach to parameterise the operating policies of a target system, a digital twin simulation of a high-throughput computing infrastructure. We evaluate the performance of the system with respect to energy consumption and performance. Through this approach we are able to reduce energy consumed by an HTC system by optimising the parameters of a RL scheduler by with only negligible increase to the overheads. This allows us to more efficiently tune the parameter sets in situations where there are more parameter combinations than can feasibly be searched, and multiple objectives over which to optimise.

In future work we plan to optimise over an increased number of objectives such as turnaround time for individual users and maximum waiting time per task. We also hope to implement our findings in a real HTCondor system.


  • [1] R. Abd Rahman, R. Ramli, Z. Jamari, and K. R. Ku-Mahamud (2016) Evolutionary Algorithm with Roulette-Tournament Selection for Solving Aquaculture Diet Formulation. Mathematical Problems in Engineering 2016, pp. 1–10. External Links: Document, ISSN 1024-123X Cited by: §IV-B4.
  • [2] T. Back, D. B. Fogel, and Z. Michalewicz (2009) Evolutionary Computation 1 Basic Algorithms and Operators. Comprehensive Chemometrics, pp. ix – x. External Links: Document, ISBN 978-0-444-52701-1 Cited by: §IV-A, Algorithm 1.
  • [3] C. Bao, L. Xu, E. D. Goodman, and L. Cao (2017) A novel non-dominated sorting algorithm for evolutionary multi-objective optimization. Journal of Computational Science 23, pp. 31–43. External Links: Document, ISSN 18777503 Cited by: §IV-B.
  • [4] E. K. Burke and K. Graham (2014) Search methodologies: Introductory tutorials in optimization and decision support techniques, second edition. External Links: Document, ISBN 9781461469407 Cited by: §IV-B.
  • [5] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimisation: NSGA-II. CEUR Workshop Proceedings 1133, pp. 850–857. External Links: ISSN 16130073 Cited by: §I, §III, §IV-B3, §IV-B4, §IV-B, Algorithm 2.
  • [6] M. Ester, H. Kriegel, J. Sander, and X. Xu (1996) A tdensity-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96, pp. 226–231. Cited by: §VI.
  • [7] M. Forshaw, A.S. McGough, and N. Thomas (2016) HTC-sim: a trace-driven simulation framework for energy consumption in high-throughput computing systems. 28 (12), pp. 3260–3290. Note: cpe.3804 External Links: ISSN 1532-0634, Document Cited by: §I, §V-A.
  • [8] A. Guha and J. Falzaranoa (2015) Application of multi objective genetic algorithm in ship hull optimization. Ocean Systems Engineering 5 (2), pp. 91–107. External Links: Document, ISSN 2093-6702 Cited by: §III.
  • [9] J. H. Holland (1975) Search methodologies: Introductory tutorials in optimization and decision support techniques, 2nd ed. External Links: ISBN 9780262082136 Cited by: §IV-A.
  • [10] S. Kannan, S. Baskar, J. D. McCalley, and P. Murugan (2009) Application of NSGA-II algorithm to generation expansion planning. IEEE Transactions on Power Systems 24 (1), pp. 454–461. External Links: Document, ISSN 08858950 Cited by: §III, §III.
  • [11] J. Knowles and D. Corne (1999) The Pareto archived evolution strategy: A new baseline algorithm for Pareto multiobjective optimisation. Proceedings of the 1999 Congress on Evolutionary Computation, CEC 1999 1, pp. 98–105. External Links: Document, ISBN 0780355369 Cited by: §IV-B.
  • [12] M. Litzkow, M. Livney, and M. W. Mutka (1988) Condor-a hunter of idle workstations. In ICDCS, Cited by: §I.
  • [13] A. S. McGough and M. Forshaw (2014) Reduction of wasted energy in a volunteer computing system through reinforcement learning. Sustainable Computing: Informatics and SystemsConcurrency and Computation: Practice and Experience 4 (4), pp. 262 – 275. External Links: ISSN 2210-5379, Document Cited by: §I, Fig. 1, §II, §II, §V-B, §V-C, §VI.
  • [14] M. Mitchell (1995) Genetic algorithms: an overview. Complexity 1 (1), pp. 31–39. Cited by: §I.
  • [15] T. Murata and H. Ishibuchi (1995) MOGA: Multi-objective genetic algorithms. (November), pp. 289–294. Cited by: §III.
  • [16] V. Pareto and A. S. (. Schwier (1927) Manual of political economy Tr. by Ann S. Schwier. Macmillan, London. External Links: ISBN 333135458 Cited by: §I.
  • [17] A. Ponsich, E. A. R. García, R. A. M. Gutiérrez, S. G. de-los-Cobos Silva, M. A. G. Andrade, and P. L. Velázquez (2017) Solving electoral zone design problems with NSGA-II. pp. 159–160. External Links: Document, ISBN 9781450349390 Cited by: §III, §III.
  • [18] W. Stadler (1979) A survey of multicriteria optimization or the vector maximum problem, part I: 1776-1960. Journal of Optimization Theory and Applications 29 (1), pp. 1–52. External Links: Document, ISBN 0022-3239, ISSN 00223239 Cited by: §I.
  • [19] R.S. Sutton and A.G. Barto (1998) Reinforcement learning: an introduction. A Bradford book, Bradford Book. External Links: ISBN 9780262193986, LCCN 97026416 Cited by: §I, §V-B.
  • [20] R. Tibshirani (1996) Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58 (1), pp. 267–288. External Links: Document Cited by: §VI.
  • [21] W. Yu, B. Li, H. Jia, M. Zhang, and D. Wang (2015) Application of multi-objective genetic algorithm to optimize energy efficiency and thermal comfort in building design. Energy and Buildings 88, pp. 135–143. External Links: Document, ISSN 03787788 Cited by: §III.
  • [22] E. Zitzler and L. Thiele (2006) Multiobjective optimization using evolutionary algorithms — A comparative case study. pp. 292–301. External Links: Document Cited by: §IV-B.