+ All Categories
Home > Documents > D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000...

D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000...

Date post: 21-May-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
12
1 D4.7 – Multi-Objective Dynamic Optimizer (b) Copyright © AllScale Consortium Partners 2015 H2020 FETHPC-1-2014 An Exascale Programming, Multi-objective Optimisation and Resilience Management Environment Based on Nested Recursive Parallelism Project Number 671603 D4.7 – Multi-Objective Dynamic Optimizer (b) WP4: Unified runtime system for extreme scales Version: 1.0 Author(s): Kostas Katrinis (IBM) Date: 29/06/2018
Transcript
Page 1: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

1

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

H2020 FETHPC-1-2014

An Exascale Programming, Multi-objective Optimisation and Resilience

Management Environment Based on Nested Recursive Parallelism Project Number 671603

D4.7 – Multi-Objective Dynamic Optimizer (b)

WP4: Unified runtime system for extreme scales

Version: 1.0 Author(s): Kostas Katrinis (IBM) Date: 29/06/2018

Page 2: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

2

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

Due date: PM30 Submission date: 29/06/2018 Project start date: 01/10/2015 Project duration: 36 months Deliverable lead organization

IBM

Version: 1.0 Status Final

Author(s): Kostas Katrinis (IBM)

Reviewer(s) QUB, KTH

Dissemination level PU Public

Disclaimer This deliverable has been prepared by the responsible Work Package of the Project in accordance with the Consortium Agreement and the Grant Agreement Nr 671603. It solely reflects the opinion of the parties to such agreements on a collective basis in the context of the Project and to the extent foreseen in such agreements.

Page 3: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

3

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

Acknowledgements

The work presented in this document has been conducted in the context of the EU Horizon 2020. AllScale is a 36-month project that started on October 1st, 2015 and is funded by the European Commission.

The partners in the project are UNIVERSITÄT INNSBRUCK (UBIK), FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN NÜRNBERG (FAU), THE QUEEN'S UNIVERSITY OF BELFAST (QUB), KUNGLIGA TEKNISKA HÖGSKOLAN (KTH), NUMERICAL MECHANICS APPLICATIONS INTERNATIONAL SA (NUMECA), IBM IRELAND LIMITED (IBM).

The content of this document is the result of extensive discussions within the AllScale Consortium as a whole.

More information Public AllScale reports and other information pertaining to the project are available

through the AllScale public Web site under http://www.allscale.eu.

Version History

Version Date Comments, Changes, Status Authors, contributors, reviewers

0.1 20/04/18 First draft Kostas Katrinis 0.2 25/04/18 First review Pierre Lemarinier

0.3 10/05/18 Second draft, addressing Pierre's feedback

Kostas Katrinis

0.5 27/06/18 Final version Kostas Katrinis

Page 4: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

4

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

Table of Contents Table of Contents ............................................................................................................................. 4

Executive Summary ........................................................................................................................ 5

1 Introduction .............................................................................................................................. 6

2 Design of Multi-Objective Dynamic Optimizer ............................................................ 6

2.1 Multi-objective optimization user semantics and interface .......................... 6

2.2 Scheduler Design and Techniques ........................................................................... 8

2.3 The Strategic Scheduler Implementation ............................................................. 9

2.4 The Tactical Scheduler Implementation ............................................................. 10

3 Experimental Results ........................................................................................................... 10

4 Conclusions .............................................................................................................................. 12

5 References ................................................................................................................................ 12

Page 5: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

5

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

Executive Summary In line with the Grant Agreement, this deliverable explores and utilizes the capabilities of the other deliverables to research strategies to effectively steer applications towards fulfilling custom-specified trade-offs among (conflicting) objectives. Specifically, this document presents the structure and workings of the second iteration of the AllScale multi-objective dynamic optimizer. The work presented in this deliverable builds up on the concepts and artefacts presented in the first iteration of this deliverable, namely “D4.6 - Multi-Objective Dynamic Optimizer (a)”. Newly in this deliverable, experimental results evaluating the efficacy of the multi-objective optimizer are presented; these have been obtained through execution of one of the AllScale target applications (IPIC3D) on the AllScale stack.

Page 6: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

6

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

1 Introduction The deliverable D4.7 is part of task T4.6 within WP4. T4.6 is all about designing and implementing a multi-objective dynamic optimizer component within the AllScale architecture. The objective of T4.6 is to research efficient ways of utilizing the available capabilities for steering applications towards fulfilling flexibly customizable trade-offs among a variety of (conflicting) optimization objectives, such as, execution time, power consumption and/or resource utilization. This report covers the implementation of the multi-objective optimizer that has matured in the project, fully utilizing the multi-versioning feature of the AllScale environment. At a high-level, the distinguishing novel features of the optimizer are: a) the ability to have full control on the task granularity, either by splitting tasks into sub-tasks or by processing them without splitting - from the runtime point of view splitting a task means selecting a split variant of the work item and processing a task means selecting a process variant of the work item and b) the ability to dynamically adjust server-specific parameters – namely number of hardware threads assigned to the runtime and scaling of core frequencies – at runtime to achieve near-optimal trade-offs among conflicting optimization objectives, including balancing of tasks among runtime instances. The design and implementation of the multi-objective dynamic optimizer is aligned with the API specification of the deliverable D4.1 and extends the simple prototype scheduler as defined in D4.2 and further qualified in D4.6. This deliverable is structured as follows: Section 2 presents the standing design and implementation the multi-objective optimizer, both from a user and a developer perspective. Evaluation results of the optimizer on the IPIC3D AllScale application are presented in Section 3. Section 4 concludes the deliverable.

2 Design of Multi-Objective Dynamic Optimizer

2.1 Multi-objective optimization user semantics and interface The main responsibility of the dynamic optimizer is to optimize against a set of objectives, thereby the term multi-objective. The set of objectives and their respective importance can be provided by the user, using provided application command line options. Specifically, the user can optimize against following objectives: 1. Application execution time 2. Power consumed for application execution 3. Resources (quantified as average number of hardware threads) consumed for application execution In the first iteration of this train of deliverables (D4.6), we have initially supported the ability of the user to specify up to two preferred objectives (among time, power and resource usage), while also hiding from the user the exact weighting assigned internally to the various objectives. In this matured

Page 7: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

7

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

version of the multi-objective optimizer, specifying up to all three of the target objectives, their respective priorities and weights is granted to the user as a superior capability for total control. Specifically, the user specifies target objectives as an ordered set of <objective:margin> pairs, with following rules:

objective = time|energy|resource. Margin is a fractional number in [0,1]. The order that the objectives are listed is significant, i.e. if objective A is

listed before objective B, then A is given priority over B during dynamic optimization.

The set can contain one, two or three pairs, with the constraint of all specified objectives being pairwise different.

The margin argument in each specified pair has the following semantics: unlike single objective optimization, optimization of multiple objectives can lead to conflicts among the objectives being optimized. For instance, an application execution prioritizing on (minimizing) power consumption over execution time could heavily penalize the latter objective (time), due to e.g. aggressive frequency scaling or suspension of hardware threads, as a means to reserve power. In certain cases, this can defeat the purpose of multi-objective optimization. Margins provide a degree of freedom in this regard, creating some slack in optimizing against higher priority objectives to make room for optimization against lower priority objectives too. More formally, the multi-objective optimizer interprets each margin parameter (fraction number in [0,1]) within each <objective:margin> pair as the percentage of the distance between the minimum and maximum objective values, relative to the objective minimum value. For instance, if the pair set “<energy:0.1> <time:0.5>” is specified by the user, the optimizer will allow a 10% slack to minimizing power consumption (takes first priority over time, as it is specified first), to allow for a larger degree of freedom in trying to minimize time. In an example execution, where maximum observed power is 1000W and optimization against power manages to find a solution that consumes 500W (minimum), then the 0.1 margin allows the optimizer to seek time-optimizing solutions that can consume up to 550W. In the standing implementation, the user specifies the set of desired objectives, with the priority order set and margins preferred, using the dedicated command line parameter “--hpx:ini=allscale.objective!”. For instance, a user can execute the IPIC3D AllScale application by issuing the following command (single line): >ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

The above command will trigger the optimizer to prioritize runtime parameter adjustment during IPIC3D execution to favour resource consumption over power consumption, followed by power consumption over execution time, within the margins specified.

Page 8: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

8

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

2.2 Scheduler Design and Techniques The scheduler is organized in two-level hierarchical design:

1) A top layer, strategic scheduler – this layer interprets the user provided optimization objectives input (as discussed in Section 2.1) and periodically adjusts configurable machine-specific resources (optimization parameters) to arrive to a parameter set that optimizes against the user preferred goals. In the standing implementation, the optimizer supports actively two optimization parameters, namely core frequency scaling and number of hardware threads assigned to the AllScale runtime thread pools. Last, the strategic scheduler presents the ability to distribute tasks for execution to remote nodes, as a means of distributed load balancing. The implementation presents also readiness for experimentation with dynamic core offlining. It is noted that the strategic scheduler runs asynchronously to the application being executed.

The core of the intelligence of the strategic scheduler lies in its multi-objective optimization technique. The employed technique derives from the ANGEL auto-tuning approach [1], with specific deviations to make it efficient within the context of the AllScale environment. The implemented techniques starts with the highest priority objective, employing the Nelder-Mead Simplex single-objective algorithm to locate minimization points. Once minimization of the first objective has converged, the technique continues to the next objective, following the same single-objective optimization approach. However, for any objective but the highest priority one, the search space for minimization solutions is constrained within solutions that do not violate the margins of higher priority objectives, as specified by user input. Some of the implementation features of the technique custom to AllScale constitute the employment of linear approximations to specify feasible solutions within margins (unlike the penalty approach employed in [1]), as well as tuning of per objective iterations to match the relatively short time scales of the tasks induced by AllScale applications and benchmarks.

2) A bottom layer, tactical scheduler – this layer is responsible for effectively utilizing the resources provided by the top layer; the decision of whether to split the work items or process, the selection of the work item variants and assigning them to compute units, and moving data item fragments are all responsibility of this layer. Each node has its own tactical scheduler that mostly works independently of other, nodes except when presented with the need to move data across the nodes for load-balancing purposes, or satisfy the requirements imposed by the resiliency manager. Figure 1 represents the interaction of the two layers.

Page 9: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

9

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

Figure 1. High-level Block Diagram of the Multi-Objective Dynamic

Optimizer

2.3 The Strategic Scheduler Implementation As the AllScale project is based on the HPX runtime, the strategic scheduler utilizes the HPX APIs underneath to implement the desired functionalities. To implement a scheduling policy that can find an efficient trade-off among execution time, power consumption and total resource utilization, the runtime should provide a meaningful set of performance counter measurements through its monitoring interface. Depending on the value of the utilized performance counter and using the hardware reconfiguration facility of the AllScale environment, the dynamic optimizer can either suspend or resume hardware threads (resp. increase/reduce core frequency) while trying to find an optimal balance among objectives, as mandated by the multi-objective optimization technique presented in the previous section. The performance counter that is being used currently is provided by the application. More precisely, after each iteration of the application its execution time is exported as an HPX performance counter (D5.2). The scheduler reads such performance counter periodically and depending on its value the number of cores is either suspended or resumed. The number of threads to suspend or resume at each time is governed by the multi-objective optimization technique step, as it searches for better optimization parameter settings. The thread to suspend is elected randomly and the thread to resume is always the first one in the suspended list of threads. It is worth mentioning that, the design is not limited to any specific performance counter and any meaningful performance counter can be used for this purpose. The implementation of the strategic scheduler, within the AllScale Runtime source code repository comprises following source code files:

Page 10: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

10

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

src/components/scheduler_component.cpp allscale/components/ scheduler_component .hpp src/components/ localoptimizer.cpp allscale/components/ localoptimizer.hpp src/components/ nmsimplex_bbincr.cpp allscale/components/ nmsimplex_bbincr.hpp

2.4 The Tactical Scheduler Implementation The implementation of the tactical scheduler depends on the HPX performance counters as well. In this case, two performance counters, namely, the length of the queue on each worker thread, and the idle rate of each worker thread are being used to decide when to split or process work items. If there are few tasks in the system but the idle rate is high, or if there are many tasks in the system but the idle rate is low, then the work item needs to be split, otherwise it is processed without being split. The implementation can be found in the do_split method in the scheduler_component.cpp file.

3 Experimental Results We have executed a rigorous set of experiments to evaluate the AllScale multi-objective optimizer. For that, we have used the IPIC3D AllScale application throughout, with 1 million particles being simulated. The experiments have been conducted on several IBM Power8 8-core/1TB DDR4 machines (S822LC High Performance Computing server [2]). Initially and for the purpose of constructing some sense of goodness to compare the optimizer results against, we have run an exhaustive search trial set of executions of the IPIC3D applications. For that, we have constructed executions over all possible combinations of optimization parameters (number of runtime HPX thread assigned to the application, constant core frequency) with reasonable step sizes between successive parameter values. The results of this trial are presented in the 3D scatter plot shown in Figure 1A. In this figure, we are distinctly showing (using triangular red markers) the calculated points forming the Pareto frontier of the search space, which effectively forms the optimal frontier in the multi-objective optimization sense. Subsequently, we have executed a set of IPIC3D application trials, using the AllScale environment and its multi-objective optimizer for dynamic parameter tuning. Throughout the experimentation, the AllScale runtime has been initiated with 160 threads in total (or at maximum, given that the dynamic optimizer can suspend/resume threads at runtime). In Figure 1B, C and D we present the results of executions (five iterations of application execution per configuration, raw results are shown) corresponding to energy, resource and time as first priority objectives respectively, as given in the command line interface during execution (exact margins are given in the figure).

Page 11: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

11

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

Figure 2 - A) Results of execution of IPIC3D using exhaustive search for space exploration (Pareto Frontier is shown with red triangle markers), B) Results obtained by the AllScale Multi-Objective Dynamic Optimizer with input <energy:0.1 time:0.4 resource:0.4>, C) Results obtained by the AllScale Multi-Objective Dynamic Optimizer with input <resource:0.1 energy:0.4 time:0.4> and D) Results obtained by the AllScale Multi-Objective Dynamic Optimizer with input <time:0.1 energy:0.4 resource:0.4>

By first comparatively inspecting the results in Figure 1B, C and D, we observe that the optimizer indeed honors the objective priorities specified by the user, especially when it comes to prioritizing for energy conservation and resource conservation. Specific to the time prioritization experiment, we observed a non-negligible number of outliers. More careful inspection revealed this being a Nelder-Mead optimization artifact, as the single-objective optimization did frequent reach maximum number of iterations due to inability to detect convergence, despite having reached a near-optimal parameter setting for the purpose (high number of threads active, almost maximum core frequency). Of equal importance is the comparative inspection of the results relative to Figure 1A and specifically the Pareto frontier points. Both in the case of energy and resource prioritization, the optimizer manages to quickly bring the operating parameters close to Pareto points, in line with the overall objective of having the runtime optimization approach hand-tuned configurations for a specific application (IPIC3D in this case).

Page 12: D4.7 Multi-Objective Dynamic Optimizer (b) Multi-Objective Dynamic...>ipic3d_allscalecc :U:1000000 --hpx:threads=160 --hpx:ini=allscale.objective!="resource:0.1 energy:0.4 time:0.4"

12

D4.7 – Multi-Objective Dynamic Optimizer (b)

Copyright © AllScale Consortium Partners 2015

4 Conclusions This deliverable is the second and final iteration on constructing, integrating and evaluating a multi-objective dynamic optimizer for the AllScale environment. The design and implementation have been presented, providing details into the workings of the standing artefact and pointers to the source code embodying the design. A significant effort has been devoted to integrating the optimizer to be used with AllScale applications, with IPIC3D results having been reported in this deliverable. The result manifest the ability of the optimizer to incorporate multiple conflicting objectives gracefully, while observing user priorities, and also result to objective values that stay close to dominating multi-objective value points. Towards the final demonstrators of the project, we are continuing to tune the optimizer, with main priorities focusing on adaptive convergence detection during Nelder-Mead steps (applying to time-optimization outliers).

5 References [1] Ray S. Chen and Jeffrey K. Hollingsworth, “ANGEL: A Hierarchical Approach to

Multi-Objective Online Auto-Tuning”, In Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS '15). ACM, New York, NY, USA, 2015.

[2] https://www.ibm.com/us-en/marketplace/high-performance-computing/specifications#product-header-top


Recommended