+ All Categories
Home > Documents > High Performance Computing and the FLAME Framework

High Performance Computing and the FLAME Framework

Date post: 16-Jan-2016
Category:
Upload: gracie
View: 29 times
Download: 0 times
Share this document with a friend
Description:
High Performance Computing and the FLAME Framework. Prof C Greenough, LS Chin and Dr DJ Worth STFC Rutherford Appleton Laboratory Prof M Holcombe and Dr S Coakley Computer Science, Sheffield University. Why High Performance Computing?. - PowerPoint PPT Presentation
Popular Tags:
32
High Performance Computing and the FLAME Framework Prof C Greenough, LS Chin and Dr DJ Worth STFC Rutherford Appleton Laboratory Prof M Holcombe and Dr S Coakley Computer Science, Sheffield University
Transcript
Page 1: High Performance Computing and the FLAME Framework

High Performance Computing and the FLAME Framework

Prof C Greenough, LS Chin and Dr DJ WorthSTFC Rutherford Appleton Laboratory

Prof M Holcombe and Dr S CoakleyComputer Science, Sheffield University

Page 2: High Performance Computing and the FLAME Framework

Application can not be run on a conventional computing system– Insufficient memory– Insufficient compute power

High Performance Computing (HPC) generally now means:– Large multi-processor system– Complex communications hardware– Specialised attached processors– GRID/Cloud computing

STFC Rutherford Appleton Laboratory

2CLIMACE Meeting - 14 May 2009

Why High Performance Computing?

Page 3: High Performance Computing and the FLAME Framework

Parallel system are in constant development Their hardware architectures are ever changing

– simple distributed memory on multiple processors– share memory between multiple processors– hybrid systems –

clusters of share memory multiple processors clusters of multi-core systems

– the processors often have a multi-level cache system

STFC Rutherford Appleton Laboratory

3CLIMACE Meeting - 14 May 2009

Issues in High Performance Computing

Page 4: High Performance Computing and the FLAME Framework

Most have high speed multi-level communication switches

GRID architectures are now being used for very large simulations– many large high-performance systems – loosely coupled together over the internet

Performance can be improved by optimising to a specific architecture

Can very easily become architecture dependent

STFC Rutherford Appleton Laboratory

4CLIMACE Meeting - 14 May 2009

Issues in High Performance Computing

Page 5: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

5CLIMACE Meeting - 14 May 2009

The FLAME Framework

Page 6: High Performance Computing and the FLAME Framework

Based on X-Machines Agents:

– Have memory– Have states– Communicate through messages

Structure of Application:– Embedded in XML and C-code– Application generation driven by state graph– Agent communication managed by library

STFC Rutherford Appleton Laboratory

6CLIMACE Meeting - 14 May 2009

Characteristics of FLAME

Page 7: High Performance Computing and the FLAME Framework

The Data Load– Size of agents internal memory– The number of size of message boards The Computational Load– Work performed in any state change– Any I/O performed FLAME Framework– Programme generator (serial/parallel)– Provides control of states– Provide communications network

STFC Rutherford Appleton Laboratory

7CLIMACE Meeting - 14 May 2009

Characteristics of FLAME

Page 8: High Performance Computing and the FLAME Framework

Based on :– the distribution of agents – computational load– distribution of message boards – data load

Agents only communicate via MBs Cross-node message information is made available to

agents by message board synchronisation Communication between nodes are minimised

– Halo regions– Message filtering

STFC Rutherford Appleton Laboratory

8CLIMACE Meeting - 14 May 2009

Initial Parallel Implementation

Page 9: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

CLIMACE Meeting - 14 May 2009 9

Geometric Partitioning

halos

radius

P1

P2

P3

P4 P7 P10

P11

P12P9P6

P5 P8

Processors

Pi

Page 10: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

10CLIMACE Meeting - 14 May 2009

Parallelism in FLAME

Page 11: High Performance Computing and the FLAME Framework

Parallelism is hidden in the XML model and the C-code – this is in term of agent locality or groupings

Communications captured in XML– In agent function descriptions– In message descriptions

The States are the computational load – weight not known until run time – could be fine or course grained

Initial distribution based on a static analysis Final distributions method be based on dynamic

behaviour

STFC Rutherford Appleton Laboratory

11CLIMACE Meeting - 14 May 2009

Issues with HPC and FLAME

Page 12: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

12CLIMACE Meeting - 14 May 2009

Parallelism in FLAMEParallel agents grouped on parallel nodes.

Messages synchronised

Message board library allows both serial and parallel versions to work

Implementation details hiddenfrom modellers

System automatically manages the simulation

Page 13: High Performance Computing and the FLAME Framework

Decoupled from the FLAME framework Well defined Application Program Interface (API) Includes functions for creating, deleting, managing and

accessing information on the Message Boards Details such as internal data representations, memory

management and communication strategies are hidden Uses multi-threading for work and communications

STFC Rutherford Appleton Laboratory

13CLIMACE Meeting - 14 May 2009

Message Boards

Page 14: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

14CLIMACE Meeting - 14 May 2009

FLAME & the Message Boards

Page 15: High Performance Computing and the FLAME Framework

MB Management– create, delete, add message, clear board

Access to message information (iterators)– plain, filtered, sorted, randomise

MB Synchronisation– moving information between nodes– full data replication – very expensive– filtered information using tagging– overlapped with computation

STFC Rutherford Appleton Laboratory

15CLIMACE Meeting - 14 May 2009

Message Board API

Page 16: High Performance Computing and the FLAME Framework

Message Board Management– MB_Env_Init - Initialises MB environment– MB_Env_Finalise - Finalises the MB environment– MB_Create - Creates a new Message Board object– MB_AddMessage - Adds a message to a Message

Board– MB_Clear - Clears a Message Board– MB_Delete - Deletes a Message Board

STFC Rutherford Appleton Laboratory

16CLIMACE Meeting - 14 May 2009

The MB Environment

Page 17: High Performance Computing and the FLAME Framework

Message Selection & Reading - Iterators– MB_Iterator_Create - Creates an iterator– MB_Iterator_CreateSorted - Create a sorted iterator– MB_Iterator_CreateFiltered - Create a filtered iterator– MB_Iterator_Delete - Deletes an iterator– MB_Iterator_Rewind - Rewinds an iterator– MB_Iterator_Randomise - Randomises an Iterator– MB_Iterator_GetMessage - Returns next message

STFC Rutherford Appleton Laboratory

17CLIMACE Meeting - 14 May 2009

The Message Board API (2)

Page 18: High Performance Computing and the FLAME Framework

Message Synchronisation: Synchronisation of boards involves the propagation of message data out across the processing nodes as required by the agents on each node

– MB_SyncStart - Synchronises a message board– MB_SyncTest - Tests for synchronisation completion– MB_SyncComplete - Completes the synchronisation

STFC Rutherford Appleton Laboratory

18CLIMACE Meeting - 14 May 2009

The Message Board API (3)

Page 19: High Performance Computing and the FLAME Framework

MB Sychronisation:– The simplest form is full replication of message data

- very expensive in communication and memory– The MB uses message tagging to reduce the volume

of data being transferred and stored– Tagging uses message FILTERs to select message

information to be transferred – FILTERs are specified in the Model File XMML

STFC Rutherford Appleton Laboratory

19CLIMACE Meeting - 14 May 2009

The Message Board API (4)

Page 20: High Performance Computing and the FLAME Framework

The Message Board API (5)

Selection based on filters Filters defined in XMML Filters can be used:

– in creating iterators to reduce local message list

– during synchronisation to minimise cross-node communications

STFC Rutherford Appleton Laboratory

20CLIMACE Meeting - 14 May 2009

Page 21: High Performance Computing and the FLAME Framework

Iterators objects used for traversing Message Board content. They provide users access to messages while isolating them from the internal data representation of Boards.

Creating an Iterator generates a list of the available messages within the Board against a specific criteria. This is a snapshot of the content of a local Board.

STFC Rutherford Appleton Laboratory

21CLIMACE Meeting - 14 May 2009

MB Iterators (1)

Page 22: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

22CLIMACE Meeting - 14 May 2009

MB Iterators (2)

Page 23: High Performance Computing and the FLAME Framework

FLAME has been successfully ported to the to various HPC systems:– SCARF – 360x2.2 GHz AMD Opteron cores, 1.3TB total

memory– HAPU – 128x2.4 GHz Opteron cores, 2GB memory / core– NW-Grid – 384x2.4 GHz Opteron cores, 2 or 4 GB

memory/core– HPCx – 2560x1.5GHz Power5 cores, 2GB memory / core– Legion (Blue Gene/P) – 1026xPowerPC 850 MHz; 4096 cores– Leviathan (UNIBI) – 3xIntel Xeon E5355 (Quad Core), 24

cores

STFC Rutherford Appleton Laboratory

23CLIMACE Meeting - 14 May 2009

Porting to Parallel Platforms

Page 24: High Performance Computing and the FLAME Framework

Test Models

Circles Model– Very simple agents– all have position data– x,y,fx,fy,radius in

memory– Repulsion from

neighbours– 1message type – Domain

decomposition

C@S Model– Mix of agents: Malls,

Firms, People– A mixture of state

complexities– All have position data– Agents have range of

influence– 9 message types– Domain

decomposition

STFC Rutherford Appleton Laboratory

24CLIMACE Meeting - 14 May 2009

Page 25: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

25CLIMACE Meeting - 14 May 2009

Circles Model

Page 26: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

26CLIMACE Meeting - 14 May 2009

C@S Model

Page 27: High Performance Computing and the FLAME Framework

STFC Rutherford Appleton Laboratory

27CLIMACE Meeting - 14 May 2009

Bielefeld Model

Page 28: High Performance Computing and the FLAME Framework

Work only just started Goal to move agents between compute nodes:

– reduce overall elapsed time– increase parallel efficiency

There is an interaction between computational efficiency and overall elapsed time

The requirements of communications and load may conflict!

STFC Rutherford Appleton Laboratory

28CLIMACE Meeting - 14 May 2009

Dynamic Load Balancing

Page 29: High Performance Computing and the FLAME Framework

Balance - Load vs. Communication

Distribution 1– P1: 13 agents– P2: 3 agents– P2 <--> P1: 1 channel

Distribution 2– P1: 9 agents– P2: 7 agents– P1 <--> P2: 6 channels

STFC Rutherford Appleton Laboratory

29CLIMACE Meeting - 14 May 2009

Distribution A

Distribution B

P1 P2

Frequent

Occasional

Page 30: High Performance Computing and the FLAME Framework

Moving Wrong AgentsMoving wrong agents could increase elapsed time

Problem of Load Imbalance

0

1

2

3

4

5

6

1 2 2 4 4 4 4 5 5 5 5 5 8 8 8 8 8 8 8 8

# Partitions

Tim

e (s

) Work by agents (geometric)

Elapsed time (geometric)

Work by agents (round-robin)

Elapsed time (round-robin)

STFC Rutherford Appleton Laboratory

30CLIMACE Meeting - 14 May 2009

Page 31: High Performance Computing and the FLAME Framework

Size of agent population Granularity of agents

– is there are large computational load– How often do they communicate

Inherent parallelism (locality) in model – Are the agents in groups– Do they have short range communication

Size of initial data Size of outputs

STFC Rutherford Appleton Laboratory

31CLIMACE Meeting - 14 May 2009

HPC Issues in CLIMACE

Page 32: High Performance Computing and the FLAME Framework

Effect initial static distributions Effect dynamic agent migration algorithms Sophisticated communication strategies

– To reduce the number of communications– To reduce synchronisations– To reduce communication volumes– Pre-tagging information to allow pre-fetching

Overlapping of computation with communications Efficient use of multi-code nodes on large systems Efficient use of attached processors

STFC Rutherford Appleton Laboratory

32CLIMACE Meeting - 14 May 2009

HCP Challenges for ABM


Recommended