+ All Categories
Home > Documents > 1 UNC-Charlotte’s Grid Computing “Seeds” framework 1 © 2011 Jeremy Villalobos /B. Wilkinson...

1 UNC-Charlotte’s Grid Computing “Seeds” framework 1 © 2011 Jeremy Villalobos /B. Wilkinson...

Date post: 21-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
26
1 UNC-Charlotte’s Grid Computing “Seeds” framework 1 © 2011 Jeremy Villalobos /B. Wilkinson Fall 2011 Grid computing course. Slides10-1.ppt Modification date: Nov 16, 2011 Acknowledgment The “Seeds” framework was developed by: Jeremy Villalobos, BS, MS, PhD (UNC-Charlotte) as part of his PhD work at UNC-Charlotte. The following slides based upon materials provided by Jeremy Villalobos.
Transcript

1

UNC-Charlotte’s Grid Computing “Seeds” framework

1© 2011 Jeremy Villalobos /B. Wilkinson Fall 2011 Grid computing course. Slides10-1.ppt Modification date: Nov 16, 2011

Acknowledgment

The “Seeds” framework was developed by:

Jeremy Villalobos, BS, MS, PhD (UNC-Charlotte)

as part of his PhD work at UNC-Charlotte.

The following slides based upon materials provided by Jeremy Villalobos.

Challenges in Running Parallel Applications on The Grid

Heterogeneity Connectivity Security Programmability Performance and Scalability

2

The Seeds framework particularly addresses programmability and scalability

MPI and OpenMP

OpenMP (Shared Memory) Easy to program High performance Does not scale

MPI (Distributed Memory) Hard to program High performance Can scale

3

“Seeds Distributed Computing Framework

Developed at UNC-Charlotte by Jeremy Villalobos

Raises level of programming for ease of programmability based upon pattern programming.

Programmer first identifies an appropriate parallel pattern or patterns to solve the problem - the patterns in this context being for example workpool, pipeline, synchronous, mesh/stencil, … .

4

Workpool pattern

“Seeds Distributed Computing Framework

Programmer then simply implements certain Java interface methods, principally:

data diffuse

computation, and

data gather

and framework automatically distributes tasks across distributed computers and processor cores.

Key aspects Programmer does not write programs at low level

of message passing MPI or thread-based APIs (not initially anyway).

Patterns implemented automatically and code is executed on a distributed computing platform.

5

“Skeletons” and “Patterns”

Note. In the literature, term “skeleton” sometimes used to describe “patterns”, especially directed acyclic graphs with a source, a computation, and a sink.

6

Skeletons

Skeletons are: Functional

programming constructs,

Stateless Data-parallel Resemble trees

7

Pattens

Patterns are: State-full, Synchronous loop-parallel Also data-parallel

8

Skeletons/Patterns

Advantages Implicit parallelization

Avoid deadlocks Avoid race conditions

Reduction in code size [3] Abstracts the Grid/Cloud environment

Disadvantages Takes away some of the freedom from the user

programmer New approach to learn Performance told (5% for skeletons on top of MPI

[PASM])

9

Hands-on session

Deploy a Seeds workpool skeleton to compute using Monte Carlo method – all code given.

Just local computer used for this session although remote computers could be used.

Needs Seeds installed. For session, will also need Java , ant, and Eclipse

installed First prepared ant script used to deploy workpool Then Eclipse IDE used to run same program Later Eclipse example for numerical integration - code

partially given. Will need to fill in missing parts.10

11

Basis on Monte Carlo calculations is use of random selections

In this case, circle formed with a square

Points within square chosen randomly

Fraction of points within circle = /4

calculation

12

Can limit calculation to one quadrant and get same result

Actually computes an integral

Seed Skeletons/Patterns

public abstract class Workpool extends BasicLayerInterface{ /** * This function is used to request the user for * chunks of that will be send * through the network. * @param segment * @return */ public abstract Data DiffuseData(int segment); /** * Main computation method. * @param input * @return */ public abstract Data Compute(Data input); /** * GatherData does the opposite of DiffuseData(). * @param segment * @param dat */ public abstract void GatherData(int segment, Data dat); /** * This function should tell the framework what is the * total number of pieces the user decided to divide * the Input data. * @return */ public abstract int getDataCount(); public String getHostingTemplate(){ return WorkpoolTemplate.class.getName(); }}

13

14

CodeInitial part

package edu.uncc.grid.example.workpool;import java.util.Random;import java.util.logging.Level;import edu.uncc.grid.pgaf.datamodules.Data;import edu.uncc.grid.pgaf.datamodules.DataMap;import edu.uncc.grid.pgaf.interfaces.basic.Workpool;import edu.uncc.grid.pgaf.p2p.Node; public class MonteCarloPiModule extends Workpool { 

private static final long serialVersionUID = 1L;private static final int DoubleDataSize = 1000;

double total;int random_samples;Random R;public MonteCarloPiModule() {

R = new Random();}@Overridepublic void initializeModule(String[] args) {

total = 0;// reduce verbosity for logging informationNode.getLog().setLevel(Level.WARNING);// set number of random samplesrandom_samples = 3000;

}

15

Compute methodpublic Data Compute(Data data) {

// input gets the data produced by DiffuseData()DataMap<String, Object> input = (DataMap<String,Object>)data;// output will emit the partial answers done by this methodDataMap<String, Object> output = new DataMap<String, Object>();Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed);Long inside = 0L;for (int i = 0; i < DoubleDataSize ; i++) {

double x = r.nextDouble();double y = r.nextDouble();double dist = x * x + y * y;if (dist <= 1.0) {

++inside;// = 1L;}

}output.put("inside", inside);// store partial answer to return

//to GatherData()return output;

}

Diffuse and gather methods

public Data DiffuseData(int segment) {DataMap<String, Object> d =new DataMap<String, Object>();d.put("seed", R.nextLong());return d; // returns a random seed for each job unit

}public void GatherData(int segment, Data dat) {

DataMap<String,Object> out = (DataMap<String,Object>) dat;Long inside = (Long) out.get("inside");total += inside; // aggregate answer from all the worker nodes.

public double getPi() {// returns value of pi based on the job done by all the workersdouble pi = (total / (random_samples * DoubleDataSize)) * 4;return pi;

}public int getDataCount() {

return random_samples;}

}

16

Running example using ant script

Get prepared package PiApprox.zip Contains everything needed:

Seeds P code Numerical integration code

Run ant script Will compile all software and run PiApprox

program17

Using Eclipse

Advantage of using the ant script first is everything is deployed (seed folder, etc.)

18

Additional Features of Framework

19

Seeds Specifications

Topology: Network Overlay/Hybrid Network Connectivity: Direct,

Through NAT, and Shared memory Programming Style:

Skeleton/Pattern Based Parallel Memory Management: Message-

passing with use of shared memory if available

Self-deployment: Using Java Cog Kit and Globus

Load-balancing: expected on the patterns

20

Seeds Development Layers Basic

Intended for programmers that have basic parallel computing background

Based on Skeletons and patterns Advanced: Used to add or extend functionality like:

Create new patterns Optimize existing patterns or Adapt existing pattern to non-functional requirements

specific to the application Expert: Used to provide basic services:

Deployment Security Communication/Connectivity Changes in the environment

21

Seeds Development Layers Basic

Intended for programmers that have basic parallel computing background

Based on Skeletons and patterns Advanced: Used to add or extend functionality like:

Create new patterns Optimize existing patterns or Adapt existing pattern to non-functional requirements

specific to the application Expert: Used to provide basic services:

Deployment Security Communication/Connectivity Changes in the environment

22

Nested Pattern Structures

It may be that a single pattern will not suffice for a larger problem.

Framework offers nested patterns and facilities to break a single pattern into multiple patterns.

23

Pattern Operators

Also have pattern operators that can take two patterns and merge them into a single pattern.

24

Adding a Stencil (left) to an all-to-all (center) to produce hybrid pattern (right).

Some results

25

Bubble sort using a pipeline on shared memory with Seeds framework

Pipeline pattern for Bubblesort, tested on the Dell 900 server with four quad-core processors and 64GB shared memory (coit-grid05.uncc.edu).

Static test, - program was executed using a fixed number of cores as given. Dynamic tests -- dynamically change the number of cores during execution to improve performance.

Questions

26


Recommended