+ All Categories
Home > Documents > 1 On-line Parallel Tomography Shava Smallen UCSD.

1 On-line Parallel Tomography Shava Smallen UCSD.

Date post: 18-Jan-2016
Category:
Upload: neil-gilbert
View: 218 times
Download: 0 times
Share this document with a friend
32
1 On-line Parallel Tomography Shava Smallen UCSD
Transcript
Page 1: 1 On-line Parallel Tomography Shava Smallen UCSD.

1

On-line Parallel Tomography

Shava Smallen

UCSD

Page 2: 1 On-line Parallel Tomography Shava Smallen UCSD.

2

I) Introduction to On-line Parallel Tomography

II) Tunable On-line Parallel Tomography

III) User-directed application-level scheduler

IV) Experiments

V) Conclusion

Talk Outline

Page 3: 1 On-line Parallel Tomography Shava Smallen UCSD.

3

What is tomography?

• A method for reconstructing the interior of an object from its projections

• At the National Center for Microscopy and Imaging Research (NCMIR), tomography is applied to electron microscopy to study specimens at the cellular and subcellular level

Page 4: 1 On-line Parallel Tomography Shava Smallen UCSD.

4

Tomogram of spiny dendrite(Images courtesy of Steve Lamont)

Example

Page 5: 1 On-line Parallel Tomography Shava Smallen UCSD.

5

Parallel Tomography at NCMIR

• Embarrassingly parallel

X

Y

slice

specimen

Z

scanlineprojection

projection

scanline

Page 6: 1 On-line Parallel Tomography Shava Smallen UCSD.

6

NCMIR Usage Scenarios

Off-line parallel tomography (off-line PT)

– Data resides somewhere on secondary storage

– Single, high quality tomogram

– Reduce turnaround time

– Previous work (HCW’ 00)

On-line parallel tomography (on-line PT)

– Data streamed from the electron microscope

• long makespan, configuration errors, etc.

– Iteratively computed tomogram

– Soft real-time execution

Page 7: 1 On-line Parallel Tomography Shava Smallen UCSD.

7

On-line PT

• Real-time feedback on quality of data acquisition1 ) First projection acquired from microscope2 ) Generate coarse tomogram3 ) Iteratively refine tomogram using subsequent

projections (refresh)• Update each voxel value • Size of tomogram is constant

Page 8: 1 On-line Parallel Tomography Shava Smallen UCSD.

8

NCMIR Target Platform

• Multi-user, heterogenous resources– NCMIR cluster

• SGI Indigo2, SGI Octane, SUN ULTRA, SUN Enterprise

• IRIX, Solaris

– Meteor cluster• Pentium III dual proc• Linux, PBS

– Blue Horizon• AIX, Loadleveler, Maui Scheduler

network

Page 9: 1 On-line Parallel Tomography Shava Smallen UCSD.

slices

preprocessor

ptomo

ptomo

ptomo

ptomo

ptomo

writer

On-line PT Architecture

projection

scanlines

tomogram

Page 10: 1 On-line Parallel Tomography Shava Smallen UCSD.

10

On-line PT Design

1) Frame on-line parallel tomography as a tunable application– Resource limitations / dynamic– Availability of alternate configurations [Chang,et

al]• each configuration corresponds to different output

quality and resource usage

2) Coupled with user-directed application-level scheduler (AppLeS)– adaptive scheduler– promote application performance

Page 11: 1 On-line Parallel Tomography Shava Smallen UCSD.

11

On-line PT Configuration

• Triple: (f, r, su)

• Reduction factor (f) – Reduce resolution of data reduce both

computation and communication

• Projections per refresh (r)– Reduce refinement frequency reduce

communication

• Service Units - (su)– Increase cost of execution increase

computational power

Page 12: 1 On-line Parallel Tomography Shava Smallen UCSD.

12

User Preferences

• Best configuration (f, r, su) = (1, 1, 0 )

• Several possible configurations user specifies bounds– projections should be at least size 256x256

• 1 f 4 or 1 f 8

– user could tolerate up to a 10 minute time wait• 1 r 13

– reasonable upper bound• 0 su (50 x acquisition period x c)

Page 13: 1 On-line Parallel Tomography Shava Smallen UCSD.

13

User-directed

• Feasible?– Use dynamic load information– if work allocation found

• Better? – e.g.

1. (1, 6, 4) - best f

2. (2, 2, 8) - good su/r

3. (2, 1, 20) - best r

reduction factor

projections per refresh

service units

Page 14: 1 On-line Parallel Tomography Shava Smallen UCSD.

generaterequest

displaytriples

adjustrequest

reviewtriples

processrequest

findwork

allocation

executeon-line PT

accepts one

rejects all

infeasible

feasible

User-directed AppLeS

User

User-directed AppLeS

Page 15: 1 On-line Parallel Tomography Shava Smallen UCSD.

15

Triple Search

• Search parameter space– If triple satisfies constraints feasible

• Constrained optimization problem based on soft real-time execution– compute constraint– transfer constraint

• Heuristics to reduce search space– e.g. assume user will always choose (1,2,1)

over (1,2,4)

Page 16: 1 On-line Parallel Tomography Shava Smallen UCSD.

16

Work Allocation

work allocation

transfer constraints

cost

user constraints

compute constraints

cpu availability

processor availability

ptomo-to-writer bandwidth

subnet-to-writer bandwidth

Multiple mixed-integer programs approx soln

Page 17: 1 On-line Parallel Tomography Shava Smallen UCSD.

17

Experiments

• Impact of dynamic information on scheduler performance

• Usefulness of tunability Grid environments

• Scheduling latency

Page 18: 1 On-line Parallel Tomography Shava Smallen UCSD.

18

Dynamic Information

• We fix the triple and let schedulers determine work allocation

Infinite bandwidth

Dynamic bandwidth

Dedicated cpu

wwa wwa+bw

Dynamic cpu

wwa+cpu AppLeS

Page 19: 1 On-line Parallel Tomography Shava Smallen UCSD.

19

• Evaluate schedulers– Repeatibility – Long makespan– several resource environments

• Simgrid (Casanova [CCGrid’2001])– API for evaluating scheduling algorithms

• tasks• resources modeled using traces

– E.g. Parameter sweep applications [HCW’00]

• Simtomo

Simulation

Page 20: 1 On-line Parallel Tomography Shava Smallen UCSD.

20

relative refresh lateness

expected refresh period

actual refresh period

• Relative refresh lateness

Performance Metric

Page 21: 1 On-line Parallel Tomography Shava Smallen UCSD.

21

NCMIR experiments

• Traces (8 machines)– 8 hour work day on March 8th, 2001

• Ran simulations throughout day at 10 minute intervals

8:00 am 4:00 pm

Page 22: 1 On-line Parallel Tomography Shava Smallen UCSD.

22

Perfect Load Predictions

0 1 2 3 4 5 6 7 810

0

101

102

103

104

hours since 3/8/2001 - 8:00 PST

mea

n re

lativ

e re

fres

h la

tene

ss

wwawwa+cpuwwa+bwAppLeS

Page 23: 1 On-line Parallel Tomography Shava Smallen UCSD.

23

Imperfect Load Predictions

0 1 2 3 4 5 6 7 810

0

101

102

103

104

hours since 3/8/2001 - 8:00 PST

me

an

rela

tive

re

fre

sh la

tene

ss

wwawwa+cpuwwa+bwAppLeS

Page 24: 1 On-line Parallel Tomography Shava Smallen UCSD.

24

Synthetic Grids

• Bandwidth predictibility– Average prediction error

– pi {L, M, H}

– p1 p2 p3

• e.g. LMH

– 27 types– 2510 Grids

x 4 schedulers

– 10,040 simulations

writer

cluster3

cluster2

cluster1

p1

p2

p3

Page 25: 1 On-line Parallel Tomography Shava Smallen UCSD.

25

wwa wwa+cpu wwa+bw AppLeS 0

500

1000

1500

2000

2500

3000

scheduler

num

be

r o

f run

s1st2nd3rd4th

Relative Scheduler Performance

705.89 658.91 127.10 1.07

Page 26: 1 On-line Parallel Tomography Shava Smallen UCSD.

26

Partial Ordering

• Performance vs. bandwidth predictability

• Grid predictibility– Partial orders using p1 p2 p3

– Comparable/Not Comparable• e.g. HML is comparable to HLL• e.g. HLM is not comparable to LHM

• HHH, HHM, HMM, HLM, MLM, LLM, LLL

Page 27: 1 On-line Parallel Tomography Shava Smallen UCSD.

27

Example Partial Order

HHH HHM HMM HLM MLM LLM LLL . 10

0

101

102

103

104

rela

tive

re

fre

sh la

ten

ess

(se

con

ds)

wwawwa+cpuwwa+bwAppLeS

Page 28: 1 On-line Parallel Tomography Shava Smallen UCSD.

28

Tunability Experiments

• How useful is tunability?– variability

• Fixed topology– categorized traces

• L, M, H

– v1 v2 v3 v4 v5

– 243 Grid types cluster2

cluster1

writer

supercomputer

v2

v1

v3

v4

v5

Page 29: 1 On-line Parallel Tomography Shava Smallen UCSD.

29

Tunability Experiments

• Run over a 2 day period– back-to-back– assume single user

model• f, r, su

• Set of triples chosen– T = {1,…,61}

02

46

8

05

10150

2

4

6

x 104

fr

su

Page 30: 1 On-line Parallel Tomography Shava Smallen UCSD.

30

Tunability Results

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

fra

ctio

n o

f ch

an

ge

s

parameters

frsu

• Count how many times a triple changed per 2-day simulation

• e.g.– 12.9%– 25.7%

Page 31: 1 On-line Parallel Tomography Shava Smallen UCSD.

31

0 2 4 6 8 100

1000

2000

3000

4000

5000

6000

7000

seconds

nu

mb

er

of

exp

erim

en

ts

Scheduling Latency

• Time to search for feasible triples• e.g.

– 88% under 1 sec– 63% under 1 sec

Page 32: 1 On-line Parallel Tomography Shava Smallen UCSD.

32

Conclusions and Future Work

• Grid-enabled version of on-line parallel tomography– Tunable application

• Tunability is useful in Grid environments

– User-directed AppLeS• Importance of bandwidth predictability

– e.g. rescheduling

• Scheduling latency is nominal

• Production use


Recommended