+ All Categories
Home > Documents > Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Date post: 23-Jan-2016
Category:
Upload: nadda
View: 34 times
Download: 0 times
Share this document with a friend
Description:
Parallelization of Iterative Heuristics for Performance Driven Low-Power VLSI Standard Cell Placement Research Committee Project Seminar. Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran. Outline. Introduction Parallel CAD & Environments - PowerPoint PPT Presentation
Popular Tags:
33
Parallelization of Iterative Heuristics for Performance Driven Low- Power VLSI Standard Cell Placement Research Committee Project Seminar Sadiq M. Sait Sadiq M. Sait June 15, 2003 June 15, 2003 King Fahd University of Petroleum & King Fahd University of Petroleum & Minerals Dhahran Minerals Dhahran
Transcript
Page 1: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Parallelization of Iterative Heuristics for Performance Driven Low-Power VLSI

Standard Cell Placement

Research Committee Project Seminar

Sadiq M. SaitSadiq M. Sait

June 15, 2003June 15, 2003

King Fahd University of Petroleum & Minerals King Fahd University of Petroleum & Minerals DhahranDhahran

Page 2: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 2

Outline IntroductionIntroduction Parallel CAD & EnvironmentsParallel CAD & Environments Problem Formulation & Cost FunctionsProblem Formulation & Cost Functions Fuzzification of CostFuzzification of Cost Parallel Iterative HeuristicsParallel Iterative Heuristics Benchmarks & Experimental SetupBenchmarks & Experimental Setup ToolsTools Objectives, TasksObjectives, Tasks ConcludeConclude

Page 3: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 3

Introduction & Motivation Problem: VLSI Cell-placement: NP-HardProblem: VLSI Cell-placement: NP-Hard Motivation for multi-objective VLSI optimizationMotivation for multi-objective VLSI optimization

Can include new issues: Power, Delay, WirelengthCan include new issues: Power, Delay, Wirelength Motivation for using Non-deterministic Iterative Motivation for using Non-deterministic Iterative

Heuristics (Heuristics (Simulated Annealing, Genetic Algorithm, Simulated Annealing, Genetic Algorithm, Simulated Evolution, Tabu Search, Stochastic Evolution)Simulated Evolution, Tabu Search, Stochastic Evolution) StableStable Hill ClimbingHill Climbing ConvergenceConvergence If engineered properly, then longer the CPU time (or If engineered properly, then longer the CPU time (or

more CPU power available) better the quality of more CPU power available) better the quality of solution (due to search space exploration)solution (due to search space exploration)

Page 4: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 4

Motivation for Parallel CAD Faster runtimes to achieve same resultsFaster runtimes to achieve same results Larger problem sizes that have more CPU & Larger problem sizes that have more CPU &

memory requirements can be handledmemory requirements can be handled Better quality resultsBetter quality results Cost-effective technology compared to large Cost-effective technology compared to large

multiprocessor supercomputermultiprocessor supercomputer Utilization of abundant computer power of PCs Utilization of abundant computer power of PCs

that remain idle for a large fraction of timethat remain idle for a large fraction of time Availability and affordability of high speed Availability and affordability of high speed

transmission links, in the order of Gigabits/sectransmission links, in the order of Gigabits/sec

Page 5: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 5

Environments Consists of:Consists of:

Programming Paradigm Programming Paradigm Software Software Hardware PlatformHardware Platform

Need to choose:Need to choose: A Hardware Platform, which is locally A Hardware Platform, which is locally

availableavailable A Programming Paradigm, which A Programming Paradigm, which

suitable for the selected platformsuitable for the selected platform

Page 6: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 6

Environments—Software Four Possible Programming Paradigms:Four Possible Programming Paradigms:

Explicit Message-Passing using Communication Libraries:Explicit Message-Passing using Communication Libraries: Parallel Virtual Machine (PVM) Parallel Virtual Machine (PVM) C/C++/Fortran C/C++/Fortran Message Passing Interface (MPI) Message Passing Interface (MPI) C/C++/Fortran C/C++/Fortran

Data Parallel LanguagesData Parallel Languages Message-Passing implemented through a parallel languageMessage-Passing implemented through a parallel language Compiler directives for specification of data distributionCompiler directives for specification of data distribution

Example: High Performance Fortran (HPF) Example: High Performance Fortran (HPF) Shared-Memory MultiprocessingShared-Memory Multiprocessing

Loop-level (fine-grained) parallelism Loop-level (fine-grained) parallelism Identified through compiler directivesIdentified through compiler directives

Example: OpenMP standardExample: OpenMP standard Lightweight Threads (that use OS threads for parallelizing) Lightweight Threads (that use OS threads for parallelizing)

OS level threads used for hiding latencies (e.g., memory stalls)OS level threads used for hiding latencies (e.g., memory stalls)Examples: POSIX (Portable Operating System Interface) Examples: POSIX (Portable Operating System Interface)

and Java threadsand Java threads

Page 7: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 7

Environments—Software (Cont.)

Our Choice:Our Choice: MPI and/or Multithreading with C/C++ MPI and/or Multithreading with C/C++ Explicit message-passing Explicit message-passing better control better control

higher performance higher performance Multithreading useful to hide CPU stalls Multithreading useful to hide CPU stalls

Cluster of PCs (Cost effective)Cluster of PCs (Cost effective)Affordable & abundantly availableAffordable & abundantly availableFreely available software tools Freely available software tools

Our Choice:Our Choice: Cluster of Pentium III or IV PCsCluster of Pentium III or IV PCs Connected through a GigE SwitchConnected through a GigE Switch

Page 8: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 8

Problem Formulation & Cost Functions Cell Placement Problem:Cell Placement Problem:

The process of placement consists of finding The process of placement consists of finding suitable locations for each cell on the entire suitable locations for each cell on the entire layoutlayout

By suitable location we mean those that By suitable location we mean those that minimize a given objective, subject to certain minimize a given objective, subject to certain constraints (width)constraints (width)

In this work we target the minimization of:In this work we target the minimization of: PowerPower DelayDelay WirelengthWirelength

Constraint (width of the layout)Constraint (width of the layout)

Page 9: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 9

Cost function for Power In standard CMOS technology, power dissipation

is a function of the clocking frequency, supply voltages and the capacitances in the circuit (recall that static power is zero except for due to leakage).

where Ptotal is the total power dissipated pi is the switching probability of gate i Ci represents the capacitive load of gate i fclk is the clock frequency Vdd is the supply voltage and ß is a technology dependent constant

Page 10: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 10

Cost function for Delay The delay of any given long path is computed as

the summation of the delays of the nets v1, v2,...vk belonging to that path and the switching delay of the cells driving these nets. The delay of a given path is given by

where CDvi is the switching delay of the driving cell IDvi is the interconnection delay that is given by the

product of load-factor of the driving cell and the capacitance of the interconnection net IDvi = LFvi . Cvi

We target only a selected set of long pathsWe target only a selected set of long paths

Page 11: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 11

Cost function for Wirelength Steiner tree approximation is used for each multi-

pin net and the summation of all Steiner trees is considered as the estimated wire-length of the proposed solution

where n is the number of cells contributing to the net

where B is the length of the bisecting line, k is the number of cells contributing to the net and Dj is the perpendicular distance from cell j to the bisecting line

Page 12: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 12

Width Cost The cost of the width is given by the The cost of the width is given by the

maximum of all the row widths in the maximum of all the row widths in the layoutlayout

Layout width is constrained not to exceed a Layout width is constrained not to exceed a certain positive ratio α to the average row certain positive ratio α to the average row width width wwavgavg

This can be expressed as This can be expressed as

Page 13: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 13

Fuzzification of Cost Three objectives are optimized simultaneously, Three objectives are optimized simultaneously,

with width constraint. This is done using fuzzy with width constraint. This is done using fuzzy logic to integrate multiple objectives into a scalar logic to integrate multiple objectives into a scalar cost functioncost function

This is translated as OWA (ordered weighted This is translated as OWA (ordered weighted average) fuzzy operator, and the membership µ(average) fuzzy operator, and the membership µ(xx) ) of a solution of a solution xx is a fuzzy set given by is a fuzzy set given by

Page 14: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 14

Review of Parallel Iterative Heuristics

For combinatorial optimizations, three types For combinatorial optimizations, three types of parallelization strategies are reported in of parallelization strategies are reported in literatureliterature The operation within an iteration of the solution The operation within an iteration of the solution

method are parallelized (for example cost method are parallelized (for example cost computation in Tabu search)computation in Tabu search)

The search space (problem domain) is The search space (problem domain) is decomposeddecomposed

Multi-search threads (for example built-in Multi-search threads (for example built-in library in PVM, generates threads and library in PVM, generates threads and distributes to different processors)distributes to different processors)

Page 15: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 15

Parallelization of Tabu Search Crainic et. al. classify Parallel Tabu Search heuristic based Crainic et. al. classify Parallel Tabu Search heuristic based

on a on a taxonomytaxonomy along three dimensions along three dimensions Control Cardinality: 1-control, p-control (1-control is Control Cardinality: 1-control, p-control (1-control is

one master in master slave; in p–control every one master in master slave; in p–control every processor has some control over the execution of the processor has some control over the execution of the algorithm such as maintaining its own Tabu list, etc.)algorithm such as maintaining its own Tabu list, etc.)

Control and Communication type: Control and Communication type: Rigid Synchronization (RS, slave just executes masters Rigid Synchronization (RS, slave just executes masters

instruction and synchronizes)instruction and synchronizes) Knowledge Synchronization (KS, more information is sent Knowledge Synchronization (KS, more information is sent

from the master to the slave)from the master to the slave) Collegial (C, asynchronous type, power to execute is given to Collegial (C, asynchronous type, power to execute is given to

all processors)all processors) Knowledge Collegial (KC, similar to above, but master gives Knowledge Collegial (KC, similar to above, but master gives

more information to the slaves)more information to the slaves)

Page 16: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 16

Parallelization of Tabu Search Search DifferentiationSearch Differentiation

SPSS (single point single strategy)SPSS (single point single strategy) SPDS (single point different strategy)SPDS (single point different strategy) MPSS (multiple point single strategy)MPSS (multiple point single strategy) MPDS (multiple point multiple strategy)MPDS (multiple point multiple strategy)

Parallel Tabu Search can also be classified as:Parallel Tabu Search can also be classified as: SynchronousSynchronous

1-control: RS, or 1-control: KS1-control: RS, or 1-control: KS AsynchronousAsynchronous

p-control: C, or p-control: KCp-control: C, or p-control: KC

Page 17: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 17

Related work (Parallel TS) Yamani et. al present Heterogeneous Parallel Tabu Yamani et. al present Heterogeneous Parallel Tabu

Search for VLSI cell placement using PVMSearch for VLSI cell placement using PVM They parallelized the algorithm on two levels They parallelized the algorithm on two levels

simultaneously;simultaneously; On the higher level, the Master starts a number of Tabu On the higher level, the Master starts a number of Tabu

Search Workers (TSW) & provides them with a same Search Workers (TSW) & provides them with a same initial solutioninitial solution

On the lower level, Candidate list is constructed, where On the lower level, Candidate list is constructed, where each TSW starts a Candidate List Worker (CLW)each TSW starts a Candidate List Worker (CLW)

Based on the taxonomy aforementioned, the Based on the taxonomy aforementioned, the algorithm can be classified as algorithm can be classified as Higher level: p-control, Rigid Sync, MPSSHigher level: p-control, Rigid Sync, MPSS Lower level: 1-control, Rigid Sync, MPSSLower level: 1-control, Rigid Sync, MPSS

Page 18: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 18

Parallel Genetic Algorithm Most of the GA parallelization techniques exploit Most of the GA parallelization techniques exploit

the fact that GAs work with population of the fact that GAs work with population of chromosomes (solutions)chromosomes (solutions)

The population is partitioned into sub-populations The population is partitioned into sub-populations which evolve independently using sequential GAwhich evolve independently using sequential GA

Interaction among smaller communities is Interaction among smaller communities is occasionally allowedoccasionally allowed

The parallel execution model is a more realistic The parallel execution model is a more realistic simulation of natural evolutionsimulation of natural evolution

The reported parallelization strategies are:The reported parallelization strategies are: Island modelIsland model Stepping stone modelStepping stone model Neighborhood modelNeighborhood model

Page 19: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 19

Parallelization strategies for GA Island modelIsland model

Each processor runs sequential GAEach processor runs sequential GA Periodically subsets of elements are migratedPeriodically subsets of elements are migrated Migration is allowed between all subpopulationsMigration is allowed between all subpopulations

Stepping stone modelStepping stone model Similar to ‘Island model’ but communication is Similar to ‘Island model’ but communication is

restricted to neighboring processor onlyrestricted to neighboring processor only This model defines fitness of individuals relative to This model defines fitness of individuals relative to

other individuals in local subpopulationother individuals in local subpopulation Neighborhood modelNeighborhood model

In this model, every individual has its own In this model, every individual has its own neighborhood, defined by some diameterneighborhood, defined by some diameter

Suitable for massively parallel machines, where each Suitable for massively parallel machines, where each processor is assigned one elementprocessor is assigned one element

Page 20: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 20

Related work (Parallel GA) Mohan & Mazumder report VLSI standard cell Mohan & Mazumder report VLSI standard cell

placement using a network of workstationsplacement using a network of workstations Programs were written in C, using UNIX’s rexec Programs were written in C, using UNIX’s rexec

(remote execution) and socket functions(remote execution) and socket functions They summarized their findings as:They summarized their findings as:

The result The result qualityquality obtained by serial and parallel obtained by serial and parallel algorithms were algorithms were samesame

Statistical observations showed that for any desired Statistical observations showed that for any desired final result, the parallel version could provide close to final result, the parallel version could provide close to linear speed-uplinear speed-up

Parameter Parameter settings for migration and crossoversettings for migration and crossover were were identified for optimal communication patternidentified for optimal communication pattern

Static and dynamic Static and dynamic load balancing schemesload balancing schemes were were implemented to cope with network heterogeneityimplemented to cope with network heterogeneity

Page 21: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 21

Benchmarks ISCAS-89 benchmark circuits are usedISCAS-89 benchmark circuits are used

Circuit No. of gatesNo. of gates No. of pathsNo. of paths

S298S298 136136 150150

S386S386 172172 205205

S641S641 433433 687687

S832S832 310310 240240

S953S953 440440 583583

S1196S1196 561561 600600

S1238S1238 540540 661661

S1488S1488 667667 557557

S1494S1494 661661 558558

C3540C3540 17531753 668668

S9234S9234 58445844 512512

S13207S13207 87448744 708708

S15850S15850 1047010470 512512

Page 22: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 22

Experimental Setup Consists of Consists of

Hardware:Hardware: Cluster of 8 machines, x86 architecture, Pentium 4, Cluster of 8 machines, x86 architecture, Pentium 4,

2GHz clock speed, 256 MB of memory2GHz clock speed, 256 MB of memory Cluster of 8 machines, x86, Pentium III, 600 MHz Cluster of 8 machines, x86, Pentium III, 600 MHz

clock speed, 64 MB memoryclock speed, 64 MB memory Connectivity: Connectivity:

Gigabit Ethernet switchGigabit Ethernet switch 100 MBit/Sec Ethernet switch100 MBit/Sec Ethernet switch

OS: LinuxOS: Linux Available Parallel Environments: Available Parallel Environments: PVMPVM, MPI, MPI

Page 23: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 23

Tools Profiling toolsProfiling tools

Intel’s VTune Performance AnalyzerIntel’s VTune Performance Analyzer Gprof (built in Unix tool)Gprof (built in Unix tool) VAMPIR (an MPI profiling tool)VAMPIR (an MPI profiling tool) upshot (built in tool with MPIch upshot (built in tool with MPIch

implementation)implementation) Debugging toolsDebugging tools

Total View (from Etnus)Total View (from Etnus) GNU debugger (gdb)GNU debugger (gdb)

Page 24: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 24

Objectives The proposed research work will build upon

our previous attempts that focused on design and engineering sequential iterative algorithms for VLSI standard cell placement

The primary objective of the present work is to accelerate the exploration of search space by employing the available computational resources for the same problem and with the same test cases

Page 25: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 25

Tasks outline 1.1. Setting up the cluster environmentSetting up the cluster environment2.2. Porting of code, followed by review and analysis Porting of code, followed by review and analysis

of sequential implementation to identify of sequential implementation to identify performance bottlenecksperformance bottlenecks

3.3. Literature review of the work related to Literature review of the work related to parallelization of iterative heuristics parallelization of iterative heuristics

4.4. Investigation of different acceleration strategies Investigation of different acceleration strategies 5.5. Collection of data and tools for implementation, Collection of data and tools for implementation,

analysis and performance evaluationanalysis and performance evaluation6.6. Design and implementation of parallel heuristicsDesign and implementation of parallel heuristics7.7. Compilation of results and comparisonCompilation of results and comparison8.8. Documentation of the developed softwareDocumentation of the developed software

Page 26: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 26

Current Status Two clusters ready, switches procured, initial Two clusters ready, switches procured, initial

experiments indicate that existing infrastructure experiments indicate that existing infrastructure adequate for starting of projectadequate for starting of project

All serial code initially developed on All serial code initially developed on PCs/Windows now being ported to the cluster PCs/Windows now being ported to the cluster environment running Linuxenvironment running Linux

Insight into all code is being obtained, and Insight into all code is being obtained, and bottlenecks being identified using profiling toolsbottlenecks being identified using profiling tools

Initial experiments conducted with Tabu search Initial experiments conducted with Tabu search show promising results. There is a lot to be done.show promising results. There is a lot to be done.

Some experiments are being carried out using Some experiments are being carried out using Genetic Algorithms.Genetic Algorithms.

Page 27: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 27

Simple Trivial Implementation The first synchronous parallel tabu search implementation The first synchronous parallel tabu search implementation

procedure is built according to 1-control, Rigid procedure is built according to 1-control, Rigid Synchronization (RS), SPSS strategySynchronization (RS), SPSS strategy

In this implementation, the master process executes the In this implementation, the master process executes the tabu search algorithm, while the candidate list tabu search algorithm, while the candidate list (neighborhood size (neighborhood size NN), is divided among ), is divided among p p slavesslaves

Each of the slaves evaluates the cost of the (Each of the slaves evaluates the cost of the (N/pN/p) moves ) moves (swapping the cells), and report the best of them to the (swapping the cells), and report the best of them to the mastermaster

The master collects the solutions from all slaves and The master collects the solutions from all slaves and accepts the best among them to be the next initial solution, accepts the best among them to be the next initial solution, for the next iterationfor the next iteration

There is communication and exchange of information There is communication and exchange of information between the master and slaves after every iterationbetween the master and slaves after every iteration

Page 28: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 28

Two test circuits (c3540 & s9234)

Page 29: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 29

Observation Observations of the 1-control, RS, SPSS Observations of the 1-control, RS, SPSS

strategystrategy The total runtime decreased as the number of The total runtime decreased as the number of

processors increased, in achieving the same processors increased, in achieving the same quality of solution (except for c3540 circuit, for quality of solution (except for c3540 circuit, for 2 processors, where communication was more 2 processors, where communication was more then computation)then computation)

The speed-up in terms of time, for the large The speed-up in terms of time, for the large circuit (s9234) wascircuit (s9234) was

1.42 for 1 slave processor1.42 for 1 slave processor 1.82 for 2 slave processors1.82 for 2 slave processors 2.72 for 3 slave processors2.72 for 3 slave processors 3.50 for 4 slave processors3.50 for 4 slave processors 4.46 for 5 slave processors4.46 for 5 slave processors

Page 30: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 30

Utility value of the Project It is expected that our findings will help in It is expected that our findings will help in

addressing other NP-hard engineering problemsaddressing other NP-hard engineering problems The results can be used by other investigators in The results can be used by other investigators in

academia and industry to enhance existing academia and industry to enhance existing methodsmethods

A lab environment setup will provide support for A lab environment setup will provide support for future research, and enable utilization of existing future research, and enable utilization of existing idle CPU resources. The environment will also idle CPU resources. The environment will also help train graduate students in VLSI physical help train graduate students in VLSI physical design research, iterative algorithms, and design research, iterative algorithms, and parallelization aspects.parallelization aspects.

Page 31: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 31

Conclusion The project aims at accelerating runtime of The project aims at accelerating runtime of

iterative non-deterministic heuristics by iterative non-deterministic heuristics by exploiting available unused CPU resources.exploiting available unused CPU resources.

It is expected that insight on how the It is expected that insight on how the implementation of heuristics behaves in a implementation of heuristics behaves in a parallel environment will lead to newer parallel environment will lead to newer parallelization strategiesparallelization strategies

Initial experiments indicate that the Initial experiments indicate that the available environment can provide available environment can provide sufficient support for such type of work.sufficient support for such type of work.

Page 32: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 32

Thank youThank you

Page 33: Sadiq M. Sait June 15, 2003 King Fahd University of Petroleum & Minerals Dhahran

Copyright © 2003 Parallelization of Iterative Heuristics 33

Environments—HardwareChoices for Parallel Computing Hardware:Choices for Parallel Computing Hardware:

Shared-Memory MultiprocessorsShared-Memory MultiprocessorsToo expensiveToo expensiveRequire specialized software toolsRequire specialized software toolsNone available locallyNone available locally

Cluster of PCsCluster of PCsAffordable & abundantly availableAffordable & abundantly availableFreely available software tools Freely available software tools

Our Choice:Our Choice: Cluster of Pentinum III or IV PCsCluster of Pentinum III or IV PCs Connected through a GigE SwitchConnected through a GigE Switch


Recommended