2
Introduction
Project Purpose
What is a Genetic Algorithm?
Serial Algorithm
Modes of Parallelization
Parallel Sort
My All to All
Example Problems
Results
Future Direction
Friday, August 12, 11
3
Project Purpose
Learn more about Genetic Algorithm, GA, construction
Develop a Generic GA for use with various optimization problems
Study various methods of Parallelizing GAs
Develop a GA system which runs well even with expensive fitness function evaluations
Friday, August 12, 11
4
What is a Genetic
An “optimization” system
Find good, but maybe not optimal, solutions to difficult problems
Often used on NP-Hard problems
Combinatorial Optimization
Friday, August 12, 11
5
What is a Genetic
Requirements
Solution(s) to the problem represented as a string
A fitness function
Takes as input the solution string
Output the desirability of the solution
A method of combining solution strings to generate new solutions
Friday, August 12, 11
6
More Details on Genetic Algorithms
Find solutions to problems by Darwinian Evolution
Potential solutions are thought of a living entities in a population
The strings are the genetic codes of the individuals
Individuals are evaluated for their fitness
The fittest individuals are allowed to live and “sexually” reproduce
There may be some mutation
Parents die and kids start the next generation
Friday, August 12, 11
7
The Genetic Algorithm
Generate to set of potential solutions, genes
Repeat until done
Find the fitness of the various members of the population
Sort the population
Allow the bottom half to die
Allow the top half to reproduce, replacing old population
Possibly have mutations
Friday, August 12, 11
8
How do solutions
One method. . .
abcdefg
1234567
Split the two at some
Arbitrary LocationRecombinethe Pieces
abc defg
123 4567abc4567
Given 2Solutions
Friday, August 12, 11
9
Primary Modes of
Manager keeps all of the population
Population is distributed across all processors
Friday, August 12, 11
10
Manager keeps all of the Population
Pass out genes with request for fitness function evaluation
Workers (and manager) do fitness function evaluations
Keep fitness value along with index into array of solutions
Discard genes
Do a parallel sort with results returned to manager
Manager reproduces top half
Friday, August 12, 11
11
Comments on this strategy
Good load balance if fitness function is expensive
Could use multiple processors for a single fitness function evaluation
Works well if time for fitness function is unpredictable
Easy to implement
Bad memory balance
Large amounts of communication
Boring
Friday, August 12, 11
12
Population is distributed across all Processors
Each processor does evaluation on its sub-population
Each processor does own reproduction
Many interesting variations on this theme
Friday, August 12, 11
13
Variation 1
Allow processors to work independently for I generations
Do a global parallel sort of all of the fitness values
The global top half of all genes are redistributed
Friday, August 12, 11
14
Comments on Variation 1
Allows for Uncoupled to Tightly coupled algorithm
See effect of various mutation rates
Allow separate evolutions
Simulate sequential algorithm
Requires All to All personalized communication which is difficult to set up
Send half of the population, less communication
Each processor receives same number of genes
Each processor sends a different number to various processors
Good load balance
Friday, August 12, 11
15
Variation 2
Allow processors to work independently for J generations
Exchange sub-population left-right, up-down
Comments
Allows the migration of solutions across topology
Easy to implement
See the effect of various mutation rates
Some what artificial
Friday, August 12, 11
16
Variation 3Assign an aggression factor to each processor
Allow processors to work independently for K generations
Aggressive processors force a portion of their population onto other processors
Comments
Requires two All to All personalized communications
Tell how many are coming
Send the Genes
Friday, August 12, 11
17
Variation 4
Variation 1 + Variation 2 + Variation 3
Amount of each variation is controlled by input
A goal of project is to study the effect of the combinations of these variations
Friday, August 12, 11
18
Parallel Sort
Used to sort values of the fitness function
Results are on the manager
Each processor sorts its subset of the fitness function values
Uses a parallel gather routine to merge the sorted lists
Friday, August 12, 11
19
The Sort/Merge active =1while (2*active < P) active = 2 * active
if(myid >= active)then send(data , myid-active) returnendifif(myid + active < p)then recv(new_data, myid+active) data = merge(data , new_data)endifwhile(active > 1) active = active / 2 if(myid >= active)then send(data, myid-active) else recv(new_data , myid+active) data = merge(data, new_data) endifendwhile
Input is sorted list from each
processor
Friday, August 12, 11
The Sort/Merge
0
1
2
3
4
5
6
7
2
3
1
0
1
0
Phase 1 Phase 2 Phase 3
The Source for this algorithmis in the “Hybrid” examples
Friday, August 12, 11
21
My All to All Personalized Communication
Used primarily to redistribute genes
Different processors send/recv different amounts of data
At the time MPIAlltoallv did not work correctly
My algorithm is based on a Hypercube algorithm
Does not require power of 2 processors
Iterate up to power of 2 -1 processors
Check to see if you are sending to a valid processor
Uses simple trick to avoid nonblocking send/receive
If Myid < partner send first
If Myid > partner recv first
Friday, August 12, 11
22
My ALLtoALLv
find n2, the power of two >= numnodesdo i=1,n2-1 do xor to find the processor xchng xchng=xor(i,myid) if(xchng <= (numnodes-1))then if(myid < xchng)then send from myid to xchng recv from xchng to myid else recv from xchng to myid send from myid to xchng endif else skip this stage endifenddo
Friday, August 12, 11
23
Algorithm with 5
Stage Node 0 Node 1 Node 2 Node 3 Node 41a 0 to 1 0 to 1 2 to 3 3 to 2 skip
1b 1 to 0 1 to 0 3 to 2 2 to 3 skip
2a 0 to 2 1 to 3 0 to 2 1 to 3 skip
2b 2 to 0 3 to 1 2 to 0 3 to 1 skip
3a 0 to 3 1 to 2 1 to 2 0 to 3 skip
3b 3 to 0 2 to 1 2 to 1 3 to 0 skip
4a 0 to 4 skip skip skip 0 to 4
4b 4 to 0 skip skip skip 4 to 0
5a skip 1 to 4 skip skip 1 to 4
5b skip 4 to 1 skip skip 4 to 1
6a skip skip 2 to 4 skip 2 to 4
6b skip skip 4 to 2 skip 4 to 2
7a skip skip skip 3 to 4 3 to 4
7b skip skip skip 4 to 3 4 to 3
Friday, August 12, 11
24
An Example Problem
Problem
Given an input signal put on a transmission line (Amplitude Only)
Given the distorted output signal on the other end of the line (Amplitude Only)
Assume the distortion is from change of phase of Fourier Components
Find the change in phase of each of the Fourier Components
Use the changes found to “correct” the signal
1d version of an atmospheric propagation problem
Friday, August 12, 11
25
Fitness function
Find FFT of reference signal
The gene is a integer vector of phase variations
Add the phase variations to FFT of reference signal
Inverse FFT
Compare measured signal to generated signal
Friday, August 12, 11
26
Input for an Example Run
16 processors
Population = 2000 genes
1000 generations
global exchange every 50 generations
Force genes on to other processors every 5 generations
Shift every 12 generations
Friday, August 12, 11
27
Graph of Ideal and Measured Signals
1.11
0.90.80.70.60.50.40.30.20.10
-0.1
Amplitude
6456484032241680Time
Ideal And Measured SignalsMeasured Ideal
Friday, August 12, 11
28
Graph of Ideal Corrected
•
1.11
0.90.80.70.60.50.40.30.20.10
-0.1
Amplitude
6456484032241680Time
Ideal and Corrected Signals
Corrected Ideal
Friday, August 12, 11
29
Timings
node genes sent fitness sort total in sort time time time 0 1217 39.919 1.328 50.688 1 1176 39.868 1.258 50.862 2 1127 39.979 1.033 50.863 3 1202 39.844 0.952 50.861 4 1279 39.834 0.903 50.825 5 1202 39.937 1.062 50.841 6 1188 39.981 0.966 50.861 7 1151 39.755 0.905 50.827 8 1206 40.002 0.849 50.866 9 1158 40.054 0.855 50.860 10 1218 39.814 0.880 50.825 11 1185 39.721 0.868 50.828 12 1218 39.864 0.881 50.862 13 1190 39.993 0.867 50.827 14 1192 39.963 0.859 50.827 15 1243 39.925 0.850 50.825
Friday, August 12, 11
Another application
Given a Dielectric (carbon fiber) cylinder of fixed shape (a wing) and an incident electrical field (radar wave)
Find the material properties that minimize (or maximize) the returned signal in a particular direction
Friday, August 12, 11
Why of interest
Design of low radar cross section bodies (Stealth Technology)
Similar to the design of high tech car head lights
Represents a class of problems
Friday, August 12, 11
Find Returned Signal
Create a complex matrix (set of linear equations)
Geometrical Properties (expensive but only done one time)
Material Properties (many times)
Solve linear equations (many times)
Use solution to obtain returned signal in a particular direction (minimize)
Friday, August 12, 11
Fitness Function Steps
Create a complex matrix (set of linear equations)
Solve linear equations , find X
The returned signal is a function of Y, the solution of the set of equations
C X Y× =
F=f(R(X))
Friday, August 12, 11
Our MxM Matrix
n=m
n≠m
Constant, only depends on geometry
Changes for every fitness
function evaluation
SymmetricOnly depends on n
ε is a vector of length M that describes the
material properties
Friday, August 12, 11
Solve the Linear Equations
[tkaiser@mio darwin]$ man cgesvSYNOPSIS SUBROUTINE CGESV( N, NRHS, A, LDA, IPIV, B, LDB, INFO )
INTEGER INFO, LDA, LDB, N, NRHS INTEGER IPIV( * ) COMPLEX A( LDA, * ), B( LDB, * )
PURPOSE CGESV computes the solution to a complex system of linear equations A * X = B, where A is an N-by-N matrix and X and B are N-by-NRHS matrices.
The LU decomposition with partial pivoting and row interchanges is used to factor A as A = P * L * U, where P is a permutation matrix, L is unit lower triangular, and U is upper triangular. The factored form of A is then used to solve the system of equations A * X = B.......
LAPACK version 3.0 15 June 2000 CGESV(l)
We use the LAPACK/MKL
routine cgesv
Friday, August 12, 11
The Link Command
mpif90 charles.o darwin.o ga_list_mod.o global.o init.o \mods.o more_mpi.o mpi.o numz.o unique.o wtime.o \laser_new.o \/opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_intel_lp64.a \/opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_core.a \/opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_sequential.a \/opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_core.a \/opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_sequential.a \/opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_core.a \-lpthread \-o darwin
Friday, August 12, 11
Final Problem
Given our 4 materials
material(0)=cmplx(3,-1)material(1)=cmplx(2,-2)material(2)=cmplx(4,-3)material(3)=cmplx(5,-4)
Minimize the average return over 0° to 30°
Small Signals (< -20) don’t count
Friday, August 12, 11
GA statistics
Ran on 64 Mio processorsPopulation size 5120
Terminated after 1313 Generations because of stagnationTotal Run time 8008 seconds
Matrix size 799x799Matrix Inversions 6,722,560
111156041192735111550042035163793410789548086280344598298545557734384207674422880597461956440434923543714883802852623459362767641138209948290962541754032025298906462696551039941826764245862717597589826903187709568163513990163545434131712589472245284979067605823117125785074692904056218529694448947230242574226153830004788278417156172355372230514994709570664303225740550623006686851674536764546411912524900750974104
60830484002104118443786127982311697061639884742739324040156591161344
Sample Space =4799
Friday, August 12, 11
Input
[tkaiser@mio darwin]$ cat darwin.in&DARWIN_DAT POP_SIZE = 5120 , GENERATIONS = 2000, STAGNATE = 100, GENE_SIZE = 799, SEED = -1650, INVERT_RATE = 0.01 MUT_RATE = 0.075, QUIT_VALUE = 20.0, SHIFT_RATE = 5, SHIFT_NUM = 10, GLOBAL_RATE = 12, PRINTING = 0, HAWK_RATE = 0, HAND_OUT = 0, MAXTIME=17100.0, THE_TOP = T, DO_ONE = T /
Friday, August 12, 11
46
Another classic example
Map 4 color
Given a map of a country divided into states
Given 4 colors
Find a coloring of the map so that no neighboring states have the same color
In our example we use the 22 western US states
Known to be NP-hard with 4**22=17 trillion potential solutions
This turned out to be too easy but still interesting
Can’t be done with 3 colors (Consider Colorado)
Friday, August 12, 11
47
Input and Results
&DARWIN_DAT POP_SIZE=200, GENERATIONS=100, STAGNATE=100,
GENE_SIZE=22, SEED=-12345, INVERT_RATE=0.1E-01, MUT_RATE=0.75E-01,
QUIT_VALUE=1.0, SHIFT_RATE=5, SHIFT_NUM=20, GLOBAL_RATE=10, PRINTING=0, HAWK_RATE=0,
HAND_OUT=0, THE_TOP=T, DO_ONE=T /
node genes sent fitness sort total in sort cpu time cpu time cpu time 0 41 0.025 0.008 0.126 1 116 0.025 0.007 0.122 2 46 0.026 0.006 0.124 3 97 0.025 0.006 0.124
the global best fitness is 1.0 for generations= 30
Friday, August 12, 11
48
results 1 ar yellow 2 az blue 3 ca red 4 co green 5 ia red 6 id green 7 ks red 8 la red 9 mn blue 10 mo green 11 mt yellow 12 nd red 13 ne yellow 14 nm yellow 15 nv yellow 16 ok blue 17 or blue 18 sd green 19 tx green 20 ut red 21 wa red 22 wy blue
Our Coloring
Friday, August 12, 11