Date post: | 05-Apr-2018 |
Category: |
Documents |
Upload: | rachit-goel |
View: | 212 times |
Download: | 0 times |
of 31
8/2/2019 CP0270_12-Apr-2012_RM01
1/31
Genetic Algorithms
8/2/2019 CP0270_12-Apr-2012_RM01
2/31
Overview Genetic Algorithms: a gentle introduction What are GAs
How do they work/ Why?
Critical issues
Use in Data Mining GAs and statistics
decile performance maximization
multi-objective models
8/2/2019 CP0270_12-Apr-2012_RM01
3/31
Natural Genetics to AI
Computational models inspired by
biological evolution survival of the fittest
reproduction through cross-breeding
8/2/2019 CP0270_12-Apr-2012_RM01
4/31
Genetic Algorithms Population based search (parallel) simultaneous search from multiple points in search space
useful in complex, unstructured search spaces
(less prone to local failures)
Population members: potential solutions
Population of solutions evolve from onegeneration to the next
8/2/2019 CP0270_12-Apr-2012_RM01
5/31
Genetic Algorithms
Search objective Fitness score for population members
(fitness function)
Survival of the fittest selection
Generating new solutions Mating and reproduction of individuals
(crossover, mutation)
8/2/2019 CP0270_12-Apr-2012_RM01
6/31
Basic OperationString1 (f1)
String2 (f2)
String3 (f3)
String4 (f4)
...
...
StringN (fN)
String1
String2
String2
String4
...
...
Stringx
Offspring1(1,4)
Offspring2(1,4)
Offspring3(2,7)
Offspring4(2,7)
...
...
OffspringN(x,y)
Selection RecombinationCrossover Mutation
Generation t Generation t+1
8/2/2019 CP0270_12-Apr-2012_RM01
7/31
GAs: Parallel Search
X
X
Hill
climber
Fitness
x
8/2/2019 CP0270_12-Apr-2012_RM01
8/31
GAs: Basic Principles Representation of individuals String of parameters (genes) : chromosome
eg. optimize a function F(p,q,r,s,t)
Population members: p q r s t
genotype andphenotype
8/2/2019 CP0270_12-Apr-2012_RM01
9/31
Binary representation?
Population members as bit strings
F( p,q,r,s,t) as:
1 0 0 1 1 0 1 0 1 1 0 1 1 0 0 1 1 0 1 0
p q r s t
early theory in terms of binary strings (schematheorem)
unnecessary perversity?
8/2/2019 CP0270_12-Apr-2012_RM01
10/31
GAs: Basic Principles Survival of the fittest (Fitness function)
numerical figure of merit/utility measure of an individual
tradeoff amongst a multiple evaluation criteria
efficient evaluation
8/2/2019 CP0270_12-Apr-2012_RM01
11/31
GAs: Basic Principles
Iterative search population evolves over generations
Convergence progression towards uniformity in population
premature convergence?
(local optima)
8/2/2019 CP0270_12-Apr-2012_RM01
12/31
Typical GA RunFitness
Generations
Best
Average
8/2/2019 CP0270_12-Apr-2012_RM01
13/31
Operators: Selection Fitness proportionate selection (fi/f )
number ofreproductive trials for individuals
8/2/2019 CP0270_12-Apr-2012_RM01
14/31
Selection Roulette-wheel selection(stochastic sampling with replacement)
wheel spaced in proportion to
fitness values
N (pop size) spins of the wheel
Stochastic universal sampling N equally spaced pins on wheel
single turn of the wheel
8/2/2019 CP0270_12-Apr-2012_RM01
15/31
Selection Premature converge Fitness scaling
f = f - (2*avg. - max.)
Ranked fitness
Elitism
Steady-state selection Demetic grouping
8/2/2019 CP0270_12-Apr-2012_RM01
16/31
Operators: Crossover
Parent 1: axpsqvqbtpihd
Parent 2: qzxxaycgbtphw
crossover sites
Offspring 1: azpsavcbtpphd
Offspring 2: qxxxqyqgbtihw
(Uniform crossover)
combining good building blocks
8/2/2019 CP0270_12-Apr-2012_RM01
17/31
Operators: Mutation
alters each gene with small probability
x 1 y x 0 y 0 y y 0 x y x y
x 1 y x 0 y 1 y y 0 x x x y
8/2/2019 CP0270_12-Apr-2012_RM01
18/31
Non-Binary Representations Integer, real-number, order-based, rules, ...
Binary or Real-valued?
real representations give faster, moreconsistent, more accurate results
High-level representation
intuitive, can utilize specializedoperators effective search over complex spaces
8/2/2019 CP0270_12-Apr-2012_RM01
19/31
Real-valued representationParent1: 3.45 0.56 6.78 0.976 2.5Parent2: 0.98 1.06 4.20 0.34 1.8
Offspring1: 3.22 0.56 6.78 0.65 2.12
Offspring2: 1.43 1.06 4.20 0.41 1.93
(Arithmetic crossover)
8/2/2019 CP0270_12-Apr-2012_RM01
20/31
High-level representationParent1:Parent2:
Offspring1:
Offspring2:
{(1.2 x 3.4) (5.8 x 6.0) (0.2 x 0.61)}1 2 7
{( . . ) ( . . ) ( . . )2 3 41 36 51 51 5616 2 4
x x x
( . . ) ( . . )}03 11 2 2 2 73 9x x
{ ( . . ) ( . . )}(1.2 x 3.4)1
2 2 2 7 51 5619 4
x x
{( . . ) [( . . ) ]2 3 41 36 516 2
x x (5.8 x 6.0)2
( . . ) }03 113x (0.2 x 0.61)7
8/2/2019 CP0270_12-Apr-2012_RM01
21/31
High-level representation
{( . . ) ( . . )}03 11 2 2 2 73 9
x x
{( . . ) ( . . ) ( . . )}03 11 2 2 2 7 51 623 9 4
x x x
Generalize/Specialize
{( . . ) ( . . )}03 11 2 2 2 73 9
x x
{( . . ) ( . . )}045 09 19 2 93 9
x x
8/2/2019 CP0270_12-Apr-2012_RM01
22/31
Tree-structured representation (GP)
/
x 5
log
*
(x log(y))/5)
y
Automated learning of programs (originally)parse tree expressions
Non-linear interaction terms
Function set : internal nodes
{+,-,*,/,log}terminal set: leaf nodes
{constants, variables}
8/2/2019 CP0270_12-Apr-2012_RM01
23/31
Tree-structured representation
Representing complex patterns
x 2
If (y2)then 0else 2x+y
8/2/2019 CP0270_12-Apr-2012_RM01
24/31
Genetic search: Issues Coding scheme, fitness function critical the art in GA design!
General mechanism so robust that, within reasonable margins,
parameter settings are not critical.
Representation to match problem, domain utilizing domain knowledge
problem-specific crossover, mutation, selection
Flexibility in fitness function formulation modeling business objectives
8/2/2019 CP0270_12-Apr-2012_RM01
25/31
Genetic search: Issues
Stochastic search initial populations, probabilistic operators
multiple runs with different random streams
Initializing population with known solutions
seeding initial population with solutions from multiple,independent runs
8/2/2019 CP0270_12-Apr-2012_RM01
26/31
Genetic search: Issues
Guarantees optimality? But...
GAs and traditional techniques especially useful where traditional approaches fail
in conjunction with traditional techniques
Parallelizable for large data multi-processor, networked machines
8/2/2019 CP0270_12-Apr-2012_RM01
27/31
Using GAs ?
When to use a GA?
GA and traditional techniques
How long does it take?
Will it perform better?
8/2/2019 CP0270_12-Apr-2012_RM01
28/31
Using GAs
population size
mutation, crossover rates
how many generations
multiple runs
8/2/2019 CP0270_12-Apr-2012_RM01
29/31
Is it a black-box?
? Huh?
Data characteristics
Fitness function
GA parameters
8/2/2019 CP0270_12-Apr-2012_RM01
30/31
GA Application Examples Function optimizers difficult, discontinuous, multi-modal, noisy functions
Combinatorial optimization
layout of VLSI circuits, factory scheduling, traveling
salesman problem
Design and Control
bridge structures, neural networks, communication networksdesign; control of chemical plants, pipelines
8/2/2019 CP0270_12-Apr-2012_RM01
31/31
GA Application Examples
Machine learning
classification rules, economic modeling, scheduling strategies
Portfolio design, optimized trading models, directmarketing models, sequencing of TV advertisements,
adaptive agents, data mining, etc.