Introduction to AI (part two) Tim Watson G6.71 tw@dmu.ac.uk.

Post on 19-Dec-2015

220 views 0 download

Tags:

transcript

Introduction to AI (part two)

Tim Watson

G6.71

tw@dmu.ac.uk

Overview

• Three weeks on Genetic Algorithms (Gas)

• Two weeks on Symbolic AI

• One week of revision

• Labs used to run experiments with a simple GA program

• Coursework and slides will be available on L:\tw\IntroAI

What is a GA?

• Evolutionary Algorithm– Create random population of possible

solutions in the form of bitstrings– See how good each one is (test fitness)– Produce next generation (fitter are more likely

to get into next generation)– Crossover, mutate and make next generation

the current one

What can they do?

• Schedule Barcelona Olympics• Aircraft Design• Dynamic Routing in Networks• Robot Arm Trajectory Planning• Lab Task Scheduling for US Navy• Aircraft Missile Evasion• Evolving aNN Architecture• Parameter Tuning for Sonar Systems• Conformational Analysis of DNA

Example: Onemax

1. Initial population:100

000110

2. Calculate fitness:100 2

000 1110 3

3. Reproduce:110100110

4. Crossover & Mutate:110101010

Biological Terminology

• Genes etc.– Gene– Locus (plural loci)– Allele (also called gene value)– (Pleiotropic gene)

• Genotype (Chromosome)

• Phenotype

• Fitness Landscape

GA Theory

• Schema Theorem– GA searches schemata in parallel– 10 represents 10, 1#, #0 and ##– The theorem is rubbish!

• Building Block Hypothesis– Good, small sequences are found and

recombined to form good solutions

• No Free Lunch Theorem

GA Parameters

• Population Size– Static or Dynamic?

• Chromosome Size– Fixed or Variable?

• Crossover Rate– One-point, two-point or uniform?

• Mutation Rate– Fixed or Variable?

• Fitness Function• Termination Criteria

Prediction Test!

• What happens to the population statistics in a standard GA with random fitness, no crossover, no mutation and chromsize equals 16?– Best, Worst, Mean , Std. Dev., column counts

• Best=17, Worst=1, Mean=9ish, Std. Dev. constant-ish, pop converges randomly.

Reproduction in GAs

• Need selective pressure for reproduction to improve the population fitness– None leads to random walk (slow)– Some leads to geometric growth of best (fast)

• Infinite populations select individuals on relative fitness: fit/mean(fit)

• Finite populations also affected by how many copies are already present

Types of Selection

• Fitness-Proportionate– Fitness scaling based on raw fitness where if

fita < fitb then scaled(fita) scaled(fitb)

– Scaling can be altered dynamically

• Rank-Order

• Tournament

• Elitism

Initialising the Population

• Uniformly at random

• Best Guesses

• Converged to Best Known

• From Real World

• Hybrid

Mutation

• Goal of selection: survival of the fittest• Goal of mutation: explore lost or never seen

alleles• Random reset versus bit flip

– Reset rate = ½ bit flip rate

• Mutation as a spring• In infinite time every possible population visited

an infinite number of times• Alternative to mutation: complement

Crossover

• Goal: to try out different combinations of good bits of individuals

• Crossover point– One-point– Two-point– N-point– Uniform

Crossover (2)

• Closer genes are less likely to be split by crossover– AB###### probability of split with one-point

crossover = 1/7– A######B probability of split = 1

• Local maxima can occur (e.g. for onemax)11011 fit=4 101100010100100 fit=1 0010100000All children have lower fitness than parents

Design Decisions

• If genes are linked then the representaion of an individual ought to keep them close together

• Get the balance right:– Popsize too small premature convergence– Popsize too large too slow to compute– Mutation rate too low not enough exploring– Mutation rate too high too much noise

Messy GAs

• Developed by David Goldberg et al. late 1980s early 1990s

• Goal of messy GAs: improve function optimisation performance by explicitly building up increasingly longer, highly fit strings from well-tested shorter building blocks

• Nature started with simple life-forms and built up complex ones

Representation

• Each bit is tagged with its ‘real’ locus but not all loci have to specified. For example, in a four-bit problem:

{(1,0), (2,0), (4,1), (4,0)} {(3,1), (3,0), (3,1), (4,0), (4,1), (3,1)}

• Overspecification: conflicting specification at a locus– left-to-right first come first served

• Underspecification: locus not specified– can be thought of as a schema, e.g. 00#1– replace with random value?– use hill-climbing to find local optimum and use value from that

Outline of Algorithm

• Two phases: primordial and juxtapositional• Primordial phase

– Goal: enrich population with small, promising candidate schemata

– guess order k of smallest relevant schemata– initial population contains all possible schemata of

order k, e.g. for k=3 and solution bitlength=4 there are 8 different 3-bit patterns and 4 ways of choosing 3 bits for a popsize of 32

– use selection only to produce next generation– regularly halve popsize and stop after n generations

Outline of Algorithm (cont.)

• Juxtapositional phase– popsize remains fixed, selection continues– no mutation– two new operators: cut and splice

• Cut takes a string and cuts it at a random point to produce two new strings:

{(1,1), (2,0), (1,0), (4,1), (1,1)}{(1,1), (2,0), (1,0)} {(4,1), (1,1)}

• Splice joins two strings together• Goldberg argued that the primordial phase should

produce, in enough numbers, all the building blocks necessary to construct an optimal solution and that the juxtapositional phase should then construct it.

Problems with Messy GAs

• Based on Schema Theorem, which isn’t correct• Need to know (or guess) k, when to halve

popsize and how many generations to run primordial phase

• Initial size of population easily becomes too large (e.g. for k=8, bitlength=64, popsize approx. 1 trillion)

• Goldberg et al. have announced that messy GAs are ready for real-world problems and call for their immediate application to difficult combinatorial problems of practical importance. No results have yet been forthcoming.