+ All Categories
Home > Documents > Game Playing

Game Playing

Date post: 15-Jan-2016
Category:
Upload: oke
View: 28 times
Download: 0 times
Share this document with a friend
Description:
Game Playing. Evolve a strategy for two-person zero-sum games. Help the user to determine the next move. Constructing a game tree Each node represents a state in the game Each arc represents a legal move The minimax algorithm Alpha-beta pruning. Example: Minimax Algorithm. Game Tree: - PowerPoint PPT Presentation
Popular Tags:
15
Genetic Programm ing Game Playing Evolve a strategy for two-person zero-sum games. Help the user to determine the next move. Constructing a game tree Each node represents a state in the game Each arc represents a legal move The minimax algorithm Alpha-beta pruning
Transcript
Page 1: Game Playing

Genetic ProgrammingGame Playing

Evolve a strategy for two-person zero-sum games.

Help the user to determine the next move. Constructing a game tree

Each node represents a state in the game Each arc represents a legal move

The minimax algorithm Alpha-beta pruning

Page 2: Game Playing

Genetic ProgrammingExample: Minimax Algorithm

Game Tree: We want to maximize player X’ score. A value of 1 indicates a win for player X and a loss for player O. A value of 0 indicates a win for player O and a loss for player X.

1 0 0 0 1 0 0 1

X X X X

X

OO

10

0

1 1

1

1

Page 3: Game Playing

Genetic ProgrammingHeuristics

Not viable to generate the entire game tree. Use of heuristics Example : Tic-Tac-Toe

Number of possible wins for X minus number of possible wins for O.

X X

O

8 – 5 = 3 4 – 5 = -1

Page 4: Game Playing

Genetic ProgrammingExample: Minimax Algorithm

32 31 15 16 7 8 24 23

X X X X

X

OO

32 16

16

8 24

8

16

Page 5: Game Playing

Genetic ProgrammingGame Tree

32 31 15 16 7 8 24 23

32 16 8 24

16 8

16

X X X X

X

OO

3 4 20 19 28 27 11 12

4 20 28 12

4 12

12

X X X X

X

OO

O

12

1 2 18 17 26 25 9 10

2 18 26 10

2 10

10

X X X X

X

OO

30 29 13 14 5 6 22 21

30 14 6 22

14 6

10

X X X X

X

OO

O

14

X 12

Page 6: Game Playing

Genetic ProgrammingOperators

Terminals – Legal moves, i.e. left and right Functions: CXM1, CXM2, COM1, COM2

XM1: first move made by player X XM2: second move made by player X OM1: first move made by player O OM2: second move made by player O

CXM1

Arg1 Arg2 Arg3

XM1=UXM1=RXM1=L

CXM2

Arg1 Arg2 Arg3

XM2=UXM2=RXM2=L

COM1

Arg1 Arg2 Arg3

OM1=UOM1=ROM1=L

COM2

Arg1 Arg2 Arg3

OM2=UOM2=ROM2=L

Page 7: Game Playing

Genetic ProgrammingFitness Cases

Consists of the possible combinations of L and R for the moves that O can make.

Format: XM1, OM1, XM2, OM2

32 31 15 16 7 8 24 23

32 16 8 24

16 8

16

X X X X

X

OO

3 4 20 19 28 27 11 12

4 20 28 12

4 12

12

X X X X

X

OO

O

12

1 2 18 17 26 25 9 10

2 18 26 10

2 10

10

X X X X

X

OO

30 29 13 14 5 6 22 21

30 14 6 22

14 6

10

X X X X

X

OO

O

14

X 12

LLLL

LRRR

LLLR

LRRL

Page 8: Game Playing

Genetic ProgrammingEvaluation

The raw fitness of an individual is the sum of the payoffs for each fitness case.

The hits ratio is the number of fitness cases for which the individual receives a payoff at least as good as the minimax strategy.

What is the raw fitness and hits ratio of the following individuals?

COM1

L L L

COM1

L L R

Page 9: Game Playing

Genetic ProgrammingGP Parameters

Population size: 500 Max. no. of Generations: 51 Initial Population Generation: The ramped half-

and-half method with an initial tree depth of six and a depth limit of seventeen on the size of trees created by the genetic operators.

Method of Selection: Fitness proportionate selection

Page 10: Game Playing

Genetic ProgrammingEvolved Solution

com2

com1

cxm1

com1

cxm2

cxm1

cxm1L

R

L L L

L L

L

LL

R

R

R

R R

L

R

Page 11: Game Playing

Genetic ProgrammingSimplified Solution

com2

com1

RL

L R

L

Page 12: Game Playing

Genetic ProgrammingPursuer - Evader

P(0,0)

E(x,y)

Page 13: Game Playing

Genetic ProgrammingGame Parameters

The payoff for the pursuer is the time it takes to catch the evader .

The payoff of the evader is the time it remains free.

The information available at each stage of the game is the position of the pursuer and the evader.

A game-playing strategy will specify the angle at which the pursuer must move in order to catch the evader.

Page 14: Game Playing

Genetic ProgrammingTerminals and Functions

T={ X, Y , R } X - x-coordinate of the position of the evader Y – Y-coordinate of the position of the evader R – ephemeral constant in the range [-1, 1]

F={ +, -, /, EXP, IFLTZ} EXP – the exponential function IFLTZ – evaluates its first argument if its second

argument is less than zero else it evaluates its third arguments

Page 15: Game Playing

Genetic ProgrammingEvaluation

This fitness cases consists of 20 different positions of the evader on the plane, i.e. a set of (X, Y) coordinate values.

The raw fitness of an individual is average time required to catch the evader over the 20 fitness cases.

An upper limit is set on the maximum time permitted. The hits ratio is the number of fitness cases for which this time limit is not exceeded.


Recommended