Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | kenyon-hunt |
View: | 23 times |
Download: | 0 times |
double AlphaBeta(state, depth, alpha, beta)begin
if depth <= 0 then return evaluation(state) //op pov
for each action “a” possible from statenextstate = performAction(a, state)rval = -AlphaBeta(nextstate, depth-1,
-beta, -alpha);if (rval >= beta) return rval;if (rval > alpha) alpha = rval;
endforreturn alpha;
end
Meta-Reasoning for Search• problem: one legal move only (or one clear
favourite) alpha-beta search will still generate (possibly large) search tree
• similar symmetrical situations• idea: compute utility of expanding a node before
expanding it• meta-reasoning (reasoning about reasoning):
reason about how to spend computing time
Where We are: Chess
TechnologySearch depth
Level of play
minimax, evaluation function, cut-off test with quiescence search, large transposition table, speed: 1 million nodes/sec
5 ply novice
as above, but alpha-beta search 10 ply expert
as above, plus additional pruning, database of openings and end games, supercomputer
14 plygrand master
Deep Blue• algorithm:
– iterative-deepening alpha-beta search, transposition table, databases incl. openings, grandmaster games (700000), endgames (all with 5 pieces, many with 6)
• hardware:– 30 IBM RS/6000 processors
• software search: at high level– 480 custom chess processors for
• hardware search: search deep in the tree, move generation and ordering, position evaluation (8000 features)
• average performance:– 126 million nodes/sec., 30 billion position/move generated,
search depth: 14 (but up to 40 plies)
Samuel’s Checkers Program (1952)
• learn an evaluation function by self-play(see: machine learning)
• beat its creator after several days of self-play
• hardware: IBM 704– 10kHz processor– 10000 words of memory– magnetic tape for long-term storage
Chinook: Checkers World Champion
• simple alpha-beta search (running on PCs)• database of 444 billion positions with eight or
fewer pieces• problem: Marion Tinsley
– world checkers champion for over 40 years– lost three games in all this time
• 1990: Tinsley vs. Chinook: 20.5-18.5– Chinook won two games!
• 1994: Tinsley retires (for health reasons)
Backgammon
• TD-GAMMON– search only to depth 2 or 3– evaluation function
• machine learning techniques (see Samuel’s Checkers Program)
• neural network
– performance• ranked amongst top three players in the world
• program’s opinions have altered received wisdom
Go
• most popular board game in Asia
• 19x19 board: initial branching factor 361– too much for search methods
• best programs: Goemate/Go4++– pattern recognition techniques (rules)– limited search (locally)
• performance: 10 kyu (weak amateur)
A Dose of Reality: Chance• unpredictability:
– in real life: normal; often external events that are not predictable
– in games: add random element, e.g. throwing dice, shuffling of cards
• games with an element of chance are less “toy problems”
Search Trees with Chance Nodes
• problem: – MAX knows its own legal moves
– MAX does not know MIN’s possible responses
• solution: introduce chance nodes– between all MIN and MAX nodes
– with n children if there are n possible outcomes of the random element, each labelled with
• the result of the random element
• the probability of this outcome
Example: Search Tree for Backgammon
1/361-1
1/181-2
1/185-6
1/366-6
MAX
MIN
CHANCE
CHANCE
move
probability +outcome
move
probability +outcome
Optimal Decisions for Games with Chance Elements
• aim: pick move that leads to best position
• idea: calculate the expected value over all possible outcomes of the random element
expectiminimax value
Example: Simple Tree
0.9 0.90.1 0.1
2 2 13 43 1 4
2 3 1 4
0.9 × 2 + 0.1 × 3 = 2.1 0.9 × 1 + 0.1 × 4 = 1.3
2.1 1.3
Complexity of Expectiminimax• time complexity: O(bmnm)
– b: maximal number of possible moves– n: number of possible outcomes for the random
element– m: maximal search depth
• example: backgammon– average b is around 20
(but can be up to 4000 for doubles)– n = 21– about three ply depth is feasible