Post on 21-Dec-2015
transcript
Combinatorial Games
Martin Müller
Contents
• Combinatorial game theory• Thermographs• Go and Amazons as combinatorial games
Combinatorial Games
• Basics
• Example: Domineering
• Simplifying games
• Sums of games• Hot games
What is a Game?
• 2 players, Left and Right
• Set of positions, starting position
• Moves defined by rules
• Alternating moves
• Player who cannot move loses(no draws)
Conway's plan:find the simplestpossible definition
Properties of Games
• Complete information
• Perfect information
• No random element (no dice, coin throws, …)
Definition of a Game
• Move options of players• Each move leads to a game• Player who cannot move loses
A B C D E
{ A,B,C | D,E }
G = { L1,…,Ln | R1,…,Rm }
Creating Games
G = { L1,…,Ln | R1,…,Rm }
• Simplest possible game:{ | }
• Next step:{{ | } | }{ | { | }}
{{ | } | { | }}• Continue...
Games and Numbers
• Insight: some games represent a number of free moves for one player
0 = { | }
1 = { 0 | }- 1 = { | 0 }
- 2 = { | - 1 }2 = { 1 | }
Infinite Games
• Recursion: option leads back to game
G = { A,B | C }A = { |G }
A B C
G
The Domineering Game
R
L
Domineering Examples
Inverse Game
• Swap all Left and Right moves• Compute inverse for all options recursively
G = { L1,…,Ln | R1,…,Rm }. • Inverse:
-G = { -R1,…,-Rm | -L1,…,-Ln }• Property of inverses:
-(-G) = G
Examples of Inverses
-(0) = -({ | }) = { | } = 0-(1) = -({0 | }) = { | -0} = { | 0} = -1-({0|0}) = {-0 | -0} = {0|0}
Domineering Example
• Inverse of domineering position: rotate by 90˚
G -G
90˚
Classification of Games
G > 0 Left winsG < 0 Right winsG = 0 Second player winsG || 0 First player wins
Classification Examples
0 = { | } First player loses
{ 0 | 0 } First player win
{ 0 | { 0 | 0 } } Left always wins
{{ 0 | 0 } | 0 } Right always wins
Comparing Games
• G > H if G - H > 0Left wins difference game
• G < H if G - H < 0Right wins difference game
• G = H if G - H = 0Second player wins difference game
• G || H if G - H || 0First player wins difference game
Canonical Form of Games
• Loopfree games have canonical form• Two operations:
– Delete dominated options– Reversing reversible options
• Apply as long as possible• End result: unique canonical form
Deleting Dominated Options
• Example:{2, -5, 6, 3 | -2, 6, 13, -8} = {6|-8}
• General problem: compare games• Complete algorithm implemented in David
Wolfe's games package
Sums of Games
• Two games, G and H• Choice: play either in G or in H
G+H = { G+HL, GL+H | G+HR, GR+H }
• Example:-5+3 = { -5+3L, -5L+3 | -5+3R, -5R+3 }
= {-5+2|-4+3} = {-3|-1} = -2
Sum of Domineering Positions
Fractions
• Example: {0|1} + {0|1} = 1
-10 1
{-1,0|1}={0|1} = 1/2
Hot Games
• First player gets extra moves• Both are eager to play
• Example: {1|-1}
The 2x2 square is hot
Sums of Hot Games
• Can be much more complex than summands• Example:a = {1|-1}, b = {2|-2}, c = {3|-3}, d = {4|-4}• Sums:a+b = {{3|1}|{-1|-3}}a+b+c = {{{6|4}|{2|0}}|{{0|-2}|{-4|-6}}}a+b+c+d = {{{10|8}|{6|4}}|{{4|2}|{0|-2}}}
|{{{2|0}|{-2|-4}}|{{-4|-6}|{-8|-10}}}
Mean
• Mean • Average outcome• Means add
Examples: 4|-4) = 06|-4) = 14|{-4|-10}) = -3/24|{-4|-20}) = -4
Theorem: a+bab
Temperature
• Measures urgency of move
• Sum does not become hotter
Examples:temp4|-4}) = 4temp4|{-4|-10}) = 11/2temp4|{-4|-20}) = 8temp4|{-4|-100}) = 8
temp(a+b) max(temp(a), temp(b))≤
Example
• a = 4|-4, b = 5|-5, c = 5 |{-4|-6}• temp(a) = 4, temp(b) = 5, temp(c) = 5• temp(a + b) = 5• temp(b + c) = 1• temp(b + b) = 0
Leftscore and Rightscore
• Also called LeftStop and RightStop
• Minimax values of game if left (right) plays first
• Assumption: play stops in numbers
• Base points of thermograph (see next slides)
Thermograph
t
score
Left
scaffold
Right
scaffold
mean
temperature
Thermograph (TG)• Consists of left and right scaffold
• May coincide in a mast
• Leaf node: TG of numbers are masts
• Constructed from TG of followers
– Tax right scaffold of left follower by t
– Tax left scaffold of right follower by -t
– Compute max (min) over all left (right) followers
– Cut off above intersection of left, right, add mast
Sente and Gote Thermographs
• Three examples
– Gote
– One-sided sente
– Double sente• All examples: leftScore - rightScore = 4. • Appear the same to a local minimax search• But they are very different!
Gote• Game: 4|0
• leftScore 4
• rightScore 0
• Mean: 2
• Temperature: 2
a
One-sided Sente
a
• Game: 22|4||0
• leftScore 4
• rightScore 0
• Mean: 4
• Temperature: 4
Double Sente
a b
• Game: 12|3 || -1|-11.5
• leftScore 3
• rightScore -1
• Mean: 0.5
• Temperature: 7
Extensions (1)
• Sub-zero thermography
– Problem: hard to check when game is number
– extend TG to range [-1..0]
– “colored ground” rule for zugzwang-like games
– Can now construct TG from options in a uniform way
– TG = makeTG(left-option-TGs,right-option-TGs)
Extensions (2)
• TG for games including loops
– Defined by Berlekamp’s Economists’s view paper
– I did the first practical algorithm and implementation
– Much more complex…
– Caves, hills, bent masts, backward masts,…
Some Wild Ko Thermographs
Stable and Unstable Positions
• Position H in game G is called stable if temperature is lower than all of its ancestors
• H is unstable if it has an ancestor with lower temperature
• H is semistable if not unstable and has ancestor of same temperature
Subtree of Stable Followers
• Root of a game tree is stable by definition
• Find first stable node on each line of play
• Go on recursively
• This subtree of stable followers is a (very good) small summary of the whole game
Mainlines and Sidelines
• Given G, play n copies of G optimally
• Let n go to infinity
• Some lines of play will be played more and more often
– Mainlines
• Other lines played only finitely often
– Sidelines
Stable Followers in Mainlines
• Stable mainline gote position: has two stable followers, one for each color
• Stable mainline one-sided sente position:
– Only stable follower of one color (sente)
• In a “rich environment” (e.g. coupon stack), play follows mainlines.
Playing Sum Games
• Choose one subgame
• Choose move in that subgame
• Brute force algorithm:
– Compute sum
– Find move retaining minimax value
– Problem: computing sum is slow
Fast Approximate Methods
• Goal: identify good move without computing sum
• Two parameters: mean and temperature• Hottest games usually most urgent• Refinement: Thermostrat