Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria in extensive form games

Andrew Gilpin

Advanced AI – April 7, 2005

This talk

• Extensive form games– Representation– Computing equilibrium

• Poker AI– History of poker research– Current research

Extensive form representation1. I = {0, 1, …, n} – players2. (V,E), terminals Z – tree3. P: V \ Z H – controlling player

4. H = {H0, …, Hn} – information sets

5. A = {A0, …, An} – actions6. u : Z Rn – payoffs7. p – chance probabilities

Perfect recall assumption: Players never forget informationGame from: Bernhard von Stengel. Efficient Computation of BehaviorStrategies. In Games and Economic Behavior 14:220-246, 1996.

Computing equilibria via normal form

• Normal form exponential, in worst case and in practice (e.g. poker)

Sequence form• Instead of a move for every information set,

consider choices necessary to reach each information set and each leaf

• These choices are sequences and constitute the pure strategies in the sequence form

S1 = {{}, l, r, L, R}S2 = {{}, c, d}

Realization plans• Players strategies are specified as realization plans over sequences:

• Prop. Realization plans are equivalent to behavior strategies.

Computing equilibria via sequence form• Players 1 and 2 have realization plans x and y

• Realization constraint matrices E and F specify constraints on realizations

{} l r L R

{} c d

{} v v’

{} u

Computing equilibria via sequence form• Payoffs for player 1 and 2 are: and

for suitable matrices A and B

• Creating payoff matrix:– Initialize each entry to 0

– For each leaf, there is a (unique) pair of sequences corresponding to an entry in the payoff matrix

– Weight the entry by the product of chance probabilities along the path from the root to the leaf

{} c d

{} l r L R

Computing equilibria via sequence form

Primal Dual

Holding x fixed,compute best response

Holding y fixed,Compute best response

Primal Dual

Computing equilibria via sequence form: An example

min p1subject to x1: p1 - p2 - p3 >= 0 x2: 0y1 + p2 >= 0 x3: -y2 + y3 + p2 >= 0 x4: 2y2 - 4y3 + p3 >= 0 x5: -y1 + p3 >= 0 q1: -y1 = -1 q2: y1 - y2 - y3 = 0bounds y1 >= 0 y2 >= 0 y3 >= 0 p1 Free p2 Free p3 Freeend

Sequence form summary

• Poly-time algorithm for computing Nash equilibria in 2-player zero-sum games

• Poly-size linear complementarity problem (LCP) for computing Nash equilibria in 2-player general-sum games

• Major shortcomings:– Not well understood when more than two players

– Sometimes, polynomial is still slow (e.g. poker)

Poker

• Poker is a wildly popular card game– This year’s World Series of Poker is expected to have

prizes totaling almost $50 million

• Challenges– Incomplete information

– Risk assessment

– Deception and counter-deception

• Sequence form does not directly apply– Two-player Texas Hold’em has ~1018 nodes

Hold’em Poker

• Every player receives hole cards

• Some cards are placed on the table (flop, turn, river)

• Betting rounds after each deal of cards– Players can bet, raise, check, fold, call

• At end of the game, player with best hand takes the pot

Previous work in poker research

• Rule-based• Simulation/Learning• Game-theoretic

– Manual abstraction• “Approximating Game-Theoretic Optimal Strategies

for Full-scale Poker”, Billings, Burch, Davidson, Holte, Schaeffer, Schauenberg, Szafron, IJCAI-03. Distinguished Paper Award.

– Automated abstraction

Finding equilibria in large sequential games of incomplete information

(Joint with Tuomas Sandholm, 2005)

• Outline:– Extensive game isomorphism– Restricted game isomorphic abstraction transformation– GameShrink – automatically shrinking games– Application to poker– Approximation methods

Extensive game isomorphism: example

Extensive game isomorphism: example

Extensive game isomorphism: definition

• Let G=(I,V,E,P,H,A,u,p) and G’=(I’,V’,E’,P’,H’,A’, u’,p’) be given. A bijection f:V V’ is an extensive game isomorphism if:

1. f induces a graph isomorphism between (V,E) and (V’,E’)

2. For each information set h in G, f induces a bijection between the nodes of h and some h’ in G’

3. P(x) = P’(f(x)) for all x in V \ Z

4. U(x) = u’(f(x)) for all x in Z

5. p(h,a) = p’(f(h), f(a)) for all h in H0

Restricted game isomorphic abstraction transformation

• The restricted game Gx is obtained from G by removing all nodes except x and its descendants.

• (Gx,Gy) is contractible within G if1. x and y are in the same information set2. Every node in that information set has the same parent, and

the parent is either in a singleton information set or a chance node

3. Gx and Gy are extensive game isomorphic

• For (Gx,Gy) contractible, the restricted game isomorphic abstraction transformation is the game where Gx and Gy are “merged”

Restricted game isomorphicabstraction transformation: example



Main equilibrium result

• Thm. Let G be a sequential game with observable actions, let G’ be obtained by one application of the restricted game isomorphic abstraction transformation, and let s’ be a Nash equilibrium for G’. Then the corresponding s for G is a Nash equilibrium.

Computing ExtensiveGameIsomorphic?(x,y)

1. If x and y both leaves, return u(x) == u(y)

2. If x and y have different number of children, or if a different player controls them, return false

3. Construct bipartite graph Gx,y (see next slide).

4. Return true if Gx,y has a perfect matching; otherwise return false.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Constructing Gx,y



Constructing Gx,y



Constructing Gx,y



Constructing Gx,y



Constructing Gx,y



Constructing Gx,y



GameShrink: Efficiently computing restricted game isomorphic abstraction transformations

1. Bottom-up pass: Compute the ExtensiveGameIsomorphic relation for each pair of equal depth nodes.

2. Top-down pass: For i from 0 to height(G):• For each information set h at level i whose

nodes share a common parent:• Apply the restricted game isomorphic abstraction

transformation to each applicable x and y in h

Enhancements

• Disjoint-set data structure for storing isomorphisms

• Implicit enumeration of game tree nodes

• Necessary conditions for extensive game isomorphism

• Payoff histogram database

Application to poker• Theorem. In poker, can compute

isomorphisms only considering card tree.

J1 J2

J2 J1 J1 J2

K

KK

0 -1 -1 0 1 1

Rhode Island Hold’em

• Invented as a testbed for AI research [Shi & Littman 2001]

• More than 3.1 billion game tree nodes• Applying sequence form:

– LP has 91 million rows and columns

• Applying GameShrink:– LP has 1.2 million rows and columns– Solvable in about 1 week– GameShrink itself takes less than 1 second, the LP

solve still dominates

Future poker research

• More difficult games– Multi-player

• LP only handles two players• Possible mapping of n-player strategy to (n+1)- player

strategy

– Tournament• Size of bankroll changes aggressiveness of players

• Maximally vs. Optimally– Opponent modeling

Date post:	22-Dec-2015
Category:	Documents
View:	219 times
Download:	0 times

Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Documents