+ All Categories
Home > Documents > Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Date post: 22-Dec-2015
Category:
View: 219 times
Download: 0 times
Share this document with a friend
36
Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005
Transcript
Page 1: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria in extensive form games

Andrew Gilpin

Advanced AI – April 7, 2005

Page 2: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

This talk

• Extensive form games– Representation– Computing equilibrium

• Poker AI– History of poker research– Current research

Page 3: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Extensive form representation1. I = {0, 1, …, n} – players2. (V,E), terminals Z – tree3. P: V \ Z H – controlling player

4. H = {H0, …, Hn} – information sets

5. A = {A0, …, An} – actions6. u : Z Rn – payoffs7. p – chance probabilities

Perfect recall assumption: Players never forget informationGame from: Bernhard von Stengel. Efficient Computation of BehaviorStrategies. In Games and Economic Behavior 14:220-246, 1996.

Page 4: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria via normal form

• Normal form exponential, in worst case and in practice (e.g. poker)

Page 5: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Sequence form• Instead of a move for every information set,

consider choices necessary to reach each information set and each leaf

• These choices are sequences and constitute the pure strategies in the sequence form

S1 = {{}, l, r, L, R}S2 = {{}, c, d}

Page 6: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Realization plans• Players strategies are specified as realization plans over sequences:

• Prop. Realization plans are equivalent to behavior strategies.

Page 7: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria via sequence form• Players 1 and 2 have realization plans x and y

• Realization constraint matrices E and F specify constraints on realizations

{} l r L R

{} c d

{} v v’

{} u

Page 8: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria via sequence form• Payoffs for player 1 and 2 are: and

for suitable matrices A and B

• Creating payoff matrix:– Initialize each entry to 0

– For each leaf, there is a (unique) pair of sequences corresponding to an entry in the payoff matrix

– Weight the entry by the product of chance probabilities along the path from the root to the leaf

{} c d

{} l r L R

Page 9: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria via sequence form

Primal Dual

Holding x fixed,compute best response

Holding y fixed,Compute best response

Primal Dual

Page 10: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing equilibria via sequence form: An example

min p1subject to x1: p1 - p2 - p3 >= 0 x2: 0y1 + p2 >= 0 x3: -y2 + y3 + p2 >= 0 x4: 2y2 - 4y3 + p3 >= 0 x5: -y1 + p3 >= 0 q1: -y1 = -1 q2: y1 - y2 - y3 = 0bounds y1 >= 0 y2 >= 0 y3 >= 0 p1 Free p2 Free p3 Freeend

Page 11: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Sequence form summary

• Poly-time algorithm for computing Nash equilibria in 2-player zero-sum games

• Poly-size linear complementarity problem (LCP) for computing Nash equilibria in 2-player general-sum games

• Major shortcomings:– Not well understood when more than two players

– Sometimes, polynomial is still slow (e.g. poker)

Page 12: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Poker

• Poker is a wildly popular card game– This year’s World Series of Poker is expected to have

prizes totaling almost $50 million

• Challenges– Incomplete information

– Risk assessment

– Deception and counter-deception

• Sequence form does not directly apply– Two-player Texas Hold’em has ~1018 nodes

Page 13: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Hold’em Poker

• Every player receives hole cards

• Some cards are placed on the table (flop, turn, river)

• Betting rounds after each deal of cards– Players can bet, raise, check, fold, call

• At end of the game, player with best hand takes the pot

Page 14: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Previous work in poker research

• Rule-based• Simulation/Learning• Game-theoretic

– Manual abstraction• “Approximating Game-Theoretic Optimal Strategies

for Full-scale Poker”, Billings, Burch, Davidson, Holte, Schaeffer, Schauenberg, Szafron, IJCAI-03. Distinguished Paper Award.

– Automated abstraction

Page 15: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Finding equilibria in large sequential games of incomplete information

(Joint with Tuomas Sandholm, 2005)

• Outline:– Extensive game isomorphism– Restricted game isomorphic abstraction transformation– GameShrink – automatically shrinking games– Application to poker– Approximation methods

Page 16: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Extensive game isomorphism: example

Page 17: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Extensive game isomorphism: example

Page 18: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Extensive game isomorphism: definition

• Let G=(I,V,E,P,H,A,u,p) and G’=(I’,V’,E’,P’,H’,A’, u’,p’) be given. A bijection f:V V’ is an extensive game isomorphism if:

1. f induces a graph isomorphism between (V,E) and (V’,E’)

2. For each information set h in G, f induces a bijection between the nodes of h and some h’ in G’

3. P(x) = P’(f(x)) for all x in V \ Z

4. U(x) = u’(f(x)) for all x in Z

5. p(h,a) = p’(f(h), f(a)) for all h in H0

Page 19: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Restricted game isomorphic abstraction transformation

• The restricted game Gx is obtained from G by removing all nodes except x and its descendants.

• (Gx,Gy) is contractible within G if1. x and y are in the same information set2. Every node in that information set has the same parent, and

the parent is either in a singleton information set or a chance node

3. Gx and Gy are extensive game isomorphic

• For (Gx,Gy) contractible, the restricted game isomorphic abstraction transformation is the game where Gx and Gy are “merged”

Page 20: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Restricted game isomorphicabstraction transformation: example

Page 21: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Restricted game isomorphicabstraction transformation: example

Page 22: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Restricted game isomorphicabstraction transformation: example

Page 23: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Main equilibrium result

• Thm. Let G be a sequential game with observable actions, let G’ be obtained by one application of the restricted game isomorphic abstraction transformation, and let s’ be a Nash equilibrium for G’. Then the corresponding s for G is a Nash equilibrium.

Page 24: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Computing ExtensiveGameIsomorphic?(x,y)

1. If x and y both leaves, return u(x) == u(y)

2. If x and y have different number of children, or if a different player controls them, return false

3. Construct bipartite graph Gx,y (see next slide).

4. Return true if Gx,y has a perfect matching; otherwise return false.

Page 25: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Page 26: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Page 27: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Page 28: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Page 29: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Page 30: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Page 31: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Constructing Gx,y

• Each vertex corresponds to an information set containing a child node.

• Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Page 32: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

GameShrink: Efficiently computing restricted game isomorphic abstraction transformations

1. Bottom-up pass: Compute the ExtensiveGameIsomorphic relation for each pair of equal depth nodes.

2. Top-down pass: For i from 0 to height(G):• For each information set h at level i whose

nodes share a common parent:• Apply the restricted game isomorphic abstraction

transformation to each applicable x and y in h

Page 33: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Enhancements

• Disjoint-set data structure for storing isomorphisms

• Implicit enumeration of game tree nodes

• Necessary conditions for extensive game isomorphism

• Payoff histogram database

Page 34: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Application to poker• Theorem. In poker, can compute

isomorphisms only considering card tree.

J1 J2

J2 J1 J1 J2

K

KK

0 -1 -1 0 1 1

Page 35: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Rhode Island Hold’em

• Invented as a testbed for AI research [Shi & Littman 2001]

• More than 3.1 billion game tree nodes• Applying sequence form:

– LP has 91 million rows and columns

• Applying GameShrink:– LP has 1.2 million rows and columns– Solvable in about 1 week– GameShrink itself takes less than 1 second, the LP

solve still dominates

Page 36: Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Future poker research

• More difficult games– Multi-player

• LP only handles two players• Possible mapping of n-player strategy to (n+1)- player

strategy

– Tournament• Size of bankroll changes aggressiveness of players

• Maximally vs. Optimally– Opponent modeling


Recommended