Chapter 14 Game Theory to accompany Operations Research: Applications and Algorithms 4th edition by...

Chapter 14

Game Theory

to accompany

Operations Research: Applications and Algorithms

4th edition

by Wayne L. Winston

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.

2

14.1 Two-Person Zero-Sum and Constant-Sum Games: saddle Points

Characteristics of Two-Person Zero-Sum games There are two players (called the row player and

column player) The row player must choose 1 of m strategies.

Simultaneously, the column player must choose 1 if n strategies.

If the row player chooses her ith strategy and the column player chooses her jth strategy, then the row player receives a reward for aij and the column player loses an amount aij. Thus, we may think of the row player’s reward of aij as coming from the column player.

Such a game is called a two-person zero-sum game, which can be represented by a reward matrix.

3

A zero-sum game is a game in which the payoffs for the players always adds up to zero is called a zero-sum game.

Two-person zero-sum is played according to the following basic assumption: Each player chooses a strategy that enables him/her to do

the best he/she can, given that his/her opponent knows the strategy he/she is following.

A two-person zero-sum game has a saddle point if and only if

Max (row minimum) = min (column maximum) all all

rows columns

4

If a two-person zero-sum game has a saddle

point The row player should choose any strategy (row)

attaining the maximum on the right side of (1). The column player should choose any strategy

(column) attaining the minimum on the right side of (1).

An easy way to spot a saddle point is to observe that the reward for a saddle point must be the smallest number in its row and the largest number in its column.

A saddle point can also be thought of as a equilibrium point in that neither player can benefit from a unilateral change in strategy.

5

A two-person constant-sum game is a two-player game in which, for any choice of both player’s strategies, the row player’s reward and the column player’s reward add up to a constant value c.

6

14.2 Two-Person Zero-Sum Games: Randomized Strategies, Domination and Graphical Solution

It has been shown that not all two-person zero-sum games have saddle points.

How can the value and optimal strategies for a two-person zero-sum game that does not have a saddle point be found?

7

Example 2: Odds and Evens

Two players (called Odd and Even) simultaneously choose the number of fingers (1 or 2) to put out. If the sum of the fingers put out by both players is

odd, then Odd wins $1 from Even.

If the sum of the fingers put out by both players is even, then Even wins $1 from Odd.

Consider the row player to be Odd and the column player to be Even.

Determine whether this game has a saddle point.

8

Example 2: Solution

This is a zero-sum game with the reward matrix shown.

Observe that for any choice of strategies by both players, there is a player who can benefit by unilaterally changing his or her strategy.

Thus, no choice of strategies by the player is stable.

Column Player (Even)

Row Player (Odd)

1 Finger 2 Fingers Row minimum

1 Finger -1 +1 -1

2 Fingers +1 -1 -1

Column Maximum

+1 +1

9

To further the analysis of the Example, the set of allowable strategies for each player must be expanded to include randomized strategies.

Any mixed strategy (x1, x2, … ,xn) for the row player is a pure strategy if any of the xi equals 1.

Any mixed strategy may be written as (x1, 1- x1), and it suffices t determine the value of x1.

As functions of x1, the expected rewards can be drawn as line segments on a graph.

10

The common value of the floor and ceiling is called the value of the game to the row player.

Any mixed strategy for the row player that guarantees that the row player gets an expected reward a least equal to the value of the fame is an optimal strategy for the row player.

11

14.3 Linear Programming and Zero-Sum Games

Linear programming can be used to find the value and optimal strategies for any two-person zero-sum game.

12

Example 4: Stone, Paper, Scissors

Two players must simultaneously utter one of the three words stone, paper, or scissors and show corresponding hand signs.

If both players utter the same word, then the game is a draw.

Otherwise one layer wins $1 from the other player according to the these rules Scissors defeats (cuts) paper Paper defeats (covers) stone Stone defeats (breaks) scissors

Find the optimal strategies for this two-person zero-sum game.

13

Example 4: Solution

The reward matrix is

Define:

x1 = probability that row player chooses stone

x2 = probability that row player chooses paper

x3 = probability that row player chooses scissors

y1 = probability that row player chooses stone

y2 = probability that row player chooses paper

y3 = probability that row player chooses scissors

Column Player

Row Player

Stone Paper Scissors Row minimum

Stone 0 -1 +1 -1

Paper +1 0 -1 -1

Scissors -1 +1 0 -1

Column minimum

+1 +1 +1 -1

14

The row player has chosen a mixed strategy (x1, x2, x3) then the optimal strategy can be found by solving the LP

The column player has chosen a mixed strategy (y1, y2, y3) then the optimal strategy can be found by solving the LP

max z = vs.t. v ≤ x2 – x3 (Stone constraint)

v ≤ -x1 + x3 (Paper constraint)v ≤ x1 – x2 (Scissor constraint)x1 + x2 +x3 = 1x1, x2, x3 ≥ 0; v urs

min z = ws.t. w ≤ -y2 – y3 (Stone constraint)

w ≤ y1 - y3 (Paper constraint)w ≤ -y1 + y2 (Scissor constraint)y1 + y2 +y3 = 1y1, y2, y3 ≥ 0; w urs

15

The column player’s LP is the dual of the row player’s LP.

Thus the row player’s floor equals the column players ceiling. This result is often known as the Minimax Theorem.

The common value of v and w the value of the game to the row player.

16

Ex. 4 – Solution continued

The most negative element in the Stone, Paper, Scissors reward matrix is -1.

Therefore, we add |-1| to each element of the reward matrix.

This yields the constant-sum game. The LP’s for each player are modified. Stone, Paper, Scissors appears to be a fair

game so we conjecture that v = w = 0. This solution is dual feasible. Thus the primal

feasible and dual feasible solution has been found.

17


The value of Stone, Paper, Scissors is v’ -1 = 0.

The optimal strategy for the row player is ( 1/3, 1/3, 1/3).

The optimal strategy for the column player is ( 1/3, 1/3, 1/3).

18

LINDO or LINGO can be used to solve for the value and the optimal strategies in a two-person zero-sum game.

Simply type in either the row or column player’s problem.

19

14.4 Two-Person Nonconstant-Sum Games

Most game-theoretic models of business situations are not constant-sum games, because it is unusual for business competitors to be in total conflict.

As in a two-person zero sum game, a choice of strategy by each player (prisoner) is an equilibrium point if neither player can benefit from a unilateral change in strategy.

20

Example 7: Prisoner’s Dilemma

Two prisoners who escaped and participated in a robbery have been recaptured and are awaiting trial for their new crime.

Although they are both guilty the district attorney is not sure he has enough evidence to convict them.

To entice them to testify against each other, the district attorney tells each prisoner

“If only one of you confesses and testifies against your partner, the person who confesses will go free while the person who does not confess will surely be convicted and given a 20-year jail sentence. If both of you confess, then you will both be convicted and sent to prison for 5 years. Finally, if neither of you confess, I can convict you both of a misdemeanor and you will each get 1 year in prison.”

What should each prisoner do?

21

Example 7: Solution

Assume that the prisoners cannot communicate with each other, the strategies and rewards for each are shown.

This is not a constant-sum two-player game. For each prisoner the “confess” strategy

dominates the “don’t confess” strategy. Each prisoner seeks to eliminate any dominated strategies from consideration.

Prisoner 2

Prisoner 1 Confess Don’t Confess

Confess (5,5) (0, -20)

Don’t Confess (-20,0) (-1, -1)

22


On the other hand if each prisoner chooses the dominated “don’t confess” strategy, then each prisoner will spend only 1 year in prison.

If each prisoner chooses his dominated strategy, both are better off than if each prisoner chooses his undominated strategy.

23

A general prisoner’s dilemma reward matrixwhere:

NC = noncooperative action

C = cooperative action

P = punishment for not cooperating

S = payoff to person who is double-crossed

R = reward for cooperating if both players cooperate

T = temptation for double-crossing opponent

Player1 Player 2

NC C

NC (P, P) (T, S)

C (S, T) (R, R)

24

14.5 Introduction to n-Person Game Theory

In many competitive situations, there are more than two competitors.

Any game with n players is an n-person game and is specified by the game’s characteristic function.

For each subset S of N, the characteristic function v of a game gives the amount of v(S) that the members of S can be sure of receiving if they act together and for a coalition.

S ca be determined by calculating the amount that members of S can get without help from players who are not in S.

25

Example 11: The Drug Game

Joe willie has invented a new drug.

Joe cannot manufacture the drug himself but can sell the drug’s formula to company 2 or company 3.

The lucky company will split a $1 million profit with Joe Willie.

Find the characteristic function for this game.

26

Example 11: Solution

Letting Joe Will be player 1, company 2 be player 2 and company 3 be player 3, the characteristic function for this game is

v({ }) = v({1}) = v({2}) = v({3}) = v({2,3}) = 0

v({1,2 }) = v({1,3}) = v({1, 2,3}) = $1,000,000

27

Consider any two subsets of sets A and B such that A and B have no players in common (A B ≠ ).

The characteristic function must satisfy the following inequality

This property is called superadditivity. There are many solution concepts for n-person

games. A solution concept should indicate the reward

that each player will receive.

)()()( BvAvBAv

28

Let x={x1, x2, …,xn} be a vector such that player i receives a reward xi. This is called a reward vector.

A reward vector x={x1, x2, …,xn} is not a reasonable candidate for a solution unless x satisfies

If x satisfies both, it is said that x is an imputation.

ni

iixNv

1

)each for ( Niivxi

(Group rationality)

(Individual rationality)

29

14.6 The Core of an n-Person Game

An important solution concept for an n-person game is the core.

The concept of domination Given the imputation dominates x

through a coalition S (written ) if

The core of an n-person game is the set of all undominated imputations.

xyS

,...,21yyyyn

SvSi

iy

xy ii

Si ,And for all

30

Theorem 1

An imputation is in the core of an n-person game if and only if for each subset S of N

xxx nx ,...,,

21

SvSi

ix

31


Find the core of the drug game.

For this game, Theorem 1 shows that x = (x1, x2, x3) will be in the core if and only if x1, x2, x3 satisfy the following

The core of the game is ($1,000,000, $0, $0) and it emphasizes the importance of player 1.

x1 ≥ 0 x2 ≥ 0 x3 ≥ 0X1+ x2 + x3 = $1,000,000

x1 + x2 ≥ $1,000,000 x1 + x3 ≥ $1,000,000 x2 + x3 ≥ 0X1+ x2 + x3 = $1,000,000

32

14.7 The Shapley Value

An alternative solution concept for n-person games is the Shapley value, which in general gives more equitable solutions than the core does.

Theorem 2Given any n-person game with the characteristic function v, there is a unique reward vector x = (x1, x2, …, xn) satisfying the axioms. The reward of the ith player (xi) is given by

)1!0)(1(2)1(!,1for and ,in players ofnumber theis || where!

)!1||(|!|)(

where

)](}){()[(in not is for which all

nnnnSSn

SnSSp

SviSvSpxi

n

SiSn

33


Find the Shapley value for the drug game.

To compute x1, the reward that player 1 should receive, list all coalitions S for which player 1 is not a member.

For each of these coalitions, compute v(S {i}) – v(S) and p3(S).

Date post:	17-Dec-2015
Category:	Documents
Upload:	curtis-bradley
View:	241 times
Download:	31 times

Chapter 14 Game Theory to accompany Operations Research: Applications and Algorithms 4th edition by...

Documents