+ All Categories
Home > Documents > PNAS-1953-Shapley-1095-100

PNAS-1953-Shapley-1095-100

Date post: 08-Jul-2018
Category:
Upload: nicolas-perez-barriga
View: 213 times
Download: 0 times
Share this document with a friend
7
MATHEMATICS: L . S . SHAPLEY STOCHASTIC GAMES* B y L . S . SHAPLEY PRINCETON UNIVERSITY Communicated b y J . von Neumann, July 1 , 1953 Introduction.-In a stochastic game t h e play proceeds b y steps from position t o position, according t o transition probabilities controlled jointly b y t h e t wo players. We shall assume a finite number, N, o f positions, a n d finite numbers M k , n k o f choices at each position; nevertheless, the game m a y n o t b e bounded i n length. If , when a t position k , t h e players choose their i t h a n d jt h alternatives, respectively, then with probability s A j > 0 the game stops, while with probability ph t h e game moves t o position 1 . Define s = m i n s . k , i , j Since s is positive, t h e game ends with probability 1 after a finite number o f steps, because, f o r a n y number t , t h e probability that i t h a s n ot stopped after t steps i s n o t more than (1 - s ) t. Payments accumulate throughout the course o f play: t h e first player takes a s j from t h e second whenever t h e pair i , j i s chosen a t position k. I f w e define t h e bound M : M . = m a x l a - J I , k,i,j then w e s e e that t h e expected total gain o r loss i s bounded b y M +  1-s)M +  1-s)2M + . . . = M/s. (1 ) T h e process therefore depends on N 2 + N matrices P k (ptj i = 1 , 2 , ...,*; j = 1 , 2 , ...,nk) A t (asjAi = 1, 2 , ...,mk; j = 1 , 2 , ...,nk), with k , I = 1 , 2 , ..., N , with elements satisfying N pki> 0 2 I l < M, E P S J = 1  S t . < 1  S < 1. I =1 B y specifying a starting position w e obtain a particular game r P . T h e term  stochastic game will refer t o t h e collection r = { k I | k = 1, 2 , .... N J . T h e full sets o f pure and mixed strategies i n these games a r e rather cumbersome, since they take account o f much information that turns o u t to b e irrelevant. However, w e shall have t o introduce a notation only VOL. 3 9 , 1953 1095
Transcript

8/19/2019 PNAS-1953-Shapley-1095-100

http://slidepdf.com/reader/full/pnas-1953-shapley-1095-100 1/6

MATHEMATICS: L . S . SHAPLEY

STOCHASTIC GAMES*

By L . S . S H A P L E Y

P R I N C E T O NU N I V E R S I T Y

C o m m u n i c a t e db y J . v o n N e u m a n n ,J u l y 1 7 ,1 9 5 3

I n t r o d u c t i o n . - I n a s t o c h a s t i cg a m e t h e p l a y p r o c e e d s b y s t e p sf r o mp o s i t i o nt o p o s i t i o n ,a c c o r d i n gt o t r a n s i t i o np r o b a b i l i t i e sc o n t r o l l e dj o i n t l yb y t h e t w o p l a y e r s .We s h a l la s s u m ea f i n i t en u m b e r , N ,o f p o s i t i o n s ,a n d f i n i t en u m b e r sM k , n ko f c h o i c e sa t e a c h p o s i t i o n ;n e v e r t h e l e s s ,t h e

g a m e m a yn o t b e

b o u n d e di n l e n g t h .I f , w h e na t

p o s i t i o nk , t h e p l a y e r sc h o o s e t h e i ri t h a n d j t h a l t e r n a t i v e s ,r e s p e c t i v e l y ,t h e n w i t h p r o b a b i l i t ys A j> 0 t h eg a m e s t o p s ,w h i l ew i t h p r o b a b i l i t yp h t h e g a m e m o v e s t op o s i t i o n1 . D e f i n e

s = m i n s .k , i , j

S i n c es i s p o s i t i v e ,t h e g a m e e n d s w i t h p r o b a b i l i t y1 a f t e ra f i n i t en u m b e ro fs t e p s ,b e c a u s e ,f o ra n y n u m b e rt , t h e p r o b a b i l i t yt h a t i t h a s n o ts t o p p e da f t e rt s t e p si s n o t m o r et h a n ( 1 - s )t .

P a y m e n t sa c c u m u l a t et h r o u g h o u tt h e c o u r s e o f p l a y :t h e f i r s tp l a y e rt a k e s a sj f r o m t h e s e c o n d w h e n e v e rt h e p a i ri , j i s c h o s e na t p o s i t i o nk .I f w e d e f i n et h eb o u n dM:

M. = m a x l a - J I ,k , i , j

t h e n w e s e et h a t t h e e x p e c t e dt o t a lg a i no r l o s si s b o u n d e db y

M + 1 - s )M + 1-s)2M + . . . = M / s . ( 1 )

T h e p r o c e s s t h e r e f o r ed e p e n d so n N 2 + N m a t r i c e s

Pk ( p t ji = 1 ,2 , . . . , * ; j = 1 ,2 , . . . , n k )A t - ( a s j A i= 1 ,2 , . . . , m k ; j = 1 ,2 , . . . , n k ) ,

w i t h k , I = 1 , 2 , . . . , N , w i t h e l e m e n t ss a t i s f y i n gN

p k i >0 2 a Il < M , E P S J= 1 S t .< 1 S < 1 .I = 1

By s p e c i f y i n ga s t a r t i n gp o s i t i o nw e o b t a i n a p a r t i c u l a rg a m e r P . T h e

t e r m s t o c h a s t i cg a m e w i l lr e f e rt o t h e c o l l e c t i o nr = { k I |k = 1 ,2 , . . . .N J .T h e f u l ls e t so f p u r e a n d m i x e ds t r a t e g i e si n t h e s eg a m e sa r e r a t h e r

c u m b e r s o m e ,s i n c et h e y t a k e a c c o u n t o f m u c h i n f o r m a t i o n t h a t t u r n s o u tt o b e i r r e l e v a n t .H o w e v e r ,w e s h a l lh a v e t o i n t r o d u c e a n o t a t i o n o n l y

V O L .3 9 ,1 9 5 3 1 0 9 5

8/19/2019 PNAS-1953-Shapley-1095-100

http://slidepdf.com/reader/full/pnas-1953-shapley-1095-100 2/6

8/19/2019 PNAS-1953-Shapley-1095-100

http://slidepdf.com/reader/full/pnas-1953-shapley-1095-100 3/6

MATHEMATICS: L . S . SHAPLEY

T h e n w e h a v e

| | l T, B - Ta | | = m a x lv a l [ A k ) - v a l [ A k ( c ) ] |km a x | ip h j 3 1- EP h a l l ( 3 )k , i , j .

. max | Ipk i m a x - a l l

= ( 1- s ) 3 I -u s i n g ( 2 ) . I n p a r t i c u l a r ,| | T 2 - T o t | | < ( 1 - s ) | T - | . H e n c et h e

s e q u e n c e a ( o ) ,Ta ( o ) ,Ta ( o ) . . . i s c o n v e r g e n t . T h e l i m i tv e c t o r 4 h a st h e p r o p e r t y 4 = T o . Bu t t h e r ei s o n l y o n e s u c h v e c t o r ,f o r iT 1

i m p l i e s

T= 1 T -T 4 1< ( 1 - s ) l |- T l l ,b y ( 3 ) ,w h e n c e| | - | | = 0 . H e n c e 4 i st h eu n i q u ef i x e dp o i n to fT a n d

i s i n d e p e n d e n to f a ( o ) .To s h o wt h a t O ki s t h e v al u eo ft h e g a m e r k ,w e o b s e r v et h a t b y f o l l o w i n g

a n o p t i m a l s t r a t e g y o f t h e f i n i t eg a m e r ( I )f o rt h e f i r s tt s t e p s a n d p l a y i n ga r b i t r a r i l yt h e r e a f t e r ,t h e f i r s tp l a y e rc a n a s s u r e h i m s e l fa n a m o u n t w i t h i n= ( 1 - s ) M / so ft h e v a l u e o f r ( ) ; l i k e w i s ef o rt h e o th e rp l a y e r .S i n c ee :0 a n d t h e v a l u e o f r ( * )c o n v e r g e st o q 6 k ,w e c o n c l u d et h a t q 6 ki s i n d e e d

t h e v a l u e o f r k . S u m m i n gu p :THEOREM1 . T h e v a l u e o ft h es t o c h a s t i cg a m e r i s t h eu n i q u es o l u t i o n

4 o ft h es y s t e m

4 k = v a l[ A k ( ) ] , k = 1 , 2 , . . . , N .

O u r n e x t o b j e c t i v ei s t o p r o v e t h e e x i s t e n c eo f o p t i m a l s t r a t e g i e s .

THEOREM2 . T h e s t a t i o n a r ys t r a t e g i e sx * , y * ,w h e r ex l e X [ A( 4 ) ] ,y Y [ A ( 4 ) ] ,I = 1 , 2 , . . . , N , a r e o p t i m a lf o rt h ef i r s ta n d s e c o n dp l a y e r sr e s p e c t i v e l yi n e v e r yg a m er kb e l o n g i n gt o r .

P r o o f :L e t a f i n i t ev e r s i o no f r I kb e d e f i n e db y a g r e e i n gt h a t on t h et t h s t e p t h e p l a y s h a l ls t o p ,w i t h t h e f i r s tp l a y e rr e c e i v i n gt h ea m o u n t

a h ; + > 3 P h *4 1i n s t e a do f j u s ta h j . C l e a r l y ,t h es t a t i o n a r ys t r a t e g y x *a s s u r e s t h e f i r s tp l a y e rt h e a m o u n t 4 ki n t h i sf i n i t ev e r s i o n .I n t h e o r i g i n a lg a m e r k , i f t h e f i r s tp l a y e ru s e s x * , h i s e x p e ct ed w in ni ng sa f t e rt s t e p sw i l lb e a t l e a s t

- _ 1 -s ) -1maxh , i ,

V O L .3 9 ,1 9 5 3 1 0 9 7

8/19/2019 PNAS-1953-Shapley-1095-100

http://slidepdf.com/reader/full/pnas-1953-shapley-1095-100 4/6

MA THEMA T I C S : L . S . SHAPLEY

a n d h e n c e a t l e a s t

p - k ( 1 - s ) m a x .

H i s t o t a le x p e c t e d w i n n i n g s a r e t h e r e f o r ea t l e a s t

k- ( 1 - s max9 - ( 1 - s ) t M / s .I

S i n c e t h i si s t r u e f o ra r b i t r a r i l yl a r g ev a l u e s o ft , i t f o l l o w s t h a t x i so p t i m a li n r F f o rt h e f i r s tp l a y e r .S i m i l a r l y ,y i s o p t i m a l f o rt h e s e c o n dp l a y e r .

R e d u c t i o n t o a F i n i t e - D i m e n s i o n a lGame.-The n o n - l i n e a r i t yo f t h e v a l o p e r a t o r o ft enm a k e si t d i f f i c u l tt o o b t a i n e x a c t s o l u t i o n sb y m e a n so fT h e o r e m s1 a n d 2 . I t t h e r e f o r eb e c o m e sd e s i r a b l et o e x p r es st h e p a y o f fd i r e c t l yi n t e r m s o f s t a t i o n a r y s t r a t e g i e s .L e t r = { 1 }d e n o t e t h ec o l l e c t i o no f g a m e sw h o s ep u r e s t r a t e g i e sa r e t h e s t a t i o n a r y s t r a t e g i e so fF . T h e i r p a y o f f f u n c t i o n s ( x , y) m u s t s a t i s f y

G( X , y . ) = X k A k y k+ E lX k p kl y k O l (- Y

f o rk = 1 , 2 , . . . , N . T h i s s y s t e mh a s a u n i q u e s o l u t i o n ;i n d e e d ,f o rt h e

l i n e a rt r a n s f o r m a t i o n 7 7 7 7 :

T 7X I = , w h e r e( k = x k A k y k+ 1 x k P k l y k a i

w e h a v e a t o n c e

jT7T v ( - 7 7 7 va l l= m a x I x k p k l y k ( ( 3- a l J ( 1 - s l l- a l l ,k

c o r r e s p o n d i n gt o ( 3 )a b o v e . H e n c e , b y C r a m e r sr u l e ,

x l P l l y l 1 x l P l 2 y i - x A y . . . X l p l N y Ix 2 P 2 1 y 2x 2 P 2 2 y 2 - 1

x N P N Iy N . . . x NA N y N . . . XNPNNyN1t* ) = _

x l p l l y l 1 x l P l 2 y l .. . x l p l k y l . . . x l p l N y lx 2 P 2 1 y 2X 2 P 2 2 y 2- ...

x k p k k y k

x N P N 1 y N . . . xNpNkyN . . . xNPNNyN-1

THEOREM3 . T h e g a m e sr I kp o s s e s s s a d d l ep o i n t s :

m i n m a x . k ( x ,y ) = m a x m i n t ( x , y ) , ( 4 )

y x x y

P R O C .N . A . S .0 9 8

8/19/2019 PNAS-1953-Shapley-1095-100

http://slidepdf.com/reader/full/pnas-1953-shapley-1095-100 5/6

MA THEMA T I C S : L . S . SHAPLE Y

f o r k = 1 , 2 , . . . , N . An y s t a t i o n a r ys t r a t e g yw h i c h i s o p t i m a lf o r a l l

r e r i s a n o p t i m a l p u r e s t r a t e g yf o r a l l ;r k e r , a n d c o n v e r s e l y .T h e v a l u ev e c t o r so fr a n d 1r a r e t h es a m e .

Th e p r o o f i s a s i m p l e a r g u m e n tb a s e d o n Theorem 2 . I t s h o u l d b e

p o i n t e d o u t t h a t a s t r a t e g y x m a y b e o p t i m a l f o ro n e g a m e r k ( o rr k )a n dn o t o p t i m a l f o ro t h e r g a m e sb e l o n g i n g t o r ( o r r ) . T h i s i s d u e t o t h ep o s s i b i l i t yt h a t r m i g h t b e d i s c o n n e c t e d ; h o w e v e ri f n o n e o f t h e p k ja r e z e r o t h i sp o s s i b i l i t yd o e s n o t a r i s e .

I t c a n b e s h o w nt h a t t h e s e t so f o p t i m a l s t a t i o n a r y s t r a t e g i e sf o rr a r ec l o s e d ,c o n v e x p o l y h e d r a . A s t o c h a s t i cg a m e w i t h r a t i o n a lc o e f f i c i e n t sd o e s n o t n e c e s s a r i l yh a v e a r a t i o n a lv a l u e .T h u s , u n l i k e t h e m i n i m a xt h e o r e mf o rb i l i n e a rf o r m s , t h e e q u a t i o n ( 4 ) i s n o t v a l i di n a n a r b i t r a r yo r d e r e d f i e l d .

E x a m p l e sa n d A p p l i c a t i o n s . - 1 .When N - 1 , r m a y b e d e s c r i b e da s a s i m p l em a t r i x g a m e A w h i c h i s t o b e r e p l a y e d a c c o r d i n g t o p r o b a -b i l i t i e st h a t d e p e n do n t h e p l a y e r sc h o i c e s .T h e p a y o f f f u n c t i o n o f r i s

x A y

N ( X ,y ) =xAy

w h e r e S i s t h e m a t r i x o f ( n o n - z e r o )s t o p p r o b a b i l i t i e s .T h e m i n i m a xt h e o r e m( 4 )f o rr a t i o n a lf o r m s o f t h i ss o r t w a s e s t a b l i s h e db y v on N e u -m a n n ; 3 a n e l e m e n t a r yp r o o fw a s s u b s e q u e n t l yg i v e nb y L o o m i s . 4

2 . By s e t t i n ga l l t h e s t o p p r o b a b i l i t i e ss t ,e q u a lt o s > 0 , w e o b t a i n am o d e l o f a n i n d e f i n i t e l yc o n t i n u i n gg a m e i n w h i c h f u t u r e p a y m e n t sa r ed i s c o u n t e db y a f a c t o r( 1 - s ) . I n t h i si n t e r p r e t a t i o nt h e a c t u a lt r a n s i -t i o np r o b a b i l i t i e sa r e q k l p k j l . / l- s ) . By h o l d i n gt h e q f if i x e da n dv a r y i n g s , w e c a n s t u d y t h e i n f l u e n c eo f i n t e r e s tr a t e on t h eo p t i m a ls t r a t e g i e s .

3 . A s t o c h a s t i cg a m e d o e s n o t h a v e p e r f e c ti n f o r m a t i o n ,b u t i s r a t h e ra s i m u l t a n e o usg a m e , i n t h e s e n s e o fK u h n a n d T h o m p s o n . H o w e v e r ,p e r f e c ti n f o r m a t i o n c a n b e s i m u l a t e d w i t h i n o u r f r a m e w o r kb y p u t t i n ge i t h e rm ko r n ke q u a lt o 1 , f o ra l l v a l u e s o f k . S u c h a s t o c h a s t i cg a m e o fp e r f e c ti n f o r m a t i o n w i l lo f c o u r s e h a v e a s o l u t i o ni n s t a t i o n a r yp u r es t r a t e g i e s .

4 . I f w e s e t n k = 1 f o ra l lk , e f f e c t i v e l ye l i m i n a t i n gt h e s e c o n dp l a y e r ,t h e r e s u l ti s a d y n a m i cp r o g r a m m i n gm o d e l . 5I t s s o l u t i o ni s g i v e nb y

a n y s e to f i n t e g e r si = { i l ,i 1 , . . . , N J1 < .k < M} w h i c hm a x i m i z e st h ee x p r e s s i o n

V O L .3 9 ,1 9 5 3 1 0 9 9

8/19/2019 PNAS-1953-Shapley-1095-100

http://slidepdf.com/reader/full/pnas-1953-shapley-1095-100 6/6


Recommended