+ All Categories
Home > Documents > Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr,...

Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr,...

Date post: 16-Jan-2016
Category:
Upload: alaina-thompson
View: 225 times
Download: 0 times
Share this document with a friend
22
Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn
Transcript
Page 1: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Efficient Solution Algorithms for Factored MDPs

by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman

Presented by Arkady Epshteyn

Page 2: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Problem with MDPs

• Exponential number of states• Example: Sysadmin Problem

• 4 computers: M1, M2 , M3 , M4

• Each machine is working or has failed.• State space: 24

• 8 actions: whether to reboot each machine or not• Reward: depends on the number of working

machines

Page 3: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Factored Representation

• Transition model: DBN• Reward model:

k

j

j xrxR1

)()(

Page 4: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Approximate Value Function

• Linear value function:

• Basis functions:

hi(Xi=true)=1

hi(Xi=false)=0

h0=1

k

j

jj xhwxV1

)()(

Page 5: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Markov Decision Processes

'

)( )'()|'()()(x

x xVxxPxRxV For fixed policy :

The optimal value function V*:

])'(*)|'()([max)(*'

x

aaa

xVxxPxRxV

Page 6: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Solving MDPMethod 1: Policy Iteration

• Value determination

• Policy Improvement

'

)()( )'()|'()()(x

txx

t xVxxPxRxV

•Polynomial in the number of states N•Exponential in the number of variables K

])'()|'()([maxarg)('

1

x

taa

a

t xVxxPxRx

Page 7: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Solving MDPMethod 2: Linear Programming

Intuition: compare with the fixed point of V(x):

axVxxPxRVtoSubject

xiVxMinimize

VVVariables

i

j

jijaai

i

x

ii

N

i

,,)|()(:

0)(:,)(:

,...,: 1

•Polynomial in the number of states N•Exponential in the number of variables

])'(*)|'()([max)(*'

x

aaa

xVxxPxRxV

Page 8: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Value Function Approximation

axxhwxxPxRxhwtoSubject

xixhwxMinimize

wwVariables

i

ii

x

aa

i

ii

x

k

i

ii

K

,,)'()|()()(:

0)(:,)()(:

,...,:

'

'

1

1

axVxxPxRVtoSubject

xiVxMinimize

VVVariables

i

j

jijaai

i

x

ii

N

i

,,)|()(:

0)(:,)(:

,...,: 1

Page 9: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Objective function

axxhwxxPxRxhwtoSubject

xixhwxMinimize

wwVariables

i

ii

x

aa

i

ii

i

x i

ii

K

,,)'()|()()(:

0)(:,)()(:

,...,:

'

'

1

•Objective function polynomial in the number of basis functions

i

i

Cx

i

i

ii

c

ii

i

i

x

i

x i

ii

xcwhere

chcw

xhxw

xhwx

)()(

,)()(

)()(

)()(

Page 10: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Each Constraint: Backprojection

axxhwxxPxRxhwtoSubject

xixhwxMinimize

wwVariables

i

ii

x

aa

i

ii

i

x i

ii

K

,,)'()|()()(:

0)(:,)()(:

,...,:

'

'

1

i

i

x

ai

i

ii

x

a xhxxPwxhwxxP )'()|()'()|('

'

'

'

))(|(

)|(

)|'(

iii

ii

i

cpacEh

xcEh

xxEh

Page 11: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Representing Exponentially Many Constraints

axxhwxxPxRxhwtoSubject

xixhwxMinimize

wwVariables

i

ii

x

aa

i

ii

i

x i

ii

K

,,)'()|()()(:

0)(:,)()(:

,...,:

'

'

1

axRxhxhxxPw

axxRxhxhxxPw

axxhwxxPxRxhw

a

i

ii

x

aix

a

i

ii

x

ai

i

ii

x

aa

i

ii

),()]()'()|([max0

,),()]()'()|([0

,,)'()|()()(

'

'

'

'

'

'

Page 12: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Restricted Domain

i j

jiix

a

i

iaii

x

a

i

ii

x

aix

xrxfw

xRxhxgw

axRxhxhxxPw

)()(max

)()]()([max

),()]()'()|([max0'

'

1. Backprojection - depends on few variables2. Basis function3. Reward function

1 2 3

Page 13: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Variable Elimination

)],(),([max),(

)],(),(),([max

)]],(),([max),(),([max

),(),(),(),(max

)()(max

4324214

321

321312221113,2,1

4324214

312221113,2,1

432421312221114,3,2,1

xxrxxrxxewhere

xxexxfwxxfw

xxrxxrxxfwxxfw

xxrxxrxxfwxxfw

xrxfw

x

xxx

xxxx

xxxx

i j

jiix

- similar to Bayesian Networks

Page 14: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Maximization as Linear Constraints

...

),(),(),(

),(),(),(

),(),(),(

),(),(),(

:sconstrainttoEquivalent

)],(),([max),(

432421321

432421321

432421321

432421321

4324214

321

xxrxxrxxe

xxrxxrxxe

xxrxxrxxe

xxrxxrxxe

xxrxxrxxex

• Exponential in the size of each function’s domain, not the number of states

Page 15: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Factored LP: Scaling

Page 16: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Rule-based Representation

Page 17: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Approximate Value Function

k

j hRule

ij

k

j

jj

k

j

jj

ji

xxxxRulew

xxxxhwxhwxV

1

4321

1

4321

1

),,,(

),,,()()(

x1

x30

5 0.6

h1:

6.0:,:

5:,:

0::

313

312

11

xxRule

xxRule

xRule

Notice: compact representation (2/4 variables, 3/16 rules)

Page 18: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Summing Over Rules

k

j hRule

ij

ji

xxxxRulewxV1

4321 ),,,()(

x1

x3u1

u2 u3

h1(x)

x2

x1u4

u5

h2(x)

+

u6

=

x2

x1

u1+u4

u2+u6 u3+u6

x1

x3 x3u5+u1

u2+u4 u3+u4

Page 19: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Multiplying over Rules

• Analogous construction

axRxhxhxxPw a

i

ii

x

aix

),()]()'()|([max0'

'

Page 20: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Rule-based MaximizationaxRxhxhxxPw a

i

ii

x

aix

),()]()'()|([max0'

'

x1

x2u1

u2 x3

u3 u4

Eliminate x2

x1

x3u1

max(u2,u3) max(u2,u4)

Page 21: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Rule-based Linear Program

• Backprojection, objective function – handled in a similar way

• All the operations (summation, multiplication, maximization) – keep rule representation intact

• is a linear function ji hRule

ij xxxxRulew ),,,( 4321

Page 22: Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.

Conclusions

• Compact representation can be exploited to solve MDPs with exponentially many states efficiently.

• Still NP-complete in the worst case.• Factored solution may increase the size of LP

when the number of states is small (but it scales better).

• Success depends on the choice of the basis functions for value approximation and the factored decomposition of rewards and transition probabilities.


Recommended