Subramanian Ramamoorthy School of Informatics 22 March, 2013

transcript

Decision MakingDecision Makingin Robots and Autonomous in Robots and Autonomous

AgentsAgents

Preferences and Elicitation(Based on material from C. Boutilier et al.)

Subramanian RamamoorthySchool of Informatics

22 March, 2013

Preference Elicitation in AILuggage Capacity?Two Door? Cost?Engine Size?Color? Options?

Shopping for a Car:

22/03/2013 2

Intelligence

Readiness

Objective conditions

Political situation

Budget limitations

Campaign-level goals

Experience

Beliefs

Strategic and tactical intentions and preferences

Outcome evaluation for alternative decisions

Proposals of preferable courses of actions

Better decision

Beliefs and preference refinement through

incremental questioning

Monitor and filter information

Adapt presentation

22/03/2013 3[Source: R. Brafman]

The Preference Bottleneck

• Preference elicitation: the process of determining a user’s preferences/utilities to the extent necessary to make a decision on her behalf

• Why a bottleneck?– preferences vary widely– large (multiattribute) outcome spaces– quantitative utilities (the “numbers”) difficult to assess

22/03/2013 4

Trousers

black black red whiteblack white white redwhite black white redwhite white red white

Jacketblack white black white

22/03/2013 5

best22/03/2013 6

Automated Preference Elicitation

• The interesting questions:– decomposition of preferences– what preference info is relevant to the task at hand?– when is the elicitation effort worth the improvement it

offers in terms of decision quality?– what decision criterion to use given partial utility info?

22/03/2013 7

Overview

• Preference Elicitation in A.I.

– Constraint-based Optimization

– Factored Utility Models

– Types of Uncertainty

– Types of Queries

• Single User Elicitation

22/03/2013 8

Constraint-based Decision Problems

• Constraint-based optimization (CBO):

– outcomes over variables X = {X1 … Xn}

– constraints C over X spell out feasible decisions

– generally compact structure, e.g., X1 & X2 ¬ X3

– add a utility function u: Dom(X) → R – preferences over configurations

22/03/2013 9

Constraint-based Decision Problems

• Must express u compactly like C– generalized additive independence (GAI)

• model proposed by Fishburn (1967) • nice generalization of additive linear models

– given by graphical model capturing independence

22/03/2013 10

Factored Utilities: GAI Models• Set of K factors fk over subset of vars X[k]

– “local” utility for each local configuration

• [Fishburn67] u in this form exists iff– lotteries p and q are equally preferred whenever p and q

have the same marginals over each X[k]

Kkk kfu )xx) ][(( A Bf1(A)

a: 3a: 1

f2(B)b: 3b: 1

Cf3(BC) bc: 12 bc: 2…

u(abc) = f1(a)+ f2(b)+ f3(bc)

22/03/2013

Optimization with GAI Models

• Optimize using simple Integer Program– number of variables linear in size of GAI model

kfu k )xx) ][((A B

f1(A)a: 3a: 1

f2(B)b: 3b: 1

C f3(BC) bc: 12 bc: 2…

CAIukDomkkik

kkXI , tosubj. max

])[(][][][][},{

22/03/2013

Difficulties in CBO

• Utility elicitation: how do we assess individual user preferences?– need to elicit GAI model structure (independence)– need to elicit (constraints on) GAI parameters– need to make decisions with imprecise parameters

22/03/2013 13

Strict Utility Function Uncertainty

• User’s actual utility u unknown• Assume feasible set F U = [0,1]n

– allows for unquantified or “strict” uncertainty– e.g., F a set of linear constraints on GAI terms

• How should one make a decision? elicit information?

u(red,2door,280hp) > 0.4u(red,2door,280hp) > u(blue,2door,280hp)

22/03/2013 14

f2(L,N)l,n: [2,4]l,n: [1,2]

Strict Uncertainty Representation

Utility Function

f1(L)l: [7,11]l: [2,5]

22/03/2013 15

Bayesian Utility Function Uncertainty

• User’s actual utility u unknown• Assume density P over U = [0,1]n

• Given belief state P, EU of decision x is:EU(x , P) = U (px . u) P( u ) du

• Decision making is easy, but elicitation harder?– must assess expected value of information in query

22/03/2013 16

f2(L,N)l,n: l,n: …

Bayesian Representation

Utility Function

f1(L)l: l:

22/03/2013 17

Query Types

• Comparison queries (is x preferred to x’ ?)– impose linear constraints on parameters

• k fk(x[k]) > k fk(x’[k]) – Interpretation is straightforward

22/03/2013 18

Query Types

• Bound queries (is fk(x[k]) > v ?)– response tightens bound on specific utility parameter– can be phrased as a local standard gamble query

22/03/2013 19

Overview

– Foundations of Local queries

– Bayesian Elicitation

– Regret-based Elicitation

22/03/2013 20

Difficulties with Bound Queries• Bound queries focus on local factors

– but values cannot be fixed without reference to others!– seemingly “different” local prefs correspond to same u

u(Color,Doors,Power) = u1(Color,Doors) + u2(Doors,Power)

u(red,2door,280hp) = u1(red,2door) + u2(2door,280hp)

u(red,4door,280hp) = u1(red,4door) + u2(4door,280hp)

10 6 4

22/03/2013 21

Local Queries

• We wish to avoid queries on whole outcomes– can’t ask purely local outcomes– but can condition on a subset of default values

• Conditioning set C(f) for factor fi(Xi) :– variables that share factors with Xi

– setting default outcomes on C(f) renders Xi independent of remaining variables

– enables local calibration of factor values

22/03/2013 22

Local Standard Gamble Queries

• Local std. gamble queries– use “best” and “worst” (anchor) local outcomes

-- conditioned on default values of conditioning set– bound queries on other parameters relative to these– gives local value function v(x[i]) (e.g., v(ABC) )

• Hence we can legitimately ask local queries:

• But local Value Functions not enough: – must calibrate: requires global scaling

22/03/2013 23

Global Scaling• Elicit utilities of anchor outcomes wrt global best and worst

outcomes

– the 2*m “best” and “worst” outcomes for each factor

– these require global std gamble queries (note: same is true for pure additive models)

22/03/2013 24

Bound Query Strategies

• Identify conditioning sets Ci for each factor fi

• Decide on “default” outcome• For each fi identify top and bottom anchors

– e.g., the most and least preferred values of factor i– given default values of Ci

• Queries available:– local std gambles: use anchors for each factor, C-sets– global std gambles: gives bounds on anchor utilities

22/03/2013 25

Overview

22/03/2013 26

Partial preference informationBayesian uncertainty

• Probability distribution p over utility functions• Maximize expected (expected) utility

MEU decision x* = arg maxx Ep [u(x)]• Consider:

– elicitation costs– values of possible decisions– optimal tradeoffs between elicitation effort and

improvement in decision quality

22/03/2013 27

Query Selection

• At each step of elicitation process, we can– obtain more preference information– make or recommend a terminal decision

22/03/2013 28

Bayesian ApproachMyopic EVOI

MEU(p)

r1,1 r2,1r1,2 r2,2

MEU(p1,1) MEU(p1,2) MEU(p2,1) MEU(p2,2)22/03/2013 29

Expected value of information

• MEU(p) = Ep [u(x*)]

• Expected posterior utility: EPU(q,p) = Er|q,p [MEU(pr)]• Expected value of information of query q:

EVOI(q) = EPU(q,p) – MEU(p)

MEU(p)

q1q2 ...

r1,1 r2,1r1,2 r2,2

...MEU(p1,1) MEU(p1,2) MEU(p2,1) MEU(p2,2)

22/03/2013 30

Bayesian ApproachMyopic EVOI

• Ask query with highest EVOI - cost• [Chajewska et al ’00]

– Global standard gamble queries (SGQ) “Is u(oi) > l?”– Multivariate Gaussian distributions over utilities

• [Braziunas and Boutilier ’05]– Local SGQ over utility factors– Mixture of uniforms distributions over utilities

22/03/2013 31

Local elicitation in GAI models [Braziunas and Boutilier ’05]

• Local elicitation procedure– Bayesian uncertainty over local factors– Myopic EVOI query selection

• Local comparison query “Is local value of factor setting xi greater than l”?– Binary comparison query– Requires yes/no response– query point l can be optimized analytically

22/03/2013 32

Experiments

• Car rental domain: 378 parameters [Boutilier et al. ’03]– 26 variables, 2-9 values each, 13 factors

• 2 strategies– Semi-random query

• Query factor and local configuration chosen at random• Query point set to the mean of local value function

– EVOI query• Search through factors and local configurations• Query point optimized analytically

22/03/2013 33

Experiments

0 10 20 30 40 50 60 70 80 90 1000

Mixture of UniformsUniformGaussian

No. of queries

Percentage utility error (w.r.t. truemax utility)

22/03/2013 34

Overview

• why MiniMax Regret (MMR) ?

• Decision making with MMR

• Elicitation with MMR

22/03/2013 35

Minimax Regret: Utility Uncertainty

• Regret of x w.r.t. u:

• Max regret of x w.r.t. F:

• Decision with minimax regret w.r.t. F:

),(),(),( * uxEUuxEUuxR u

),(max),( uxRFxMRFu

),()(;),(minarg *

* FxMRFMMRFxMRx FXFeasx

22/03/2013 36

Why Minimax Regret?*

• Appealing decision criterion for strict uncertainty– contrast maximin, etc.– not often used for utility uncertainty

x’x’x

u1 u2 u3 u4 u5 u6

22/03/2013 37

Why Minimax Regret?

• Minimizes regret in presence of adversary– provides bound worst-case loss– robustness in the face of utility function uncertainty

• In contrast to Bayesian methods:– useful when priors not readily available– can be more tractable– effective elicitation even if priors available

22/03/2013 38

Overview

22/03/2013 39

Computing Max Regret

• Max regret MR(x,F) computed as an IP– number of variables linear in GAI model size– number of (precomputed) constants (i.e., local regret

terms) quadratic in GAI model size

– r( x[k] , x’[k] ) = uT (x’[k] ) – u (x[k] )

CAIrk k

kkkXI ik , tosubj. max

]['][']['][}',{ ]['

22/03/2013 40

Minimax Regret in Graphical Models

• We convert minimax to min (standard trick)– obtain a MIP with one constraint per feasible config– linearly many vars (in utility model size)

• Key question: can we avoid enumerating all x’ ?

22/03/2013 41

Constraint Generation

• Very few constraints will be active in solution

• Iterative approach: – solve relaxed IP (using a subset of constraints)– Solve for maximally violated constraint– if any add it and repeat; else terminate

22/03/2013 42

Constraint Generation Performance• Key properties:

– aim: graphical structure permits practical solution– convergence (usually very fast, few constraints)– very nice anytime properties – considerable scope for approximation– produces solution x* as well as witness xw

22/03/2013 43

Overview

22/03/2013 44

Regret-based Elicitation[Boutilier, Patrascu, Poupart, Schuurmans IJCAI05; AIJ 06]

• Minimax optimal solution may not be satisfactory• Improve quality by asking queries

– new bounds on utility model parameters• Which queries to ask?

– what will reduce regret most quickly?– myopically? sequentially?

• Closed form solution seems infeasible– to date people have looked only at heuristic elicitation

22/03/2013 45

Elicitation Strategies I• Halve Largest Gap (HLG)

– ask if parameter with largest gap > midpoint– MMR(U) ≤ maxgap(U), hence nlog(maxgap(U)/)

queries needed to reduce regret to – bound is tight

f1(a,b) f1(a,b) f1(a,b) f1(a,b) f2(b,c) f2(b,c) f2(b,c) f2(b,c)22/03/2013 46

Elicitation Strategies II• Current Solution (CS)

– only ask about parameters of optimal solution x* or regret-maximizing witness xw

– intuition: focus on parameters that contribute to regret• reducing u.b. on xw or increasing l.b. on x* helps

– use early stopping to get regret bounds (CS-5sec)

f1(a,b) f1(a,b) f1(a,b) f1(a,b) f2(b,c) f2(b,c) f2(b,c) f2(b,c)22/03/2013 47

Elicitation Strategies III• Optimistic-pessimistic (OP)

– query largest-gap parameter in one of:• optimistic solution xo

• pessimistic solution xp

• Computation:– CS needs minimax optimization– OP needs standard optimization– HLG needs no optimization

• Termination:– CS easy– Others ?

22/03/2013 48

Results (Small Random)

10vars; < 5 vals

10 factors, at most 3 vars

Avg 45 trials

22/03/2013 49

Results (Car Rental, Unif)

26 vars; 61 billion configs

36 factors, at most 5 vars; 150 parameters

Avg 45 trials

22/03/2013 50

Results (Real Estate, Unif)

20 vars; 47 million configs

29 factors, at most 5 vars; 100 parameters

Avg 45 trials

22/03/2013 51

Results (Large Rand, Unif)

25 vars; < 5 vals

20 factors, at most 3 vars

A 45 trials

22/03/2013 52

Summary of Results

• CS works best on test problems– time bounds (CS-5): little impact on query quality– always know max regret (or bound) on solution– time bound adjustable (use bounds, not time)

• OP competitive on most problems– computationally faster (e.g., 0.1s vs 14s on RealEst)– no regret computed so termination decisions harder

• HLG much less promising

22/03/2013 53

Interpretation

• HLG: – provable regret reduced very quickly

• But:– true regret faster (often to optimality)– OP and CS restricted to feasible decisions– CS focuses on relevant parameters

22/03/2013 54

Concluding Remarks

• Local parameter elicitation– Theoretically sound– Computationally practical– Easier to answer

• Bayesian EVOI / Regret-based elicitation– Good guides for elicitation– Integrated in computationally tractable algorithms

• Questions for Future Work:– Sequential reasoning

22/03/2013 55

Subramanian Ramamoorthy School of Informatics 22 March, 2013

Documents