+ All Categories
Home > Documents > Preference Handling— An Introductory Tutorial

Preference Handling— An Introductory Tutorial

Date post: 12-Feb-2022
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
29
Articles 58 AI MAGAZINE Our preferences guide our choices. Hence an understand- ing of the various aspects of preference handling should be of great relevance to anyone attempting to build systems that act on behalf of users or simply support their decisions. This could be a shopping site that attempts to help us identify the most preferred item, an information search and retrieval engine that attempts to provide us with the most preferred pieces of infor- mation, or more sophisticated embedded agents such as robots, personal assistants, and so on. Each of these applications strives to take actions that lead to a more preferred state for us: better product, most appropriate articles, best behavior, and so on. Early work in AI focused on the notion of a goal—an explicit target that must be achieved—and this paradigm is still domi- nant in AI problem solving. But as application domains become more complex and realistic, it is apparent that the dichotomic notion of a goal, while adequate for certain puzzles, is too crude in general. The problem is that in many contemporary applica- tion domains, for example, information retrieval from large databases or the web, or planning in complex domains, the user has little knowledge about the set of possible solutions or feasi- ble items, and what she or he typically seeks is the best that’s out there. But since the user does not know what is the best achievable plan or the best available document or product, he or she typically cannot characterize it or its properties specifi- cally. As a result, the user will end up either asking for an unachievable goal, getting no solution in response, or asking for too little, obtaining a solution that can be substantially improved. Of course, the user can gradually adjust the stated goals. This, however, is not a very appealing mode of interaction because the space of alternative solutions in such applications can be combinatorially huge, or even infinite. Moreover, such incremental goal refinement is simply infeasible when the goal must be supplied offline, as in the case of autonomous agents (whether on the web or on Mars). Hence, what we really want is for the system to understand our preferences over alternative Copyright © 2009, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602 Preference Handling— An Introductory Tutorial Ronen I. Brafman and Carmel Domshlak n We present a tutorial introduction to the area of preference handling—one of the core issues in the design of any system that auto- mates or supports decision making. The main goal of this tutorial is to provide a framework, or perspective, within which cur- rent work on preference handling—represen- tation, reasoning, and elicitation—can be understood. Our intention is not to provide a technical description of the diverse methods used but rather to provide a general perspec- tive on the problem and its varied solutions and to highlight central ideas and tech- niques.
Transcript
Page 1: Preference Handling— An Introductory Tutorial

Articles

58 AI MAGAZINE

Our preferences guide our choices. Hence an understand-ing of the various aspects of preference handling should be ofgreat relevance to anyone attempting to build systems that acton behalf of users or simply support their decisions. This couldbe a shopping site that attempts to help us identify the mostpreferred item, an information search and retrieval engine thatattempts to provide us with the most preferred pieces of infor-mation, or more sophisticated embedded agents such as robots,personal assistants, and so on. Each of these applications strivesto take actions that lead to a more preferred state for us: betterproduct, most appropriate articles, best behavior, and so on.

Early work in AI focused on the notion of a goal—an explicittarget that must be achieved—and this paradigm is still domi-nant in AI problem solving. But as application domains becomemore complex and realistic, it is apparent that the dichotomicnotion of a goal, while adequate for certain puzzles, is too crudein general. The problem is that in many contemporary applica-tion domains, for example, information retrieval from largedatabases or the web, or planning in complex domains, the userhas little knowledge about the set of possible solutions or feasi-ble items, and what she or he typically seeks is the best that’sout there. But since the user does not know what is the bestachievable plan or the best available document or product, heor she typically cannot characterize it or its properties specifi-cally. As a result, the user will end up either asking for anunachievable goal, getting no solution in response, or asking fortoo little, obtaining a solution that can be substantiallyimproved. Of course, the user can gradually adjust the statedgoals. This, however, is not a very appealing mode of interactionbecause the space of alternative solutions in such applicationscan be combinatorially huge, or even infinite. Moreover, suchincremental goal refinement is simply infeasible when the goalmust be supplied offline, as in the case of autonomous agents(whether on the web or on Mars). Hence, what we really wantis for the system to understand our preferences over alternative

Copyright © 2009, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602

Preference Handling—An Introductory Tutorial

Ronen I. Brafman and Carmel Domshlak

n We present a tutorial introduction to thearea of preference handling—one of the coreissues in the design of any system that auto-mates or supports decision making. Themain goal of this tutorial is to provide aframework, or perspective, within which cur-rent work on preference handling—represen-tation, reasoning, and elicitation—can beunderstood. Our intention is not to provide atechnical description of the diverse methodsused but rather to provide a general perspec-tive on the problem and its varied solutionsand to highlight central ideas and tech-niques.

Page 2: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 59

choices (that is, plans, documents, products, andso on), and home in on the best achievable choic-es for us.

But what are preferences? Semantically, theanswer is simple. Preferences over some domain ofpossible choices order these choices so that a moredesirable choice precedes a less desirable one. Weshall use the term outcome to refer to the elementsin this set of choices.1 Naturally, the set of out-comes changes from one domain to another.Examples of sets of possible outcomes could bepossible flights, vacation packages, cameras, endresults of some robotic application (such as pic-tures sent from Mars, time taken to complete theDARPA Grand Challenge, and so on). Orderingscan have slightly different properties; they can betotal (that is, making any pair of outcomes compa-rable) or partial, strict (that is, no two outcomes areequally preferred) or weak—the basic definitionsfollow—but that is about it. One can find variousdiscussions in the literature as to when andwhether total or weak orderings are appropriate(for an entry point, see, for example, (Hansson2001b)), but this debate is mostly inconsequentialfrom our perspective, and we make no commit-ment to one or the other.

Definition 1. A preference relation over a set Ω is atransitive binary relation over Ω. If for every o, o′∈ Ω either o o′ or o′ o then is a total order.Otherwise, it is a partial order (that is, some out-comes are not comparable). A preference relation isstrong if it is antisymmetric (that is, if o o′ then o′ o), otherwise it is weak.

Given a weak ordering , we can define theinduced strong orderings as follows: o o′ IFF o o′, but o′ o. Finally, if o o′ and o′ o then o ∼o′, that is, o and o′ are equally preferred.

Unfortunately, while the semantics of prefer-ence relations is pretty clear, working with prefer-ences can be quite difficult, and there are a num-ber of reasons for that. The most obvious issue isthe cognitive difficulty of specifying a preferencerelation. Cognitive effort is not an issue when allwe care about is a single, naturally ordered attrib-ute. For instance, when one shops for a particulariPhone model, all the shopper cares about mightbe the price of the phone. When one drives towork, all the driver cares about the choice of routemight be the driving time. However, typically ourpreferences are more complicated, and even inthese two examples often we care about the war-ranty, shipping time, scenery, smoothness of driv-ing, and so on. Once multiple aspects of an out-come matter to us, ordering even two outcomescan be cognitively difficult because of the need toconsider trade-offs and interdependencies betweenvarious attributes, as many readers surely knowfrom their own experience. As the size of the out-come spaces increases, our cognitive burdenincreases, and additional computational and rep-

resentational issues come in. How do we order alarge set of outcomes? How can the user effective-ly and efficiently communicate such an orderingto the application at hand? How do we store andreason with this ordering efficiently?

Work in preference handling attempts toaddress these issues in order to supply tools thathelp us design systems acting well on our behalf, orhelp us act well. Preference elicitation methods aimat easing the cognitive burden of ordering a set ofoutcomes, or finding an optimal item. These meth-ods include convenient languages for expressingpreferences, questioning strategies based on simplequestions, and more. Preference representationmethods seek ways of efficiently representing pref-erence information. The two are closely relatedbecause a compact representation requires lessinformation to be specified, making its elicitationeasier. Preference handling algorithms attempt toprovide efficient means of answering commonqueries on preference models.

For the remainder of this article, we consider theoverall process of preference handling, focusing onthe basic questions involved and the key concepts.A number of basic concepts play a central role inour discussion, and these are models, queries, lan-guages, interpretation, representation, and model selec-tion. Two other central concepts are structure andindependence. These concepts will be introducedgradually through concrete examples. They are byno means new or specific to the area of preferencehandling, but using them properly is crucial for agood understanding and a useful discourse. Thechoice of examples is admittedly somewhat biasedto our own, but we believe that these choices servethe purpose of this tutorial well.

When evaluating the ideas described in thistutorial, the reader should bear in mind that thereare many types of decision contexts in which pref-erence handling is required. Each such contextbrings different demands, and one solution may begreat for one context but less so in another. Webelieve preference-handling settings can be, veryroughly, divided into three classes. The firstinvolves online applications with lay users, bestexemplified by various online product selectingproblems. Here users are reluctant to provide muchinformation, time, and effort; there is no room foruser “training”; information is typically qualita-tive, and no technical knowledge should beexpected of the user. Preference-handling tech-niques in this area must make the most of naturallanguagelike statements, and be very wise in theirselection of questions. The second involves systemconfiguration and design. In these applicationspreferences drive the behavior of some system theuser cares about. In these applications we canexpect more effort from the user, and the inputlanguage may be a bit more technical. Preference-

Page 3: Preference Handling— An Introductory Tutorial

handling techniques for such problems can bemore sophisticated but must still focus on enablingthe user to naturally express his domain knowl-edge. Finally, there are complex real-world deci-sion problems that require the assistance of a deci-sion analyst. Here, we require tools such asprobabilistic modeling and utility functions. Pref-erence-handling techniques should provide theanalyst tools that will make it easier for him or herto elicit these complex inputs and efficiently workwith them.

Structurally, this tutorial is divided into twoparts. The first and central part focuses on prefer-ence representation methods. It considers differentcomplete preference models, languages thatdescribe them, and some representational issues.In many application, however, we either don’tneed or won’t get a complete preference modelfrom the user. Thus, the second part deals with thequestion of what to do when the system is givenonly partial information about the preferencemodel, and how to obtain as much of the mostimportant parts of the model from the user. At theend of each subsection we provide both biblio-graphic notes on the material described in it, aswell as pointers to related topics.

Further ReadingWork on preference models, algorithms, and repre-sentation appears in many areas. Discussion of pref-erence models are abundant in the philosophicalliterature (Hansson 2001a, Krantz et al. 1971, vonWright 1963, Hallden 1957), in economics andgame theory (Arrow and Raynaud 1986, Debreu1954, von Neumann and Morgenstern 1947, Fish-burn 1969), in mathematics (Birkhoff 1948, Daveyand Priestley 2002), in operations research anddecision analysis (Fishburn 1974, Wald 1950,Bouyssou et al. 2006), in psychology (Tversky 1967,1969; Kahneman and Tversky 1979, 1984), and invarious disciplines of compute science: AI (Doyleand Thomason 1999, Doyle 2004), databases(Agrawal and Wimmers 2000, Kießling 2002,Chomicki 2002), HCI (Gajos and Weld 2005, Chenand Pu 2007), electronic commerce (Sandholm andBoutilier 2006), and more.

Of particular interest to AI researchers are thefields of decision analysis and multicriteria deci-sion making. Practitioners in these areas facediverse problems, often of great complexity, whichcall for a wide range of tools for modeling prefer-ences, classifying, and sorting options. Some oftheir applications call for more esoteric mathemat-ical models of preference that differ from the tran-sitive binary relations we focus on here. See, forexample, Fishburn (1999), and Oztürk, Tsoukiàs,and Vincke (2005).

Modeling PreferencesTwo key questions arise when one approaches amodeling task, namely (1) what is the model? and(2) what questions or queries do we want to askabout this model? To achieve clarity and properfocus, these questions should be answered early on.

A model is a mathematical structure thatattempts to capture the properties of the physical,mental, or cognitive paradigm in question. Thepurpose of the model is to provide a concreteproxy to the, sometimes abstract, paradigm ofinterest. As such, the model must have intuitiveappeal, because it lends meaning to the queryanswers we compute. Returning to the topic of ourdiscussion, a natural model for preferences is pret-ty clear—it is an ordering over the set of possibleoutcomes, Ω. The answer to the second questionvaries with our application. Typical examples forqueries of interest are finding the optimal out-come, finding an optimal outcome satisfying cer-tain constraints, comparing between two out-comes, ordering the set of outcomes, aggregatingpreferences of multiple users, finding a preferredallocation of goods, and so on. Once we have amodel and a query, we need algorithms that com-pute an answer to the query given the model. Inour tutorial, algorithms will take the back seat, butof course, as computer scientists, we realize thatultimately we must provide them.

The key point is that a preference model mustbe specified for each new user or each new appli-cation (that is, a new set of possible outcomes).Obtaining, or eliciting, the information specifyingthe model is thus one of the major issues in theprocess of preference handling. One simple, butineffective, approach to preference elicitation is toask the user to explicitly describe the model, thatis, an ordering. Clearly, this is impractical unless Ωis very small, and thus we must find an alternativetool that implicitly specifies the model in a com-pact way.2 This tool is the language—a collectionof possible statements that, hopefully, make it easyto describe common, or natural models. To givemeaning to these statements we need an interpre-tation function mapping sets of such statements tomodels.

Putting things together, we now have the fivebasic elements of our metamodel: the model, whichdescribes the structure we really care about; thequeries, which are questions about the model weneed to answer; the algorithms to compute answersto these queries; the language, which lets us implic-itly specify models; and an interpretation functionthat maps expressions in the language into mod-els. These concepts are depicted and interconnect-ed graphically in figure 1 in which the semantics ofthe directed arcs is “choice dependence.” Forinstance, as we discuss later, the choice of languagedepends big time on our assumptions about the

Articles

60 AI MAGAZINE

Page 4: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 61

preference models of the users. Similarly, the algo-rithms developed for handling queries about thepreferences are typically tailored down to thespecifics of the language in use. In what follows,we discuss a few concrete instances of this meta-model. This discussion comes to relate the meta-model to some specific approaches to preferencehandling, and, synergetically, relate differentapproaches to each other.

Value Functions, Structure, and IndependenceLet us start with the simplest concretization of themetamodel in which the model is a weak totalorder , and the language is simply the modelitself. On the positive side, the model is nicebecause it is always possible to answer commonquestions such as “what is the optimal outcome?”or “which of two outcomes is better?” In addition,this combination of model and language makesthe choice of interpretation entirely trivial as itshould simply be set to the identity function. Onthe negative side, however, the choice of languagehere is obviously poor. For instance, imagine hav-ing to order hundreds of digital cameras accessiblethrough some online electronics store with theirmultitude of attributes.

So we are going to gradually try to improve thelanguage. First, rather than specify an explicitordering of the outcomes, we can ask the user toassociate a real value with each outcome. In otherwords, the language now takes the form of a valuefunction V : Ω → �, the interpretation function isstill simple, notably o o′ ⇔ V (o) ≥ V (o′), and theoverall concretization of the metamodel this way isdepicted in figure 2. At first, this appears an ill-

motivated choice. Indeed, having to associate a realvalue with each outcome seems hardly easier thanordering the outcomes. In fact, while sometimesone may find it more convenient to, for example,assign a monetary value to each outcome ratherthan order Ω explicitly, most people will probablyconsider “outcome quantification” to be a moredifficult task. However, the main utility of thismove is conceptual, as it gives us a clue as to howthings might be really improved: While the valuefunction V can always be specified explicitly in theform of a table, there may be more convenient,compact forms for V. If such a compact form exists,and this form is intuitive to the user, then the com-plexity pitfall of eliciting user preferences on anoutcome-by-outcome basis can be eliminated.

But why would V have a compact and intuitiveform? Of course, if we treat outcomes as mono-lithic objects, for example, camera23, camera75,and so on, then such a compact form of V is notsomething we would expect. However, as the digi-tal camera example suggests, outcomes of typical-ly have some inherent structure, and this structureis given by a set of attributes. For example, flightshave departure and arrival times, airline, cost,number of stopovers, and so on. When we pur-chase a flight ticket, what we care about are the val-ues of these attributes. In the remainder of thistutorial, unless stated otherwise, we assume thatwe are working with such outcomes. That is, eachoutcome has some set of attributes, and our choiceamong these outcomes is driven by the value ofthese attributes. Formally, we assume the existenceand common knowledge of a set of attributes X =X1, …, Xn such that Ω = X = ×n

i=1Dom(Xi), whereDom(Xi) is the domain of attribute Xi. The domainsof the attributes are assumed to be finite, although

Model Language Algorithms

Queries

Find optimal outcomeFind optimal feasible outcomeOrder a set of outcomes...

Outcome X is preferred to outcome YOutcome Z is goodValue of outcome W is 52...

Total strict order of outcomesTotal weak order of outcomesPartial strict order of outcomesPartial weak order of outcomes

Figure 1. The Metamodel.

Page 5: Preference Handling— An Introductory Tutorial

handling infinite yet naturally ordered domains istypically very much similar.

Preferential Independence. Although the abilityto describe an outcome by means of some prefer-ence-affecting attributes X is a necessary conditionfor a value function to be compactly representable,in itself it is not enough. An additional workingassumption we need is that typical user preferences

exhibit much regularity with respect to the givenattributes X. The informal notion of “preferenceregularity” is formally captured by the notion ofpreferential independence.

Let Y, Z be a partition of X, that is, Y and Z aredisjoint subsets of attributes whose union is X. Wesay that Y is preferentially independent of Z if forall y1, y2 ∈ Dom(Y) and for every z ∈ Dom(Z),

Articles

62 AI MAGAZINE

Figure 2. Weak Total Order.

Language Algorithms

Queries

Models

Interpretation

Total weak orderof outcomes

o � o′ ⇔ V(o) > V (o′)

V(o) = 0.5

V(o′) = 1.7

V(o) = 100

V(o) = 92

V(o) = 91

Page 6: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 63

y1z y2z implies that for all z′ ∈ Dom(Z): y1z′ y2z′.

Intuitively, this means that preferences over Y-val-ues do not depend on the value of Z, as long as thatvalue is fixed. That is, if, when we fix Z to z, I weak-ly prefer y1z to y2z, then when we fix Z to someother value z′, I still weakly prefer y1z′ to y2z′. Forexample, if my preference for a car’s color is thesame regardless of the brand, for example, I alwaysprefer blue to red to white, then we can say that thecolor of the car is preferentially independent fromits brand. (Here we assumed that color and brandare the only two preference-affecting attributes.)Similarly, for most people, price is preferentiallyindependent of all other attributes. That is, giventwo outcomes that are similar, except for theirprice, most people would prefer the cheaper one,no matter what fixed concrete value the other pref-erence-affecting attributes take in these outcomes.Importantly, note that when Y is preferentiallyindependent of Z we can meaningfully talk aboutthe projection of to Y. Even more importantly,preferential independence of Y from Z allows theuser to express his or her preferences between manypairs of outcomes in Ω in terms of Y only. Forexample, if my preference for a car’s color is thesame regardless of the brand, preference of blue tored implies preference of blue Toyota to red Toyota,blue BMW to red BMW, and so on.

Unlike probabilistic independence, preferentialindependence is a directional notion. If Y is pref-erentially independent of Z it does not follow thatZ is preferentially independent of Y. To see this,suppose that cars have two attributes: price (cheapor expensive) and brand (BMW or Toyota). Ialways prefer a cheaper car to an expensive one forany fixed brand; that is, I prefer a cheap Toyota toan expensive Toyota, and a cheap BMW to anexpensive BMW. However, among cheap cars, Iprefer Toyotas, whereas among expensive cars, Iprefer BMW. In this case, the ordering modelingmy preferences is

(cheap Toyota) (cheap BMW) (expensive BMW) (expensive Toyota).

While here price is preferentially independent ofbrand, brand is not preferentially independent ofprice.

A weaker notion of independence is that of con-ditional preferential independence. Let Y, Z, C be apartition of X. We say that Y is conditionally pref-erentially independent of Z given C = c if, when weproject the ordering to outcomes in which C = c,we have that Y is preferentially independent of Z.More explicitly, for all y1, y2 ∈ Dom(Y) and forevery z ∈ Dom(Z)

y1zc y2zc implies that for all z′ ∈ Dom(Z): y1z′c y2z′c.

We say that Y is preferentially independent of Z

given C if for every c ∈ Dom(C) we have that Y ispreferentially independent of Z given C = c. Thus,in the case of conditional preferential independ-ence, our preference over Y values does not dependon Z values for any fixed value of C. Note that thepreference ordering over Y values can be differentgiven different values of C, but for any choice of Cvalue, the ordering is independent of the value ofZ. As an example, suppose that Y, Z, C are, respec-tively, brand (BMW or Toyota), mileage (low orhigh), and mechanical inspection report (good orbad). Given a good inspection report, I prefer BMWto Toyota, regardless of the mileage, perhapsbecause I generally like BMWs, but feel there is nopoint in a BMW whose mechanical condition isnot good. Thus, I have (BMW,low,good) (Toyota,low,good), and (BMW,high,good) (Toy-ota,high,good). On the other hand, given a badmechanical inspection report, I prefer Toyota toBMW, as they are cheaper to fix, and more reliable.Thus, (Toyota,low,bad) (BMW,low,bad), and (Toy-ota,high,bad) (BMW,high,bad). We see that oncethe value of the mechanical inspection report isfixed, my preference for brand is independent ofmileage, yet the actual preference may changedepending on the value of the mechanical inspec-tion report.

Compact Value Functions and Additivity.Thestrong intuition obtained from the examples in theprevious section is that user preferences in manydomains exhibit much preferential independenceof this or another kind. The question now is howcan we leverage such structure. In what followswe’ll see at least two answers to this question, andhere we’d like to focus on the effect of preferentialindependence on the structure of the value func-tion.

First, given an attributed representation of theoutcome space Ω = ×i=1,…,nDom(Xi), we can focuson value functions V : ×i=1,…,nDom(Xi) → R. As wenoted earlier, this in itself buys us a language forspecifying V, but not the compactness—the size ofan explicit tabular representation of V would stillbe ∏i=1…n |Dom(Xi)|. However, depending on theactual preference ordering of the user, more com-pact forms of V are possible. Probably the best-known such form is that of a sum of functions oversingle attributes, that is,

Functions V that have this form are said to be addi-tively independent. Intuitively, it seems that suchfactored form corresponds to some form of inde-pendence. Indeed, suppose that we know that auser preference ordering is captured byV(Color,Brand) = V1(Color) + V2(Brand). Clearly, forany fixed value of color, the preference of the userover brands is identical, because the choice of col-

V X X V Xn i ii

n

11

, ,…( )= ( )=∑

Page 7: Preference Handling— An Introductory Tutorial

or simply fixes V1. Thus, brand is preferentiallyindependent of color, and similarly, color is pref-erentially independent of brand. In fact, if

then for any attribute subset Y of X, Y is preferen-tially independent of X \ Y. This strong form ofindependence within X is called mutual independ-ence.

Value functions in a “factored” form offer onemajor advantage: they are much easier to specify,and thus much easier to elicit from the users. Tosee the potentially huge benefit here, note that, ifV is known to be additively independent over nBoolean attributes, its specification requires 2n val-ues only, as opposed to 2n values for a tabular,unstructured representation. Some forms of rea-soning, too, are much simpler. For instance, find-ing the optimal outcome with respect to an addi-tively independent V can be done in time O(n) byindependently maximizing each of the subfunc-tions of V. In contrast, the same task with respectto the same function but in an unstructured, tabu-lar form requires exhaustive search in time O(2n).But an additive independent form is very limiting,too. Basically, it implies that our preferences areunconditional, that is, the preference orderingover attribute values is identical no matter whatvalues the other attributes have. Thus, if you pre-fer your coffee with milk, then you must preferyour tea with milk, too. Or, if you like a manualtransmission in your sports car, you must like it inyour minivan, too. Even worse, the strength ofpreference over attribute values is also fixed—informally, if the added value of milk in your cof-fee (as opposed to none) is greater than the addedvalue of sugar (as opposed to none), then the samegoes for tea.

Generalized Additivity. While additively inde-pendent value functions seem a limiting choice oflanguage for preference specification, it does seemthat our preferences satisfy many independenceproperties. For example, the only attributes thataffect whether I prefer an aisle seat to a windowseat on the plane are the duration of the flight andthe route. Or, the only attribute that affects howmuch milk I want in my hot drink is the type ofdrink, not the amount of sugar, the glass, and soon. To capture such (weaker than additive) formsof independence, we have to shift our language tomore expressive functional forms of value func-tions.

A more general form of additive independencethat allows us to be as general as we wish, but notmore than that, is that of generalized additiveindependence (GAI). The generalized additiveform of a value function V is

V x x V xn i n i i1 1, , , ,…( )= ( )= …∑ where Z1, …, Zk ⊆ X is a cover of the preference-affecting attributes X. In other words, V is a sum ofk additive factors, Vi, where each factor depends onsome subset of the attributes X, and the attributesubsets associated with different factors need notbe disjoint. By allowing for factors containingmore than one attribute we enable capturing pref-erentially-dependent attributes. By allowing fac-tors to be nondisjoint we might substantiallyreduce factor sizes by enabling different factors tobe influenced by the same variable, without havingto combine their associated variables into a single,larger factor. For example, the value of a vacationmay depend on the quality of facilities and thelocation. How I assess each may depend on the sea-son—in the summer I prefer a place near thebeach, while in the winter I prefer a major city; inthe summer I want large outdoor pools and ter-races, while in the winter I prefer indoor pools,saunas, and large interiors. By using the languageof GAI value functions, V(Location,Facilities,Season)can now be given in the form V1(Location,Season) +V2(Facilities,Season).

Note that, in contrast to the language of addi-tively independent functions, the language of GAIfunctions is fully general. If we choose k = 1 and Z1= X, we can represent any value function. In theother extreme, if we choose k = n and Zi = {Xi}, weobtain the language of additively independent val-ue functions. Considering the user efforts requiredat the stage of preference elicitation, the usefultrade-off between expressiveness and compactnessseems to lie somewhere in between, with factorsize (that is, |Zi|) being relatively small and thenumber of factors, k, being also not too large. How-ever, even with this sublanguage of GAI valuefunctions, answering queries may become muchharder than with the language of fully additive val-ue functions. This issue brings us to introducingthe final component of our metamodel, namelythe representation of the user-provided informa-tion.

The representation is a structure that captures allthe information in the the model and is intendedto be used directly by our query-answering algo-rithms. A natural choice for representation is thelanguage itself—after all, the language is designedto facilitate efficient, concise characterization ofthe model. Indeed, in many cases, the language isused as the representation, and the algorithmsoperate on it. Much work in knowledge represen-tation and reasoning in AI seeks to find variousclasses of languages for which efficient inference oroptimization algorithms exist. Naturally, theseclasses cannot describe every model concisely, butoften they can describe some important, or natural

V X X Vn i ii

k

11

, ,…( )= ( )=∑

Articles

64 AI MAGAZINE

Page 8: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 65

models concisely. Sometimes, however, it is usefulto take the input expression and slightly transformor augment it to obtain a representation that pro-vides more intuitive means for analyzing the com-plexity of query answering and developing effi-cient query-answering algorithms. Taking a lookacross various reasoning tasks in AI, typically sucha representation comes in a form of an annotatedgraph, describing dependencies between differentcomponents of the problem. Indeed, graphical rep-resentations have been found useful for variouspreference models as well. These representationsare compiled from the language (or in some cases,they augment the language as input devices), andinference and optimization algorithms makestrong use of the properties of these graphicalstructures. In particular, the complexity of thealgorithms can often be bounded by some graph-topological parameters of these graphs.

GAI value functions V provide us with the firstexample of this type. The language in this case is aset of tables describing the subfunctions of V.Using these tables, that is, using the language asthe actual representation manipulated by the algo-rithms, it is easy to compute the value of any out-come and thus, to compare any two outcomeswith respect to the preferences of the user. Howev-er, computing the optimal outcome, that is, find-ing the value of X that maximizes V is more diffi-cult. To deal with such queries, it is useful tomaintain a GAI-network (Gonzales and Perny2004). A GAI-network is an annotated graph whosenodes correspond to the attributes in X. An edgeconnects the nodes corresponding to a pair ofattributes Xi, Xj ∈ X if Xi and Xj occur jointly insome factor, that is, {Xi, Xj} ⊆ Zl for some l ∈ {1, …,k}. Assuming no redundant factors, each of the

cliques in this graph corresponds to a factor in V,and it is annotated with a table describing the val-ues of different instantiations of this factor. Anexample of the graphical core of a GAI-networkappears in figure 3.

On the left side of figure 3 we see a GAI valuefunction, and on the right hand side, we see thecorresponding GAI-network. In the constraintoptimization literature, this structure is called acost network (Dechter 2003), and the problem offinding an assignment with maximal cost for sucha network is well studied. Importantly, the algo-rithms used to solve this problem utilize the graph-ical structure of the network. This problem isequivalent to that of computing an optimal assign-ment to X with respect to a GAI value function.Consequently, the same algorithms can be usedhere, and they benefit from a representation of thevalue function in the form of a cost, or GAI, net-work. Putting things together, in the case of GAIvalue functions, we see that it is better to store theinput in an intermediate representation as a costor GAI network. The updated metamodel with thecomponents associated with GAI value functionsis depicted in figure 4.

Further ReadingValue functions originate in the decision theory lit-erature, for example, Wald (1950). The notion ofpreferential independence is discussed in depth byKeeney and Raiffa (1976) who also consider theirrelation to additive forms. Generalized additiveindependence was introduced by Fishburn (1969)and was later made popular in the AI communityby Bacchus and Grove (1995). GAI value functionsare essentially identical to weighted CSPs (Dechter2003), that is, CSPs in which a value is associated

V (X1,..., X6) = g1(X1, X2, X3) + g2(X2, X4, X5) + g3(X5, X6)

X1

X2 X4

X5X3 X6

Figure 3. Example of a GAI Value Function and the Induced GAI-Network.

Page 9: Preference Handling— An Introductory Tutorial

with each possible assignment to a constrained setof variables. In fact, the optimization problem offinding the optimal assignment to a GAI valuefunction/cost network/weighted CSP is identical tothat of finding most probable explanation (MPE)in graphical probabilistic models such as Bayesnets or join-trees (Pearl 1988, Lauritzen andSpiegelhalter 1988). Indeed, there is, in general,close correspondence between probabilistic rea-soning and preferences. Technically, a joint proba-bility distribution over random variables x1, …, xnalso describes an ordering over the set of possibleassignments. While the joint probability form ismultiplicative, the logarithm is additive and looksmuch like an additive value function. Value (andutility) functions can be normalized to look likeprobabilities, but there are no natural correspon-ding notions of marginal probability and condi-tional probability in the case of preferences(although see Shoham [1997] for a slightly differ-ent perspective).

Value functions map tuples to integers, or reals.More generally, mappings to other structures canbe defined, as has been done in the framework ofsoft constraints (Bistarelli, Montanari, and Rossi1997; Bistarelli et al. 1999). This formalism can beviewed as generalizing the notion of GAI, as itallows for both new ranges for the function andaggregation and comparison operators other thanaddition and maximization.

Value functions over the Cartesian product of aset of attribute domains are one way of combiningpreference for different aspects of the problem.One can think of each factor in a value function asrepresenting one criterion for selecting elements.Alternatively, one way to induce attributes on anobject is to rate it according to multiple criteria.Multicriteria decision making is an important fieldof operations research and decision analysis that isconcerned with the question of how to work withmultiple, possibly conflicting criteria without nec-essarily combining them into a single value func-tion (or more generally, a total order). See Bouys-sou et al. (2006) for a detailed description oftechniques in this area.

Partial Orders and Qualitative LanguagesThe language provided by GAI value functions fordescribing total orders is clearly much more con-venient than an explicit ordering of the set of out-comes. However, the effort required to specifythem is still too large for many applications, andthe quantitative information the users are requiredto provide is typically not very intuitive for them.Therefore, while system designers may be willingto spend the time and effort required to specify avalue function, casual, lay users, such as shoppersat online stores, are highly unlikely to do so.

So what properties would we like a language tohave so that lay users will feel comfortable express-

Articles

66 AI MAGAZINE

Language Algorithms

Queries

Models

Interpretation

Total weak orderof outcomes Factor values

RepresentationCost networks

o � o′ ⇔ f (g1(o[Y1]), ...) > f (g1(o′[Y1]), ...)

Figure 4. Metamodel for GAI Value Functions.

Page 10: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 67

ing their preferences or answering preferencequeries?3 First, such a language must be basedupon information that users find cognitively easyto reflect upon and express. Second, the informa-tion communicated in such language should bereasonably easy to interpret. Third, the languageshould allow a quick (and therefore, compact)specification of “natural” preference orderings.Finally, it should be possible to efficiently answerqueries about the user preferences. Consideringthis list of expectations, it appears unavoidablethat such a language be based on pieces of qualita-tive information about the model. Indeed, whilequantitative information appears much harder tointrospect upon, natural language statements,such as “I prefer SUVs to minivans,” or “I likevanilla,” appear quite often in our daily conversa-tions (as opposed, say, to the “vanilla is worth $5.2to me”).

So what types of qualitative statements of pref-erence can we expect users to provide? One simpleclass of statements corresponds to explicit com-parison between alternatives, such as “I like thiscar more than that one,” or “I prefer this flight tothat flight.” Similar statements are example-cri-tiquing statements of the form “I want a car likethis, but in blue.” These statements are very easy tointerpret, in part because they indicate an orderingrelation between two alternative, completely spec-ified outcomes. However, because these statementsare very specific, they provide only a small amountof information about the preference model of theuser. Another class of qualitative preference state-ments are generalizing statements, such as “In aminivan I prefer automatic transmission to manu-al transmission,” or perhaps a more extreme “I pre-fer any red car to any blue car.” Such statementsare as natural in our daily speech as the more con-crete comparison statements, yet they providemuch more information because a single general-izing statement implicitly encodes many compar-isons between concrete outcomes. Syntactically,generalizing preference expressions take the form

where each si = ϕi Ri ψi is a single preference state-ment, with ψi, ϕi being some logical formulas overX; Ri ∈ {, , ∼}; and , , and ∼ have the standardsemantics of strong preference, weak preference,and preferential equivalence, respectively.

For an illustration, the statements s1 “SUV is at least as good as a minivan”

and s2 “In a minivan, I prefer automatic transmission tomanual transmission”

can be written as (Xtype = SUV ) (Xtype = minivan)

and

S s sm m m m= …{ }= …{ }1 1 1 1, , , , ,ϕ ψ ϕ ψR R

(Xtype = minivan ∧ Xtrans = automatic) (Xtype = mini-van ∧ Xtrans = manual).

By adding such generalizing statements into ourlanguage, we facilitate more compact communica-tion of the preference model. This benefit, howev-er, can come with a price at the level of languageinterpretation because the precise meaning that auser puts into generalizing statements is notalways clear. While the statement “I prefer any veg-etarian dish to any meat dish” is easy to interpret,the statement “In a minivan I prefer automatictransmission to manual transmission” can beunderstood in numerous ways. For instance, underwhat is called the totalitarian semantics, this state-ment gets interpreted as “I always prefer a minivanwith an automatic transmission to one with amanual transmission” (that is, regardless of theother properties the two cars may have). Whilesometimes reasonable, in general it assumes toomuch about the information communicated bythe statement. Another possible interpretation is “Iprefer a typical/normal minivan with automatictransmission to a typical/normal minivan with amanual transmission.” Of course, we now have todefine “typical” and “normal,” a thorny issue byitself. Yet another possible interpretation, underwhat is called ceteris paribus (“all else being equal”)semantics, is “I prefer a minivan with automatictransmission to one with a manual transmission,provided all other properties are the same.” This isprobably the most conservative natural interpreta-tion of such generalizing statements, and as such,it is also the weakest. That is, it provides the weak-est constraints on the user’s preference model. Butbecause it is conservative, it is also unlikely toreach unwarranted conclusions, which makes itprobably the most agreed-upon general scheme forstatement interpretation. However, the list of alter-native interpretations that has been suggested inthe literature on preferences and in the area ofnonmonotonic reasoning is much longer, and theonly solid conclusion we can probably draw fromthis situation is that the role of the interpretationfunction becomes very important in metamodelsrelying on qualitative statements of preference.

With this picture in mind, in the remainder ofthis section we consider a specific instance of acomplete metamodel in which a qualitative lan-guage is used. Specifically, we consider the lan-guage of statements of preference on the values ofsingle attributes. This specific language underliesthe preference specification approach called CP-networks (Boutilier et al. 2004a), and the compo-nents of this metamodel instance are described infigure 5.

The model of user preferences assumed in theCP-networks approach is that of partial orders overoutcomes. The reason is practical. Total orders aredifficult to specify. Ultimately, to specify a total

Page 11: Preference Handling— An Introductory Tutorial

order, we need information that allows us to com-pare any two outcomes. That is quite a lot of infor-mation, and generally it is unlikely that casualusers will be ready to provide it. In addition, inpractice we often don’t need to compare every pairof outcomes. For instance, this information mightnot be essential for the tasks of identifying themost preferred outcomes, ordering reasonably wella given set of outcomes, and so on.

The language employed in the CP-networksapproach is the language of (possibly conditioned)preference statements on the values of singleattributes. For instance, in the statement “I preferblack sports cars to red sports cars,” the preferenceis expressed over attribute Ext-Color, with the valueof Category conditioning the applicability of thestatement only to pairs of sports cars. Formally, thelanguage of CP-networks consists of preferencesexpressions of the form

Note that Y can be any subset of attributesexcluding the referent attribute X, and each state-ment y ∧ xi y ∧ xj in the expression says that, insome context captured by assigning y to Y, the val-ue xi is preferred to xj for X.

The interpretation function used in CP-net-works corresponds to the ceteris paribus interpre-tation of individual statements, and then transitive

Sx x X X

x x Dom X Domi j

i j

⊆∧ ∧ ∈ ⊆ { }∈ ( ) ∈ ( )

⎧⎨⎪y y X Y X

y Y

, \ ,

, ,⎪⎪

⎩⎪⎪

⎫⎬⎪⎪

⎭⎪⎪

closure of the preference orderings induced bythese statements. Specifically, a statement y ∧ xi y ∧ xj is interpreted as communicating a preferenceordering in which outcome o is preferred to out-come o′ whenever o and o agree on all attributes inX \ {X}, they both assign the value y to Y, and oassigns xi to X while o′ assigns xj to X. For example,consider the statement s1 in figure 6, which can bewritten as: Xcategory = minivan ∧ Xext-color = red Xcat-

egory = minivan ∧ Xext-color = white. s1 induces the fol-lowing preference ordering on the tuples in thetable on the top right: t1 t3 and t2 t4. These rela-tionships follow from the ceteris paribus semanticsbecause both t1, t3 and t2, t4 are minivans and theyboth assign the same value to all other attributes(Int-Color in this case). However, t1’s Ext-Color ispreferred to t3’s. In contrast, t1 t4 does not followfrom s1 because t1 and t4 assign different values toInt-Color. Considering now a set of statements as asingle coherent preference expression, the inter-pretation function in the CP-networks approachtakes the transitive closure of the orderingsinduced by each of the statements si. For example,we saw that s1 induces t2 t4. From s4 we can sim-ilarly deduce t1 t2 because these are two red carswhose other attributes (Category in this case) arethe same. Thus, the preference ordering inducedby the overall expression provides us with t1 t4.

The last conceptual component of the meta-model is the representation of the input informa-tion. In the CP-networks approach, the preference

Articles

68 AI MAGAZINE

Language Algorithms

Queries

Models

Interpretation

Partial strict / weak orderof outcomes

Sets of statementsof (conditional) preference

over single attributes

Ceteris Paribus

RepresentationCP-nets

Figure 5. Metamodel for CP-Nets.

Page 12: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 69

expression is represented using an annotateddigraph (that gives the name to the approach). Inthis digraph, nodes correspond to outcome attrib-utes, and edges capture the conditioning structureof the preference statements. Specifically, for eachstatement y ∧ xi y ∧ xj we add an edge from eachnode representing an attribute Y′ ∈ Y to the noderepresenting attribute X, because our preferencefor X values depends on the value of Y′. Finally,each node is annotated with all statements thatdescribe preferences for its value. Figure 6 depictsthe CP-network induced by statements {s1, …, s5}.For example, because of statement s1 in which thepreference for exterior color depends on category,Category is a parent of Ext-Color in the network. Inthat figure we can also see the relevant preferencestatements in the table of each node. On the low-er right side of this figure we see the induced pref-erence relation. Note that for clarity we introducedonly the ordering relations that follow directlyfrom the statements and just a few of the relationsobtained by transitive closure.

Recall that earlier we emphasized the notion ofindependence. How does it come into play in CP-networks? The answer is that it is a by-product of

the interpretation semantics. Notice how theceteris paribus semantics and the notion of condi-tional preferential independence are closely relat-ed. In both some context is fixed, and then theorder over values of some attribute set is identicalregardless of the (fixed) value of the remainingattributes. In CP-networks, the ceteris paribusinterpretation forces the attribute associated with anode to be preferentially independent of all otherattributes, given the value of the parent attributes.

But what does the graphical structure of CP-net-works really buy us? While users need not be awareof this structure, it was shown to play an impor-tant role in computational analysis of variousqueries about the preference model. First, thegraph structure helps us ensure statement consis-tency. In general, a preference statement is incon-sistent if it induces a cycle of strict preferences inthe preference ordering. In that case, we can provethat o o for some outcome o. In the case of CP-networks, we know that if the CP-net is acyclic,then the preference statement it represents mustbe consistent. Note that the acyclicity of the net-work can be recognized in time linear in |X|—this

category ext-color int-color minivan red bright minivan red dark minivan white bright minivan white dark SUV red bright SUV red dark SUV white bright SUV white dark

Outcome SpacePreference Expression

Preference OrderCP-net

s1 I prefer red minivans to white minivans.s2 I prefer white SUVs to red SUVs.s3 In white cars I prefer a dark interior.s4 In red cars I prefer a bright interior.s5 I prefer minivans to SUVs.

t1t2t3t4t5t6t7t8

Cmv � CsuvEr � EwEw � Er

CmvCsuv

Ib � IdId � Ib

ErEw

category ext - color int - color

t1

t2

t3 t7 t5

t6t4 t8

Figure 6. CP-Net Example: Statements, CP-Net, Induced Preference Relation.

Page 13: Preference Handling— An Introductory Tutorial

is probably the simplest example of a query thatcan be efficiently answered by using the graphicalrepresentation of the input expression. Second,when the network is acyclic, we can topologicallysort its nodes. This topological ordering is a stan-dard subroutine in various algorithms that operateon CP-networks. For instance, this ordering can beused to find an optimal outcome in time linear in|X|.4 Third, it can be shown that the complexity ofcomparison queries, that is, answering the query“does o o′ hold,” varies depending on the struc-ture of the network from polynomial time toPSPACE-complete.

Finally, given the complexity of comparisons, itappears that finding a preferentially consistenttotal order of a given set of outcomes would be dif-ficult as well. By a consistent total order we meana total order such that if o o′ holds then o willappear before o′ in that order. Surprisingly, per-haps, this is not the case, and we can consistentlyorder a set of outcomes quickly. The reason for thisis that we have more flexibility in doing this. Fortwo given outcomes, o, o′, a comparison query askswhether o o′ holds. In contrast, an “orderingquery” asks for a consistent ordering of o, o′. Toorder o before o′, we need only know that o o′,which is easier to check. In fact, an ordering queryneeds only establish that one of o o′ or o′ oholds. When the CP-network is acyclic we can dothis in time linear in |X|. Of course, it is not obvi-ous that, given that ordering two outcomes is easy,so is ordering an arbitrary number of outcomes, asthere may be some global rather than local con-straints involved. But fortunately, the same ideasapply, and we can order a set of N items in O(Nlog(N)|X|) time.

Altogether, we see that the graphical representa-tion in CP-nets holds much useful informationabout the induced preference order. This exampleprovides good evidence that representationdeserve careful attention as an important compo-nent of the preference handling metamodel.

Further ReadingCP-nets were introduced in Boutilier et al. (2004a,2004b) and have generated various extensions,enhancements, and analysis, such as Wilson(2004); Rossi, Venable, and Walsh (2004); Gold-smith et al. (2005); and Brafman, Domshlak, andShimony (2006). Preference languages and logicshave been discussed by philosophers (Hansson2001a) and AI researchers (Brewka, Niemela, andTruszczynski 2003; Tan and Pearl 1994; Doyle andWellman 1994; Boutilier 1994). In fact, the bestknown interpretation for nonmonotonic logics isknown as the preference semantics (Shoham1987), and thus most nonmonotonic logics, suchas Kraus, Lehmann, and Magidor (1990);McCarthy (1980); and Reiter (1980), can be viewed

as expressing preference statements, usually asso-ciated with aggressive interpretations that attemptto deduce as much information as possible.

Graphical models play an important role inmany areas of AI, most notably probabilistic rea-soning (Pearl 1988) and constraint satisfaction(Dechter 2003). An early use in preference model-ing appeared in (Bacchus and Grove 1995), whereit was used to model independence in utility func-tions. More recent models include expected-utilitynetworks (La Mura and Shoham 1999), which pro-vide a model that combines preference and proba-bilities combined, GAI-networks, which modelGAI utility functions (Gonzales and Perny 2004),UCP-net (Boutilier, Bacchus, and Brafman 2001),which provide a directed graphical model for val-ue and utility functions, and CUI networks, whichbuild on the notion of conditional utility inde-pendence to provide more compact representa-tions for certain utility functions (Engel and Well-man 2006).

Preference CompilationSo far, we have seen two general approaches topreference specification. The more quantitativevalue function-based specifications are very simpleto interpret and support efficient comparison andordering operations. However, they are based onan input language that few users will find natural.The qualitative models that utilize generalizingstatements attempt to provide users with intuitiveinput expressions; however, it is not always clearhow to interpret these statements (for example, “Ilike Chinese restaurants”), comparisons may bedifficult, and depending on the class of input state-ments, ordering may be difficult too. Things canbecome particularly hairy if we want to allowdiverse types of input statements. On the onehand, this is clearly desirable, as it gives more flex-ibility to the users. On the other hand, the inter-pretation and interaction between heterogeneousstatements may not be easy to specify. Likewise,developing efficient query-answering algorithmsfor such heterogeneous expressions becomes muchmore problematic.

Preference compilation techniques attempt tomake the best of both worlds by mapping diversestatements into a single, well-understood repre-sentation, notably that of compact value func-tions. As outlined in figure 7, this mapping trans-lates qualitative statements into constraints on thespace of possible value function. As a simple exam-ple, if I say “car A is preferred to car B,” then thisputs a constraint on V in the form of V(A) > V(B).Typically these constraints are consistent withmultiple value functions, and depending on themethod, we then answer queries by using one ofthe consistent value functions. Note that by work-ing with one specific consistent value function, we

Articles

70 AI MAGAZINE

Page 14: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 71

are committing to one of many total orders thatextend the partial order described by the input;that is, we are making additional, not necessarilywarranted, assumptions about the user’s prefer-ence. Depending on the application and the typesof queries we want to answer, this may or may notbe appropriate. We return to this issue later in thistutorial.

In what follows we describe two classes of tech-niques for value function compilation. One ismore canonical, following the classical works onconjoint measurement, whereas the other is basedon more novel ideas, as well as on some computa-tional techniques used extensively in the area ofdiscriminative machine learning.

Structure-Based Compilation. One of the advan-tages of generalizing statements is that a singlestatement can express much information. Thus,with a compact structure, such as a CP-network, wecan express much information about the orderingof outcomes. If we were to compile a set of suchstatements into an arbitrary value function, wewould lose this conciseness—we might get a struc-ture that is exponentially large in |X|, and there-fore expensive to store and work with. Structure-based compilation attempts to maintain thebeneficial compact structure of input statements inthe resulting value function so that time and spacecomplexity remains low. Another reason for main-taining a compact target function is that it pos-

sesses many independence properties. These prop-erties seem desirable, as they are usually implicitlyimplied by the input statements.

The general paradigm of structure-based compi-lation works as follows. We would like to show thatsome class of statements can be mapped into someclass of value functions and that this mapping canbe done efficiently. Let’s consider a very simpleexample to illustrate this idea. For that, we restrictourselves to a simple language of unconditionalpreference statements over single attributes such as“I prefer red cars to blue cars.” Next, we specify atarget class of value functions, and in our case thiswill be the class of additively independent valuefunctions. Note that this class of functions satisfiesthe desired goal of mapping into compact valuefunctions—each additively independent valuefunction requires only O(n) space as we need toassociate a numeric value with (and only with)every attribute value. For example, if we have anattribute Color with values blue, red, and black, wewill have a function Vcolor associated with thisattribute and we must specify Vcolor(blue), Vcolor(red),Vcolor(black). Finally, we must specify a mappingfrom our preference expressions to the selected tar-get class of functions. The mapping usually corre-sponds to mapping each statement to a set of con-straints on the value function. For example, “Iprefer red cars to blue cars” maps into Vcolor(blue) <Vcolor(red). Once we generate the entire set of con-

Language Algorithms

Queries

Models

Interpretation

Partial strict / weak orderof outcomes

Sets of qualitativepreference statements

RepresentationCompact value

functions

Compilation

Figure 7. Preference Compilation.

Page 15: Preference Handling— An Introductory Tutorial

straints, any values for the parameters that satisfythese constraints is a valid value function. Forexample Vcolor(blue) = 10; Vcolor(red) = 15 is a validvalue function with respect to the sole statement,as noted.

At this point we want to show that this mappingworks in general. That is, if we have a consistentset of statements S in the given language, thenthere exists an assignment to the appropriate set ofparameters that yields a value function V repre-senting the preference order induced by S.5 Forinstance, in this particular case, it is trivial to showthat given consistent orderings over the values ofindividual attributes, we can come up with anadditively independent value function that is con-sistent with these orderings. In general, represen-tation theorems of interest usually (1) define a con-crete class of input statements; (2) define thestructure of the resulting value function (as a func-tion of some properties of the input statements),and show that it is not too large; (3) prove that, forconsistent input expressions in the chosen class,there exist value functions consistent with theseexpressions and obeying the selected structure.Finally, we also need a compilation theorem, thatis, a theorem that shows that this mapping processcan be performed efficiently.

Naturally, there are more interesting results thanthe very simple example considered here. Forexample, Brafman and Domshlak (2008) show thatit is possible to compile to compact GAI valuefunctions rather complicated, heterogeneousexpressions consisting of (1) conditional prefer-ence statements over single attributes, for example,“In sports car from model year 2008 and beyond, Iprefer the color red to blue,” (2) conditional state-ments of attribute importance, for example, “Giv-en that I’m flying in business class at night, the air-line choice is more important than the seatingassignment,” and (3) pairwise comparisons of con-crete outcomes, for example, “Your car is betterthan mine.” The compilation scheme for this lan-guage is already much more involved and thus wewon’t get into its detailed discussion. In general,however, the area of structure-based preferencecompilation is still largely unexplored, and webelieve that many interesting results in this direc-tion are still to be discovered.

Structure-Free Compilation. While the amountof concrete results on structure-based preferencecompilation is still limited, it appears that thestructure-based approach has two important limi-tations in general. The first is that a particular tar-get structure for the value function is assumed. Aswe know, such structure is intimately tied with thenotion of independence. Thus, in essence, thestructure-based approach attempts to augment theuser expression with additional independenceassumptions. These assumptions may not be valid.

They may even be inconsistent with the user’sstatements. The second limitation is that our abil-ity to provably model heterogeneous types of state-ments is limited. That is, the languages for whichwe have representation and compilation theoremsare limited.

The two assumptions are closely related.Because we insist on a special structure with itsrelated independence assumptions, we lose ourmodeling flexibility, and consequently the expres-sive power of input statements we can handle.However, we insisted on such structure for a rea-son—we wanted a compactly representable valuefunction. The structure-free compilation approachremoves the inherent independence assumptionsof the structure-based approach while maintain-ing our ability to efficiently compute the value ofeach outcome. It does this by utilizing kernel-based methods, popular in machine learning(Muller et al. 2001). Of course, there is no entirelyfree lunch, and some other assumptions are hid-den behind these technique—and we will try tounderstand them later on.

The basic idea behind structure-free compilationis as follows: starting with the original n attributesX, we schematically define a much larger (expo-nential in n) space of new attributes. These attrib-utes, called features in what follows, correspond toall possible combinations of attribute values. Thekey property of the high-dimensional spacedefined by these features is that any value functiondefined in terms of the original set of attributes hasan additively independent decomposition over thenew features. The value of an outcome correspondsto the dot product between a vector of weights,one associated with each feature, and the vector offeature values provided by the outcome. Comput-ing this dot product explicitly is infeasible, as thesevectors’ length is exponential in n. However, undercertain conditions, dot products can be computedefficiently using kernal functions, and this is exact-ly what is exploited here. We now provide a moredetailed explanation.

The structure-free approach works by taking afirst step that is rather unintuitive—it maps theoriginal problem where we work with n givenattributes into a problem in which we have exp(n)attributes. Assuming here that the attributes X areall binary valued, the outcomes Ω described interms of X are schematically mapped into a spaceF = R4n using a certain mapping Φ : Ω F. Themapping Φ relates the attributes X and dimensionsof F as follows. Let F = {f1, … f4n} be a labeling ofthe dimensions of F, and D = ∪i=1…n Dom(Xi) = {x1,x1—, … xn, xn

—}; be the union of attribute domains inX. Let val : F → 2D be a bijective mapping from thedimensions of F onto the power set of D, unique-ly associating each dimension fi with a subsetval(fi) ⊆ D, and vice versa. Let Var(fi) ⊆ X denote

Articles

72 AI MAGAZINE

Page 16: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 73

the subset of attributes “instantiated” by val(fi). Forexample, if val(fi) = {x2, x3,

— x17}, then Var(fi) = {X2,X3, X17}. Given that, for each x ∈ X and fi ∈ F,

That is, geometrically, Φ maps each n-dimen-sional vector x ∈ X describing an outcome to the4n-dimensional vector in F that uniquely encodesthe set of all projections of x onto the subspaces ofX. For example, suppose that we have three attrib-utes describing cars: Color (red, blue), Category(sedan, minivan), New (new, used). In this case, thefeatures fi will correspond to red, red ∧ new, sedan ∧used, and so on, and we would map a car (red, used,sedan) into a truth assignment to features thatassigns true to the features associated with {red},{used}, {sedan}, {red, used}, {red, sedan}, {used, sedan},{red, used, sedan}.

Let us now consider the compilation of prefer-ence expressions in terms of constraints on addi-tively independent value functions in the newspace F. Assuming here that the referents of thestatements in S correspond to propositional logicformulas over X, consider an arbitrary comparativestatement ϕ ψ. Let Xϕ ⊆ X (and similarly Xψ) bethe variables involved in ϕ, and M(ϕ) ⊆ Dom(Xϕ)be the set of all ϕ’s models in the subspace of Ωdefined by Xϕ. For instance, if X = {X1, …, X10}, andϕ = X1 ∨ X2, then Xϕ = {X1, X2}, and M(ϕ) = {x1x2, x1

x2, x1 x2—}. Given that, each statement ϕ ψ is com-

piled into a set of |M(ϕ)| × |M(ψ)| linear constraints

where 2m denotes the set of all nonempty valuesubsets of the local model m. For example, state-ment (X1 ∨ X2) (¬X3) (for example, “It is moreimportant that the car is powerful or fast than nothaving had an accident”) is compiled into

Putting together such constraints resulting fromcompiling all the individual preference statementsin the expression, we obtain a constraint system Ccorresponding to the entire expression S. At firstview, solving C poses numerous complexity issues.First, though this constraint system is linear, it islinear in the exponential space R4n. Second, thedescription size of C, and, in fact, of each individ-ual constraint in C, can be exponential in n. Inter-estingly, it turns out that these complexity issuescan be overcome by using some duality techniques

w w w w

w w w w

w w w

x x x x x

x x x x x

x x x x

1 2 1 2 3

1 2 1 2 3

1 2 1 2

+ + >

+ + >

+ + >>wx3

∀ ∈ ( ) ∀ ∈ ( )

>( )∈ (∑

m m

m

ϕ ψϕ ψ

ϕ

M M

w wi j

i i j j

, .

: :f f f fval val2 ))∈∑

2mψ

Φ xx( )[ ]= ( )≠ ∧ ( )⊆⎧

⎨⎪⎪⎪

⎩⎪

i i i10,,

val val

otherwise

f fθ

⎪⎪⎪

from optimization theory (Bertsekas, Nedic, andOzdaglar 2003) and reproducing kernel Hilbertspaces (RKHS) (Kimeldorf and Wahba 1971). Thebottom line is that these techniques allow for com-puting a solution for C (corresponding to thedesired value function V : F R), and subse-quently computing the values of V on the out-comes Ω, without requiring any explicit computa-tion in R4n.6

Further ReadingThe idea of finding factored value functions thatrepresent some preference relation is discussed atlength in Keeney and Raiffa (1976). Conjointmeasurement theory (Green and Rao 1971; Green,Krieger, and Wind 2001) specifically considers theproblem of generating a value function describinga user (or a population) preference based onanswers to preference queries. Most of this workseeks an additive independent model, thoughsome extensions exist (Bouyssou and Pirlot 2005).Compilation methods were investigated in AI byMcGeachie and Doyle (2004), Domshlak andJoachims (2007), and Brafman and Domshlak(2008). Much recent work in the machine-learningcommunity is devoted to rank learning problems,where we attempt to learn a ranking function onsome input based on examples. See for exampleCrammer and Singer (2003), Radlinski andJoachims (2007), and Burges et al. (2005).

Uncertainty and Utility FunctionsIn our discussion so far, we assumed that our pref-erences were over outcomes that we can obtainwith certainty. For example, when we ordered dig-ital cameras (for example, based on their zoom,number of mega-pixels, and cost) our implicitassumption was that we actually have a choiceamong concrete cameras. This may sound com-pletely obvious, but consider the following situa-tion. My company offers a new perk—each week itwill buy me a lottery ticket, but I have to choosewhich lottery to participate in (for example, Cali-fornia Lottery, Nevada, British Columbia, UpperSaxon, and so on). So I’m faced with the followingdecision problem: determine the most desirablelottery from a set of possible lotteries. Each lotteryoffers a certain set of possible prizes, and to makethings simple, we assume that the odds for eachprize are known. Alternatively, suppose that I’mspending the weekend in Las Vegas, and I loveplaying roulette-style games. I know that differenthotels have different such games with differentprizes and different odds, and I want to choose thebest place to gamble.

Gambling and lotteries may sound like insignifi-cant applications of preference theories. It turnsout, however, that the concept of a lottery, or agamble, is far more general than “real” lotteries.

Page 17: Preference Handling— An Introductory Tutorial

Basically, a lottery represents a (possibly availableto us) choice that can lead to various concrete out-comes, but where we do not know ahead of timewhich outcome will prevail. For example, considerthe problem of selecting a digital camera. If theattributes we care about are indeed, megapixels,zoom, and cost, then no lottery is involvedbecause we can directly obtain the outcome weselect. But what if we also care about reliability,defined as length of time until the camera starts tomalfunction? Needless to say that we cannot real-ly select the value of this attribute. Instead, wemight be able to associate with this attribute aprobability distribution over its values. Similarly,what if what we care about in a flight is its actual(in contrast to the published!) arrival time. Again,we cannot select a concrete arrival time simplybecause we don’t control it. What we could do is,for example, choose the airline based on its pastrecord. Thus, a flight is a lottery over arrival times(among other things), and by selecting differentflights we are selecting different lotteries. Similar-ly, consider an actuation command sent to one ofNASA’s rovers on Mars. We don’t really knowahead of time how it will affect the rover. Thisdepends on the complex dynamics of the roveritself, the terrain, and the current weather condi-tions on Mars. None of these parameters areknown precisely. Hence, when we choose to sendthis command, we’re actually choosing a lotteryover various concrete changes of the rover’s state.

Thus, lotteries over concrete outcomes are a gen-eral model of choice under uncertainty, that is, theproblem of choosing among actions with uncer-tain outcomes. But this begs the question: how canwe work with preferences over such choices? Clear-ly, if we’re able to order the set of possible lotteries,we’re done. However, it’s really unclear how evento approach this problem. This is where von Neu-mann and Morgenstern’s theory of utilities (vonNeumann and Morgenstern 1947) comes in. Thisseminal theory explains how we can associate areal value with each concrete outcome—its utili-ty—such that one lottery is preferred to another ifand only if the expected utility of its outcomes isgreater than that of the other.

This result seems magical, and in some sense itis, but does come with a price. First, note that sim-ply ordering the outcomes is not going to beenough. We will have to work harder and come upwith something like a value function. However,whereas in a value function we really care onlyabout the relative ordering of outcomes’ values, ina utility function the actual value plays a signifi-cant role. In addition, the whole technique worksonly when the target ordering over lotteries satis-fies certain properties. On the positive side, theseproperties make a lot of sense; that is, you wouldexpect a “reasonable” decision maker to satisfy

these postulates. On the negative side, it is wellknown that peoples’ orderings often do not satisfythese properties, and a lot depends on framingissues. There’s been a lot of work on this topic,especially by economists, who have attempted tocome up with weaker or simply different proper-ties (and there’s a very nice and accessible textbookon this topic (Kreps 1988)). Here, however, we’llstick to the classical result.

Formally, our model here has the following ele-ments:

Ω—Set of possible concrete outcomes

L = Π(Ω)—Set of possible lotteries (that is, probabil-ity distributions) over Ω

L ⊆ L—Set of available lotteries over Ω (that is, pos-sible choices or actions)

If l ∈ L and o ∈ Ω, we use l(o) to denote the proba-bility that lottery l will result in outcome o. Ourfirst step will be to assume that we seek an order-ing over the entire family Π(Ω). This makes thingssimpler mathematically. The actual elicitationtechnique is affected by the size of Ω only.

Our second step is to define the notion of a com-plex lottery. Imagine that my employer has nowadded a new metalottery to its list of offered lot-teries. In this lottery, we win lottery tickets. Say,with probability 0.4 we get a California lottery tick-et denoted c, and with probability 0.6 we get aNevada lottery ticket denoted n. A fundamentalassumption we’re going to make is that this newcomplex lottery, denoted 0.4c + 0.6n, is equivalentto a simple lottery. For example, suppose the Cali-fornia lottery gives out $1,000,000 with probabili-ty 0.1, and nothing otherwise, while the Nevadalottery gives out $2,000,000 with probability 0.05,and nothing otherwise. We can transform the newcomplex lottery into a simple lottery by looking atthe joint distribution. Thus, the probability thatwe get $1,000,000 is 0.1 * 0.4 = 0.04; the probabil-ity that we get $2,000,000 is 0.05 * 0.6 = 0.03 andthe probability that we get nothing is 0.93. Next,we present the three properties that preferenceover lotteries must satisfy for the von NeumannMorgenstern theorem to hold:

Axiom 1. Order. is a Total Weak Order. For everyl, l′ ∈ L at least one of l l′ or l′ l holds.

Axiom 2. Independence/Substitution. For everylottery p, q, r and every a ∈ [0, 1] if p q then ap +(1 – a)r aq + (1 – a)r.

Axiom 3. Archimedean/Continuity. If p, q, r arelotteries s.t. p q r then ∃a, b ∈ (0, 1) such that ap+ (1 – a)r q bp + (1 – b)r.

Let’s take a closer look at each one of theseaxioms. The first is pretty straightforward: Wemust agree that, in principle, any two lotteries arecomparable. The second axiom is a bit harder toparse, yet it formalizes a very intuitive assumptionabout preferences over lotteries. Basically, the sec-

Articles

74 AI MAGAZINE

Page 18: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 75

ond axiom says that if you prefer oranges at leastas much as apples then, no matter how you feelabout carrots, you’ll like a lottery that gives you anorange with probability p and a carrot with proba-bility 1 – p at least as much as a lottery that givesyou an apple with probability p and a carrot withprobability 1 – p. Thus, in both lotteries we get acarrot with the same probability, but with theremaining probability, the first lottery gives us amore preferred outcome, and so we like it better.

To understand the third axiom imagine that youstrictly prefer oranges to apples and apples to car-rots. Imagine that we slightly “dilute” your orangesby making a lottery that gives you oranges withprobability 1 – a and carrots with probability a. Theaxiom says that there is always (possibly verysmall, but still positive) such a dilution for whichyou still prefer this lottery to getting apples forsure. It is not surprising that this is called the con-tinuity axiom. Similarly, if we take the carrots andonly slightly improve them by giving you somechance of getting an orange, there is always such aprospect that is still worse than getting apples forsure. Intuitively, this axiom is implying that noth-ing can be too bad (for example, hell) or too good(for example, heaven) so that just a little bit of itcould transform a good thing into an arbitrarilybad one, or a bad thing into an arbitrarily goodone.

It turns out that these three axioms together buyus a lot. Specifically, the von Neumann Morgen-stern theorem states that a binary relation over Lsatisfies axioms 1–3 if and only if there exists afunction U : Ω → R such that

Moreover, U is unique up to affine (= linear) trans-formations.

U is called a utility function, and the aforemen-tioned condition says that a lottery p is at least aspreferred as a lottery q, iff the expected utility ofthe possible outcomes associated with p is as highas the expected utility of the outcomes associatedwith q.

Once again, the metamodel instantiated for thesetup of preferences over lotteries is depicted in fig-ure 8. Note that, in practice, the existence of theutility function should be somehow translated to amethod for obtaining that function. One way todo that is as follows. First, we order the set of con-crete outcomes. Because utilities are unique up toaffine transformations, we can fix an arbitrary val-ue for the best and worst outcomes. We’ll use 1 forthe best and 0 for the worst. Next, we need toassign a value to every other outcome. Let obdenote the best outcome and ow denote the worstoutcome. Consider an arbitrary outcome o. Theutility of o is given by the value p determined in

p q U o p o U o q oo o

⇔ ( ) ( )≥ ( ) ( )∈ ∈∑ ∑Ω Ω

response to the following question: For what valuep is it the case that you are indifferent between get-ting o for sure and a lottery in which you obtain obwith probability p and ow with probability 1 – p.

For example, suppose that we’re ordering food.Restaurants are characterized by two attributes:whether they sell junk food or healthy food, andwhether the food is spicy or not. Here are the stepswe take to generate the utility function:

Step One: Order outcomes best to worst:(unspicy, healthy) (spicy, junk food) (spicy,healthy) (unspicy, junk food)

Step Two: Assign utility to best and worst out-comes: U(unspicy, healty) : = 1, U(unspicy, junkfood) : = 0,

Step Three: Ask for values p and q such that (a)(spicy, healthy) ∼ p(unspicy, healthy) + (1 –p)(unspicy, junk food) and (b) (spicy, junk food) ∼q(unspicy, healthy) + (1 – q)(unspicy, junk food) 4.Assign: U(spicy, healthy) : = p, U(spicy, junk food) := q.

Although we see that there is a clear methodol-ogy for assigning utilities to outcomes, it is appar-ent that we can’t expect lay users to be able tocome up with these answers without much help.Indeed, specifying a utility function is much hard-er, cognitively, than specifying an ordering, anddirect utility elicitation is not likely to be very use-ful in online setting. Current research in the areaattempts to cope with this limitation in at leastthree different ways. First, it is possible to use here,as well, the structural assumptions such as gener-alized additive independence to decompose thespecification of complex utility functions. Second,by using data from previous users, one can learnabout typical utility functions and later calibratethem using information about the current user.Finally, one can exploit knowledge about the cur-rent decision problem to characterize only theinformation about the user’s utility functionrequired to address the current decision problem.We shall consider some of these issues in the nextsection.

Further ReadingThe theory of expected utility appeared in Jon vonNeumann and Oscar Morgenstern’s seminal bookTheory of Games and Economic Behavior (1947).Since then, many economists have tried to providesimilar axiomatizations of choice behavior andthis topic is discussed by Kreps (1988) and Fish-burn (1982). One of the most beautiful results inthis area is presented in Leonard Savage’s The Foun-dation of Statistics (1972). Axiomatizations areimportant because they expose the basic assump-tions of a model. We can also check the validity ofa model by verifying, directly or indirectly,whether a user’s preferences satisfy the relevant setof axioms. In particular, experiments conducted by

Page 19: Preference Handling— An Introductory Tutorial

behavioral economists show that people’s behav-ior often violates the von Neumann and Morgen-stern axioms (Kahneman and Tversky 1979).

Preference ElicitationOur discussion so far has focused on models andlanguages for representing preferences. Some ofthe languages discussed were motivated by theneed to make the preference specification processmore rapid and more accessible to lay users. How-ever, in many problem domains we can’t expect toobtain perfect information about the user’s prefer-ences because of the time and effort involved.Thus, the first question that arises is what can wedo with partial information. Next, given that wecan expect only partial information, we would liketo obtain the most useful information possible giv-en the user’s time and effort limitations. Thus, thesecond question is how to make the elicitationprocess effective.

Working with Partial SpecificationsThe abstract problem of working with a partialspecification has close similarities to the basicproblem of statistical machine learning, in partic-

ular to classification. In classification problems, wemust classify objects based on partial knowledge ofthe true model underlying the classification. In thestandard setting of the classification problem, thispartial knowledge corresponds to a set of properlyclassified objects from the space of all possibleobjects. Given such partial knowledge, the processof classification typically boils down to inferringeither a single classification model, or a (possiblyweighted) set of such models, from the space of allmodels in the chosen model class. The latter set ofpossible models is called a hypothesis space. Our sit-uation is somewhat similar—our hypothesis spacecorresponds to the set of all possible preferencemodel instances. For example, if our models aretotal orders, then these would be all possible totalorders over the relevant set of outcomes. Or if ourmodel is some GAI value function with a particu-lar set of factors, then the hypothesis spaceincludes all possible factor values. We, too, are giv-en only partial information, and have to makedecisions based on this information only. Howev-er, our situation with respect to the partial infor-mation is not exactly the same as in the standardsetting of statistical machine learning. The maindifference stems from the form in which the partial

Articles

76 AI MAGAZINE

Language Algorithms

Queries

Models

Interpretation

Total weak orderover lotteries

Find optimal lotteryOrder a set of lotteries...

RepresentationUtility functionU : Ω→ R

Utility functionU : Ω→ R

p � q ⇔ Σ U (o)p(o) ≥ Σ U (o)q(o)o∈Ω o∈Ω

Figure 8. Metamodel for Utility Functions.

Page 20: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 77

Language Algorithms

Queries

Models

Interpretation

Hypothesesspace

HypothesisEncoding

Decoding

Figure 9. Model Selection Process.

knowledge is obtained. In statistical machinelearning the partial knowledge is given by theexamples of the concept to be learned, and theexamples are assumed to be sampled from inde-pendent and identically-distributed random vari-ables reflecting the topology of the object space. Incontrast, our partial knowledge is typically givenby a set of more global constraints on the originalhypothesis space, with the constraints correspon-ding to some restrictions on values, some relation-ships between attractiveness of outcomes, general-izing statements, and so on. In that sense, whilebeing very similar to the inference process in sta-tistical machine learning, our inference process iscloser to more general settings of inference via con-straint optimization such as, for example, infer-ence about probability distributions in Shore andJohnson (1980).

Figure 9 relates this discussion and the conceptsinvolved to our metamodel. When we have a com-plete specification, the inference process leaves uswith a single model, as shown in figure 10. How-ever, in most practical situations, multiple modelsare consistent with the data, and we must specifyhow information is mapped into the model space.One of the basic choices we must make is whethera prior distribution is defined over the hypothesisspace or not.

The Maximum-Likelihood Approach. Supposewe have a prior distribution over models. In manysituations, this is quite natural. Consider, for exam-ple an e-commerce site visited by many con-sumers. As data from multiple users accumulates,the site can form a reasonable probabilistic modelof user preferences. This distribution can be used asa prior over the hypothesis space. Now, two stan-dard models for selecting model parameters can beused. In the maximum-likelihood method, we startwith a prior distribution over the preference mod-els and update this distribution using the userstatements. Typically, the user statements simplyrule out certain models, and the probability mass isdivided among the remaining models. At thispoint, the most likely model consistent with theuser statements is selected and used for the deci-sion process.

For example, suppose that we are helping theuser select a digital camera and the user cares onlyabout two features: number of megapixels M, andits weight W. Suppose that our hypothesis classcontains only additive value functions of the formcm · M + cw · W. We have a prior distribution pm,wover cm and cw. Suppose that all we know is that theuser prefers some camera with parameters (m1, w1)to another camera with (m2, w2). From this we canconclude that cmm1 + cww1 > cmm2 + cww2, that is,

Page 21: Preference Handling— An Introductory Tutorial

cm(m1 – m2) > cw(w2 — w1). The value functionselected according to the maximum-likelihoodprinciple will have the coefficients cm, cw such that:

Let’s consider some of the models we discussedearlier in light of the maximum-likelihoodapproach, starting with CP-nets. One of the specialfeatures of CP-nets is that there is always a defaultmodel—namely, the model that satisfies all andonly the preference relations derived from theinput statements. Thus, in CP-nets, the prior dis-tribution could be any distribution, while the pos-terior distribution over partial orders is

Evidently, this is not a very interesting applicationof the maximum-likelihood principle.

Next, consider the structured value functioncompilation. Suppose that p(V) is our prior distri-bution over value functions. Then, the posteriordistribution is obtained by removing elements thatare inconsistent with the constraints and renor-malizing (that is, doing standard probabilistic con-

p

∼( )1,

assumes all and only

all the information in

otherwise

N

0,

⎪⎪⎪⎪⎪⎪

⎪⎪⎪⎪⎪⎪

c c

p c c

c m m

c

m w c c

m w m w

mm w

* *,

,

,

, :

( )=( )(−( )>( )argmax 1 2

ww w w2 1−( ))

ditioning). The situation with the structureless val-ue compilation approach is the same, except thatit requires a concrete prior

where w is the vector of coefficients of a high-dimensional linear value function. The picture wehave in both of the compilation approaches can beillustrated in figure 11. That is, we start with a largeweighted hypothesis space (that is, a prior distribu-tion over models), obtain information from theuser that usually rules out some of the models, end-ing up with a subset of the original hypothesisspace, from which we need to select a single mod-el—the model with highest weight, possibly requir-ing some other selection criteria to break ties.

The Bayesian Approach. The Bayesian approachis identical to the maximum-likelihood approach,except that instead of selecting the most probablemodel and answering queries using it, this methodmaintains the entire posterior probability distribu-tion and uses all models to answer a query, weight-ed by their respective probability. Thus, consider-ing the picture above, we work with the entireintermediate purple set, but giving different weightto different elements.

As a simple example, let O be a set of totalorders—our hypothesis space—and let p : O → [0,1] be the posterior distribution over O obtainedafter updating our prior with the user’s input. Sup-

p eV wv( ) −∼2

Articles

78 AI MAGAZINE

Utility function

Total orderingsover outcomes

Single ordering

Total orderingsover lotteries

Value function

Single ordering

Figure 10. Selection With Full Information.

Page 22: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 79

pose that we want to answer the query “o1 ≺ o2?”We answer positively if p({≺ ∈ O : o1 ≺ o2}) > p({≺ ∈ O : o1 o2}). That is, we check whether pplaces more weight on models in which o1 is lesspreferred than o2 than on models in which this isnot the case.

Although conceptually elegant, the Bayesianapproach often requires the specification of anoth-er element to answer queries. The problem is thatit is not always clear how to use the posterior prob-ability to answer queries. For example, suppose weseek an optimal outcome. One approach we mighttake would be to assign to each outcome the sumof probabilities of all models in which this out-come is ranked highest, and return the outcomewith the highest score. Thus, intuitively, the totalvalue associated with an outcome is the probabili-ty that it is the best one.

Now, imagine that we have n possible outcomes,and our posterior model puts all of its weight on n– 1 equally likely different orderings. The top ele-ment in each of these orderings is different, but thesecond element is always the same outcome o.Thus, o’s weight in this case would be 0, but intu-itively it appears to be the best choice. To addressthis problem we can add a scoring function. Thescoring function assigns a score to each possibleanswer within each model. The best answer is thenthe one with the highest expected score. In thisexample, we implicitly used a score of 1 for an the

outcome ranked highest and 0 for all other out-comes. We could have used a different score, suchassigning n – j + 1 to the outcome ranked jth,obtaining a different result.

Let’s consider the earlier example of the pur-chase of a digital camera. Recall, we have a distri-bution pm,w over cm and cw, and a constraint due toour single observation that cm(m1 – m2) > cw(w2 –w1). Let’s use mC and wC to denote the number ofmegapixels and the weight of some camera C. Thescore associated with camera C in a particular mod-el with coefficients cm, cw is cmmC + cwwC. Thus, theexpected score of a camera is

where α is some normalizing constant. Thus, thebest camera is the one that maximizes this value,that is,

A natural scoring function when our hypothesisclass if utility functions is the utility of an out-come. Consider the case of a probability distribu-tion over utility functions, and suppose we seekthe optimal choice. Each choice is associated with

α p c c

c m c

m w m wc c c m m c w w

m C w

m w m w,

, :,( )

+

−( )> −( ){ }∫ 1 2 2 1

ww dc dcC m w( ) ,

arg max ,,, :

c m w m wc c c m m c

p c cm w m

is a camera ( )−( )>1 2 ww w w

m C w C m wc m c w dc dc2 1−( ){ }∫

+( ) .

V

S = {s1, . . . ,sm}

Possible Models

Interpretation

Representation

Figure 11. Model Selection in Value Function Compilation.

Page 23: Preference Handling— An Introductory Tutorial

a value in each model—its expected utility withinthat model. But we also have a distribution overmodels; thus, we can use the expected expectedutility, that is, replacing

with

Regret-Based Scores. Although prior distributionsover preference models are natural in some cases,they are difficult to obtain in others. Withoutthem, we need to replace the notion of the “mostprobable model” by some notion of “best” model.One attractive way to do this is using the conceptof regret.

Again, suppose that our target class is the class ofvalue or utility functions. We have a set of possiblecandidates, as in the structure-based compilationapproach, and we need to select one. Suppose thethe user’s true (and unknown) value function is V,and suppose that we select some function V*. Howbad can this be? This depends on V, V*, and thequery we are trying to answer. Suppose that we aretrying to find the most preferred outcome. If bothV and V* have the same most preferred outcome,then we’re very happy. But suppose that o is themost preferred outcome according to V, and o* isthe most preferred outcome according to V*. Inthat case, we’re going to recommend o* to the user,instead of o. How unhappy would this make theuser? Well, the user values o at V(o) and o* at V(o*),so we could quantify the user’s regret at getting o*instead of o by V(o) – V(o*). Thus, we can also view

p q U o p o U o q oo o

⇔ ( ) ( )≥ ( ) ( )∈ ∈∑ ∑Ω Ω

p q p U U o p o

p U U o q oU o

U o

⇔ ( ) ( ) ( )

≥ ( ) ( ) ( )

∑ ∑

∑ ∑∈

Ω

Ω

.

this as the regret associated with selecting V*instead of V.

As an example, consider table 1 of value func-tions for a domain with four possible outcomes.Suppose that V1 is the true value function, but weselect V2. V2 assigns the highest value to o1. Thetrue value of o1 is 1. The true optimal value is 4,assigned to o4. Our regret is thus 4 – 1 = 3.

When we select V* as the value function, wedon’t know what the true value function V is, sowe can’t compute V (o) - V (o*). A cautious estimatewould consider the worst case scenario. This iscalled the maximal regret. More formally, if V is theset of candidate value functions, then

Regret(V*|V) = maxV∈V(V(argmaxo V(o)) – V(argmaxo′V*(o′))),

that is, the worst case across all choices of V of thedifference between the value according to V of thebest V outcome and the best V* outcome. Forinstance, considering our example above we seethat our maximal regret at having chosen V2would be if the true value function is V1. Given thenotion of regret, an obvious choice of a value func-tion among the set of alternative possible valuesfunctions is the one that minimizes regret. Specifi-cally, the minmax regret value function Vmr satis-fies

Vmr = argminV∈V Regret(V|V).

In our example, V3 and V4 minimize max regretbecause we can see that the maximal regret associ-ated with V1, V2, V3, V4 is 3, 3, 2, 2, respectively.Thus, V4 would be a choice that minimizes regret.Notice that in our case, this is equivallent to sayingthat selecting outcome o3 is the choice that mini-mizes regret. Indeed, rather than talk about theregret associated with a value function, we cansimply (and more generally) talk about the regretassociated with a particular answer to the query athand. This regret is a function of the set of candi-date value functions.

Regret-based criteria can be used in differentcontexts. When a probability distribution overpossible models is not given, we can use minmaxregret to select a single model or select an optimalanswer. When a probability distribution over mod-els is available, we still need to associate a scorewith each possible answer to a query. The maximalregret can be used to score an answer in a model,and expected regret can be used to rate differentanswers. Thus, in our above example, if we associ-ate a distribution of (0.5, 0.1, 0.1, 0.3) with {V1, V2,V3, V4}, we can compute the expected regret asso-ciated with each outcome. For example, the regretassociated with the choice o1 is 3, 0, 2, 2. Takingexpectation, we obtain 2.3.

Finally, a word of caution. Although minmaxregret is an intuitive choice criterion, its semanticsis problematic, as there is no clear meaning associ-ated with measures of difference in value or utility,

Articles

80 AI MAGAZINE

V1 V2 V3 V4

o1 1 4 2 2o2 2 3 4 3o3 3 3 3 4o4 4 1 1 3

Table 1. Four Feasible Value Functions over Four Outcomes.

Page 24: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 81

except for special cases of what are known as meas-urable value functions (Krantz et al. 1971). Thus, itis best thought of as an intuitive heuristic method.

Preference ElicitationOur discussion so far on preference specificationtook a user-driven perspective—the informationflow was considered to be unidirectional: from theuser to the system. Under the assumption of even-tual acquisition of the complete model of user pref-erences there is no real difference between purelyuser-driven and mixed-initiative preference speci-fication. This is not the case, however, with partialmodel acquisition, and in what follows we consid-er preference specification in that setting moreclosely.

Clearly, if we can get only part of the model,then which part we get is quite important. In fact,to answer certain queries accurately, a partial mod-el suffices, provided it has the right information.Preference elicitation is a process driven by the sys-tem that aims at improving the quality of theinformation it has about the user’s preferenceswhile decreasing the user’s cognitive burden asmuch as possible.

The main benefit of the system-driven setting isthat questions can be asked sequentially and con-ditionally. The questions we ask at each point takeinto account the answers we received earlier, aswell as the query. There are a number of ways toapproach this problem. When, as often is the case,the task is to recognize an optimal outcome ofinterest, k-item queries are often used. In this case,the user is shown k alternative choices and is askedeither to select the best one among them or to rankthem. When the user selects an item out of k, he orshe is basically giving an answer to k – 1 explicitcomparisons (that is, telling us that the chosenoutcome is at least as preferred as the k – 1 otheroutcomes). Each such answer eliminates all order-ings inconsistent with these answers.

A simple way of implementing this approach isby considering all possible total orders as possiblehypotheses, and then ruling them out as informa-tion is obtained. For example, if the choice isamong five items o1, …, o5 and k = 2, we simply askthe user to compare pairs of items. If, for example,the user indicates that o2 o3 and o1 o4, order-ings of the form …, o4, …, o1, …. and …, o3, …, o2…. are eliminated.

However, this is a slow process that is likely torequire many iterations. A more efficient processutilizes a smaller hypothesis space, for example,one that is based on some GAI-decomposition. Inthat case, each query farther constrains the possi-ble values of the value function factors, and somegeneralization is taking place; that is, one can ruleout additional orderings. For example, supposethat a vacation choice depends on two attributes,

location and facility, and I learned that the userprefers a spa in Barcelona to a spa in Madrid. With-out farther assumptions, only orderings that areinconsistent with this choice can be ruled out.However, if we assume that the value function isadditive, or if we assume preferential independ-ence, then we can also rule out orderings in whicha rented apartment in Madrid is preferred to anotherwise similar apartment in Barcelona.

The aforementioned process appears somewhatad hoc, and a more principled approach is desir-able. In particular, even if we decide to use k-itemqueries, it is not clear how to select which query toask. This problem can be approached from a deci-sion-theoretic perspective. As we noted earlier,preference elicitation is a sequential process. Westart with some target query and (possibly empty)knowledge of the agent’s preferences and we canask the user various questions. Typically, we don’tknow ahead of time how many questions the userswill agree to answer, and so we must have aresponse to the original query at each point.

To have a well-posed sequential decision prob-lem we need to model the state of the system andto have a valuation for alternative states as a start.The most natural way to model the system’s stateis in terms of some beliefs over a set of possiblepreference ordering. This takes us back to issues weconsidered in the context of partial models. In thatsetting, too, we needed to represent a set of possi-ble models, possibly with some probability distri-bution defined over them, and to assign a valua-tion to each such set. The same answers we gavethere can be used here, too. Thus, the system’s statecould be modeled as a set of possible orderings orvalue functions. We can farther refine this modelby introducing a distribution over preference mod-els to capture our uncertainty. The value of a mod-el depends on the query at hand, but again, con-sidering the problem of optimal choice, we can usemeasures such as expected utility loss or maximalregret.

At this point, we can start using standard ideasfrom decision theory to analyze the problem ofpreference elicitation. Let’s start with a context inwhich we can pose a single preference query. Whatis the most appropriate query? This problem isanalogous to the problem of selecting an observa-tion in a decision context. That is, we have somebeliefs about the current state of the world and weneed to make a decision. The value of this decisiondepends on the actual state of the world, yet beforewe make the decision, we can observe the value ofsome feature of the current state. Which featureshould we observe?

The answer given in decision theory is that weshould compute the value of information associat-ed with each feature we could observe, and observethe feature with maximal value of information.

Page 25: Preference Handling— An Introductory Tutorial

The value of information of a feature is a measureof how much “happier” we would be if we learnedthe value of this feature. Considering a settingwhere our beliefs are captured using a distributionp over the state of the world, the value of informa-tion of feature f is computed as follows: Let sdenote some state of the world, and let V(s, a)denote the value of doing a at state s. Recall thatfor us s would denote some preference model, anda would be a possible query, such as “what is thebest item?” Given the distribution p, V(p, a) is theexpected value of a according to p, that is, Ep[V(s,a)]. Let ap denote the action with the highestexpected value, that is, ap = argmaxa Ep[V(s, a)].Now we can define the value of query q given beliefstate p. Suppose that q has k possible answers, r1,…, rk. Let pi denote the probability p conditionedon ri. Let ai denote the best action according to pi.The value of information of query q is

VI(q) = Σi p(ri)V(pi, ai) – V(p, a).

Thus, the value of a query is some aggregate of thevalue of the world after we get an answer to thatquery. The value of the world, in our case, wouldbe how happy we would be making the best deci-sion in that world. For example, when we assumea probability distribution over possible preferencemodels, and we evaluate such a distribution interms of its minmax regret value, then the bestquery is the one that will minimize the expectedminmax regret value of the resulting state.

We illustrate these ideas using the example weused earlier. Consider the four possible value func-tions over four objects as in table 1. Our currentbeliefs are: pr(V1) = 0.5, pr(V2) = 0.1, pr(V3) = 0.1,pr(V4) = 0.3. The score we give each outcome ineach state is simply its value. In our current state(as captured by our current beliefs) the outcomeshave the following expected values: 1.7, 2.3, 3.3,3.1. Thus, the best choice is o3 with value 3.3. Sup-pose that we can ask a query of the form: “what isthe value of outcome o?” and let’s see what is thevalue of asking: “what is v(o4).” There are 3 possi-ble answers, with the following probabilitiesaccording to our current beliefs: 4 with probability0.5, 1 with probability 0.2, and 3 with probability0.3. If we get the answer “v(o4) = 4” then we con-clude that V1 is the only possible value function.The best action at that point (that is, the bestanswer to the query) would be to choose o4 with avalue of 4. If we learn that “v(o4) = 1” then our pos-terior belief will assign V2 and V3 probability of 0.5each. At this point, the best action or answerwould be o2 with expected value of 3.5. Finally, ifwe learn that “v(o4) = 3” we conclude that the val-ue function is V4 and choose o3 and obtain a valueof 4. We can now compute the expected value ofour state following the observation of v(o4), that is,0.5 ⋅ 4 + 0.2 ⋅ 3.5 + 0.3 ⋅ 4 = 3.9. Thus the value ofquery “what is v(o4)” is 3.9 – 3.3 = 0.6. You can ver-

ify for yourself that this query has the highest val-ue of information (among queries of this struc-ture).

In the aforementioned example we assumed adistribution over possible models. However, simi-lar ideas (though perhaps less well founded) can beapplied when do not use a prior. In that case, a nat-ural metric is minimal regret loss. Asking, as above,for the value of o4, if we learn that it is 4 or that itis 3, we know the precise value function. In thatcase, our regret is 0. If we learn that v(o4) = 1, thenboth V2 and V3 are possible. In that case, our min-imal regret is 1 (obtained by suggesting item o2).Thus, recalling from an earlier discussion that ourminimal regret when all four value functions werepossible was 2, the minimal regret loss associatedwith the question “what is v(o4)” is 2 – 1 = 1.

Using the idea of value of information, we candecide which query to asks, but this answer hastwo weaknesses. It deals with a single query, ratherthan a sequence of queries, and it ignores the cog-nitive effort required to answer this query. For bothproblems we can use well-known solutions fromdecision theory. The second problem is relativelyeasy to deal with, provided we have some way ofassigning cost to each query. Then, rather than talkabout value of information, we can talk about netvalue of information (NVI), where

NVI(q) = VI(q) – Cost(q).

For the first problem there are two standard solu-tions. The first is to act greedily, or myopically, andalways pose the query with maximal NVI. Thismyopic behavior might not lead to an optimalsequence. For example, suppose we have twoqueries to ask. There may be one query that haslarge VI, say 1, and high cost, say 0.4, but follow-ing which, most queries have little value, almost 0.On the other hand, there is a query with moderateVI, say 0.5, but no cost, following which we canpose another similar query. However, it is compu-tationally relatively cheap to compute the myopicselection. The other option is to try to compute theoptimal sequential choice. Typically, this requiresknowing the number of queries ahead of time(although this can be overcome) and while an opti-mal querying strategy results, the computationalcosts are exponential in the number of steps.

Finally, there is a very elegant model that cap-tures all the above considerations nicely, althoughit comes with a heavy computational price tag.According to this model, the problem of preferenceelicitation is best modeled using a partially observ-able Markov decision process (POMDP). A POMDPhas four key elements: a set S of possible states ofthe world, a set A of possible actions, a set Ω of pos-sible observations, and a reward function R. Intu-itively, we are modeling a decision maker that ateach decision point can select an action from A.This action affects the state of the world, but that

Articles

82 AI MAGAZINE

Page 26: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 83

state is not directly observable to the agent.Instead, the agent can observe an element in Ω,which is a possibly noisy feature of the currentstate. Actions can have cost or generate rewards,and that cost or reward may depend on the state ofthe world in which they are executed. Because theagent cannot directly observe the state of theworld, what it does observe induces a distributionover the state of the world, called its belief state.

The POMDP framework is really perfect for mod-eling the intricacies of the elicitation process. Thepossible states of the world correspond to possiblepreference models in our hypothesis space, forexample, value functions, or value functions witha particular factorization. The actions can be divid-ed into two types: queries, which model thequeries we can ask the agent, and actions, whichcan be used to model the actions we will eventual-ly take on behalf of that agent. These actions couldmodel a final selection of an item of choice, or theycould model more intricate choices that the systemmight face whose evaluation requires knowledgeof the agent’s value function. The observations cor-respond to responses the user might make to thesystem’s queries. Finally, the reward function mod-els the cost of some of the queries (capturing theircognitive cost, for example) and the possible valueto the agent of actions such as selecting one par-ticular item. Naturally, the value of an itemdepends on the agent’s preferences, that is, on thecurrent state of the world. We can also model theprobability that the agent will no longer agree toanswer questions by adjusting the state space.Once a POMDP model is formulated, there arestandard techniques for solving it. However, exactsolution methods are impractical, and approxi-mate solution methods work well for only moder-ately large state spaces. Thus, to apply this idea, wewill have to limit ourselves to a few hundreds ofpossible preference models, at least if we are to relyon the current state of the art.

Starting with an initial belief state, that is, a dis-tribution over the preference models, the policygenerated by solving the POMDP will tell us whatquery to ask next. In some precise sense, this is thebest query we could ask at this point taking allissues into consideration. Once the agent respondsto this query, our distribution over models will beupdated, and again, we can use the model todecide on our next query.

A good preference elicitation strategy balancesthe expected effort required by the user with theexpected quality of the final choice made by thesystem based on the user’s responses. The POMDPmodel is the theoretically best motivated, but theother options can also lead to good strategies. Mak-ing these approaches practical is an ongoingresearch issue that involves many aspects of theproblem. These start with modeling the initial

belief state. This initial distribution is oftenobtained by learning the preferences of a largepopulation of users. Next, there is the problem ofrepresenting this distribution compactly. In gener-al, the number of possible preference models isvery large, thus, some parametric model is desir-able. Next, there is an issue of modeling the cogni-tive burden of each query. Finally, computationaltechniques for obtaining approximately optimalelicitation strategies are needed. For example, thepreference elicitation POMDP has special structureand properties that could be exploited by the solu-tion algorithm.

Further ReadingPreference elicitation and value of informationhave been first studied in the areas of decisionanalysis and psychology, where they remain a top-ic of great importance (Tversky 1972, Keeney andRaiffa 1976, Howard and Matheson 1984, French1986). This line of research has been extended inartificial intelligence, with a focus on automatingthe process of preference elicitation (Ha and Had-dawy 1997, 1999; Torrens, Faltings, and Pu 2002;Pu and Faltings 2004; Faltings, Torrens, and Pu2004; Braziunas and Boutilier 2005; Payne,Bettman, and Johnson 1993; Smith and McGinty2003). Casting preference elicitation for policyoptimization as a properly defined decision processwas first suggested in the papers by Chajewska,Koller, and Parr (2000); Chajewska et al. (1998),and then extended in Boutilier (2002), which sug-gested the POMDP-based formulation. Preferenceelicitation under the minmax-regret model selec-tion criterion has been studied in Boutilier et al.(2006) and Braziunas and Boutilier (2007). Notethat here our discussion is focused on handlinguser preferences in “single-agent” settings; for anoverview of recent works on preference elicitationin multiagent settings such as in (combinatorial)auctions see Sandholm and Boutilier (2006).

ConclusionThe importance of preference handling techniquesfor many areas of artificial intelligence and deci-sion support systems is apparent. This area posesconceptual challenges, cognitive challenges, com-putational challenges, and representational chal-lenges. A large body of work on this topic has accu-mulated. But there is ample room for additionalideas and techniques. Indeed, aside from the clas-sical work of von Neuman and Morgenstern andtechniques in the area of conjoint measurementtheory, which basically deal with eliciting the val-ue of additive value functions, most of the ideasdescribed in this tutorial have yet to filter to real-world applications.

In this context, it is important to recall our three

Page 27: Preference Handling— An Introductory Tutorial

rough categories of applications. In the case of theonline consumer world, we believe the technologyfor more sophisticated online sales assistants isripe. Although it may perhaps not be universallyapplicable, we believe that there are markets inwhich more sophisticated online assistants wouldbe highly appreciated. Naturally, many issuesaffect their acceptance beyond the sheer power ofthe preference elicitation technology they provide.In the case of application design, we believe thatmore work on tools and much education arerequired for ideas to filter through. Finally, in thearea of decision analysis, some techniques for bet-ter elicitation of GAI utility functions, a well asqualitative techniques for preparatory analysis,could play an important role.

AcknowledgementsWe would like to thank Alexis Tsoukiàs for usefuldiscussions and detailed comments. Partial supportfor Ronen Brafman was provided by the PaulIvanier Center for Robotics Research and Produc-tion Management, by the Lynn and WilliamFrankel Center for Computer Science. Partial sup-port for Carmel Domshlak was provided by BSFAward 2004216 of United States–Israel BinationalScience Foundation. Both authors are supported byCOST Action IC0602.

Notes1. For readers familiar with decision theory, this termcomes with some baggage, and so we will note that at thisstage, we focus on choice under certainty.

2. One possibility is to elicit only a partial model and useit to answer queries. See Working with Partial Specifica-tions.

3. Note that the term preference query denotes queriesmade to users regarding their preferences, while justqueries denote the questions we wish to answer using thepreference model.

4. When the CP-net is fully specified, that is, an orderingover the domain of each attribute is specified for everypossible assignment to the parents, we know that a sin-gle most preferred assignment exists. When the CP-net isnot fully specified, or when we have additional hard con-straints limiting the feasible assignments, then a numberof Pareto-optimal assignments may exist—that is, assign-ments o such that for o o′ for any other feasible o′.5. Note that, technically, a concrete representation theo-rem would require some definition of consistency at thelevel of the input statements.

6. Presenting the computational machinery here is sim-ply infeasible, and thus the reader is referred to Domsh-lak and Joachims (2007).

ReferencesAgrawal, R., and Wimmers, E. L. 2000. A Framework forExpressing and Combining Preferences. In Proceedings ofthe ACM SIGMOD International Conference on Managementof Data, 297–306. New York: Association for ComputingMachinery.

Arrow, K. J., and Raynaud, H. 1986. Social Choice and Mul-ticriterion Decision Making. Cambridge, MA: The MITPress.

Bacchus, F., and Grove, A. 1995. Graphical Models forPreference and Utility. In Proceedings of the Eleventh Annu-al Conference on Uncertainty in Artificial Intelligence, 3–10.San Francisco: Morgan Kaufmann Publishers.

Bertsekas, D.; Nedic, A.; and Ozdaglar, A. 2003. ConvexAnalysis and Optimization. Nashua, NH: Athena Scientific.

Birkhoff, G. 1948. Lattice Theory, volume 25. Providence,RI: American Mathematical Society.

Bistarelli, S.; Fargier, H.; Montanari, U.; Rossi, F.; Schiex,T.; and Verfaillie, G. 1999. Semiring-Based CSPs and Val-ued CSPs: Frameworks, Properties, and Comparison. Con-straints 4(3): 275–316.

Bistarelli, S.; Montanari, U.; and Rossi, F. 1997. Semiring-Based Constraint Solving and Optimization. Journal of theACM 44(2): 201–236.

Boutilier, C. 1994. Toward a Logic for Qualitative Deci-sion Theory. In Proceedings of the Third Conference onKnowledge Representation (KR–94), 75–86. San Francisco:Morgan Kaufmann Publishers.

Boutilier, C. 2002. A POMDP Formulation of PreferenceElicitation Problems. In Proceedings of the EighteenthNational Conference on Artificial Intelligence, 239–246.Menlo Park, CA: AAAI Press.

Boutilier, C.; Bacchus, F.; and Brafman, R. 2001. UCP-Net-works: A Directed Graphical Representation of Condi-tional Utilities. In Proceedings of the 17th Annual Confer-ence on Uncertainty in Artificial Intelligence, 56–64. SanFrancisco: Morgan Kaufmann Publishers.

Boutilier, C.; Brafman, R.; Domshlak, C.; Hoos, H.; andPoole, D. 2004a. CP-nets: A Tool for Representing andReasoning about Conditional Ceteris Paribus PreferenceStatements. Journal of Artificial Intelligence Research 21:135–191.

Boutilier, C.; Brafman, R.; Domshlak, C.; Hoos, H.; andPoole, D. 2004b. Preference-Based Constrained Opti-mization with Cp-Nets. Computational Intelligence (SpecialIssue on Preferences in AI and CP) 20(2): 137–157.

Boutilier, C.; Patrascu, R.; Poupart, P.; and Schuurmans,D. 2006. Constraint-Based Optimization and Utility Elic-itation Using the Minimax Decision Criterion. ArtificialIntelligence 170(8–9): 686–713.

Bouyssou, D., and Pirlot, M. 2005. Following the Traces:An Introduction to Conjoint Measurement without Tran-sitivity and Additivity. European Journal of OperationalResearch 163(2): 287–337.

Bouyssou, D.; Marchant, T.; Pirlot, M.; Tsoukias, A.; andVincke, P. 2006. Evaluation and Decision Models with Mul-tiple Criteria: Stepping Stones for the Analyst. Berlin:Springer.

Brafman, R. I., and Domshlak, C. 2008. Graphically Struc-tured Value-Function Compilation. Artificial Intelligence172(2–3).

Brafman, R. I.; Domshlak, C.; and Shimony, S. E. 2006.On Graphical Modeling of Preference and Importance.Journal of Artificial Intelligence Research 25: 389–424.

Braziunas, D., and Boutilier, C. 2005. Local Utility Elici-tation in GAI Models. In Proceedings of the Twenty-firstConference on Uncertainty in Artificial Intelligence, 42–49.Arlington, VA: AUAI Press.

Articles

84 AI MAGAZINE

Page 28: Preference Handling— An Introductory Tutorial

Articles

SPRING 2009 85

Braziunas, D., and Boutilier, C. 2007. Minimax RegretBased Elicitation of Generalized Additive Utilities. In Pro-ceedings of the Twenty-third Conference on Uncertainty inArtificial Intelligence, 25–32. Arlington, VA: AUAI Press.

Brewka, G.; Niemela, I.; and Truszczynski, M. 2003.Answer Set Optimization. In Proceedings of the EighteenthInternational Joint Conference on Artificial Intelligence. SanFrancisco: Morgan Kaufmann Publishers.

Burges, C. J. C.; Shaked, T.; Renshaw, E.; Lazier, A.; Deeds,M.; Hamilton, N.; and Hullender, G. N. 2005. Learning toRank Using Gradient Descent. In Proceedings of the Inter-national Conference on Machine Learning, 89–96. New York:Association for Computing Machinery.

Chajewska, U.; Getoor, L.; Norman, J.; and Shahar, Y.1998. Utility Elicitation As A Classification Problem. InProceedings of the Fourteenth Annual Conference on Uncer-tainty in Artificial Intelligence, 79–88. San Francisco: Mor-gan Kaufmann Publishers.

Chajewska, U.; Koller, D.; and Parr, R. 2000. MakingRational Decisions Using Adaptive Utility Elicitation. InProceedings of the Seventeenth National Conference on Artifi-cial Intelligence, 363–369. Menlo Park, CA: AAAI Press.

Chen, L., and Pu, P. 2007. Preference-Based OrganizationInterface: Aiding User Critiques in Recommender Sys-tems. In Proceedings of the Eleventh International Conferenceon User Modeling, 77–86. Berlin: Springer-Verlag.

Chomicki, J. 2002. Querying with Preferences Intristic. InProceedings of the Eighth International Conference on Extend-ing Database Technology, 34–51, LNCS 2287. Berlin:Springer.

Crammer, K., and Singer, Y. 2003. A Family of AdditiveOnline Algorithms for Category Ranking. Journal ofMachine Learning Research 3: 1025–1058.

Davey, B. A., and Priestley, H. A. 2002. Introduction to Lat-tices and Order. Cambridge, UK: Cambridge UniversityPress.

Debreu, G. 1954. Representation of a Preference Order-ing By A Numerical Function. In Decision Processes, ed. R.Thrall, C. Coombs, and R. Davis, 159–166. New York:John Wiley.

Dechter, R. 2003. Constraint Processing. Morgan Kauf-mann.

Domshlak, C., and Joachims, T. 2007. Efficient and Non-parametric Reasoning Over User Preferences. User Model-ing and User-Adapted Interaction (Special issue on Statisti-cal and Probabilistic Methods for User Modeling).17(1-2):41–69.

Doyle, J. 2004. Prospects for Preferences. ComputationalIntelligence 20(2): 111–136.

Doyle, J., and Thomason, R. H. 1999. Background toQualitative Decision Theory. AI Magazine 20(2): 55–68.

Doyle, J., and Wellman, M. 1994. Representing Prefer-ences as Ceteris Paribus Comparatives. In Decision-The-oretic Planning: Papers from the AAAI Spring Sympo-sium, 69–75, AAAI Technical Report SS-94-06. MenloPark, CA: AAAI Press.

Engel, Y., and Wellman, M. P. 2006. CUI Networks: AGraphical Representation for Conditional Utility Inde-pendence. In Proceedings of the Twenty-First National Confer-ence on Artificial Intelligence. Menlo Park, CA: AAAI Press.

Faltings, B.; Torrens, M.; and Pu, P. 2004. Solution Gen-eration with Qualitative Models of Preferences. Interna-

tional Journal of Computational Intelligence and Applications7(2): 246–264.

Fishburn, P. C. 1969. Utility Theory for Decision Making.New York: John Wiley & Sons.

Fishburn, P. 1974. Lexicographic Orders, Utilities, andDecision Rules: A Survey. Management Science 20(11):1442–1471.

Fishburn, P. C. 1982. The Foundations of Expected Utility.Dordrecht, Holland: Reidel.

Fishburn, P. 1999. Preference Structures and TheirNumerical Representations. Theoretical Computer Science217(2): 359–383.

French, S. 1986. Decision Theory. New York: Halsted Press.

Gajos, K., and Weld, D. 2005. Preference Elicitation forInterface Optimization. In Proceedings of the EighteenthAnnual ACM Symposium on User Interface Software andTechnology (UIST). New York: Association for ComputingMachinery.

Goldsmith, J.; Lang, J.; Truszczynski, M.; and Wilson, N.2005. The Computational Complexity of Dominanceand Consistency In CP-nets. In Proceedings of the Nine-teenth International Joint Conference on Artificial Intelligence,144–149. Menlo Park, CA: AAAI Press.

Gonzales, C., and Perny, P. 2004. GAI Networks for Utili-ty Elicitation. In Proceedings of the Ninth International Con-ference on the Principles of Knowledge Representation andReasoning, 224–234. Menlo Park, CA: AAAI Press.

Green, P. E., and Rao, V. R. 1971. Conjoint Measurementfor Quantifying Judgmental Data. Journal of MarketingResearch 8: 355–363.

Green, P. E.; Krieger, A. M.; and Wind, Y. 2001. ThirtyYears of Conjoint Analysis: Reflections and Prospects.Interfaces 31(3): 56–73.

Ha, V., and Haddawy, P. 1997. Problem-Focused Incre-mental Elicitation of Multiattribute Utility Models. InProceedings of the Thirteenth Annual Conference on Uncer-tainty in Artificial Intelligence, Providence, Rhode Island,215–222. San Francisco: Morgan Kaufmann Publishers.

Ha, V., and Haddawy, P. 1999. A Hybrid Approach to Rea-soning with Partially Elicited Preference Models. In Pro-ceedings of the Fifteenth Annual Conference on Uncertaintyin Artificial Intelligence, Stockholm, Sweden. San Francis-co: Morgan Kaufmann Publishers.

Hallden, S. 1957. On the Logic of Better. Lund, Sweden:Gleerup.

Hansson, S. O. 2001a. Preference Logic. In Handbook ofPhilosophical Logic, volume 4, ed. D. M. Gabbay and F.Guenthner, second edition. 319–394. Dortrecht, Holland:Kluwer

Hansson, S. O. 2001b. The Structure of Values and Norms.Cambridge, UK: Cambridge University Press.

Howard, R. A., and Matheson, J. E. 1984. Readings on thePrinciples and Applications of Decision Analysis. MenloPark, CA: Strategic Decision Group.

Kahneman, D., and Tversky, A. 1979. Prospect Theory:An Analysis of Decisions Under Risk. Econometrica 47(2):313–327.

Kahneman, D., and Tversky, A. 1984. Choices, Values,and Frames. American Psychologist 39: 341–350.

Keeney, R. L., and Raiffa, H. 1976. Decision with MultipleObjectives: Preferences and Value Tradeoffs. New York:Wiley.

Page 29: Preference Handling— An Introductory Tutorial

Kießling, W. 2002. Foundations of Preferences In Data-base Systems. In Proceedings of 28th International Confer-ence on Very Large Data Bases (VLDB). San Francisco: Mor-gan Kaufmann Publishers.

Kimeldorf, G., and Wahba, G. 1971. Some Results onTchebycheffian Spline Functions. Journal of MathematicalAnalysis and Applications 33: 82–95.

Krantz, D. H.; Luce, R. D.; Suppes, P.; and Tversky, A.1971. Foundations of Measurement. New York: AcademicPress.

Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmo-notonic Reasoning, Preferential Models, and CumulativeLogics. Artificial Intelligence 44: 167–207.

Kreps, D. M. 1988. Notes on the Theory of Choice. Boulder,CO: Westview Press.

La Mura, P., and Shoham, Y. 1999. Expected Utility Net-works. In Proceedings of the Fifteenth Annual Conference onUncertainty in Artificial Intelligence, Stockholm, Sweden,367–373. San Francisco: Morgan Kaufmann Publishers.

Lauritzen, S. L., and Spiegelhalter, D. J. 1988. Local Com-putations with Probabilities on Graphical Structures andTheir Application to Expert Systems (with Discussion).Journal of Royal Statistical Society, Series B 50.

McCarthy, J. 1980. Circumscription: A Form of Nonmo-notonic Reasoning. Artificial Inteligence 13(1-2): 27–39.

McGeachie, M., and Doyle, J. 2004. Utility Functions forCeteris Paribus Preferences. Computational Intelligence20(2): 158–217. (Special Issue on Preferences in AI).

Muller, K. R.; Mika, S.; Ratsch, G.; Tsuda, K.; and Scholkopf,B. 2001. An Introduction to Kernel-Based Learning Algo-rithms. IEEE Neural Networks 12(2): 181–201.

Oztürk, M.; Tsoukiàs, A.; and Vincke, P. 2005. PreferenceModeling. In Multiple Criteria Decision Analysis: State of theArt Surveys, ed. J. Figueira, S. Greco, and M. Ehrgott, 27–72. Berlin: Springer Verlag.

Payne, J.; Bettman, J.; and Johnson, E. 1993. The AdaptiveDecision Maker. Cambridge, UK: Cambridge UniversityPress.

Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference. San Mateo, CA: MorganKaufmann.

Pu, P., and Faltings, B. 2004. Decision Tradeoff UsingExample Critiquing and Constraint Programming. Con-straints: An International Journal 9(4): 289–310.

Radlinski, F., and Joachims, T. 2007. Active Explorationfor Learning Rankings From Clickthrough Data. In Pro-ceedings of the Thirteenth ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining, 570–579.New York: Association for Computing Machinery.

Reiter, R. 1980. A Logic for Default Reasoning. ArtificialInteligence 13: 81–132.

Rossi, F.; Venable, K. B.; and Walsh, T. 2004. mCP Nets:Representing and Reasoning with Preferences of MultipleAgents. In Proceedings of the Nineteenth National Conferenceon Artificial Intelligence, 729–734. Menlo Park, CA: AAAIPress.

Sandholm, T., and Boutilier, C. 2006. Preference Elicita-tion in Combinatorial Auctions. In Combinatorial Auc-tions, ed. P. Cramton, Y. Shoham, and R. Steinberg, chap-ter 10, 233–264. Cambridge, MA: MIT Press.

Savage, L. 1972. The Foundations of Statistics, 2nd ed. NewYork: Dover.

Shoham, Y. 1987. A Semantics Approach to Nonmonot-onic Logics. In Proceedings of the Tenth International JointConference on Artificial Intelligence, 388–392. Los Altos,CA: William Kaufmann, Inc.

Shoham, Y. 1997. A Symmetric View of Probabilities andUtilities. In Proceedings of the Fifteenth International JointConference on Artificial Intelligence, 1324–1329.

Shore, J. E., and Johnson, R. W. 1980. Axiomatic Deriva-tion of the Principle of Maximum Entropy and the Prin-ciple of Minimum Crossentropy. IEEE Transactions onInformation Theory 26(1): 26–37.

Smith, B., and McGinty, L. 2003. The Power of Sugges-tion. In Proceedings of the Eighteenth International JointConference on Artificial Intelligence, 127–132. San Francis-co: Morgan Kaufmann Publishers.

Tan, S. W., and Pearl, J. 1994. Qualitative Decision Theo-ry. In Proceedings of the Twelfth National Conference on Arti-ficial Intelligence, 928–933. Menlo Park, CA: AAAI Press.

Torrens, M.; Faltings, B.; and Pu, P. 2002. SmartClients:Constraint Satisfaction as a Paradigm for Scaleable Intel-ligent Information Systems. Constraints 7(1): 49–69.

Tversky, A. 1967. A General Theory of Polynomial Con-joint Measurement. Journal of Mathematical Psychology4(1): 1–20.

Tversky, A. 1969. Intransitivity of Preferences. Psycholog-ical Review 76: 31–48.

Tversky, A. 1972. Elimination by Aspects: A Theory ofChoice. Psychological Review 79: 281–299.

von Neumann, J., and Morgenstern, O. 1947. Theory ofGames and Economic Behavior, 2nd ed.. Princeton, NJ:Princeton University Press.

von Wright, G. H. 1963. The Logic of Preference: An Essay.Edinburgh, Scotland: Edinburg University Press.

Wald, A. 1950. Statistical Decision Functions. New York:John Wiley.

Wilson, N. 2004. Extending CP-Nets with Stronger Con-ditional Preference Statements. In Proceedings of the Nine-teenth National Conference on Artificial Intelligence, 735–741. Menlo Park, CA: AAAI Press.

Ronen Brafman is an associate professor at the Depart-ment of Computer Science at Ben-Gurion University inIsrael. He received his Ph.D. in computer science fromStanford University in 1996 and was a postdoctoral fel-low at the University of British Columbia. His researchwork focuses on various aspects of decision making anddecision support, including preference handling, classi-cal and decision theoretic planning, and reinforcementlearning. He serves as an associate editor for the Journal ofAI Research and is a member of the editorial board of theArtificial Intelligence Journal.

Carmel Domshlak is a senior lecturer at the Faculty ofIndustrial Engineering and Management in Technion.His research interests are in modeling and reasoningabout preferences, automated planning and reasoningabout action, and knowledge-base information systems.He received his Ph.D. in computer science from Ben-Gurion University in 2002 for his work on preference rep-resentation models and was a postpdoctoral fellow at theIntelligent Information Systems Institute at Cornell Uni-versity. He is a member of the editorial board of the Jour-nal of AI Research.

Articles

86 AI MAGAZINE


Recommended