The AIPS-98 planning competition

transcript

■ In 1998, the international planning communitywas invited to take part in the first planning com-petition, hosted by the Artificial Intelligence Plan-ning Systems Conference, to provide a new impe-tus for empirical evaluation and direct comparisonof automatic domain-independent planning sys-tems. This article describes the systems that com-peted in the event, examines the results, and con-siders some of the implications for the future ofthe field.

The International Artificial IntelligencePlanning Systems Conference (AIPS-98),held at Carnegie Mellon University in

Pittsburgh in June 1998, played host to the firstworld planning competition. Competitorswere invited to come and compete on a collec-tion of domains and associated problems, sightunseen, using whatever planning technologythey wanted. Tracks were offered for STRIPS, ADL,and HTN planning, but in the event, only STRIPS

and ADL were entered. Indeed, ADL saw only twocompetitors: IPP (Koehler et al. 1997) and SGP

(Anderson and Weld 1998). IPP gave a convinc-ingly superior performance over SGP, the onlyLisp-based planner in the competition, in asingle round playoff. The STRIPS track originallyattracted some 9 or 10 declarations of intent totake part. This article gives an account of thecompetition as seen by the competitors whoactually arrived in Pittsburgh and took part inthe two tracks.

Competitions as a way to evaluate and pro-mote progress in various fields have precedents,

such as the Message Understanding Competi-tion (MUC) series, sponsored by the DefenseAdvanced Research Projects Agency (DARPA);the Text Retrieval Competition (TREC) series,sponsored by National Institute of Standardsand Technology and DARPA; and the TuringCompetition. These competitions have stimu-lated work, but they also represent a seriousinvestment in effort for the competition orga-nizers and competitors. It is a formidable taskto create a collection of tasks that is realisticallywithin the reach of the existing technologyand that represents an adequate challenge andpoints the way for the field to develop. Admin-istrative problems represent a huge overhead tothis task: A common language must be devel-oped that allows problems to be specified andresults evaluated, scoring mechanisms must bedetermined, and the environment must beselected and competitors forewarned. Tributeshould be paid to Drew McDermott for the rolehe played in almost single-handedly executingall these tasks, with support from the competi-tion committee.

For the competitors, the competition repre-sents a challenge to the robustness of theirsoftware, and demands work in meeting thespecifications for both input and output for-mats, while they continue to develop andenhance the basic functions of their systems.The development of PDDL (McDermott andAIPS 1998) as the common language for thecompetition problem specifications was animportant step in the progress of the competi-tion. A challenge to the competitors was to

Articles

SUMMER 2000 13

The AIPS-98 Planning Competition

Edited by Derek Long, with Henry Kautz and Bart Selman (BLACKBOX team);

Blai Bonet and Hector Geffner (HSP team);Jana Koehler, Michael Brenner, Joerg Hoffmann,

and Frank Rittinger (IPP team); Corin R. Anderson, Daniel S. Weld, and David E. Smith (SGP team);

and Maria Fox and Derek Long (STAN team)

AI Magazine Volume 21 Number 2 (2000) (© AAAI)

domains that planners can realistically beexpected to handle. All the planners in thecompetition were solving some instances ofproblems in times measured in milliseconds,including nontrivial instances. Second, the col-lection of planners that are sufficiently robustfor release into the general research communi-ty, requiring no additional special-purposedomain encoding beyond STRIPS or ADL, andthat can tackle significant instances in reason-able time is still small. The qualifications areimportant, however, because there are clearlymany more planners being actively worked onthat are either still insufficiently robust to beentered into a competition event or require toomuch careful encoding of their domains to beable to compete automatically across largenumbers of problems, sight unseen. It is hopedthat this situation changes as the competitionis repeated in future years, with a widening ofthe field and a broadening of the range of tech-nology being exposed. This latter point is par-ticularly poignant because at AIPS-98, three ofthe five planners involved were directly builton GRAPHPLAN (Blum and Furst 1997) (IPP, SGP,and STAN), one exploited GRAPHPLAN technology(BLACKBOX), and only one was independent ofGRAPHPLAN in every way (HSP). The role of GRAPH-PLAN in shaping the domain-independent–planning research field at the time of the com-petition is so significant that a short summaryof the features of the system is included in thissection.

Interestingly, all five planners used a fullinstantiation of actions as a prelude to search,explaining in part the difficulties all the plan-ners experienced with problems that involvedhuge numbers of ground-operator instances.This point is discussed further in the next sec-tion. It is unfortunate that no constraint-post-ing planner, avoiding the initial grounding ofaction schemas, took part to compare perfor-mance. Some informal experiments with PRODI-GY (Fink and Veloso 1996) during the competi-tion suggested that it might have performedsignificantly better than the other planners insome domains, but it was completely outper-formed in others. This pattern hints at one ofthe most interesting issues that the competi-tion raised (discussed in more detail in the nextsection): Different planning technologies,despite being domain independent, are actual-ly highly sensitive to both domain and prob-lem instance in determining actual perfor-mance.

This section proceeds with a summary ofGRAPHPLAN before proceeding with an examina-tion of each of the competing systems in turn.The organization of this sequence reflects the

adapt to the many minor changes in this lan-guage as it steadily stabilized. Another impor-tant problem was anticipating the demands ofthe competition domains. All the planners thateventually competed are domain-independentplanners that require little or no manual guid-ance in selecting run-time behavior; however,the performance of all the planners can beaffected dramatically by the design of theencoding of the domain and problem specifica-tions. The criteria by which success was to bejudged were also volatile: Trade-offs betweenthe planning time and the optimality of theplan produced were a controversial balancingact. Optimality was measured purely by thenumber of steps in the plan for the STRIPS track;so, planners producing optimal parallel plansmight well find themselves significantly out-performed by planners concentrating onsequential plan optimality. Furthermore, theselection of domains to be used in the compe-tition was hard. All the competitors had somecollection of favored domains that showcasedthe characteristics of their planners. In princi-ple, all competitors had the opportunity topropose domains for use, but the constraintson the time available for the competition madeit impossible to use all of them.

These various challenges thinned the field sothat the competition eventually hosted fourSTRIPS planners and two ADL planners (IPP tookpart in both tracks). It was apparent that sever-al others would liked to have participated butwere unable to meet the input and output cri-teria within the time scales that were imposedby the organization of the competition.

This article presents the competitors’ impres-sions of the competition and a summary of theplanning technology with which they compet-ed. The Competing Planners includes a seriesof short subsections that highlight features ofthe individual planners that took part andbriefly explores the issues that have led to thedifferences in the systems that competed.Review of Results examines the results of thecompetition, analyzing the issues they raiseand pointing out some of the questions thatthey leave unanswered. Finally, The Futureconsiders the future of the competition and therole it might play in pushing forward planningresearch.

The Competing PlannersThe competition highlighted several facets ofthe current state of the planning research field.First, even within the last couple of years, thetechnology has advanced dramatically in termsof the size of problems in standard benchmark

…the collection

of plannersthat are

sufficientlyrobust for

release intothe general

research community,requiring no

additionalspecial-purposedomain

encodingbeyond STRIPS

or ADL, andthat can

tackle significant

instances inreasonable

time is stillsmall.

Articles

14 AI MAGAZINE

relationship between each system and GRAPH-PLAN. IPP, SGP, and STAN are essentially exten-sions of GRAPHPLAN, each exploring differentdirections. BLACKBOX exploits the core GRAPH-PLAN data structure but carries out its search fora plan in an entirely different way. Finally, HSP

is an entirely different approach to planning.

GRAPHPLAN

GRAPHPLAN is an exceptionally influential plan-ning system devised by Blum and Furst (1997).Since it was first developed, there have beenseveral insightful papers that seek to explore itsrelationship with other planning approachesand with constraint solving (Kambhampati1999; Kambhampati, Lambrecht, and Parker1997). There has also been considerable activi-ty devoted to extending and improving theunderlying algorithms.

GRAPHPLAN constructs plans in three phases:First, the action schemas are fully instantiatedto identify the complete set of ground actionsfor the planning problem. Second, a funda-mental data structure, called the plan graph, isconstructed. Third, the plan graph is searchedin an attempt to identify a plan structure with-in it. If this process fails, the graph is extended,and a new search is initiated. This process iter-ates until a plan is found, or certain terminat-ing conditions are encountered, proving thatthere is no plan. The interleaved graph-con-struction and -search processes amount to aniterated depth-first search for a plan, ensuringa systematic search guaranteed to find theshortest plan structure.

The key to understanding GRAPHPLAN then isto understand the plan-graph data structure.This structure is built as a series of layers ofnodes. Nodes in alternate layers representground facts and ground actions. Arcs are cre-ated between nodes that represent importantrelationships. Fact nodes are connected toaction nodes in the immediately succeedinglayer when they represent preconditions of theactions. Actions are connected to the facts inthe immediately succeeding layer that theyachieve as positive effects. They are also con-nected (by differently labeled edges) to theirnegative effects (facts they delete). GRAPHPLAN

also inserts special actions, called noops, at eachaction layer, which have the role of causingfacts at the preceding layer to persist into thenext layer. Using a noop to achieve a goal isequivalent to achieving the goal early andallowing it to persist while other actions occur.Finally, and most significantly, pairs of actionsin the same layer, and pairs of facts in the samelayer, are connected if they are mutually incon-sistent (or mutually exclusive—”mutex”). Two

actions are considered mutually exclusive if thenegative effects of one intersects with the pre-conditions or positive effects of the other or ifthey have mutually exclusive preconditions.Facts are mutually exclusive if all pairs ofactions that achieve the two facts are them-selves mutually exclusive. The graph is seededwith an initial layer representing the facts truein the initial state. The graph is constructeduntil a layer of facts includes all the plan goalsas a collection of nonpairwise mutually exclu-sive nodes in a layer.

Search is goal directed, starting with the lastlayer of the graph. The goals are identified inthe final layer, and a collection of nonpairwisemutually exclusive actions in the precedingaction layer is found that includes all the goalsas positive effects. The preconditions of theseactions are then used as the seed for a recursivesearch from the immediately preceding layer offacts. The process successfully identifies a planif the search reaches a collection of facts in theinitial layer. It should be noted that becausemore than one action can be used in a givenlayer of the graph, the plan can include parallelactions. Therefore, GRAPHPLAN produces parallel-optimal plans that are not always sequentiallyoptimal. If the search fails to produce a plan(including the recursively initiated searches inpreceding layers), the collection of goals thatcannot be solved is recorded as unsolvable atthe corresponding layer. During subsequentsearches, goals are checked against previouslyrecorded unsolvable sets, and if they containsuch a set as a subset, then no search is initiat-ed, but a failure is reported immediately. Thisapproach can save a huge amount of wastedsearch.

If the search fails to find a plan for a top-levelgoal set, then the graph is extended by anotheraction and fact layer, and the search recom-mences. It has been proved (Blum and Furst1997) that certain signals in the data structurewill infallibly demonstrate a problem to beunsolvable, so that the algorithm is not onlysound and complete but also terminating.

GRAPHPLAN has had a dramatic impact on theplanning community, effectively displacingthe earlier nonlinear planning technology asthe foundation of fast and effective planners.Nevertheless, it suffers from significant prob-lems. The need to instantiate actions, the stor-age of the plan graph itself, and the records ofunsolvable goal sets together represent a formi-dable demand on memory. It is not uncom-mon to require over 100 megabytes of random-access memory (RAM) to solve significantproblems, and memory constraints representone of the barriers to solving much larger plan-

The key tounderstandingGRAPHPLAN

then is tounderstandthe plan-graphdata structure.

Articles

SUMMER 2000 15

PDDL first-order formulas into the disjunctivenormal form required by IPP but, in contrast tothat approach, does not expand out condition-al action effects. Instead, IPP uses a specialextension of the GRAPHPLAN machinery to han-dle conditional effects directly. In addition, IPP

represented a first complete reimplementationof GRAPHPLAN and incorporated many majorimprovements in the underlying data struc-tures. These efforts give IPP a dramatically bettertime and space performance than the originalGRAPHPLAN implementation. A particular im-provement is that IPP exploits a new data struc-ture to support the recording of goal sets forwhich search has failed (Hoffmann andKoehler 1999). This data structure makes it pos-sible for a full subset test to be used efficientlywhen checking to see if a new goal set includesa subset already recorded as unsolvable, allow-ing IPP to make more effective use of theserecords.

A further innovation introduced in the IPP

system is the use of a filter to remove facts andactions from the plan graph that are consid-ered irrelevant to the planning problem. Thisprocess has the potential to improve the per-formance of the planning system dramatically,reducing both storage requirements and searchtime. This system, RIFO (Nebel, Dimopoulos,and Koehler 1997), is heuristic and can be con-trolled to prune the graph more or less aggres-sively. The competition version of IPP intro-duces a control harness that uses variousparameters measured from the domain andproblem instance to determine how aggressive-ly to apply the RIFO mechanism.

The RIFO Metastrategy The competitionversion of IPP exploits a metastrategy,“wrapped” around the planner (figure 1), thatis activated when two threshold parameters—(1) the number of objects and (2) the numberof ground actions—are exceeded by a planningproblem. These parameters can be set by handor can be preset. The strategy configures theRIFO subsystem within IPP.

RIFO was developed to cope with the problemof irrelevant information in plan graphs. Usu-ally, when building a plan graph, a GRAPHPLAN-based planner will add a great many facts andactions that are totally irrelevant to solving theplanning problem at hand. By building a treeof possible actions backwards from the goal set,ignoring negative effects of actions, RIFO canidentify what information is or might be rele-vant to a problem. This identification allowsthe system to exclude irrelevant information.The pruning can be applied at various degreesof strength but at the price of increasing dangerof incompleteness: Overaggressive pruning can

ning problems with this technology. Further,the backward search through ground actionsforces the planner to make decisions aboutparts of a plan long before the information isavailable to determine how to make the deci-sion intelligently. These issues, among others,remain to be resolved if GRAPHPLAN is to havelongevity as a planning solution.

IPPThe IPP (1998) planning system extends theGRAPHPLAN system in several significant ways. Inparticular, IPP was developed to demonstratethe extension of the GRAPHPLAN approach toallow it to handle a larger subset of ADL (Ped-nault 1989). These extensions include the useof negative preconditions and conditionaleffects, allowing IPP to enter the ADL track of thecompetition. To help in handling the ADL

extensions in PDDL, IPP uses a preprocessingphase based on the proposal first documentedby Gazen and Knoblock (1997) that translates

Articles

16 AI MAGAZINE

PDDL preprocessingPDDL parser

RIFO meta strategy

IPP 3.3

PDDL output

domain problem

Figure 1. IPP Architecture.

lead to too many actions being ruled out, mak-ing the planning problem unsolvable. Themetastrategy runs RIFO on the problem inputusing the strongest reduction heuristic, whichselects a minimal set of relevant initial factsand objects and keeps only those groundactions that were used in the tree of actionsbuilt from the goals to the initial state. Theremainder of the initial state facts, objects (andfacts and actions including them), and otheractions are pruned from the graph to yield areduced problem to be exposed to search.

If IPP determines the reduced problem to beunsolvable, which it usually does quickly,because of the effective termination test inher-ent to GRAPHPLAN, the search space is enlarged byapplying RIFO again using a weaker pruningheuristic, restoring all the previously excludedactions that use objects that are newly judgedto be potentially relevant. If IPP still cannotsolve the problem, the original search space isfully restored to retain completeness of theplanner, in the worst case leading to some over-head if both RIFO heuristics turn a solvable plan-ning problem into an unsolvable one or if acompletely unsolvable problem is investigated.

Handling of Conditional Effects in IPPProbably the most important ADL extensionhandled by IPP is the use of conditional effects.Gazen and Knoblock (1997) observe that allthe other commonly handled ADL extensionscan be converted into STRIPS in reasonable timebut that conditional effects lead to a potential-ly crippling combinatorial cost if an attempt ismade to deal with them in this way.

IPP represents conditional effects directly inplanning graphs and also propagates mutualexclusion relationships between facts occur-ring in conditional effects and their effect con-ditions. However, it does not extend the notionof mutual exclusion between actions, asdefined in GRAPHPLAN. This fact contrasts withthe approach taken in SGP, discussed later,where a propagation of mutual exclusion rela-tions over actions is proposed.

Actions with conditional effects are stilltreated as atomic units in IPP, but SGP createscomponents similar to GRAPHPLAN’s STRIPS

actions that represent each conditional effectin the graph. Thus, there is the possibility ofmarking component nodes as exclusive as wellas, it is claimed, a performance benefit. Adetailed discussion of the relative merits of thetwo approaches is beyond the scope of this arti-cle but remains an interesting area of debate inthe GRAPHPLAN community.

Interestingly, the movie domain, which wasdesigned to explore the differences in theapproaches to conditional effects, failed to

achieve this objective because of an additionalfeature of IPP: the exploitation of inertia, orunchanging facts in the domain, to carry outconstant folding among the grounded actionsand reduce them to simpler forms. This processrepresents an important element of the collec-tion of techniques being developed to controlthe size of the plan graph for GRAPHPLAN-basedapproaches to planning.

SENSORY GRAPHPLAN (SGP) is a sound, completeplanner also built following GRAPHPLAN. SGP

includes support for conditional effects, uni-versal and existential quantification, uncer-tainty, and sensing actions. SGP only enteredthe ADL track of the competition to demon-strate these extensions to the underlyingGRAPHPLAN machinery. SGP was the only plannerwritten in Lisp to enter the competition and,despite the performance limitations from thelanguage, competed well.

SGP has a number of features that set it apartfrom the other GRAPHPLAN-based planners. Theseunique features, described in the following sub-sections, are factored expansion for handlingconditional effects and the ability to handleuncertainty and sensing actions. SGP is writtenin well-documented Lisp code and can be down-loaded from the SGP home page (SGP 1998). SGP

continues to be supported by the University ofWashington, with new bug fixes and new releas-es being produced as appropriate.

Factored Expansion of ConditionalEffects Many of the expressive features of ADL

are easy to implement in GRAPHPLAN, but han-dling conditional effects is surprisingly tricky.Conditional effects allow the description of asingle action with context-dependent effects.The basic idea is simple: A special when clauseis introduced into the syntax of action effects.When takes two arguments: (1) an antecedentand (2) a consequent. Execution of the actionwill have the consequent’s effect just in casethat the antecedent is true immediately beforeexecution (much like the preconditions of theaction determine if execution itself is legal),which is exactly why handling actions withconditional effects is tricky: The outcomes ofthe actions depend on the state of the worldwhen the action is executed.

Factored expansion was first described inAnderson and Weld (1998), where it was com-pared to full expansion (Gazen and Knoblock1997) and the conditional-effects handlingmethod of IPP. The idea behind factored expan-sion is twofold: First, all actions that containconditional effects are factored into compo-nents, with a single effect (conditional or

Articles

SUMMER 2000 17

edge of the world at plan time and assume thatthe outcome of actions is certain. Althoughthis assumption simplifies the planningprocess, it is an unrealistic one. SGP relaxes theassumption that the agent knows the state ofthe entire world a priori.

The first step in relaxing this assumption isto allow the initial state to include propositionswhose truth values are uncertain (either true orfalse). Internally, SGP represents this sort ofuncertainty by keeping track of each possibleworld. For example, if both propositions P andQ are uncertain, then there are four possibleworlds: (1) P and Q, (2) P and not Q, (3) not Pand Q, and (4) not P and not Q. For each possi-ble world, SGP builds and maintains a separateplanning graph. This task is complicated by thefact that because of conditional effects, the out-come of an action might be different depend-ing on the world it is executed in. It should benoted here that SGP makes the assumption thatan uncertain proposition might be true or falsebut does not make any assumptions aboutprobabilities (SGP simply records that a propo-sition is uncertain and does not associate anumeric weight with each possibility).

The second step in relaxing the completeknowledge assumption is to limit the agent’sobservational powers to explicit sensing actions.Sensing actions are actions in the domain whoseeffects include a special sense statement. Sensestatements are used to let the agent query theworld for the truth value of a proposition. Basedon the sensed truth value, the agent can thenrefine the subset of possible worlds it is in (forexample, if a proposition P is sensed to be true,then the agent knows that it is in one of the pos-sible worlds in which P is true).

Given the set of planning graphs (one foreach possible world) and the set of actions,including sensing actions, SGP builds a plan thatwill achieve the goals in all possible worlds. Toguarantee success in each possible world, theplan can include actions that are to be executedonly in some of the possible worlds. Thus, theplan might have branch points contingent onthe consequences of sensory actions. The planis a directed acyclic graph, with the possibilityof branches rejoining with each other. The con-tingencies in the plan are based on which pos-sible world the agent is in at plan-executiontime. To resolve which possible world the agentis in, the results of the sensing actions are used.An important feature of SGP is that the plannerdetermines exactly which sensing actions needto be taken to resolve the uncertainty. There aretwo important outcomes here: First, the plan-ner does not assume that the agent has fullobservational power. Instead, SGP allows only

unconditional) in each component. Each com-ponent can then be treated, more or less, justlike a STRIPS action. Second, when the compo-nents are being examined for graph expansionand backtracking search, the interactionbetween components from the same action isconsidered. In particular, there is the new con-cept of component induction: Component Cninduces Ck at level i if it is impossible to executeCn without executing Ck.

Component induction leads to changes inboth the GRAPHPLAN mutual-exclusion rules andin the backtracking search for a solution. Thechange to the mutual-exclusion rules is theaddition of a new rule that can identify moremutual-exclusion relations: Two componentsCn and Cm at level i are mutually exclusive ifthere exists a third component Ck that is mutu-ally exclusive with Cm, and Ck is induced by Cnat level i. The change to the backtrackingsearch requires that when a component isselected to establish a goal, all other compo-nents from the same action instance must alsobe considered.

An example of a domain where the factoredexpansion approach shows its strength is themovie domain. The goal of the movie domainis to collect snacks and then watch a movie.Before the movie can be “watched,” the moviemust be rewound, and the video cassetterecorder’s (VCR) counter must be set to zero.Consider what the planning graph looks likeafter one level has been built: The VCR’scounter can be reset, the tape can be rewound,and snacks can be fetched. The key thing, how-ever, is that the actions to rewind the movieand to reset the VCR’s counter are mutuallyexclusive only if their conditional effects areconsidered. The IPP method of handling condi-tional effects does not find this mutual-exclu-sion relationship, and thus, IPP must performan exhaustive search of the planning graph todetermine that no plan yet exists.

SGP, however, can identify the inducedmutual-exclusion relationship between thetwo actions. Thus, SGP immediately knows thatno plan yet exists and can thus proceed to thenext planning graph level.

Uncertainty and Sensing Actions Thesecond novel feature in SGP is its ability to dealwith uncertainty and sensing actions (thiswork is detailed in Smith and Weld [1998a] andWeld, Anderson, and Smith [1998]). Althoughthis part of SGP was not used at the AIPS-98planning contest, it is discussed here in somedetail because it was the main motivation forwork on SGP and is the characteristic that dis-tinguishes SGP from the other planners.

Classical planners assume complete knowl-

Articles

18 AI MAGAZINE

the terms sensed by the sensing actions to beused in resolving uncertainty. Second, theagent does not have to resolve exactly whichpossible world it is in but, rather, which of a setof possible worlds it is in. This distinction isimportant because with SGP, the agent need notnecessarily resolve all the uncertainty but sim-ply “enough.”

STAN (state-analyzing planner) (1998) is a thirdsystem based on GRAPHPLAN that extends itsplanning technology in several ways: First, amore sophisticated data structure is used tostore the graph and aid in its construction thanwas used in earlier versions of GRAPHPLAN. Sec-ond, more analysis is carried out on the graphand the problem it encodes to reduce searchbranches, including goal-ordering analysis,symmetry analysis, and resource analysis.Third, a wave front is exploited at the fix pointof the plan graph to avoid unnecessary addi-tional work. Fourth, state-analysis techniques,implemented as a module of STAN that can beexploited as a planner-independent system,TIM, are used to acquire information about adomain and problem encoding that is exploitedin instantiation and filtering of the plan graph.

The implementation of the plan graph inSTAN is based on a careful reuse of much of thecentral graph structure, to avoid repeatedlycopying layer-independent information, to-gether with bit-level operations to supportgraph construction and careful filtering of thelayer-dependent structures to reduce the retest-ing carried out at subsequent layers. Thisimplementation, the most recent version ofwhich is described in full in Long and Fox(1999), is efficient and relatively compact,although the competition version of STAN (STAN

1.0) still allowed unnecessary wastage of space.A similar approach, although introduced witha slightly different purpose (to extend GRAPH-PLAN to handle actions with duration), is dis-cussed in Smith and Weld (1999, 1998b).

The plan graph is carefully analyzed duringconstruction to produce several auxiliary rela-tionships between goals and between actionsat each layer. In particular, STAN identifies order-ing relations between pairs of goals reflectingthe order in which they must be satisfied sothat both are true at a given level. Thus, a con-siderable reduction in the search space isallowed by automatically selecting noops forgoals that must be satisfied earlier than the cur-rent layer during search. Furthermore, certainchains of ordered goals can prevent an entiregoal set from being achieved at a given layerwithout any search at all.

STAN also identifies some goal sets asunachievable when they exceed certainresource limitations imposed by the domain.Resources include numbers of objects in certainconfigurations, the rate at which certain objectscan be put into key configurations during theplan execution, and other factors as well as thephysical resources available to the planningagent (such as grippers, fuel, and containers).Limits on the availability of abstract resources,such as the rate at which objects can be config-ured, arise from the limits imposed by the phys-ical resources of the agent and are expressed, forexample, in terms of constraints on the numberof plan-graph levels that must have been builtbefore a certain goal configuration can beachieved. The resource analysis performed bySTAN can dramatically reduce search, by identi-fying the minimum number of graph layersthat must be built, in some domains includingthe traveling salesman problem discussed later.

Finally, STAN exploits structural features of theproblem domain, such as symmetry, to reducesearch. The competition version of STAN per-formed only a preliminary symmetry analysis inwhich symmetric objects (those that are indis-tinguishable and, hence, do not form interest-ingly different action instantiations) were iden-tified to reduce the number of actioninstantiations produced. Although STAN was ableto detect certain forms of symmetry, the compe-tition version was not able to exploit it fully. Amore advanced treatment has been developedsince the competition (Fox and Long 1999).

One of the most important features of STAN

that distinguishes it from other GRAPHPLAN-based planners is its use of a highly efficientimplicit representation of the graph beyondthe fix point. STAN avoids both constructionand search of the plan graph beyond the fixpoint, where the graph is static. Instead, searchis built on a collection of candidate goal setsthat are generated as failure sets at the fix pointand that are pushed one layer forward to beretried. This process in which failed goal setsare promoted forward forms a kind of rollingcollection of goal sets at the fix point, which iscalled a wave front (Long and Fox 1999). Theefficiency gains this mechanism offers can bedramatic in certain problems and can also sig-nificantly reduce the memory demands duringthe process of constructing a solution.

The exploitation of the results of state analy-ses of various kinds is an important feature ofSTAN. State analyses can be done in a prepro-cessing stage, using techniques that are plannerindependent, and the results fed into the plan-ning process and used to reduce, and eveneliminate, some of the more resource-intensive

Articles

SUMMER 2000 19

Much work remains to be done in this area:RIFO offers IPP a huge benefit in many problemsby filtering out many irrelevant objects,actions, and facts but at the price of possibleincompleteness. As can be seen in Review ofResults, all the planners were confoundedwhen confronted with problem instances con-taining large numbers of ground-operatorinstances. The ability to reduce these numbersby intelligent filtering would appear to be acritical element to successful planning.

In contrast, some problems are hard notbecause of the cost of construction of the graphbut because of the multiplicity of search pathsin the problem. This multiplicity is particularlyproblematic in cases in which the problemappears to be solvable long before it actually is(the graph contains the nonpairwise, mutuallyexclusive goals long before the solution layer).An example of such a problem is the complete-graph traveling salesman problem (TSP) inwhich the graph is completely connected sothat the traveler is unconstrained in the orderof visits he makes. In this case, the problem isthat all the destinations could be visited withina single step, and any pair could be visited aftertwo steps. Thus, the problem appears solvableafter two steps but is not actually solvable untiln steps have passed, where n is the number ofsites to be visited. However, as n grows, thenumber of possible paths the traveler mighthave taken explodes exponentially, so that thesearch problem quickly becomes intractable.STAN uses resource analysis to determine thatonly one destination can be visited at eachstep, so it does not search for a plan until n lay-ers are constructed and can also ensure thatevery layer is used for a visit to a hithertounvisited site.

Unfortunately, there are other problems ofsimilar character that are not yet adequatelytackled by STAN. The gripper problem used in thecompetition is an example: STAN correctly deter-mines that no more than two balls can bedeposited at each time step. However, it doesnot allow for the fact that the two balls mustalso be picked up and transported, requiring anadditional two steps (and a third to return to thesource for another load), so the resource analysisdoes not help as much as one might hope.

GRAPHPLAN-style planners also suffer duringsearch from a problem of premature commit-ment. This problem occurs because the use ofground-action instances forces selection ofobjects to play particular roles in a plan oftenbefore constraints that would govern theirchoice become apparent. Constraint-solvingplanners can benefit with these problems bymaking choices in the highly constrained parts

aspects of planning. Symmetry analysis, inwhich symmetric objects and actions are iden-tified to prune redundant search, is one suchform of state analysis. Another technique, cur-rently being integrated with STAN 3, is the typesand state-invariant inference analysis per-formed by the TIM module (Fox and Long1998). The competition version of STAN was notfully integrated with TIM and therefore couldnot make use of the inferred state invariants. Itcould exploit the type structure inferred by TIM,but the competition domains had types sup-plied, hence STAN did not have as large anadvantage over the other competitors as can beachieved with some domain encodings. Morerecently, TIM has been extended to allow theautomatic identification of certain genericdomain behaviors such as objects that traversea map of locations and objects that can be car-ried around such a network, together with theobjects responsible for carrying them. Thisanalysis can be exploited in a variety of ways,but an initial use is in filtering useless actioninstances from the initial action-instantiationphase common to GRAPHPLAN-based planners.

Development of STAN During the develop-ment of the competition version of STAN, a pat-tern was established of exploring the behaviorof the planner on a family of problems tounderstand what made them hard followed byconstructing techniques to tackle the source ofthe difficulty in a domain-independent way. Animportant lesson learned from this process, andfrom the competition itself, is that problems arehard for a wide variety of reasons and that thesedifferent sources of difficulty can lead to thedevelopment of techniques that are powerful incertain contexts but are simply useless over-head in others. Of particular interest were prob-lems that appeared hard for STAN yet were easyfor other planners or are intuitively easy.

One of the reasons problems can be hard,affecting GRAPHPLAN-style planners in particular,is that domains can contain huge collections ofinstantiated actions that have no useful role inthe plan. Thus, you have both a large cost inthe construction phase and, often, a large costin the search phase when many branches mustbe explored. These branches often express thesame fundamental planning decisions but,because of the search in grounded actions, dif-fer in details that are insignificant from theplanning point of view.

STAN attempts to deal with this problem inseveral ways: Type inference leads to a poten-tial reduction in the numbers of instantiatedactions; the use of static conditions in instanti-ation; and some preliminary work on filteringto remove some objects in some domains.

Articles

20 AI MAGAZINE

of the plan and propagating them into the lessconstrained parts of the plan. Forward- andbackward-chaining planners do not direct theirplanning strategies by exploiting the mosthighly constrained parts of a plan structure,which can lead to costly mistaken choices thatmust be retracted.

Some problems are hard because they have aninherent combinatorial cost, and others are, inprinciple, easy yet prove hard with current plan-ning technology or at least with some currentplanning strategies. The gripper domain is agood example of the latter—a domain in whichproblems are trivial for human problem solvers,yet an optimal plan for this domain eluded allthe planners in the competition for instanceslarger than a dozen balls. More recent work onSTAN has explored the exploitation of symmetryin problems such as the gripper that can reducethe difficulty dramatically (Fox and Long 1999).Nevertheless, clearly much work remains to bedone in recognizing and exploiting features ofproblems that humans appear to identify withease. Furthermore, the representation of so-lutions to problems such as these remains anissue: No human would represent the solutionto problems in the gripper domain as an explicitsequence of steps but would as a method forgenerating these steps during execution (“Trans-port the first two balls from room a to room b,then return and repeat until all balls are trans-ported.” The fact that the first two balls are notidentified by name is an indication of the role ofsymmetry for human problem solvers).

BLACKBOX

It has often been observed that the classical AIplanning problem (that is, planning with com-plete and certain information) is a form of log-ical deduction. Because early attempts to usegeneral theorem provers to solve planningproblems proved impractical, research becamefocused on specialized planning algorithms.However, the belief that planning requiredsuch specialized reasoning algorithms waschallenged by the work of Kautz and Selman(1996, 1992) on planning as propositional sat-isfiability testing. SATPLAN showed that a generalpropositional theorem prover could be com-petitive with some of the best specialized plan-ning systems. The success of SATPLAN can beattributed to two factors: (1) the use of a logicalrepresentation that has good computationalproperties and (2) the use of powerful new gen-eral reasoning algorithms such as WALKSAT (Sel-man, Kautz, and Cohen 1994). Both the factthat SATPLAN uses propositional logic instead offirst-order logic and that we suggested the par-ticular conventions for representing time and

actions are significant. Differently declarativerepresentations that are semantically equiva-lent can have distinct computational profiles.The use of general reasoning algorithms offersan important benefit because many researchersin different areas of computer science devisenew algorithms and implementations for SAT

testing each year and freely share ideas andsource code. SAT is the general problem of deter-mining whether there is a way to set a collec-tion of identified variables to each be eithertrue or false so that a given logical expressioninvolving these variables is made true. As aresult of this shared work and the size of theinterested community, at any point in time,the best general SAT engines tend to be faster (interms of raw inferences a second) than the bestspecialized planning engines.

Interestingly, the GRAPHPLAN approach toplanning shares a number of features with theSATPLAN strategy. Comparisons with SATPLAN

show that neither algorithm is strictly superior.For example, SATPLAN is faster with a complexlogistics domain; they are comparable on theblocks world; and with several other domains,GRAPHPLAN is faster.

GRAPHPLAN bears an important similarity toSATPLAN: Both systems work in two phases, firstcreating a propositional structure (in GRAPHPLAN,a plan graph, in SATPLAN, a formula in conjunc-tive normal form [CNF]) and, second, searchingthis structure. The propositional structure cor-responds to a fixed plan length, and the searchreveals whether a plan of this length exists. Fur-thermore, in Kautz and Selman (1996), it isshown that the plan graph has a direct transla-tion to a CNF formula and that the form of theresulting formula is close to the original con-ventions for SATPLAN. It is hypothesized that thedifferences in performance between the twosystems can be explained by the fact that GRAPH-PLAN uses a better algorithm for instantiatingthe propositional structure, and SATPLAN usesmore powerful search algorithms.

SATPLAN fully instantiates a problem instancebefore passing it to a simplifier and a solver. Bycontrast, GRAPHPLAN interleaves plan-graphinstantiation and simplification. Furthermore,GRAPHPLAN uses a powerful planning-specificsimplification algorithm: the computation ofthe mutual-exclusion relations between pairsof actions and facts.

These observations have led to the creationof a new system that combines the best featuresof GRAPHPLAN and SATPLAN. This system, calledBLACKBOX, works in three phases. First, a plan-ning problem (specified in a standard STRIPS

notation) is converted to a plan graph. Second,the plan graph is converted into a CNF formu-

Articles

SUMMER 2000 21

ing as well as a technique for reducing the sizeof the CNF encodings by suppressing the gener-ation of clauses that are logically redundant.

BLACKBOX was designed to improve the plan-extraction phase of GRAPHPLAN (that is, the searchwithin the plan graph for a solution). In themajority of the problems solved in the compe-tition, plan extraction was comparatively easyonce the plan graph was constructed. Systemsthat reduce the plan-graph structure and there-by reduce the cost of constructing it enjoyed arelative advantage over BLACKBOX in this con-text. Problems that have a relatively muchharder plan-extraction phase are included inthe BLACKBOX (1998) distribution.

HSP: Heuristic Search PlannerHSP is a planner based on the well-establishedidea of heuristic search. Heuristic search algo-rithms perform forward search from an initialstate to a goal state using a heuristic functionthat provides an estimate of the distance to thegoal. The 8-puzzle is the standard example ofheuristic search and is treated in most AI text-books (Pearl 1983; Nilsson 1980). The main dif-ference between the 8-puzzle and the approachto planning adopted in HSP is in the heuristicfunction. Although in domain-specific taskssuch as the 8-puzzle, the heuristic function isgiven (for example, as the sum of the Manhat-tan distances), in domain-independent plan-ning, it has to be derived from the high-levelrepresentation of the actions and goals.

A common way to derive a heuristic functionh(s) for a problem P is by relaxing P into a sim-pler problem P’ whose optimal solution can becomputed efficiently. Then, the optimal cost forsolving P’ can be used as a heuristic for solvingP (Pearl 1983). For example, if P is the 8-puzzle,P’ can be obtained from P by allowing the tilesto move into any neighboring position. Theoptimal cost function of the relaxed problem isprecisely the Manhattan-distance heuristic.

In STRIPS planning, the heuristic values for aplanning problem P can be obtained by consid-ering the relaxed planning problem P’ in whichall delete lists are ignored. In other words, P’ islike P except that delete lists are assumed emp-ty. As a result, actions can add new atoms butnot remove existing ones, and a sequence ofactions solves P’ when all goal atoms have beengenerated. As with all the other competitionplanners, action schemas are first grounded inan initial instantiation phase, so variables donot occur in actions.

It is not difficult to show that for any initialstate s, the optimal cost h’(s) to reach a goal inP’ is a lower bound on the optimal cost h*(s) toreach a goal in the original problem P. The

la. Third, the formula is solved by any of a vari-ety of fast SAT engines.

The formula generated from the plan graphcan be considerably smaller than one generat-ed by translating STRIPS operators to axioms inthe most direct way, as was done by the earlierMEDIC system of Ernst, Millstein, and Weld(1997). Furthermore, the mutual-exclusionrelations computed in the plan graph can betranslated directly into negative binary clauses,which can make the formula easier to solve formany kinds of SAT engines.

The competition version of BLACKBOX includ-ed the local-search SAT solver WALKSAT and thesystematic SAT solver SATZ (Li and Anbulagan1997) as well as the original GRAPHPLAN engine(that searches the plan graph instead of theCNF form). To have robust coverage over a vari-ety of domains, the system can use a scheduleof different solvers. For example, it can runGRAPHPLAN for 30 seconds; then WALKSAT for 2minutes; and, if still no solution is found, SATZ

for 5 minutes.The BLACKBOX system introduces new SAT

technology as well, namely, the use of random-ized complete search methods. As shown inGomes, Selman, and Kautz (1998), systematicsolvers in combinatorial domains often exhibita “heavy tail” behavior, whereby they get“stuck” on particular instances. Adding a smallamount of randomization to the search heuris-tic and rapidly restarting the algorithm after afixed number of backtracks can dramaticallydecrease the average solution time, often fromhours to seconds.

This randomization-restart technique wasapplied to the version of SATZ used by BLACKBOX.The variable-choice heuristic for SATZ choosesto split on a variable that maximizes a particu-lar function of the number of unit propaga-tions that would be performed if that variablewere chosen (see Li and Anbulagan [1997] fordetails). The BLACKBOX version, SATZ-RAND, ran-domly selects among the set of variables whosescores are within 40 percent of the best score.The solver schedule used in the competitionwas to run the GRAPHPLAN engine for 2 seconds,then to convert the problem to CNF and runSATZ-RAND for 10 restarts with a cutoff of 100backtracks. If one of the solvers showed that nosolution existed, the plan graph was extendedby one layer and the schedule repeated. Notethat no one cutoff value is ideal for alldomains. One way to address the problem is tospecify a sequence of increasing cutoff valuesin the solver schedule.

The newest version of BLACKBOX includes anadditional solver, RELSAT (Bayardo and Schrag1997), based on dependency-directed backtrack-

Articles

22 AI MAGAZINE

heuristic function h(s) could therefore be set toh’(s) and obtain an informative and admissible(nonoverestimating) heuristic. The problem,however, is that computing h’(s) is still NP hard(first observed by Bernhard Nebel). Therefore,an approximation is used: The heuristic valuesh(s) are set to an estimate of the optimal valuesh’(s) of the relaxed problem. These estimatesare computed as follows:

Starting with s0 = s and i = 0, si is expandedinto a (possibly) larger set of atoms si+1 by com-bining the atoms in si with the atoms that canbe generated by the actions whose precondi-tions hold in si. Every time an action thatasserts an atom p is applied, a measure gs(p) isupdated that is intended to estimate the diffi-culty (number of steps) involved in achieving pfrom s. For atoms p in s, this measure is initial-ized to 0, but for all other atoms, gs(p) is initial-ized to infinity. Then, when an action with pre-conditions C = r1, r2, …, rn that asserts p isapplied, gs(p) is updated as

The expansions and updates continue untilthese measures do not change. When all pre-conditions involve a single atom, it is a Bell-man-Ford procedure for computing costs in agraph from a given set of sources. The nodesare the atoms, the sources are the atoms p suchthat gs(p) = 0, and edges p to q with cost 1 existwhen an action with precondition p asserts q.

The heuristic function h(s) used by HSP isdefined then as

where G stands for the set of goal atoms. Thisdefinition assumes, like decompositional plan-ners, that subgoals are independent. The addedvalue of the heuristic approach is that subgoalsare weighted by a difficulty measure that makesit possible to regard certain decompositions asbetter than others. A result of this assumptionis that the heuristic function h(s) is not admis-sible. However, it is often quite informativeand can be computed reasonably fast.

The Search Algorithm The heuristic func-tion defined previously allows us to deal withany STRIPS planning problem as a problem ofheuristic search. Thus, planning could be car-ried out using algorithms such as A*. A*, how-ever, might take exponential memory andapproaches the goal too cautiously. In HSP,where the heuristic is recomputed from scratchat every node, it is necessary to use algorithmsthat can get to the goal with as few evaluationsas possible. For this reason, HSP uses a form of

h s g pdef

( ) ( )=∈∑

g p g p g rs s si n

i( ): min[ ( ), ( )],

= +=∑1

hill-climbing search. Surprisingly, hill climbingworks well in many problems and often pro-duces good plans fast. Sometimes, however, itgets stuck in local minima. To tackle this prob-lem, search proceeds until a fixed number ofimpasses have been encountered, restarting thesearch if necessary, to some specified maxi-mum number of times. The algorithm used inthe competition is a variation of this idea thatalso uses memory to keep track of the statesthat are visited. Current effort is directedtoward identifying ways to speed up the evalu-ation of the heuristic so that more systematicsearch algorithms could be used.

HSP is implemented in C. In contrast to all theother competition systems, a preprocessor isused to convert a STRIPS problem in PDDL into aC program that is then compiled, linked,assembled, and executed. This process usuallymeans a time overhead on the order of a sec-ond or two in small planning problems butpays off in larger ones.

Related Work HSP is based on the plannerreported in Bonet, Loerincs, and Geffner (1997).This planner, called ASP, uses the same heuristicfunction but a different search algorithm basedon Korf’s (1990) LRTA, which was designed forreal-time planning. An independent proposalthat also formulates planning as heuristicsearch was developed by McDermott (1996).

The bottleneck in HSP is the computation ofthe heuristic values that are obtained afresh inevery new state. Work carried out since thecompetition has led to a solution to this prob-lem, reported in Bonet and Geffner (1999). Arelated proposal is owed to Refanidis and Vla-havas (1999). The work reported in Bonet andGeffner (1999) also suggests that there is a closerelation between heuristic search planning andGRAPHPLAN planning that might be worth fur-ther investigation.

Further details on HSP and code can be foundat the HSP (1998) web site.

Review of ResultsOne of the tasks the competition committeefaced was to determine a strategy for evaluatingplanner performance. Before the competition,a formula was proposed that combined weight-ed values for plan length and planning time,adjusted to reflect the relative performance ofdifferent planners on the same problem (togive due credit to planners that quickly solvedproblems that defeated many of the others). Inthe event, this formula was judged to givecounterintuitive results, and it was essentiallyabandoned, leaving a void in the final evalua-tion of performance; in the STRIPS track, it

Articles

SUMMER 2000 23

are similar in that they are all transportationproblems that involve moving objects arounda network of locations as efficiently as possible.Interestingly, these domains all had a similarperformance limit for all the planners: Noplanner could solve a logistics problem withmore than 10,000 ground-action instances,and all the problems with fewer than 10,000ground-action instances were solved by at least1 of the planners. Where reported, the num-bers of ground-action instances have beencomputed using STAN, with all filtering mecha-nisms turned off. STAN uses the TIM subsystem togenerate a type structure for each domain,which can result in fewer ground-actioninstances being generated than would be thecase with raw instantiation. The mystery primedomain proved more tractable, with problemsincluding as many as 24,290 ground-actioninstances being solved by some planners(although the problem instance with 24,290actions involved producing a plan with only 4steps). However, problems with over 10,000ground-action instances still proved hard ingeneral, with several planners failing on theselarge problems and inconsistent performancebeing demonstrated between the threshold of10,000 ground-action instances and the largestsolvable problems. More than half the mysteryproblems presented in the competition wereunder 10,000 ground-action instances in size.Of the 13 problems that exceeded this size, 5were unsolvable, and 3 were solved in the com-petition by at least 1 of the planners. The otherfive problems were all solvable, at least withSTAN, although buffer sizes were set too small inthe competition configuration for it to solvethem. STAN uses an object-filtering mechanismthat worked successfully in both the mysteryand mystery prime domains to cut the num-bers of ground-action instances so that in themystery domain, none of the problems pre-sented actually produced more than 8,000ground-action instances.

Interestingly, of the 30 mystery-domainproblems presented in the competition, 11were proved unsolvable by at least 1 plannerrather than simply proposed unsolvablebecause of a lack of resources. This distinctionwas not used in the competition (presumablybecause of the difficulty in distinguishing anaccurate claim that a problem is unsolvablefrom a lucky guess when resources run out),but the three GRAPHPLAN-based planners used inthe competition are capable of identifyingproblems of this kind (at least in principle);however, BLACKBOX can identify some unsolv-able problems (those in which some of thegoals are unreachable or are pairwise unreach-

remains a difficult task to assess the relativeperformances of the planners. A summary ofthe results was presented at the event, but itwas crude in that it failed to differentiatebetween good performance on simple prob-lems and good performance on hard problems.This masking is amplified by the fact that eachof the planners faced minor problems becauseof program bugs that made what would haveotherwise been simple problems appear hardfor these planners. In addition, in the mysterydomain, several problems were unsolvable(and proven such by some of the planners), butthese problems were ignored in the resultssummary. Overall, the results of the individualplanners all showed strengths and weaknesses,and it is not surprising that a simple directcomparison proved unsatisfactory in the com-petition. This challenge remains unsolved forfuture competition events, and the communitymust be wary of setting up targets (in whateverform) that oversimplify the objectives of plan-ners. The problem that evaluation representedsuggests that these objectives remain a com-plex and clouded issue.

It is worth emphasizing that three of theplanners running in the STRIPS track (BLACKBOX,IPP, and STAN) all produce parallel optimal plans(although IPP does not guarantee to do so whenusing its RIFO machinery), which, when lin-earized, will not always lead to optimal sequen-tial plans. This can explain the discrepancies inplan lengths discovered by these planners. Ingeneral, the length of the linearized plan is dif-ficult to control when using a mechanism thatproduces optimal parallel plans. HSP produceslinearized plans but does not support claims foroptimality. An interesting example to consideris the seventh logistics problem in the firstround of the competition, where HSP producedthe only plan found by any of the planners.This plan was 112 (sequential) steps long. STAN

has subsequently demonstrated that there is a(sequential) 37-step plan!

The First Round: STRIPS TrackThe competition involved the use of fivedomains in the first round and three in the sec-ond. The first round used the gripper, themovie, the logistics, the mystery, and the mys-tery prime domains. The last two are variationson transportation domains with limited fuel. Inthe last domain, the fuel can be piped betweennodes in the transportation network, but in themystery domain, the fuel is held at its startingdepots. In the first round, 30 problems werepresented of each type, except for gripper, inwhich just 20 problems were presented.

The characteristics of the last three domains

It is worthemphasizingthat three ofthe planners

running in theSTRIPS track

(BLACKBOX, IPP,and STAN) all

produceparallel

optimal plans(although IPP

does not guarantee todo so when

using its RIFO

machinery),which, when

linearized,will not

always lead tooptimal

sequentialplans.

Articles

24 AI MAGAZINE

able). STAN was fastest in demonstrating 10 ofthe 11 problems to be unsolvable (on average,it was 15 times faster than its nearest rival atshowing these problems unsolvable), and IPP

was fastest on the remaining problem.The movie domain presented no difficulty to

any of the planners, and performance timeswere so small that they cannot really be useful-ly compared. This domain was included toconsider its effect in the ADL track, as was dis-cussed in SGP. The gripper domain is peculiar inthat it is a domain that is intuitively easy tosolve—the problems present no difficulty for ahuman planner—yet only HSP was able to solveinstances involving more than 12 balls. Perfor-mance of all the other planners deterioratedexponentially with the increasing problemsize. IPP used RIFO in this problem, allowing it tosolve more problems than would otherwisehave been possible, by excluding one gripperfrom consideration. HSP produced similarlysuboptimal solutions by carrying only one ballon each trip. This problem is so hard becausethere are so many ways in which the actionscan almost solve the problem with a shortersequence than is actually required to complete-ly solve it, and these garden-path sequencesincrease exponentially with two grippersdespite the fact that even the hardest probleminstance presented in this domain containsonly 340 ground-action instances! One of thereasons this problem appears to be so simplefor a human planner is that a human can seethe essential symmetry to the problem and canexploit it to simplify the problem to the extentthat it becomes trivial. This problem is one thathighlights a critical weakness of the currentfast-planning technology.

In round one, of the 140 problems present-ed, 98 problems were either solved or provedunsolvable by at least 1 of the planners. At leasteight of the remaining problems have beensolved by one or more of the planners since thecompetition. The problem domain that proveddifficult for all the planners was the logisticsdomain, accounting for 25 of the unsolvedproblems.

The Second Round: STRIPS TrackIn the second round, a new domain was intro-duced: the grid domain. IPP managed to solvethree of the five problems presented, usingstrong RIFO pruning and producing suboptimalplans. The other planners managed only oneproblem in this domain. The problem sizesranged from 2,609 ground-action instances forthe simplest through 16,239 (unsolved by anyplanner). IPP solved an instance with 11,150ground-action instances.

The other domains used were logistics andmystery prime. All but one of the problems pre-sented in these domains were solved by at leastone planner. All the logistics problems wereunder 5,000 ground-action instances (althoughthe 2 hardest of these involved few planes, andin 1, many trucks made for a big search space).All but one of the mystery prime problemswere under 6,000 ground-action instances, andthe exception contained 19,730 ground-actioninstances. This instance defeated all the plan-ners in the competition, although at least oneof the planners has since generated a 33-stepplan that solves it. With the exception of a sin-gle instance (traced to a trivial program bug),all the other problems were solved by all theplanners. Thus, of the 15 problems presentedin round 2, 12 were solved in the competition,and at least 1 further problem has been solvedsince. The grid domain proved the leasttractable, with only IPP making much headwayand even then only producing suboptimalplans. A critical problem in this domainappears to be that the plans are comparativelylong with no parallel steps. Thus, graph con-struction is an expensive process if there is nopruning. Failure to complete the graph-con-struction phase to achieve a single search wasthe reason for failing to solve these problems inall the GRAPHPLAN-related planners.

In light of the observations already madeabout the sizes of the problems measured interms of the numbers of ground-actioninstances, it is interesting to consider thebehavior of IPP using the RIFO subsystem, whichfilters some objects and action instances fromdomains prior to planning. RIFO was not usedby IPP in the first round of the competition,except in the gripper domain in which it wasturned on by hand. In this domain, RIFO iden-tifies one gripper hand as irrelevant, so thatonly one ball is carried at a time, and plansbecome much longer.

In the second round, IPP was run using theRIFO metastrategy, discussed earlier, which sig-nificantly reduced the search space for theplanner. As table 1 shows, only eight problemscan be solved without RIFO, but with the metas-trategy, three more solutions are found. Themetastrategy combines only a small subset ofpossible RIFO pruning heuristics. Discoveringwhich combinations of heuristics work forwhich domains and problems is still a matter ofexperimentation. In the competition, no suchexperimentation was possible; therefore, theselection of the heuristics was done long beforethe competition, following the intuition thatone should try the strongest-possible heuristicfirst because it leads to the smallest search

The problemdomain thatproved difficult forall the planners wasthe logisticsdomain,accounting for 25 of the unsolvedproblems.

Articles

SUMMER 2000 25

failed. In the movie domain, however, themetastrategy using the weaker heuristic suc-ceeds in determining that only a handful(between five and nine) of the initial facts ineach problem are relevant.

CommentaryThroughout the competition, no planner suc-cessfully solved any problem instance thatinvolved more than 60,000 ground actions. Infact, without filtering techniques to reduce thenumber of ground actions, no planner solvedproblems with more than 25,000 groundactions, and reliable performance was restrictedto problems with fewer than 10,000, or so,ground-action instances. In domains with hard-er search-space growth problems, even feweraction instances could be handled. Althoughthe number of ground-action instances is notan infallible guide to the difficulty of problems,it is clearly an important indicator and stronglysuggests that at least for planners that workwith ground actions during plan construction,there is much work to be done on the filteringprocess that could remove unnecessary actionsfrom the problem space. It is interesting toobserve that problem 15 of the first round in

space and then relax it by allowing moreactions if no solution is found. The thresholdparameters that decide whether RIFO is activat-ed at all were set to 3500 actions and 35 objectsafter a few trials on some of the competitionproblems.

The ADL TrackIn the ADL track, the same collection ofdomains and problems was used as in the firstround of the STRIPS track. Of course, the domainencodings were reconstructed to exploit theADL features. Only SGP and IPP competed in thistrack.

IPP solved all problems from the moviedomain, the first 5 problems from the gripperdomain containing 20 test problems, 13 of 30problems in the mystery and mystery primedomains, and 3 of 30 problems in the logisticsdomain. In total, it solved 69 problems inapproximately 20 minutes, including all the 38problems SGP solved. RIFO would not haveimproved the performance of IPP in this roundbecause on most of the problems, it makes theplanner incomplete; so, the planner wouldhave had to find the solution in the originalsearch space after all reduction attempts had

Articles

26 AI MAGAZINE

Problem Original RIFO Strong RIFO Weaklog1 571/25 (13 — —log2 502/21 (20) — —log3 958/26 (27) — —log4 3561/42 (—) 189/30 (—) —log5 4985/53 (—) 119/26 (31) —mprime1 7809/36 (5) 7/9 (�) 11/9 (�)mprime2 3281/32 (8) — —mprime3 97259/68 (—) — —mprime4 8485/42 (5) 7/10 (4) —mprime5 6773/22 (6) — —grid1 2610/38 (14) — 80/11 (20)grid2 4501/50 (—) 69/19 (�) 260/19 (31)grid3 7256/64 (—) 315/21 (�) 557/21 (—)grid4 11151/80 (—) 135/24 (47) —grid5 16240/98 (—) 1481/49(—) —

Table 1. This Table Shows the Number of Ground Actions and Objects in the Original Problem Descriptions and, in Brackets, the Length of the Plan

(If IPP Could Find One Given a 10-Minute Central Processing Unit [CPU] Time Limit and a 120-Megabyte Memory Limit on a SPARC 1.170).

It also shows the number of selected ground actions and objects after the stronger and weaker RIFO pruningprocesses. � shows the planning problem became unsolvable, — means RIFO was inactive, and (—) means noplan was found because the planner either exceeded the CPU time or memory limit.

the logistics domain lies beyond the scope ofthe competition planners even with a manualfiltering of the domain objects, removing irrel-evant packages and trucks, leaving as few as3,006 ground actions. The difficulty of thisproblem is not easy to understand becausethere appears to be no pressure on the aircraftresources (with six aircraft available in just threecities), although the fact that the cities eachhave six locations could well be significant.

Although the number of ground actions rep-resents an important element in determiningplanner performance, the number of domainstates is also a factor. Indeed, for planning sys-tems that do not instantiate operators beforeplanning, the number of states might be a moreimportant feature of the problems. It is not easyto compute the numbers of states for someproblems (particularly mystery and mysteryprime domains) because reachable states arenontrivial to determine. However, for the logis-tics and gripper domains, it is straightforward.In the logistics problems that were solved inround 1, the state spaces contained between1010 states (problem 5) and 8 x 1025 states (prob-lem 11). These are clearly large state spaces. Bycontrast, gripper problems define state spacescontaining a mere 256 states (problem 1) to68,608 states (problem 4) to more than 4 x 1015

states in the largest (problem 20), solved onlyby HSP. The 376,832-state problem (problem 5)was beyond all the planners but HSP, and nonesolved it optimally. This analysis gives an indi-cation of the dramatic contrast between theproblems in which the planning technology isperforming well and the problems where itdemonstrates fundamental weaknesses.

As a final summary of the results from theSTRIPS track, figures 2, 3, 4, and 5 attempt to givea broad indication of the relative performancesof the planners. Figure 2 shows the cumulativenumbers of problems solved with increasingtime (in milliseconds). It should be noted thatHSP solved more problems than any other plan-ner (91 problems solved), but the graph hasbeen drawn with a 15-second cutoff to allow aclearer view of the important data: Twenty-three of the problems solved by HSP took itbetween 15 seconds and 14 minutes. Similarly,IPP required more than 15 seconds on 8 addi-tional problems, and STAN required more than15 seconds on an additional 3 problems. Thegraph demonstrates that the planners hadremarkably similar performances, solving thebulk of problems in less than 10 seconds. Thefirst 30 problems, or so, are from the moviedomain, where it can be seen that the compi-lation overhead paid by HSP gives it compara-tively poorer performance. Figure 3 shows a

similar plot, indicating cumulative numbers ofproblems solved in increasing numbers of plansteps. In this case, no cutoff has been applied,and HSP is alone in solving the last 17 problems.Once again, the graph emphasizes the similar-ities in performance, although BLACKBOX

appears to generate slightly better–qualityplans. The significant step at seven steps is theresult of the movie domain, where all plans areseven steps long. Figure 4 shows plan lengthsplotted against times taken to produce them,revealing that plan length has surprisingly lim-ited impact on the time taken to solve a prob-lem, with comparatively short plans oftenproving as challenging to produce as longerplans. The trail of points curving above themain cluster is the sequence of results for HSP inthe gripper domain.

Although there is no strong correlationbetween plan length and planner performance,there is a more suggestive correlation (valued at0.69 across all the planners) between problemsize and planner performance, as can be seen infigure 5. Of course, the specific domain has asignificant impact on the difficulty of a prob-lem (as shown by the gripper problems partic-ularly), but the size of a problem file measuredpurely by byte count is a good indicator of dif-ficulty. For example, the logistics problems thatwere solved were under 4,500 bytes in size,with the hardest logistics problem that wassolved the only one over 4,000 bytes (round 1,problem 7, solved only by HSP). Similarly, mys-tery prime problems under 5,000 bytes were allsolved, but problems between 5,000 and 6,000bytes proved hard, and those above 6,000 byteswere unsolvable for these planners. Becauseproblem size measured in this way is a goodindicator of the number of objects in a problemand, therefore, of the number of ground-actioninstances, these data add further evidence tosupport the view that the number of groundactions is a key factor in determining the per-formance of these planners.

Other ChallengesAlthough the size of domains, particularlymeasured by numbers of instantiated groundactions, represents a critical challenge to plan-ning technology, the domains in the competi-tion and others explored independently by thecompetitors have revealed other importantproblems that must be addressed.

For example, the gripper domain highlightsthe combinatorial costs of exploring a large(and largely redundant) search space. Thesearch must be reduced when possible, and inthe gripper domain in particular, there is ahuge potential for reduction in search costs. HSP

Articles

SUMMER 2000 27

from airports, which sandwich the problem offlying packages between cities, are relativelyeasy but often generate large collections ofredundant search paths. A planner that cantackle the core problem, the flying of packagesbetween cities, and propagate necessary con-straints outward to the simpler ends of the planwill have the advantage. This phenomenonrepresents a single instance of a more generalissue: Many planning problems are not uni-formly hard. A planner that can identify thehardest parts of a planning problem and con-centrate on solving those parts first, propagat-ing constraints toward the easier, initially lessconstrained, parts of the problem, will performfar better than a planner that always tackles theproblems from the same place.

The mystery domain is a fascinating varia-tion on the transportation theme, introducingresource limits on carrier capacity and fuel aswell as an underlying route-planning problem.This domain (and the mystery prime variation)has the potential to produce problems that are

demonstrated that heuristic search in thisspace can offer dramatic benefits. HSP solved allthe gripper problems, where other plannersmanaged at most four or five. This domainalone accounts for three-quarters of HSP’ssignificant lead over the other competitors inthe number of problems solved. HSP’s heuristiceffectively ignored the possibility of transport-ing the balls in pairs and solved the problemsby transporting one ball at a time between therooms. IPP, which succeeded in solving five ofthese problems, used its RIFO machinery, whichcaused it to ignore one of the grippers, alsoleading to solutions in which only one ball wastransported at a time. None of the plannerscould exploit the incredible degree of symme-try in the problem to cut the search space fromits exponential size to reflect the trivial under-lying problem.

The logistics domain has the interestingproperty that the hardest part of the probleminstances usually lies in the middle of the plan.The problems of transporting packages to and

Articles

28 AI MAGAZINE

0 2000 4000 6000 8000 10000 12000 14000 16000

Millisecs

IPPSTAN

BlackboxHSP

Figure 2. Cumulative Numbers of Problems Solved against Time (Data Cut Off at 15 Seconds).

hard for a wide variety of reasons. The lack ofresources can make the route-planning prob-lem dramatically more complex because itinteracts with the transportation of multiplepackages. Similarly, the capacity limits caninteract with fuel shortages to make it necessaryto carefully coordinate the actions of carriers tocooperate in the transportation of objects. Byvarying the size of fuel dumps, the problemscan range from simple route planning (withabundant fuel) to complex scheduling of inter-acting carrier actions (with limited fuel).

The grid domain, used in the second roundof the competition, represents a further trans-portation domain on a grid-shaped networkbut with constraints on the access to certainlocations based on keys of appropriate shapesfor the corresponding locks. This domain rep-resents a difficult search domain, primarilybecause the domain forces the planner to useonly one useful action at each layer. Tacklingthis problem requires an effective filter toremove ground actions, partly to reduce the

cost of manipulating the domain itself butmainly to reduce the number of redundantsearch paths, corresponding to multiple actualpaths through the grid itself.

The FutureThe first planning competition proved anextremely stimulating event for the planningcommunity. It has brought into sharp focus thestate of the art in domain-independent plan-ning and has offered the opportunity to iden-tify several essential issues for the planningcommunity to address. The first of these issuesis the development of a common planningdomain– description language, currently tak-ing the form of PDDL. Although PDDL must beconsidered still under development, the effortalready invested in its development is animportant step toward allowing planners to beusefully compared and in constructing a gener-ally useful repository of planning-domainproblems. Perhaps the closest the community

Articles

SUMMER 2000 29

0 20 40 60 80 100 120 140 160 180

Plan length

IPPSTAN

BlackboxHSP

Figure 3. Cumulative Numbers of Problems Solved against Plan Length.

advances, attempt to coalesce different butcompatible approaches, and avoid repeatedredevelopment of the same basic planning soft-ware tools at dozens of different sites. A com-mon domain description language is only onestep in addressing this issue. It also requiresthat components of planning systems be madeavailable to the community, particularly in sta-ble and adaptable forms. PDDL parsers arealready being made available, and some of thecode used in the competition is available assource to be modified, extended, or adapted.These steps are essential in supporting theefforts of the community to advance beyondthe current state of the art.

A third issue arises from the hope to pushthe boundaries of the current state of the artin planning: The benchmark domains that areused to establish the current levels of perfor-mance and set targets for the next generationof planners must be chosen with care. Manyof the standard benchmark domains weredesigned with specific agendas. Often, they

has come to having such a resource in the pastis the collections of problems included withparticular planner releases, where those plan-ners were widely used (UCPOP being an obviousexample [Penberthy and Weld 1992]).

PDDL has not yet been put to the test in itsprovision of HTN expressiveness, and ques-tions remain about the ADL components of thelanguage. In particular, it has been proposedthat nested conditional effects are an unneces-sary element of the language. Provision for theexpression of resource-constrained planningproblems also remains untested.

These observations highlight a second issuefor the community to confront: It remains dif-ficult to compare planners on an equal footing.Planners can differ widely in terms of theexpressiveness of the domain description lan-guage they handle, the expressiveness of theplans they produce, the speed of planning, andthe range of domains they can successfullytackle. To make coherent progress in the field,it is necessary to be able to compare potential

Articles

30 AI MAGAZINE

0 10000 20000 30000 40000 50000 60000 70000 80000 90000

Millisecs

IPPSTAN

BlackboxHSP

Figure 4. Plan Length against Planning Time (Movie Domain Ignored and Data Cut Off at 100 Seconds).

were designed to showcase specific expressivefeatures of particular languages and are unin-teresting when expressed in STRIPS (the moviedomain is one example and Pednault’s [1989]BRIEFCASE WORLD another). Others are designedto showcase particular planning strategies (therocket domain used for GRAPHPLAN, for exam-ple [Blum and Furst 1997]) or demonstrateflaws in certain planning strategies (for exam-ple, the gripper domain). Although thesedomains retain some interest for these rea-sons, it is important for the planning commu-nity to look beyond these “simple” problemsand identify more significant benchmarksthat represent tasks that demonstrate plan-ning’s coming of age to the wider research andapplications community. The greatest chal-lenge for the community, then, is to take thelessons learned from the competition andfrom the research that is current and showhow planning can move on from these prob-lem domains.

ReferencesAnderson, C., and Weld D. 1998. Conditional Effectsin GRAPHPLAN. In Proceedings of Artificial IntelligencePlanning Systems, 44–53. Menlo Park, Calif.: AAAIPress.

Bayardo, R. J., Jr., and Schrag, R. C. 1997. Using CSP

Look-Back Techniques to Solve Real-World SAT

Instances. In Proceedings of the Fourteenth NationalConference on Artificial Intelligence, 203–208. Men-lo Park, Calif.: American Association for ArtificialIntelligence.

BLACKBOX. 1998. BLACKBOX Web Site. AT&T ResearchLaboratories, Florham Park, New Jersey. Available atwww.research.att.com/~kautz/blackbox.

Blum A., and Furst, M. 1997. Fast Planning throughPlanning Graph Analysis. Artificial Intelligence90(1–2): 279–298.

Bonet B., and Geffner, H. 2000. Planning as HeuristicSearch: New Results. In Proceedings of the Fifth Euro-pean Conference on Planning (ECP’99). New York:Springer-Verlag. Forthcoming.

Bonet, B.; Loerincs, G.; and Geffner, H. 1997. ARobust and Fast Action Selection Mechanism for

Articles

SUMMER 2000 31

0 10000 20000 30000 40000 50000 60000 70000 80000 90000

Millisecs

IPPSTAN

BlackboxHSP

Figure 5. Problem File Size against Time (Plot Only Shows Problems Solved by All the Planners).

Worlds Based on Greedy Regression Tables.In Proceedings of the Fifth European Con-ference on Planning (ECP’99). Berlin:Springer-Verlag. Forthcoming.

Selman, B.; Kautz, H.; and Cohen, B. 1994.Noise Strategies for Local Search. In Pro-ceedings of the Twelfth National Confer-ence on Artificial Intelligence, 337–343.Menlo Park, Calif.: American Associationfor Artificial Intelligence.

SGP. 1998. SGP Web Site. University of Wash-ington, Seattle, Washington. Available atwww.cs.washington.edu/research/pro-jects/ai/www/sgp.html.

Smith, D., and Weld, D. 1999. TemporalGRAPHPLAN with Mutual Exclusion Reason-ing. In Proceedings of the Sixteenth Inter-national Joint Conference on ArtificialIntelligence, 326–337. Menlo Park, Calif.:International Joint Conferences on Artifi-cial Intelligence.

Smith, D. E., and Weld, D. S. 1998a. Con-formant GRAPHPLAN. In Proceedings of theFifteenth National Conference on ArtificialIntelligence, 889–896. Menlo Park, Calif.:American Association for Artificial Intelli-gence.

Smith, D. E., and Weld, D. S. 1998b. Incre-mental GRAPHPLAN. Technical Report, TR 98-09-06, University of Washington.

STAN. 1998. STAN Web Site. University ofDurham, Durham, United Kingdom. Avail-able at www.dur.ac.uk/CompSci/research/stanstuff/planpage.html.

Weld, D. S.; Anderson, C. R.; and Smith, D.E. 1998. Extending GRAPHPLAN to HandleUncertainty and Sensing Actions. In Pro-ceedings of the Fifteenth National Confer-ence on Artificial Intelligence, 897–904.Menlo Park, Calif.: American Associationfor Artificial Intelligence.

Derek Long is a lecturerin computer science atDurham University. Hejoined the department in1995 after lecturing atUniversity College Lon-don for six years. Heobtained his doctorate atthe Programming Re-

search Group, Oxford University, in 1989,with work exploring reasoning by analogy.Since then, he has pursued interests in rea-soning techniques in general and planningin particular. He has worked in close collab-oration with Maria Fox for the last 10 years.He is currently exploring automaticdomain analysis techniques and the auto-matic extraction of planning domain fea-tures. His e-mail address isD.P.Long@dur.ac.uk.

Henry Kautz is an associate professor with

the Envelope: Planning, Propositional Log-ic, and Stochastic Search. Paper presentedat the Fourteenth National Conference onArtificial Intelligence, 27–31 July, Provi-dence, Rhode Island.

Kautz, H., and Selman, B. 1992. Planning asSatisfiability. In Proceedings of the TenthEuropean Conference on Artificial Intelligence(ECAI-92), 359–363. Vienna: Wiley.

Koehler, J.; Nebel, B.; Hoffmann, J.; andDimopoulos, Y. 1997. Extending PlanningGraphs to an ADL Subset. In Proceedings ofthe Fourth European Conference on Planning(ECP-97), 273–285. Berlin: Springer-Verlag.

Korf, R. 1990. Real-Time Heuristic Search.Artificial Intelligence 42(2–3): 189–211.

Li, C. M., and Anbulagan. 1997. HeuristicsBased on Unit Propagation for SatisfiabilityProblems. In Proceedings of the FifteenthInternational Joint Conference on ArtificialIntelligence, 366–371. Menlo Park, Calif.:International Joint Conferences on Artifi-cial Intelligence.

Long, D., and Fox, M. 1999. The EfficientImplementation of the Plan Graph in STAN.Journal of Artificial Intelligence Research10:87–115.

McDermott, D. 1996. A Heuristic Estimatorfor Means-Ends Analysis in Planning. InProceedings of the Third International Confer-ence on AI Planning Systems (AIPS-96),142–149. Menlo Park, Calif.: AAAI Press.

McDermott, D., and the AIPS PlanningCompetition Committee. 1998. PDDL—ThePlanning Domain Definition Language.Available at ftp.cs.yale.edu/pub/mcder-mott/software/pddl.bar.gz.

Nebel, B.; Dimopoulos, Y.; and Koehler, J.1997. Ignoring Irrelevant Facts and Opera-tors in Plan Generation. In Proceedings ofthe Fourth European Conference on Planning(ECP-97), 338–350. Berlin: Springer-Verlag.

Nilsson N. 1980. Principles of Artificial Intel-ligence. San Francisco, Calif.: Morgan Kauf-mann.

Pearl, J. 1983. Heuristics. San Francisco,Calif.: Morgan Kaufmann.

Pednault, E. 1989. ADL: Exploring the Mid-dle Ground between STRIPS and the Situa-tion Calculus. In Proceedings of the FirstInternational Conference on Principles ofKnowledge Representation and Reasoning,324–332. San Francisco, Calif.: MorganKaufmann.

Penberthy, J., and Weld, D. S. 1992. UCPOP:A Sound and Complete Partial-Order Plan-ner for ADL. Paper presented at the ThirdInternational Conference on Principle ofKnowledge Representation and Reasoning(KR-92), October, Cambridge, Massachu-setts.

Refanidis, I., and Vlahavas, I. 2000. GRT: ADomain-Independent Heuristic for STRIPS

Planning. In Proceedings of the FourteenthNational Conference on Artificial Intelli-gence, 714–719. Menlo Park, Calif.: Ameri-can Association for Artificial Intelligence.

Ernst, M. D.; Millstein, T. D.; and Weld, D.S. 1997. Automatic SAT Compilation ofPlanning Problems. In Proceedings of theFifteenth International Joint Conferenceon Artificial Intelligence, 1169–1177. Men-lo Park, Calif.: International Joint Confer-ences on Artificial Intelligence.

Fink, E., and Veloso, M. 1996. Formalizingthe PRODIGY Planning Algorithm. In NewDirections in Artificial Intelligence Planning,eds. M. Ghallab and A. Milani, 261–272.Amsterdam, The Netherlands: IOS.

Fox, M., and Long, D. 1999. The Detectionand Exploitation of Symmetry in PlanningDomains. In Proceedings of the SixteenthInternational Joint Conference on ArtificialIntelligence, 956–961. Menlo Park, Calif.:International Joint Conferences on Artifi-cial Intelligence.

Fox, M., and Long, D. 1998. The AutomaticInference of State Invariants in TIM. Journalof Artificial Intelligence Research 9:367–422.

Gazen, B., and Knoblock, C. 1997. Combin-ing the Expressivity of UCPOP with the Effi-ciency of GRAPHPLAN. In Proceedings of theFourth European Conference on Planning(ECP-97), 221–233. Berlin: Springer-Verlag.

Gomes, C. P.; Selman, B.; and Kautz, H.1998. Boosting Combinatorial Searchthrough Randomization. Paper presentedat the Fifteenth National Conference onArtificial Intelligence, 26–30 July, Madison,Wisconsin.

Hoffmann, J., and Koehler, J. 1999. A NewMethod to Index and Query Sets. In Pro-ceedings of the Sixteenth InternationalJoint Conference on Artificial Intelligence,462–467. Menlo Park, Calif.: InternationalJoint Conferences on Artificial Intelligence.

HSP. 1998. HSP Web Site. Universidad SimonBolivar, Caracas, Venezuela. Available atwww.ldc.usb.ve/~hector.

IPP. 1998. IPP Web Site. University ofFreiburg, Freiburg, Germany. Available atwww. in format ik .un i - f r e iburg .de /~koehler/ipp.html.

Kambhampati, S. 1999. ImprovingGRAPHPLAN’s search with EBL and DDB Tech-niques. In Proceedings of the SixteenthInternational Joint Conference on ArtificialIntelligence, 982–987. Menlo Park, Calif.:International Joint Conferences on Artifi-cial Intelligence.

Kambhampati, S.; Lambrecht, E.; and Park-er, E. 1997. Understanding and ExtendingGRAPHPLAN. In Proceedings of the Fourth Euro-pean Conference on Planning (ECP-97),260–272. Berlin: Springer-Verlag.

Kautz, H., and Selman, B. 1996. Pushing

Articles

32 AI MAGAZINE

the Department of Computer Science andEngineering at the University of Washing-ton. He was previously department head ofthe AI Principles Research Group at AT&TBell Labs. He is a fellow of the AmericanAssociation for Artificial Intelligence(AAAI) and a member of the AAAI Execu-tive Council and has received the Interna-tional Joint Conferences on Artificial Intel-ligence Computers and Thought Award. Heis known for his work on planning andplan recognition, efficient deduction andsearch, and the logical foundations of AI.

Bart Selman is an associate professor ofcomputer science at Cornell University. Hepreviously was a principal scientist at AT&TBell Laboratories. He holds a Ph.D. and anM.Sc. in computer science from the Univer-sity of Toronto and an M.Sc. in physicsfrom Delft University of Technology. Hisresearch has covered many areas in AI andcomputer science, including tractable infer-ence, knowledge representation, naturallanguage understanding, stochastic searchmethods, theory approximation, knowl-edge compilation, planning, default rea-soning, and the connections between com-puter science and physics (phase-transitionphenomena). He has received four bestpaper awards at the American and Canadi-an national AI conferences and at the Inter-national Conference on Knowledge Repre-sentation. He holds a National ScienceFoundation Career Award and is an AlfredP. Sloan research fellow.

Blai Bonet received an engineering andmaster degree in computer science fromUniversidad Simon Bolivar in Venezuela.Among his interests are planning andscheduling with complete and incompleteinformation. He is currently studying forhis Ph.D. at the University of California atLos Angeles. His e-mail address isbonet@cs.ucla.edu.

Hector Geffner received his Ph.D. from theUniversity of California at Los Angeles witha dissertation that was co-winner of the1990 Association of Computing MachineryDissertation Award. He then worked as staffresearch member at the IBM T. J. WatsonResearch Center in New York for two yearsbefore returning to the Universidad SimonBolivar in Caracas, Venezuela, where hecurrently teaches. He is interested in mod-els of reasoning, action, planning, andlearning.

Jana Koehler is a project manager at theSchindler Lifts Ltd. Technology Manage-ment Center in Switzerland. She is current-ly developing elevator control software

based on AI planning techniques. Beforejoining Schindler, she worked at the Ger-man Research Center for AI and held anassistant professorship at the University ofFreiburg, Germany, where she developedthe IPP planning system. Currently, herinterests are centered on real-time planningand scheduling and software verification.

Michael Brenner worked on IPP fromspring 1997 until fall 1998 when he wentto Paris, France, where he received a Mas-ter’s in cognitive science. He is now back atthe University of Freiburg as a member ofthe Graduate School on Human andMachine Intelligence, working on multia-gent planning and plan execution indynamic environments.

Joerg Hoffmann was part of the IPP teamduring the entire project period, from Feb-ruary 1997 until August 1999. He receiveda Master’s in computer science in March1999. Currently, he is a member of theGraduate School on Human and MachineIntelligence at the University of Freiburg,where he is developing a new planning sys-tem based on heuristic forward search.

Frank Rittinger was a student member ofthe IPP team for the entire project period.Currently, he is working on his Master’sthesis at the chair for software engineeringat the University of Freiburg, where he isinvestigating the security aspects of CORBA

with formal methods.

Corin Anderson is a fourth-year Ph.D. can-didate in the Computer Science and Engi-neering Department at the University ofWashington. He earned Bachelors of Sci-ence in computer science and mathematicsfrom the University of Washington in1996, and he received a Master’s in com-puter science from the same university in1998. Anderson’s primary interests includeweb site management, planning and sched-uling algorithms, and intelligent systems.

Daniel S. Weld received Bachelor’s in com-puter science and biochemistry at Yale Uni-versity in 1982. He received a Ph.D. fromthe Massachusetts Institute of TechnologyArtificial Intelligence Lab in 1988 andimmediately joined the Department ofComputer Science and Engineering at theUniversity of Washington, where he is nowprofessor. Weld received a PresidentialYoung Investigator’s Award in 1989 and anOffice of Naval Research Young Investiga-tor’s Award in 1990 and is a fellow of theAmerican Association for Artificial Intelli-gence. Weld is on the advisory board of the

Journal of AI Research, has been guest editorfor Computational Intelligence and ArtificialIntelligence, and was program chair of the1996 National Conference on ArtificialIntelligence. Weld founded Netbot, Inc.,which developed the JANGO comparisonshopping agent (now part of the ExciteShopping Channel); AdRelevance, Inc., anonline competitive monitoring service forinternet advertisements; and Nimble.com,which develops XML query-processing tech-nology. Weld has published about 100technical papers on AI, planning, data inte-gration, and software agents.

David E. Smith is a head of the planningand scheduling group in the Computation-al Sciences Division at NASA Ames ResearchCenter, where he is involved in contin-gency planning for rover operations andresearch on temporal planning techniques.Prior to joining NASA, he was a senior sci-entist at the Rockwell Science Center,where his work led to a commercially field-ed system for newspaper imposition plan-ning and to the DESIGN SHEET conceptualdesign system in use throughout Rockwelland Boeing. He received his Ph.D. fromStanford University in 1985. Currentresearch interests include temporal plan-ning, planning under uncertainty, con-straint-satisfaction approaches to planning,and preprocessing techniques for planning.

Maria Fox is a reader in computer scienceat Durham University. She joined the uni-versity in 1995 after 6 years at UniversityCollege London. She obtained her doctor-ate in 1989 and has worked in aspects ofplanning, both theoretical and algorithmic,since that time. Currently, her primaryinterests lie in the development of auto-matic domain analysis techniques andtheir exploitation in efficient planning sys-tems.

Articles

SUMMER 2000 33

Complete Your AI Planning Library with these Volumes from AAAI Press

Proceedings of the Fifth International Conference on Artificial Intelligence Planning Systems

Edited by Steve Chien, Subbarao Kambhampati, and Craig A. Knoblock

The International Conference on AI Planning and Scheduling has evolved into the premier forum for researchers and practitioners in planning and scheduling. In

recent years, artificial intelligence planning and scheduling have emerged as technologies critical to production management, space systems, the Internet, and mili-

tary applications. These proceedings contain the papers presented at the conference. While a majority deal with algorithms for planning and scheduling in a variety

of environments—be they static or dynamic, deterministic or stochastic, or completely or partially observable—there are also papers on planner implementation and

applications of planning and scheduling research.

ISBN 1-57735-111-8, 412 pp., index, $50.00 softcover

Advanced Planning Technology: Technological Achievements of the ARPA/ Rome Laboratory Planning Initiative

Edited by Austin Tate

This volume presents the range of technological results that have been achieved with the ARPA/Rome Laboratory Planning Initiative. Five lead articles introduce the

program and its structure and explain how the more mature results of individual projects are transferred through technology integration experiments to fielded appli-

cations. The main body of this volume comprises one paper from each group or project within ARPI. Each of these papers seek to introduce the technological contri-

bution of the group’s work and provide a pointer to other work of that group.

Proceedings of the Fourth International Conference on Artificial Intelligence Planning Systems

Edited by Reid Simmons, Manuela Veloso, and Stephen Smith

The 1998 proceedings includes the work of AI researchers in all aspects of problems in planning, scheduling, planning and learning, and plan execution, for dealing

with complex problems. Papers in this proceedings range from new theoretical frameworks and algorithms for planning to practical implemented applications in a variety

of domains.

Proceedings of the Third International Conference on Artificial Intelligence Planning Systems

Edited by Brian Drabble

The 1996 organizers have tried to bring together a diverse group of researchers representing the various aspects and threads of the planning community. As with all

previous AIPS conferences, the papers have been selected on technical merit. They include practical algorithms for achieving efficiency in planning, formal results on

the completeness and complexity of planning domains, classical planning, formal specification of planning knowledge and domains, constraint satisfaction techniques

and their application, reactive planning, and repair and consistency checking in schedules.

Proceedings of the Second International Conference on Artificial Intelligence Planning Systems

Edited by Kristian Hammond

The papers in this work present current state-of-the-art research in AI planning systems. Reviewed papers present research on how to generate plans to succeed in uncer-

tain environments, improving robot plans during their execution, managing dynamic temporal constraint networks, and solving time-critical decision-making problems.

To order, call 650-328-3123 or send e-mail to orders@aaai.org, or visit our website at www.aaai.org/Press/AAAI MEMBERS: DEDUCT 20%!

The AAAI Press ■ 445 Burgess Drive ■ Menlo Park, California 94025

34 AI MAGAZINE

The AIPS-98 planning competition

Documents