+ All Categories
Home > Documents > Lookahead pathology in real-time pathfinding

Lookahead pathology in real-time pathfinding

Date post: 13-Feb-2016
Category:
Upload: robert
View: 23 times
Download: 0 times
Share this document with a friend
Description:
Lookahead pathology in real-time pathfinding. Mitja Luštrek Jožef Stefan Institute, Department of Intelligent Systems Vadim Bulitko University of Alberta, Department of Computer Science. Introduction Problem Explanation. Agent-centered search (LRTS). Lookahead area. Current state. - PowerPoint PPT Presentation
Popular Tags:
40
pathology in real-time pathfinding Mitja Luštrek Jožef Stefan Institute, Department of Intelligent Systems Vadim Bulitko University of Alberta, Department of Computer Science
Transcript
Page 1: Lookahead pathology in real-time pathfinding

Lookahead pathology in real-time pathfinding

Mitja LuštrekJožef Stefan Institute, Department of Intelligent Systems

Vadim BulitkoUniversity of Alberta, Department of Computer Science

Page 2: Lookahead pathology in real-time pathfinding

Introduction

Problem

Explanation

Page 3: Lookahead pathology in real-time pathfinding

Agent-centered search (LRTS)

Currentstate Goal state

Lookahead area

Lookaheaddepth d

Page 4: Lookahead pathology in real-time pathfinding

Agent-centered search (LRTS)

Frontier state

True shortestdistance g

Estimated shortestdistance h

f = g + h

Page 5: Lookahead pathology in real-time pathfinding

Agent-centered search (LRTS)

Frontier statewith the lowest f(fopt)

Page 6: Lookahead pathology in real-time pathfinding

Agent-centered search (LRTS)

Page 7: Lookahead pathology in real-time pathfinding

Agent-centered search (LRTS)

h = fopt

Page 8: Lookahead pathology in real-time pathfinding

Agent-centered search (LRTS)

Page 9: Lookahead pathology in real-time pathfinding

Lookahead pathology Generally believed that larger lookahead depths produce

better solutions Solution-length pathology: larger lookahead depths produce

worse solutions

Lookahead depth

Solution length

1 112 103 84 105 76 87 7

Degree ofpathology = 2

Page 10: Lookahead pathology in real-time pathfinding

Lookahead pathology Pathology on states that do not form a path Error pathology: larger lookahead depths produce more

suboptimal decisions

Multiple statesDepth Error

1 0.312 0.253 0.214 0.245 0.186 0.237 0.12

One stateDepth Decision

1 suboptimal2 suboptimal3 optimal4 optimal5 optimal6 suboptimal7 suboptimal

Degree ofpathology= 2

There ispathology

Page 11: Lookahead pathology in real-time pathfinding

Introduction

Problem

Explanation

Page 12: Lookahead pathology in real-time pathfinding

Our setting HOG – Hierarchical Open Graph [Sturtevant et al.] Maps from commercial computer games (Baldur’s Gate,

Warcraft III)

Initial heuristic: octile distance (true distance assuming an empty map)

1,000 problems (map, start state, goal state)

Page 13: Lookahead pathology in real-time pathfinding

On-policy experiments The agent follows a path from the start state to the goal

state, updating the heuristic along the way Solution length and error over the whole path computed for

each lookahead depth -> pathology

d = 1

d = 2 d = 3

Page 14: Lookahead pathology in real-time pathfinding

Off-policy experiments The agent spawns in a number of states It takes one move towards the goal state Heuristic not updated Error is computed from these first moves -> pathology

d = 1

d = 2

d = 3

d = 1d = 2, 3

d = 3

d = 1, 2

Page 15: Lookahead pathology in real-time pathfinding

Basic on-policy experiment

A lot of pathology – over 60%!

First explanation: a lot of states are intrinsically pathological (off-policy mode)

Not true: only 3.9% are If the topology of the maps is not at fault, perhaps the

algorithm is to blame?

Degree of pathology 0 1 2 3 4 ≥ 5Length (problems %) 38.1 12.8 18.2 16.1 9.5 5.3Error (problems %) 38.5 15.1 20.3 17.0 7.6 1.5

Page 16: Lookahead pathology in real-time pathfinding

Off-policy experiment on 188 states

Not much less pathology than on-policy: 42.2% vs. 61.5%

Degree of pathology 0 1 2 3 ≥ 4Problems % 57.8 31.4 9.4 1.4 0.0

Comparison not fair: On-policy: pathology from error over a number of states Off-policy: pathologicalness of single states

Fair: off-policy error over the same number of states as on-policy – 188 (chosen randomly)

Can use only error – no solution length off-policy

Page 17: Lookahead pathology in real-time pathfinding

Tolerance The first off-policy experiment showed little pathology, the

second one quite a lot Perhaps off-policy pathology is caused by minor differences

in error – noise Introduce tolerence t:

increase in error counts towards the pathology only if error (d1) > t ∙ error (d2)

set t so that the pathology in the off-policy experiment on 188 states is < 5%: t = 1.09

Page 18: Lookahead pathology in real-time pathfinding

Experiments with t = 1.09

On-policy changes little vs. t = 1: 57.7% vs. 61.9% Apparently on-policy pathology is more severe than off-

policy Investigate why! The above experiments are the basic on-policy experiment

and the basic off-policy experiment

Degree of pathology 0 1 2 3 4 ≥ 5On-policy (prob. %) 42.3 19.7 21.2 12.9 3.6 0.3Off-policy (prob. %) 95.7 3.7 0.6 0.0 0.0 0.0

Page 19: Lookahead pathology in real-time pathfinding

Introduction

Problem

Explanation

Page 20: Lookahead pathology in real-time pathfinding

Hypothesis 1

More pathology than in random states: 6.3% vs. 4.3% Much less pathology than basic on-policy: 6.3% vs. 57.7% Hypothesis 1 is correct, but it is not the main reason for on-

policy pathology

Degree of pathology 0 1 2 3 ≥ 4Problems % 93.6 5.3 0.9 0.2 0.0

LRTS tends to visit pathological states with an above-average frequency

Test: compute pathology from states visited on-policy instead of 188 random states

Page 21: Lookahead pathology in real-time pathfinding

Is learning the culprit?

Less pathology than basic on-policy: 20.2% vs. 57.7% Still more pathology than basic off-policy: 20.2% vs. 4.3% Learning is a reason, although not the only one

Degree of pathology 0 1 2 3 4 ≥ 5Problems % 79.8 14.2 4.5 1.2 0.3 0.0

There is learning (updating the heuristic) on-policy, but not off-policy

Learning necessary on-policy, otherwise the agent gets caught in infinite loops

Test: traverse paths in the normal on-policy manner, measure error without learning

Page 22: Lookahead pathology in real-time pathfinding

Hypothesis 2 Larger fraction of updated states at smaller depths

Updatedstate

Currentlookahead

area

Page 23: Lookahead pathology in real-time pathfinding

Hypothesis 2 Smaller lookahead depths benefit more from learning This makes their decisions better than the mere depth

suggests Thus they are closer to larger depths If they are closer to larger depths, cases where a larger

depth happens to be worse than a smaller depth are more common

Test: equalize depths by learning as much as possible in the whole lookahead area – uniform learning

Page 24: Lookahead pathology in real-time pathfinding

Uniform learning

Page 25: Lookahead pathology in real-time pathfinding

Uniform learning

Search

Page 26: Lookahead pathology in real-time pathfinding

Uniform learning

Update

Page 27: Lookahead pathology in real-time pathfinding

Uniform learning

Search

Page 28: Lookahead pathology in real-time pathfinding

Uniform learning

Update

Page 29: Lookahead pathology in real-time pathfinding

Uniform learning

Page 30: Lookahead pathology in real-time pathfinding

Uniform learning

Page 31: Lookahead pathology in real-time pathfinding

Uniform learning

Page 32: Lookahead pathology in real-time pathfinding

Uniform learning

Page 33: Lookahead pathology in real-time pathfinding

Pathology with uniform learning

Even more pathology than basic on-policy: 59.1% vs. 57.7% Is Hypothesis 2 wrong?

Let us look at the volume of heuristic updates encountered per state generated during search

This seems to be the best measure of the benefit of learning

Degree of pathology 0 1 2 3 4 ≥ 5Problems % 40.9 20.2 22.1 12.3 4.2 0.3

Page 34: Lookahead pathology in real-time pathfinding

Volume of updates encountered

00.5

11.5

22.5

33.5

44.5

1 2 3 4 5 6 7 8 9 10

Depth

Upd

ate

volu

me

/ gen

erat

ed

Basic on-policy On-policy with uniform learning Basic off-policy

Hypothesis 2 is correct after all

Page 35: Lookahead pathology in real-time pathfinding

Hypothesis 3 On-policy: one search every d moves, so fewer searchs at

larger depths Off-policy: one search every move

Page 36: Lookahead pathology in real-time pathfinding

Hypothesis 3 The difference between

depths in the amount of search is smaller on-policy than off-policy

This makes the depths closer on-policy

If they are closer, cases where a larger depth happens to be worse than a smaller depth are more common

Test: search every move on-policy

Page 37: Lookahead pathology in real-time pathfinding

Pathology when searching every move

Less pathology than basic on-policy: 13.1% vs. 57.7% Still more pathology than basic off-policy: 13.1% vs. 4.3% Hypothesis 3 is correct, the remaining pathology due to

Hypotheses 1 and 2

Further test: number of states generated per move

Degree of pathology 0 1 2 3 4 ≥ 5Problems % 86.9 9.0 3.3 0.6 0.2 0.0

Page 38: Lookahead pathology in real-time pathfinding

States generated / move

0200400600800

10001200140016001800

1 2 3 4 5 6 7 8 9 10

Depth

Gen

erat

ed /

mov

e

Basic on-policy On-policy every move Basic off-policy

Hypothesis 3 confirmed again

Page 39: Lookahead pathology in real-time pathfinding

Summary of explanation On-policy pathology caused by different lookahead depths

being closer to each other in terms of the quality of decisions than the mere depths would suggest: due to the volume of heuristic updates ecnountered per

state generated due to the number of states generated per move

LRTS tends to visit pathological states with an above-average frequency

Page 40: Lookahead pathology in real-time pathfinding

Thank you.

Questions?


Recommended