L04 Search(MoreSearchStrategies)

L04 - SEARCH (MORE SEARCH STRATEGIES)

Outline of this Lecture Informed Search Strategies

Best-first search Greedy best-first search A* search Heuristics Local search algorithms Hill-climbing search Simulated annealing search Beam search Local beam search

Informed Search Relies on additional knowledge about the problem or domain

frequently expressed through heuristics (“rules of thumb”)

Used to distinguish more promising paths towards a goal may be mislead, depending on the quality of the heuristic

In general, performs much better than uninformed search but frequently still exponential in time and space for realistic problems

Review: Tree search

Tree search algorithm:

function Tree-Search( problem, fringe) returns a solution, or failurefringe ← Insert(Make-Node(Initial-State[problem]), fringe)loop do

if fringe is empty then return failurenode ← Remove-Front(fringe)if Goal-Test[problem] applied to State(node) succeeds return nodefringe ← InsertAll(Expand(node, problem), fringe)

A search strategy is defined by picking the order of node expansion

56

Best-First search

Relies on an evaluation function that gives an indication of how useful it would be to expand a node (use an evaluation function f(n) for each node estimate of "desirability" → Expand most desirable unexpanded node)

family of search methods with various evaluation functions usually gives an estimate of the distance to the goal often referred to as heuristics in this context

The node with the lowest value is expanded first the name is a little misleading: the node with the lowest value for

the evaluation function is not necessarily one that is on an optimal path to a goal

if we really know which one is the best, there’s no need to do a search

The following is an algorithm for the Best-First Search:

function BEST-FIRST-SEARCH(problem, EVAL-FN) returns solution

fringe := queue with nodes ordered by EVAL-FN return TREE-SEARCH(problem, fringe)

Special cases: greedy best-first search A* search

Romania with step costs in km

57

Greedy Best-First search

minimizes the estimated cost to a goal expand the node that seems to be closest to a goal utilizes a heuristic function as evaluation function

f(n) = h(n) = estimated cost from the current node to a goal heuristic functions are problem-specific often straight-line distance for route-finding and similar

problems often better than depth-first, although worst-time complexities are

equal or worse (space)

utilizes a heuristic function as evaluation function, f(n) = h(n) = estimated cost from the current node to a goal

heuristic functions are problem-specific often straight-line distance for route-finding and similar problems

For example, hSLD(n) = straight-line distance from n to Bucharest

Greedy best-first search expands the node that appears to be closest to goal

Greedy best-first search is often better than depth-first, although worst-time complexities are equal or worse (space)

The following is an algorithm for Greedy Best-First search:

58

function GREEDY-SEARCH(problem) returns solution

return BEST-FIRST-SEARCH(problem, h)

Greedy best-first search example – first step

Greedy best-first search example – second step

59

Greedy best-first search example – third step

Greedy best-first search example – fourth step

Properties of greedy best-first search

Complete? No – can get stuck in loops, e.g., Iasi Neamt Iasi Neamt Time? O(bm), but a good heuristic can give dramatic improvement Space? O(bm) -- keeps all nodes in memory Optimal? No

60

A* search Idea: avoid expanding paths that are already expensive

It combines greedy and uniform-cost search to find the (estimated) cheapest path through the current node

Evaluation function f(n) = g(n) + h(n), the estimated total cost of path through n to goal g(n) = cost so far to reach n (path cost up to n) h(n) = estimated cost from n to goal

heuristics must be admissible That is, h(n) ≤ h*(n) where h*(n) is the true cost from n. (Also require h(n) ≥

0, so h(G) = 0 for any goal G.) never overestimate the cost to reach the goal (hSLD(n) never overestimates the

actual road distance)

It is a very good search method, but with complexity problems

The following is an algorithm for A* search: function A*-SEARCH(problem) returns solution

return BEST-FIRST-SEARCH(problem, g+h)

A* search example – first step

A* search example – second step

61

A* search example – third step

A* search example – fourth step

A* search example – fifth step

62

A* search example – sixth step

Admissible heuristics

A heuristic h(n) is admissible if for every node n,h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state from n.

An admissible heuristic never overestimates the cost to reach the goal, i.e., it is optimistic Example: hSLD(n) (never overestimates the actual road distance)

Theorem: If h(n) is admissible, A* using TREE-SEARCH is optimal

63

Optimality of A* (proof)

Suppose some suboptimal goal G2 has been generated and is in the fringe. Let n be an unexpanded node in the fringe such that n is on a shortest path to an optimal goal G.

We shall have: f(G2) = g(G2) since h(G2) = 0 g(G2) > g(G) since G2 is suboptimal f(G) = g(G) since h(G) = 0 f(G2) > f(G) from above

h(n) ≤ h*(n) since h is admissible g(n) + h(n)≤ g(n) ≤ h*(n) f(n) ≤ f(G)

Hence f(G2) > f(n), and A* will never select G2 for expansion

64

Consistent heuristics A heuristic is consistent if for every node n, every successor n' of n generated by

any action a,

h(n) ≤ c(n ,a, n') + h(n')

If h is consistent, we have f(n') = g(n') + h(n')

= g(n) + c(n, a, n') + h(n') ≥ g(n) + h(n) = f(n)

that is, f(n) is non-decreasing along any path.

Theorem: If h(n) is consistent, A* using GRAPH-SEARCH is optimal

This GRAPH-SEARCH is as shown below:

Optimality of A* search

65

A* expands nodes in order of increasing f value

Gradually adds "f-contours" of nodes

Contour i has all nodes with f=fi, where fi < fi+1

A* will find the optimal solution the first solution found is the optimal one

A* is optimally efficient no other algorithm is guaranteed to expand fewer nodes than A*

A* is not always “the best” algorithm optimality refers to the expansion of nodes other criteria might be more relevant

A* generates and keeps all nodes in memory improved in variations of A*

Complexity of A*

66

The number of nodes within the goal contour search space is still exponential with respect to the length of the solution better than other algorithms, but still problematic

Frequently, space complexity is more severe than time complexity A* keeps all generated nodes in memory

Properties of A* search

The value of f never decreases along any path starting from the initial node also known as monotonicity of the function almost all admissible heuristics show monotonicity

those that don’t can be modified through minor changes

This property can be used to draw contours regions where the f-cost is below a certain threshold with uniform cost search (h = 0), the contours are circular the better the heuristics h, the narrower the contour around the optimal path

Complete? Yes (unless there are infinitely many nodes with f ≤ f(G) )

Time? Exponential O(bd)

Space? Keeps all nodes in memory O(bd)

Optimal? Yes

Admissible heuristics

For example, for the 8-puzzle:

67

h1(n) = number of misplaced tiles

h2(n) = total Manhattan distance(that is, the number of squares from desired location of each tile)

h 1(S) = 8

h 2(S) = 3+1+2+2+2+3+3+2 = 18

Dominance

If h2(n) ≥ h1(n) for all n (both admissible) then h2 dominates h1

h2 is better for search

Typical search costs (average number of nodes expanded):

d=12 IDS = 3,644,035 nodesA*(h1) = 227 nodes A*(h2) = 73 nodes

d=24 IDS = too many nodesA*(h1) = 39,135 nodes A*(h2) = 1,641 nodes

Relaxed problems A problem with fewer restrictions on the actions is called a relaxed problem

The cost of an optimal solution to a relaxed problem is an admissible heuristic

68

for the original problem

If the rules of the 8-puzzle are relaxed so that a tile can move anywhere, then h1(n) gives the shortest solution

If the rules are relaxed so that a tile can move to any adjacent square, then h2(n) gives the shortest solution

Heuristics for Searching for many tasks, a good heuristic is the key to finding a solution

prune the search space move towards the goal

relaxed problems fewer restrictions on the successor function (operators) its exact solution may be a good heuristic for the original problem

8-Puzzle Heuristics level of difficulty

around 20 steps for a typical solution branching factor is about 3 exhaustive search would be 320 =3.5 * 109 9!/2 = 181,440 different reachable states

distinct arrangements of 9 squares

candidates for heuristic functions number of tiles in the wrong position sum of distances of the tiles from their goal position

city block or Manhattan distance

generation of heuristics possible from formal specifications

Local Search and Optimization for some problem classes, it is sufficient to find a solution

the path to the solution is not relevant

memory requirements can be dramatically relaxed by modifying the current

69

state all previous states can be discarded since only information about the current state is kept, such methods are

called local

Local search algorithms In many optimization problems, the path to the goal is irrelevant; the goal state

itself is the solution

State space = set of "complete" configurations

Find configuration satisfying constraints, e.g., n-queens

In such cases, we can use local search algorithms

Keep a single "current" state, try to improve it

Example: n-queens Put n queens on an n × n board with no two queens on the same row, column, or

diagonal

Iterative Improvement Search for some problems, the state description provides all the information required for

a solution path costs become irrelevant global maximum or minimum corresponds to the optimal solution

70

iterative improvement algorithms start with some configuration, and try modifications to improve the quality 8-queens: number of un-attacked queens VLSI layout: total wire length

analogy: state space as landscape with hills and valleys

Hill-climbing search

"Like climbing Everest in thick fog with amnesia"

71

continually moves uphill increasing value of the evaluation function gradient descent search is a variation that moves downhill

very simple strategy with low space requirements stores only the state and its evaluation, no search tree

there are some problems local maxima

algorithm can’t go higher, but is not at a satisfactory solution plateau

area where the evaluation function is flat ridges

search may oscillate slowly

general problem: depending on initial state, can get stuck in local maxima

72

73

Hill-climbing search: 8-queens problem

h = number of pairs of queens that are attacking each other, ether directly or indirectly

h = 17 for the above state

74

A local minimum with h = 1

75

Simulated annealing search

Idea: escape local maxima by allowing some "bad" moves but gradually decrease their frequency

similar to hill-climbing, but some down-hill movement random move instead of the best move depends on two parameters

∆E, energy difference between moves; T, temperature temperature is slowly lowered, making bad moves less likely

analogy to annealing gradual cooling of a liquid until it freezes

will find the global optimum if the temperature is lowered slowly enough

applied to routing and scheduling problems

Properties of simulated annealing search

One can prove: If T decreases slowly enough, then simulated annealing search will find a global optimum with probability approaching 1

Widely used in VLSI layout, airline scheduling, etc

76

Beam search

Beam search is a heuristic search algorithm that is an optimization of best-first search that reduces its memory requirement. Best-first search is a graph search which orders all partial solutions (states) according to some heuristic which attempts to predict how close a partial solution is to a complete solution (goal state). In beam search, only a predetermined number of best partial solutions are kept as candidates.

Beam search uses breadth-first search to build its search tree. At each level of the tree, it generates all successors of the states at the current level, sorting them in order of increasing heuristic values. However, it only stores a predetermined number of states at each level (called the beam width). The smaller the beam width, the more states are pruned. Therefore, with an infinite beam width, no states are pruned and beam search is identical to breadth-first search. The beam width bounds the memory required to perform the search, at the expense of risking completeness (possibility that it will not terminate) and optimality (possibility that it will not find the best solution). The reason for this risk is that the goal state could potentially be pruned.

The beam width can either be fixed or variable. In a fixed beam width, a maximum number of successor states is kept. In a variable beam width, a threshold is set around the current best state. All states that fall outside this threshold are discarded. Thus, in places where the best path is obvious, a minimal number of states is searched. In places where the best path is ambiguous, many paths will be searched.

Local beam search variation of beam search

a path-based method that looks at several paths “around” the current one

Keep track of k states rather than just one information between the states can be shared

moves to the most promising areas

Start with k randomly generated states

At each iteration, all the successors of all k states are generated

If any one is a goal state, stop; else select the k best successors from the complete list and repeat.

stochastic local beam search selects the k successor states randomly with a probability determined by the evaluation function

77

Date post:	29-Jan-2016
Category:	Documents
Upload:	kofi
View:	222 times
Download:	0 times

L04 Search(MoreSearchStrategies)

Documents