Section 4.1. Informed (Heuristic) Search Strategies 99 the goal. All we can do is choose the node that appears to be best according to the evaluation function. If the evaluation function is exactly accurate, then this will indeed be the best node; in reality, the evaluation function will sometimes be off, and can lead the search astray.
Nevertheless, we will stick with the name “best-first search,” because “seemingly-best-first search” is a little awkward.
There is a whole family of BEST-FIRST-SEARCH algorithms with different evaluation functions.1 A key component of these algorithms is a heuristic function,2denoted :
HEURISTIC FUNCTION
estimated cost of the cheapest path from node n to a goal node.
For example, in Romania, one might estimate the cost of the cheapest path from Arad to Bucharest via the straight-line distance from Arad to Bucharest.
Heuristic functions are the most common form in which additional knowledge of the problem is imparted to the search algorithm. We will study heuristics in more depth in Sec- tion 4.2. For now, we will consider them to be arbitrary problem-specific functions, with one constraint: ifis a goal node, then . The remainder of this section covers two ways to use heuristic information to guide search.
Greedy best-first search
Greedy best-first search3tries to expand the node that is closest to the goal, on the grounds
GREEDY BEST-FIRST SEARCH
that this is likely to lead to a solution quickly. Thus, it evaluates nodes by using just the heuristic function: .
Let us see how this works for route-finding problems in Romania, using the straight- line distance heuristic, which we will call . If the goal is Bucharest, we will need to
STRAIGHT-LINE DISTANCE
know the straight-line distances to Bucharest, which are shown in Figure 4.1. For example,
. Notice that the values of cannot be computed from the prob- lem description itself. Moreover, it takes a certain amount of experience to know that
is correlated with actual road distances and is, therefore, a useful heuristic.
Urziceni Neamt Oradea
Zerind Timisoara Mehadia
Sibiu Pitesti
Rimnicu Vilcea
Vaslui
241
253 329 80 199 380 234
374 Bucharest
Giurgiu Hirsova Eforie Arad
Lugoj Dobreta Craiova
Fagaras
Iasi
0 160 242 161 77 151 366
244 226 176
100 193
Figure 4.1 Values of —straight-line distances to Bucharest.
Exercise 4.3 asks you to show that this family includes several familiar uninformed algorithms.
A heuristic function takes a node as input, but is depends only on the state at that node.
Our first edition called this greedy search; other authors have called it best-first search. Our more general usage of the latter term follows Pearl (1984).
c 2002 by Russell and Norvig. DRAFT---DO NOT DISTRIBUTE
100 Chapter 4. Informed Search and Exploration
Rimnicu Vilcea
Zerind Arad
Sibiu
Arad Fagaras Oradea
Timisoara
Sibiu Bucharest
329 374
366 380 193
253 0
Rimnicu Vilcea
Zerind Arad
Sibiu
Arad Fagaras Oradea
Timisoara
329 374
366 176 380 193
Zerind Arad
Sibiu Timisoara
253 329 374
Arad 366 (a) The initial state
(b) After expanding Arad
(c) After expanding Sibiu
(d) After expanding Fagaras
Figure 4.2 Stages in a greedy best-first search for Bucharest using the straight-line dis- tance heuristic . Nodes are labeled with their -values.
Figure 4.2 shows the progress of a greedy best-first search using to find a path from Arad to Bucharest. The first node to be expanded from Arad will be Sibiu, because it is closer to Bucharest than either Zerind or Timisoara. The next node to be expanded will be Fagaras, because it is closest. Fagaras in turn generates Bucharest, which is the goal.
For this particular problem, greedy best-first search using finds a solution without ever expanding a node that is not on the solution path; hence, its search cost is minimal. It is not optimal, however: the path via Sibiu and Fagaras to Bucharest is 32 kilometers longer than the path through Rimnicu Vilcea and Pitesti. This shows why the algorithm is called
“greedy”—it prefers to take the biggest bite possible out of the remaining cost to reach the goal. It ignores the cost of getting to a node in the first place.
Minimizing is susceptible to false starts. Consider the problem of getting from
c 2002 by Russell and Norvig. DRAFT---DO NOT DISTRIBUTE