Planning
In order to solve most nontrivial problems, it is necessary to combine some of the basic problem-solving strategies discussed in Chapter 3 with one or more of the knowledge representation mechanisms that have just been presented. It is often also usefu l to divide the problem that must be solved into smaller pieces and to solve those pieces separately, to the extent that that is possible. In this chapter, we describe several techniques for doing this in order to construct plans for solving hard problems .
In Chapter 2, we described the process of problem solving as a search through a state space in which each point corresponded to a situation that might arise. The search started with an initial situation and performed a sequence of allowable operations until a situation corresponding to a goal was reached. Then, in Chapter 3, we described a variety of ways of moving through such a search space in an attempt to find a solution to a particular problem. For example, the A* algorithm provides a way of condu cting a best-first search through a graph representing a problem space. Each node that is examined in the A* algorithm represents a description of a complete problem state, and each operator describes a way of changing the total state description. For sim ple problems, such as, say, the 8-puzzle, manipulating the complete state description at one time is easy and reasonable.
However, for more complicated problem domains, it becomes important to be able to work on small pieces of a problem separately and then to combine the partial solutions at the end into a complete problem solution. Unless we can do this, the number of c ombinations of the states of the components of a problem becomes too large to handle in the amount of time available. There are two ways in which it is important to be able to perform this decomposition.
First of all, we must avoid having to recompute the entire problem state when we ,move from one state to the next. Instead, we want to consider only that part of the state that may have changed. For example, if we move from one room to another, this do es not affect the locations of the doors and the windows in the two rooms. The frame problem, which is the problem of how to determine which things change and which do not, becomes increasingly important as the complexity of the problem state incre ases. It is not difficult to figure out how the state ofthe 8-puzzle should change after every move, nor is it a lot of work to record explicitly a new copy of the state with the appropriate changes made. Our rules for moving from one state to another can simply describe how one entire board position should be transformed into another.
But if we are considering the problem of guiding a robot around an ordinary house, the situation is much more complex. The description of a single state is very large since it must describe the location of each object in the house as well as that of th e robot. A given action on the part of the robot will change only a small part of the total state. If the robot pushes a table across the room, then the locations of the table and all of the objects that were on it will change. But the locations of the ot her objects in the house will not. Instead of writing rules that describe transformations of one entire state into another, we would like to write rules that describe only the affected parts of the state description. The rest of the description can then b e assumed to stay constant.
The second important way in which decomposition can make the solution of hard problems easier is the division ofa single difficult problem into several, hopefully easier, subproblems. The AO* algorithm provides a way of doing this when it is possible t o decompose the original problem into completely separate subproblems. Although this is sometimes possible, it often is not. Instead, many problems can be viewed as nearly decomposable [Simon,1981], by which we mean that they can be divided into su bproblems that have only a small amount of interaction. For example, suppose that we want to move all the furniture out of a room. This problem can be decomposed into a set of smaller problems, each involving moving one piece of furniture out of the room. Within each of these subproblems, considerations such as removing drawers can be addressed separately for each piece of furniture. But if there is a bookcase behind a couch, then we must move the couch before we can move the bookcase. To solve such nearl y decomposable problems, we would like a method that enables us to work on each subproblem separately, using techniques such as the ones we have already studied, and then to record potential interactions among subproblems and to handle them appropriately.
Several methods for doing these two kinds of decomposition have been proposed and we investigate them in this chapter. These methods focus on ways of decomposing the original problem into appropriate subparts and on ways of recording and handling inter actions among the subparts as they are detected during the problem-solving process. The use of these methods is often called planning.
In everyday usage, the word planning refers to the process of computing several steps of a problem-solving procedure before executing any of them. When we describe computer problem-solving behavior, the distinction between planning and doing fad es a bit since rarely can the computer actually do much of anything besides plan. In solving the 8-puzzle, for example, it cannot actually push any tiles around. So when we discussed the computer solution of the 8-puzzle problem, what we were really doing was outlining the way the computer might generate a plan for solving it. For problems such as the 8-puzzle, the distinction between planning and doing is unimportant.
But in other situations, the distinction may be critical. Recall that in Chapter 2 one of the problem characteristics we discussed was whether solution steps could be ignored or undone if they prove unwise. If they can, then the process of planning a c omplete solution can proceed just as would an attempt to find a solution by actually trying particular actions. If a dead-end path is detected, then a new one can be explored by backtracking to the last choice point. So, for example, in solving the 8-puzz le, a computer could look for a solution plan in the same way as a person who was actually trying to solve the problem by moving tiles on a board. If solution steps in the real world cannot be ignored or undone, though, planning becomes extremely importan t. Although real world steps may be irrevocable, computer simulation of those steps is not. So we can circumvent the constraints of the real world by looking for a complete solution in a simulated world in which backtracking is allowed. After we find a so lution, we can execute it in the real world.
The success of this approach, however, hinges on another characteristic of a problem's domain: Is its universe predictable? If we look for a solution to a problem by actually carrying out sequences of operations, then at any step of the process we can be sure of the outcome of that step; it is whatever happened. But in an unpredictable universe; we cannot know the outcome of a solution step if we are only simulating it by computer. At best, we can consider the set of possible outcomes, possibly in some order according to the likelihood ofthe outcomes occurnng. But then when we produce a plan and attempt to execute it, we must be prepared in case the actual outcome is not what we expected. If the plan included paths for all possible outcomes of e ach step, then we can simply traverse the paths that turn out to be appropriate. But often there are a great many possible outcomes, most of which are highly unlikely. In such situations, it would be a great waste of effort to formulate plans for all cont ingencies.
Instead, we have two choices. We can just take things one step at a time and not really try to plan ahead. This is the approach that is taken in reactive systems, which we will describe in Section 13.7. Our other choice is to produce a plan that is likely to succeed. But then what should we do if it fails? One possibility is simply to throw away the rest of the plan and start the planning process over, using the current situation as the new initial state. Sometimes, this is a reasonable t hing to do.
But often the unexpected consequence does not invalidate the entire rest of the plan. Perhaps a small change, such as an additional step, is all that is necessary to make it possible for the rest of the plan to be useful. Suppose, for example, that we have a plan for baking an angel food cake. It involves separating some eggs. While canying out the plan, we turn out to be slightly clumsy and one of the egg yolks falls into the dish of whites. We do not need to create a completely new plan (unless we de cide to settle for some other kind of cake). Instead, we simply redo the egg-separating step until we get it right and then continue with the rest of the plan. This is particularly true for decomposable or nearly decomposable problems. If the final plan i s really a composite of many smaller plans for solving a set of subproblems, then if one step of the plan fails, the only part of the remaining plan that can be affected is the rest of the plan for solving that subproblem. The rest of the plan is unrelate d to that step. If the problem was only partially decomposable, then any subplans that interact with the affected one may also be affected. So, just as it was important during the planning process to keep track of interactions as they arise, it is importa nt to record information about interactions along with the final plan so that if unexpected events occur at execution time, the interactions can be considered during replanning.
Hardly any aspect of the real world is completely predictable. So we must always be prepared to have plans fail. But, as we have just seen, if we have built our plan by decomposing our problem into as many separate (or nearly separate) subproblems as p ossible, then the impact on our plan of the failure of one particular step may be quite local. Thus we have an additional argument in favor of the problem-decomposition approach to problem solving. In addition to reducing the combinatorial complexity of t he problem-solving process, it also reduces the complexity of the dynamic plan revision process that may be required during the execution of a plan in an unpredictable world (such as the one in which we live).
ln order to make it easy to patch up plans if they go awry at execution time, we will find that it is useful during the planning process not only to record the steps that are to be performed but also to associate with each step the reasons why it must be performed. Then, if a step fails, it is easy, using techniques for dependency-directed backtracking, to determine which of the remaining parts of the plan were dependent on it and so may need to be changed. If the plan-generation process proceeds backw ard from the desired goal state, then it is easy to record this dependency information. If, on the other hand, it proceeded forward from the start state, determining the necessary dependencies may be difficult. For this reason and because, for most proble ms, the branching factor is smaller going backward, most planning systems work primarily in a goal-directed mode in which they search backward from a goal state to an achievable initial state.
In the next several sections, a variety of planning techniques are presented. All of them, except the last, are problem-solving methods that rely heavily on problem decomposition. They deal (to varying degrees of success) with the inevitable interactio ns among the components that they generate.
The techniques we are about to discuss can be applied in a wide variety of task domains, and they have been. But to make it easy to compare the variety of methods we consider, we should find it useful to look at all of them in a single domain that is c omplex enough that the need for each of the mechanisms is apparent yet simple enough that easy-tofollow examples can be found. The blocks world is such a domain. There is a flat surface on which blocks can be placed. There are a number of square blocks, a ll the same size. They can be stacked one upon another. There is a robot arm that can manipulate the blocks. The actions it can perform include:
Notice that in the world we have described, the robot arm can hold oniy one block at a time. Also, since all blocks are the same size, each block can have at most one other block directly on top of it.[Footnote: Actually, by careful alignment, two b locks could be placed on top of one, but we ignore that possibility.]
In order to specify both the conditions under which an operation may be performed and the results of performing it, we need to use the following predicates:
Various logical statements are true in this blocks world. For example,
[&exists;x : HOLDING(x)] -&62; ¬ARMEMPTY
&every;x : ONTABLE(x) -&62; &exists;y: : ON(x, y)
&every;x : [¬&exists;y : ON(y,x)] -&62; CLEAR(x)
The first of these statements says simply that if the arm is holding anything, then it is not empty. The second says that if a block is on the table, then it is not also on another block. The third says that any block with no blocks on it is clear.
In problem-solving systems based on the elementary techniques discussed in Chapter 3, it was necessary to perform each of the following functions:
application.
In the more complex systems we are about to explore, techniques for doing each of these tasks are also required. In addition, a fifth operation is often important:
Before we discuss specific planning methods, we need to look briefly at the ways in which each of these five things can be done.
Choosing Rules to Apply
The most widely used technique for selecting appropriate rules to apply is first to isolate a set of differences between the desired goal state and the current state and then to identify those rules that are relevant to reducing those differences. If s everal rules are found, a variety of other heuristic information can be exploited to choose among them. This technique is based on the means-ends analysis method (recall Chapter 3). For example, if our goal is to have a white fence around our yard and we currently have a brown fence, we would select operators whose result involves a change of color of an object. If, on the other hand, we currently have no fence, we must first consider operators that involve constructing wooden objects.
Applying Rules
In the simple systems we have previously discussed, applying rules was easy. Each rule simply specified the problem state that would result from its application. Now, however, we must be able to deal with rules that specify only a small part ofthe comp lete problem state. There are many ways of doing this.
One way is to describe, for each action, each of the changes it makes to the state description. In addition, some statement that everything else remains unchanged is also necessary. An example ofthis approach is described in Green [1969]. In this syste m, a given state was described by a set of predicates representing the facts that were true in that state. Each distinct state was represented explicitly as part of the predicate. For example, Figure 13.1 shows how a state, called S0, of a simple blocks w orld problem could be represented.
The manipulation of these state descriptions was done using a resolution theorem prover. So, for example, the effect of the operator UNSTACK(x, y) could be described by the following axiom. (In all the axioms given in this section, all variables are universally quantified unless otherwise indicated.)
[CLEAR(x, s) ^ ON(x, y, s)] -&62;
[HOLDING(x, DO(UNSTACK(x,y),s)) ^
CLEAR(y, DO(UNSTACK(x, y), s))]
Here, DO is a function that specifies, for a given state and a given action, the new state that results from the execution of the action. The axiom states that if CLEAR(x) and ON(x, y) both hold in state s, then HOLDING(x) and CLEAR(y) will hold in the state that results from DOing an UNSTACK(x, y), starting in state s.
If we execute UNSTACK(A, B) in state SO as defined above, then we can prove, using our assertions about SO and our axiom about UNSTACK, that in the state that results from the unstacking operation (we call this state S1),
HOLDING(A, S1) n CLEAR(B,S1)
But what else do we know about the situation in state S 1 ? Intuitively, we know that B is still on the table. But with what we have so far, we cannot derive it. To enable us to do so, we need also to provide a set of rules, called frame axioms, that d escribe components of the state that are not affected by each operator. So, for example, we need to say that
ONTABLE(z, s) -&62; ONTABLE(z, DO(UNSTACK(x,y),s))
This axiom says that the ONTABLE relation is never affected by the UNSTACK operator. We also need to say that the ON relation is only affected by the UNSTACK operator if the blocks involved in the ON relation are the same ones involved in the UNSTACK o peration. This can be said as
[ON(m, n, .s) ^¬ EQUAL(m, x)] -&62; ON(m,n,DO(UNSTACK(x,y),s))
The advantage of this approach is that a single mechanism, resolution, can perform all the operations that are required on state descriptions. The price we pay for this, however, is that the number of axioms that are required becomes very large if the problem-state descriptions are complex. For example, suppose that we are interested not only in the positions of our blocks but also in their color. Then, for every operation (except possibly PAINT), we would need an axiom such as the following:
COLOR(x, c, s) -&62; COLOR(x, c, DO(UNSTACK(y, z), s))
To handle complex problem domains, we need a mechanism that does not require a large number of explicit frame axioms. One such mechanism is that used by the early robot problem-solving system STRIPS [Fikes and Nilsson,1971] and ù s descendants. In this approach, each operation is described by a list of new predi es that the operator causes to become true and a list of old predicates that it caus to become false. These two lists are called the ADD and DELETE lists, respectiv ly. A third list must also b e specified for each operator. This PRECONDITION li t contains those predicates that must be true for the operator to be applied. The frame xioms of Green's system are specified implicitly in STRIPS. Any predicate not include on either the ADD or DELETE l ist of an operator is assumed to be unaffected by it. This means that, in specifying each operator, we need not consider aspects of the domain that are unrelated to it. Thus we need say nothing about tlie relationship of UNSTACK to COLOR. Of course, this means that some mechanism other than simple theorem proving must be used to compute complete state descriptions after operations have been performed.
STRIPS-style operators that correspond to the blocks world operations we have been discussing are shown in Figure 13.2. Notice that for simple rules such as these, the PRECONDITION list is often identical to the DELETE list. In order to pick up a block , the robot arm must be empty; as soon as it picks up a block, it is no longer empty. But preconditions are not always deleted. For example, in order for the arm to pick up a block, the block must have no other blocks on top of it. After it is picked up, it still
has no blocks on top of it. This is the reason that the PRECONDITION and DELETE lists must be specified separately.
By making the frame axioms implicit, we have greatly reduced the amount of information that must be provided for each operator. This means, among other things, that when a new attribute that objects might possess is introduced into the system, it is no t necessary to go back and add a new axiom for each of the existing operators. But how can we actually achieve the effect of the use of the frame axioms in computing complete state descriptions? The first thing we notice is that for complex state descript ions, most of the state remains unchanged after each operation. But if we represent the state as an explicit part of each predicate, as was done in Green's system, then all that information must be deduced all over again for each state. To avoid that, we can drop the explicit state indicator from the individual predicates and instead simply update a single database of predicates so that it always describes the current state of the world. For example, if we start with the situation shown in Figure 13.1, we would describe it as
ON(A, B) ^ ONTABLE(B) ^ CLEAR(A)
After applying the operator UNSTACK(A, B), our description of the world would be
ONTABLE(B) ^ CLEAR(A) ^ CLEAR(B) ^ HOLDING(A)
This is derived using the ADD and DELETE lists specified as part of the UNSTACK operator.
Simply updating a single state description works well as a way of keeping track of the effects of a given sequence of operators. But what happens during the process of searching for the correct operator sequence? If one incorrect sequence is explored, it must be possible to return to the original state so that a different one can be tried. But this is possible even if the global database describes the problem state at the current node of the search graph. All we need to do is record at each node the ch anges that were made to the global database as we passe ugh the node. Then, if we backtrack through that node, we can undo the ges. But the changes are described exactly in the ADD and DELETE lists o the operators that have been applied to move from one n ode to another. So we ne d only record, along each arc of the search graph, the operator that was applied. Fig e 13.3 shows a small example of such a search tree and the corresponding global database. The initial state is the one shown in Figure 13.1 and described in STRIPS form above. Notice that we must specify not just the operator (e.g., UNSTACK) but also its arguments in order to be able to undo the changes later.
Now suppose that we want to explore a path different from the one we have just shown. First we backtrack through node 3 by adding each of the predicates in PUTDOWN's DELETE list to the global database and deleting each of the elements of PUTDOWN's ADD list. After doing that, the database contains
ONTABLE(B) ^ CLEAR(A) ^ CLEAR(B) ^ HOLDING(A)
As we expected, this description is identical to the one we previously computed as the result of applying UNSTACK to the initial situation. If we repeat this process using the ADD and DELETE lists of UNSTACK, we derive a description identical to the on e with which we started.
Because an implicit statement of the frame axioms is so important in complex problem domains, all the techniques we look at exploit STRIPS-style descriptions of the available operators.
Detecting a Solution
A planning system has succeeded in finding a solution to a problem when it has found a sequence of operators that transforms the initial problem state into the goal state. How will it know when this has been done? In simple problem-solving systems, thi s question is easily answered by a straightforward match of the state descriptions. But if entire states are not represented explicitly but rather are described by a set of relevant properties, then this problem becomes more complex. The way it can be sol ved depends on the way that state descriptions are represented. For any representational scheme that is used, it must be possible to reason with representations to discover whether one matches another. Recall that in Part II we discussed a variety of ways that complex objects could be represented as well as reasoning mechanisms for each representation. Any of those representations (or some combination of them) could be used to describe problem states. Then the corresponding reasoning mechanisms could be u sed to discover when a solution had been found.
One representational technique has served as the basis for many of the planning systems that have been built. It is predicate logic, which is appealing because of the deductive mechanisms that it provides. Suppose that, as part of our goal, we have the predicate P(x). To see whether P(x) is satisfied in some state, we ask whether we can prove P(x) given the assertions that describe that state and the axioms that define the world model (such as the fact that if the arm is holding so mething, then it is not empty). If we can construct such a proof, then the problem-solving process terminates. If we cannot, then a sequence of operators that might solve the problem must be proposed. This sequence can then be tested in the same way as th e initial state was by asking whether P(x) can be proved from the axioms and the state description that was derived by applying the operators.
Detecting Dead Ends
As a planning system is searching for a sequence of operators to solve a particular problem, it must be able to detect when it is exploring a path that can never lead to a solution (or at least appears unlikely to lead to one). The same reasoning mecha nisms that can be used to detect a solution can often be used for detecting a dead end.
If the search process is reasoning forward from the initial state, it can prune any path that leads to a state from which the goal state cannot be reached. For example, suppose we have a fixed supply of paint: some white, some pink, and some red. We wa nt to paint a room so that it has light red walls and a white ceiling. We could produce light red paint by adding some white paint to the red. But then we could not paint the ceiling white. So this approach should be abandoned in favor of mixing the pink and red paints together. We can also prune paths that, although they do not preclude a solution, appear to be leading no closer to a solution than the place from which they started.
If the search process is reasoning backward from the goal state, it can also terminate a path either because it is sure that the initial state cannot be reached or because little progress is being made. In reasoning backward, each goal is decomposed in to subgoals. Each of them, in turn, may lead to a set of additional subgoals. Sometimes it is easy to detect that there is no way that all the subgoals in a given set can be satisfied at once. For example, the robot arm cannot be both empty and holding a block. Any path that is attempting to make both of those goals true simultaneously can be pruned immediately. Other paths can be pruned because they lead nowhere. For example, if, in trying to satisfy goal A, the program eventually reduces its problem to the satisfaction of goal A as well as goals B and C, it has made little progress. It has produced a problem even harder than its original one, and the path leading to this problem should be abandoned.
Repairing an Almost Correct Solution
The kinds of techniques we are discussing are often useful in solving nearly decomposable problems. One good way of solving such problems is to assume that they are completely decomposable, proceed to solve the subproblems separately, and then c heck that when the subsolutions are combined, they do in fact yield a solution to the original problem. Of course, if they do, then nothing more need be done. If they do not, however, there are a variety of things that we can do. The simplest is just to t hrow out the solution, look for another one, and hope that it is better. Although this is simple, it may lead to a great deal of wasted effort.
A slightly better approach is to look at the situation that results when the sequence of operations corresponding to the proposed solution is executed and to compare that situation to the desired goal. In most cases, the difference between the two will be smaller than the difference between the initial state and the goal (assuming that the solution we found did some useful things). Now the problem-solving system can be called again and asked to find a way of eliminating this new difference. The first s olution can then be combined with this second one to form a solution to the original problem.
An even better way to patch up an almost correct solution is to appeal to specific knowledge about what went wrong and then to apply a direct patch. For example, suppose that the reason that the proposed solution is inadequate is that one of its operat ors cannot be applied because at the point it should have been invoked, its preconditions were not satisfied. This might occur if the operator had two preconditions and the sequence of operations that makes the second one true undid the first one. But per haps, if an attempt were made to satisfy the preconditions in the opposite order, this problem would not arise.
A still better way to patch up incomplete solutions is not really to patch them up at all but rather to leave them incompletely specified until the last possible moment. Then when as much information as possible is available, complete the specification in such a way that no conflicts arise. This approach can be thought of as a least-commitment strategy. It can be applied in a variety of ways. One is to defer deciding on the order in which operations will performed. So, in our previous example, i nstead of arbitrarily choosing one orde m which to satisfy a set of preconditions, we could leave the order unspecified unti he very end. Then we would look at the effects of each of the subsolutions to dete ine the dependencies that exist among them. At that point, an ordering can be chose.
One of the earliest techniques to be developed for solving compound goals that may interact was the use of a goal stack. This was the approach used by STRIPS. In this
method, the problem solver makes use of a single stack that contains both goals and operators that have been proposed to satisfy those goals. The problem solver also relies on a database that describes the current situation and a set of operators descr ibed as PRECONDITION, ADD, and DELETE lists. To see how this method works, let us cany it through for the simple example shown in Figure 13.4.
When we begin solving this problem, the goal stack is simply
ON(C, A) ^ ON(B, D) ^ ONTABLE(A) ^ ONTABLE(D)
But we want to separate this problem into four subproblems, one for each component of the original goal. Two of the subproblems, ONTABLE(A) and ONTABLE(D), are already true in the initial state. So we will work on only the remaining two. Depending on t he order in which we want to tackle the subproblems, there are two goal stacks that could be created as our first step, where each line represents one goal on the stack and OTAD is an abbreviation for ONTABLE(A) n ONTABLE(D):
ON(C, A) ON(B, D)
ON(B, D) ON(C, A)
ON(C, A) ^ ON(B, D) ^ OTAD ON(C, A) ^ ON(B, D) ^ OTAD
[1] [2]
At each succeeding step of the problem-solving process, the top goal on the stack will be pursued. When a sequence of operators that satisfies it is found, that sequence is applied to the state description, yielding a new description. Next, the goal th at is then at the top of the stack is explored and an attempt is made to satisfy it, starting from the situation that was produced as a result of satisfying the first goal. This process continues until the goal stack is empty. Then, as one last check, the original goal is compared to the final state derived from the application of the chosen operators. If any components of the goal are not satisfied in that state (which they might not be if they were achieved at one point and then undone later), then thos e unsolved parts of the goal are reinserted onto the stack and the process resumed.
To continue with the example we started above, let us assume that we choose first to explore alternative 1. Alternative 2 will also lead to a solution. In fact, it finds one so trivially that it is not very interesting. Exploring alternative 1, we firs t check to see whether ON(C,A) is true in the current state. Since it is not, we check for operators that could cause it to be true. Of the four operators we are considering, there is only one, STACK, and it would have to be called with C and A. So we pla ce STACK(C, A) on the stack in place of ON(C,A), yielding
STACK(C, A)
ON(B, D)
ON(C, A) ^ ON(B, D) ^ OTAD
STACK(C, A) replaced ON(C, A) because after performing the STACK we are guaranteed that ON(C, A) will hold. But in order to apply STACK(C, A), its preconditions must hold, so we must establish them as subgoals. Again we must separate a compound goal
CLEAR(A) n HOLDING(C)
into its components and choose an order in which to work on them. At this point, it is useful to exploit some heuristic knowledge. HOLDING(x) is very easy to achieve. At most, it is necessary to put down something else and then to pick up the desired o bject. But HOLDING is also very easy to undo. In order to do almost anything else, the robot will need to use the arm. So if we achieve HOLDING first and then try to do something else, we will most likely end up with HOLDING no longer true. So we exploit the heuristic that if HOLDING is one of several goals to be achieved at once, it should be tackled last. This produces the new goal stack
CLEAR(A)
HOLDING(C)
CLEAR(A) ^ HOLDING(C)
STACK(C, A)
ON(B, D)
ON(C, A) ^ ON(B, D) ^ OTAD
This kind of heuristic information could be contained in the precondition list itself by stating the predicates in the order in which they should be achieved.
Next we check to see if CLEAR(A) is true. It is not. The only operator that could make it true is UNSTACK(B, A). So we will attempt to apply it. This produces the goal stack
ON(B, A)
CLEAR(B)
ARMEMPTY
ON(B, A) ^ CLEAR(B) ^ ARMEMPTY
UNSTACK(B, A)
HOLDING(C)
CLEAR(A) ^ HOLDING(C)
STACK(C, A)
ON(B, D)
ON(C, A) ^ ON(B, D) ^ OTAD
This time, when we compare the top element of the goal stack, ON(B, A), to the world model, we see that it is satisfied. So we pop it off and consider the next goal, CLEAR(B). It, too, is already true in the world model, although it was not stated expl icitly as one of the initial predicates. But from the initial predicates and the blocks world axiom that says that any block with no blocks on it is clear, a theorem prover could derive CLEAR(B). So that goal, too, can be popped from the stack. The third precondition for UNSTACK(B, A) remains. It is ARMEMPTY, and it is also true in the current world model, so it can be popped off the stack. The next element on the stack is the combined goal representing all of the preconditions for UNSTACK(B, A). We check to make sure it is satisfied in the world model. It will be unless we undid one of its components in attempting to satisfy another. In this case, there is no problem and the combined goal can be popped from the stack.
Now the top element of the stack is the operator UNSTACK(B, A). We are now guaranteed that its preconditions are satisfied, so it can be applied to produce a new world model from which the rest of the problem-solving process can continue. This is done using the ADD and DELETE lists specified for UNSTACK. Meanwhile we record that UNSTACK(B, A) is the first operator of the proposed solution sequence. At this point, the database corresponding to the world model is
ONTABLE(A) ^ ONTABLE(C) ^ ONTABLE(D) ^HOLDING(B) ^ CLEAR(A)
The goal stack now is
HOLDING(C)
CLEAR(A) ^ HOLDING(C)
STACK(C, A)
ON(B, D)
ON(C, A) ^ ON(B, D) ^ OTAD
We now attempt to satisfy the goal HOLDING(C). There are two operators that might make HOLDING(C) true: PICKUP(C) and UNSTACK(C, x), where x could be any block from which C could be unstacked. Without looking ahead, we cannot tell which o f these operators is appropriate, so we create two branches of the search tree, corresponding to the following goal stacks:
ONTABLE(C) ON(C,x)
CLEAR(C) CLEAR(C)
ARMEMPTY ARMEMPTY
ONTABLE(C) ^ CLEAR(C) ^ ON(C,x) ^ CLEAR(C) ^
ARMEMPTY ARMEMPTY
PICKUP(C) UNSTACK(C,x)
CLEAR(A) ^ HOLDING(C) CLEAR(A) ^ HOLDING(C)
STACK(C,A) STACK(C,A)
ON(B,D) ON(B,D)
ON(C,A) ^ ON(B,D) ^ OTAD ON(C,A) ^ ON(B,D) ^ OTAD
[1] [2]
Notice that for alternative 2, the goal stack now contains a variable x, which appears in three places. Although any block could be substituted for x, it is important that the same one be matched to each of the x's. Thus it is impo rtant that each time a variable is introduced into the goal stack, it be given a name distinct from any other variables already in the stack. And whenever a candidate object is chosen to match a variable, the binding must be recorded so that other occurre nces of the same variable will be bound to the same object.
How should our program choose now between alternative 1 and alternative 2? We can tell that picking up C (alternative 1) is better than unstacking it because it is not currently on anything. So to unstack it, we would first have to stack it. Although t his could be done, it would be a waste of effort. But how could a program know that? Suppose we decided to pursue alternative 2 first. To satisfy ON(C, x), we would have to STACK C onto some block x. The goal stack would then be
CLEAR(x)
HOLDING(C)
CLEAR(x) n HOLDING(C) STACK(C, x)
CLEAR(C)
ARMEMPTY
ON(C, x) n CLEAR(C) n ARMEMPTY UNSTACK(C, x)
CLEAR(A) þ HOLDING(C)
STACK(C, A)
ON(B, D)
ON(C, A) n ON(B, D) n OTAD
But now notice that one of the preconditions of STACK is HOLDING(C). This is what we were trying to achieve by applying UNSTACK, which required us to apply STACK so that the precondition ON(C, x) would be satisfied. So we are back to our origina l goal. In fact, we now have additional goals as well, since other predicates have also been added to the stack. At this point, this path can be terminated as unproductive. If, however, block C had been on another block in the current state, ON(C, x) would have been satisfied immediately with no need to do a STACK and this path would have led to a good solution.
Now we must return to alternative 1, which used PICKUP to get the arm holding C. The top element on the goal stack is ONTABLE(C), which is already satisfied, so we pop it off. The next element is CLEAR(C), which is also satisfied, so we pop it off. The remaining precondition of PICKUP(C) is ARMEMPTY, which is not satisfied since HOLDING(B) is true. There are two operators that could be applied to make ARMEMPTY true: STACK(B, x) and PUTDOWN(B). In other words, we can either put B on the table or we can put it on another block. Which should we choose? If we look ahead a bit, we see that we ultimately want to get B onto D. It would be most efficient simply to put it there now. Our program could figure this out by comparing the elements of the ADD l ists of the competing operators to the rest of the goal stack. If one of the operators has the fortuitous effect of making any of those goals true, it should be chosen. So we choose to apply STACK(B, D) by binding D to x in the STACK operator. This makes the goal stack
CLEAR(D)HOLDING(B)
CLEAR(D) ^ HOLDING(B)
STACK(B, D)
ONTABLE(C) ^ CLEAR(C) ^ ARMEMPTY
PICKUP(C)
CLEAR(A) ^ HOLDING(C)
STACK(C, A)
ON(B, D)
ON(C, A) ^ ON(B, D) ^ OTAD
CLEAR(D) and HOLDING(B) are both true. Now the operation STACK(B, D) can be performed, producing the world model
ONTABLE(A) ^ ONTABLE(C) ^ ONTABLE(D) ^
ON(B, D) ^ ARMEMPTY
All of the preconditions for PICKUP(C) are now satisfied so it, too, can be executed. Then all of the preconditions of STACK(C, A) are true, so it can be executed.
Now we can begin work on the second part of our original goal, ON(B, D). But it has already been satisfied by the operations that were used to satisfy the first subgoal. This happened because when we had a choice of ways to get rid of the arm holding B , we scanned back down the goal stack to see if one of the operators would have other useful side effects and we found that one did. So we now pop ON(B, D) off the goal stack. We then do one last check of the combined goal ON(C, A) ^ ON (B, D) ^ ONTABLE(A ) ^ ONTABLE(D) to make sure that all four parts still hold, which, of course, they do here. The problem solver can now halt and return as its answer the plan
1. UNSTACK(B, A)
2. STACK(B, D)
3. PICKUP(C)
4. STACK(C, A)
In this simple example, we saw a way in which heuristic information can be applied to guide the search process, a way in which an unprofitable path could be detected, and a way in which considering some interaction among goals could help produce a good overall solution. But for problems more difficult than this one, these methods are not adequate.
To see why this method may fail to find a good solution, we attempt to solve the problem shown in Figure 13.5.2 There are two ways that we could begin solving this problem, corresponding to the goal stacks
ON(A, B) ON(B, C)
ON(B, C) ON(A, B)
ON(A, B) ^ ON(B, C) ON(A, B) ^ ON(B, C)
[1] [2]
z This problem is often called the Sussman Anomaly, because it was carefully studied in Sussman [ 1975]
Suppose that we choose alternative 1 and begin trying Io get A on B. We will eventually produce the goal stack shown in Figure 13.6.
We can then pop off the stack goals that have already been satisfied, until we reach the ARMEMPTY precondition of PICKUP(A). To satisfy it, we need to PUTDOWN(C). Then we can continue popping until the goal stack is
ON(B, C)ON(A, B) ^ ON(B, C)
Then the current state is
ONTABLE(B) ^ON(A, B) ^
ONTABLE(C) ^
ARMEMPTY
The sequence of operators applied so far is
1. UNSTACK(C, A)2. PUTDOWN(C)
3. PICKUP(A)
4. STACK(A, B)
Now we can begin to work on satisfying ON(B, C). Without going through all the detail, we can see that our algorithm will attempt to achieve this goal by stacking B on C. But to do that, it has to unstack A from B. By the time we have achieved the goal ON(B, C) and popped it off the stack, we will have executed the following additional sequence of operators:
5. UNSTACK(A, B)
6. PUTDOWN(A)
7. PICKUP(B)
8. STACK(B, C)
The problem state will be
ON(B, C) ^ONTABLE(A) ^
ONTABLE(C) ^
ARMEMPTY
But now when we check the remaining goal on the stack,
ON(A, B) n ON(B, C)
we discover that it is not satisfied. We have undone ON(A, B) in the process of achieving ON(B, C). The difference between the goal and the current state is ON(A, B), which is now added to the stack so that it can be achieved again. This time, the sequ ence of operators
9. PICKUP(A)
10. STACK(A, B)
is found. Now the combined goal is again checked, and this time it is satisfied. The complete plan that has been discovered is
1. UNSTACK(C,A) 6. PUTDOWN(A)
2. PUTDOWN(C) 7. PICKUP(B)
3. PICKUP(A) 8. STACK(B,C)
4. STACK(A,B) 9. PICKUP(A)
5. UNSTACK(A, B) 10. STACK(A, B)
Although this plan will achieve the desired goal, it does not do so very efficiently. A similar situation would have occurred if we had examined the two major subgoals in the opposite order. The method we are using is not capable of finding an efficien t way of solving this problem.
There are two approaches we can take to the question of how a good plan can be found. One is to look at ways to repair the plan we already have to make it more efficient. In this case, that is fairly easy to do. We can look for places in the plan where we perform an operation and then immediately undo it. If we find any such places, we can eliminate both the doing and the undoing steps from the plan. Applying this rule to our plan, we eliminate steps 4 and 5. Once we do that, we can also eliminate step s 3 and 6. The resulting plan
1. UNSTACK(C, A) 4. STACK(B, C)
2. PUTDOWN(C) 5. PICKUP(A)
3. PICKUP(B) 6. STACK(A,B)
contains, in fact, the minimum number of operators needed to solve this problem. But for more complex tasks, the interfering operations may be farther apart in the plan and thus much more difficult to detect. In addition, we wasted a good deal of probl emsolving effort producing all the steps that were later eliminated. It would be better if there were a plan-finding procedure that could construct efficient plans directly. In the next section, we present a technique for doing this.
The goal-stack planning method attacks problems involving conjoined goals by solving the goals one at a time, in order. A plan generated by this method contains a sequence of operators for attaining the first goal, followed by a complete sequence for t he second goal, etc. But as we have seen, difficult problems cause goal interactions. The operators used to solve one subproblem may interfere with the solution to a previous subproblem. Most problems require an intertwined plan in which multiple subprobl ems are worked on simultaneously. Such a plan is called a nonlinear plan because it is not composed of a linear sequence of complete subplans.
As an example of the need for a nonlinear plan, let us return to the Sussman anomaly described in Figure 13.5. A good plan for the solution of this problem is the following:
This section explores some heuristics and algorithms for tackling nonlinear problems such as this one.
Many ideas about nonlinear planning were present in HACKER [Sussman,1975], an automatic programming system. The first true nonlinear planner, though, was NOAH [Sacerdoti, I975]. NOAH was further improved upon by the NONLIN program [Tate, 1977]. The goa l stack algorithm of STRIPS was transformed into a goal set algorithm by Nilsson [1980]. Subsequent planning systems, such as MOLGEN [Stefik,1981b] and TWEAK [Chapman,1987], used constraint posting as a central technique.
The idea of constraint posting is to build up a plan by incrementally hypothesizing operators, partial orderings between operators, and bindings of variables within operators. At any given time in the problem-solving process, we may have a set of usefu l operators but perhaps no clear idea of how those operators should be ordered with respect
to each other. A solution is a partially ordered, partially instantiated set of operators; to generate an actual plan, we convert the partial order into any of a number of total orders. Figure 13.7 shows the difference between the constraint posting m ethod and the planning methods discussed in earlier sections.
We now examine several operations for nonlinear planning in a constraint-posting environment, although many of the operations themselves predate the use of the technique in planning.
Let's incrementally generate a nonlinearplan to solve the Sussman anomaly problem. We begin with the null plan, i.e., a plan with no steps. Next we look at the goal state and posit steps for achieving that goal. Means-ends analysis tells us to choose t wo steps with respective postconditions ON(A, B) and ON(B, C):
Each step is written with its preconditions above it and its postconditions below it. Delete postconditions are marked with a negation symbol (¬). Notice that, at this point, the steps are not ordered with respect to each other. All we know is that we want to execute both of them eventually. Neither can be executed right away because some of their preconditions are not satisfied. An unachieved precondition is marked with a star (*). Both of the *HOLDING preconditions are unachieved because the arm holds nothing in the initial problem state.
Introducing new steps to achieve goals or preconditions is called step addition, and it is one of the heuristics we will use in generating nonlinear plans. Step addition is a very basic method dating back to GPS [Newell and Simon, 1963], where m eansends analysis was used to pick operators with postconditions corresponding to desired states. Figure 13.8 lists step addition along with other heuristics we use throughout this example.
To achieve the preconditions of the two steps above, we can use step addition again:
Adding these PICKUP steps is not enough to satisfy the *HOLDING preconditions of the STACK steps. This is because there are no ordering constraints present among the steps. If, in the eventual plan, the PICKUP steps were to follow the STACK steps, then the *HOLDING preconditions would need to be satisfied by some other set of steps. We solve this problem by introducing ordering constraints whenever we employ step addition. In this case, we want to say that each PICKUP step should precede its correspond ing STACK step [Footnote: S1 &60; S2 means that step S1 must precede step S2 in the eventual plan]
PICKUP(A) &60;- STACK(A, B)
PICKUP(B) &60;- STACK(B, C)
We now have four (partially ordered) steps in our plan and four unachieved preconditions. *CLEAR(A) is unachieved because block A is not clear in the initial state. *CLEAR(B) is unachieved because although B is clear in the initial state, there exists a step STACK(A, B) with postcondition,CLEAR(B), and that step might precede the step with *CLEAR(B) as a precondition. To achieve precondition CLEAR(B), we use a second heuristic known as promotion. Promotion, first used by Sussman in his HACKER pr ogram [Sussman,1975], amounts to posting a constraint that one step must precede another in the eventual plan. We can achieve CLEAR(B) by stating that the PICKUP(B) step must come before the STACK(A, B) step:
PICKUP(B) &60;- STACK(A, B)
Let's now turn to the two unachieved *ARMEMPTY preconditions [we deal with *CLEAR(A) a little later]. While the initial state has an empty arm, each of the two pickup operators contain,ARMEMPTY postconditions. Either operator could prevent the other fr om executing. We can use promotion to achieve at least one of the two preconditions:
PICKUP(B) &60;- PICKUP(A)
Since the initial situation contains an empty arm, and no step preceding PICKUP(B) could make it unempty, the preconditions of PICKUP(B) are all satisfied.
A third heuristic, called declobbering, can help achieve the *ARMEMPTY precondition in the PICKUP(A) step. PICKUP(B) asserts ¬ARMEMPTY, but if we can insert another step between PICKUP(B) and PICKUP(A) to reassert ARMEMPTY then the precondit ion will be achieved. The STACK(B, C) does the trick, so we post another constraint:
PICKUP(B) &60;- STACK(B, C) &60;- PICKUP(A)
The step PICKUP(B) is said to "clobber" PICKUP(A)'s precondition. STACK(B, C) is said to "declobber" it. Declobbering was first used in the NOAH planner [Sacerdoti, 1975], and then in NONLIN. NOAH was the first nonlinear planner to make use of the heur istics we are discussing here. NOAH also used many other heuristics and was able to solve a number of difficult nonlinear planning problems. Still, there were some natural problems that NOAH could not solve. In particular, NOAH's inability to backtrack pr evented it from finding many solutions. The NONLIN program included backtracking, but it also failed to solve many hard problems.
Back in our example, the only unachieved precondition left is *CLEAR(A), from the PICKUP(A) step. We can use step addition to achieve it:
We introduce the variable x because the only postcondition we are interested in is CLEAR(A). Whatever block is on top of A is irrelevant. Constraint posting allows us to create plans that are incomplete with respect to the order of the steps. Va riables allow us to avoid committing to particular instantiations of operators.
Unfortunately, we now have three new unachieved preconditions. We can achieve ON(x, A) easily by constraining the value of x to be block C. This works because block C is on block A in the initial state. This heuristic is called simple establishment, and in its most general form, it allows us to state that two different propositions must be ultimately instantiated to the same proposition. In our case:
x = C in step UNSTACK(x, A)
There are still steps that deny the preconditions CLEAR(C) and ARMEMPTY, but we can use promotion to take care of them:
UNSTACK(x, A) &60;- STACK(B, C)
UNSTACK(x, A) &60;- PICKUP(A)
UNSTACK(x, A) &60;- PICKUP(B)
Among the heuristics we have looked at so far, adding a new step is the most problematic because we must always check if the new step clobbers some precondition of a later, already existing step. This has actually happened in our example. The step PICK UP(B) requires ARMEMPTY but this is denied by the new UNSTACK(x, A) step. One way to solve this problem is to add a new declobbering step to the plan:
ordered as
UNSTACK(x, A) &60;- PUTDOWN(C) &60;- PICKUP(B)
Notice that we have seen two types of declobbering, one in which an existing step is used to declobber another and one in which a new declobbering step is introduced. Fortunately, the precondition of our newest PUTDOWN step is satisfied. In fact, all p reconditions of all steps are satisfied, so we are done. All that remains is to use the plan ordering and variable binding constraints to build a concrete plan:
1. UNSTACK(C, A)
2. PUTDOWN(C)
3. PICKUP(B)
4. STACK(B, C)
5. PICKUP(A)
6. STACK(A, B)
This is the same plan we found at the end of Section 13.4. We used four different heuristics to synthesize it: step addition, promotion, declobbering, and simple establishment. (These are sometimes called plan modification operations.) Are these four operations, applied in the correct order, enough to solve any nonlinear planning problem? Almost. We require one more, called separation. Separation is like simple establishment, in that it concerns variable bindings, but it is used in a decl obbering fashion. Suppose step C1 possibly precedes step C2 and C1 possibly denies a precondition of C2. We say "possibly" because the propositions may contain variables. Separation allows us to state a constraint that the two propositions must not be instantiated in the same way in the eventual plan.
Work on the TWEAK planner presented formal definitions of the five plan modification operations and proved that they were sufficient for solving any solvable nonlinear planning problem. In this manner, TWEAK cleaned up the somewhat ad hoc , heuristic results in nonlinear planning research. The algorithm to exploit the plan modification operations is quite simple.
Algorithm: Nonlinear Planning (TWEAK)
Of course, not every sequence of plan modification operations leads to a solution. For instance, we could use step addition ad infinitum without ever converging to a useful plan. The nondeterminism of steps 2 and 3 must be implemented as some sort of s earch procedure. This search can be guided by heuristics; for example, if promotion and step addition will both do the job, it is probably better to try promotion first. TWEAK uses breadth-first dependency-directed backtracking, as well as ordering heuris tics.
The example above used most of the plan modification operations, but not in their full generality. We will now be more specific about these operations and bow they relate to finding correct plans. The core notion is one of making a proposition necessar ily true in some state. The modal truth criterion tells us exactly when a proposition is true.
Roughly, this means that P has to be asserted in the initial state or by some previous step and that there can be no clobbering steps without corresponding declobbering steps to save the day. The relationship between the modal truth criterion an d the five plan modification operations is shown in Figure 13.9. The figure is simply a logical parse tree of the criterion, from which we can see how the plan modification operations are used to enforce the truth of various parts of the criterion. In the figure, the expression C1 &60; C2 means step (or state) C1, necessarily precedes step (or state) C2. The expression P = Q means P and Q codesignate.
The development of a provably correct planner was a noteworthy achievement in the formal (or "neat") style of AI. It cleaned up the complicated, ill-defined planning notions that preceded it and made available a reliable (if not efficient) planner. Now , however, a new round of more informal (or "scruffy") research must follow, concentrating on
the weaknesses of such planners. Efficiency is of critical concern in large systemsassured correctness is nice, but a slow planner can be less useful than an incorrect one. Typically, search-based programs can be made faster through the use of heuristi c knowledge. Another efficiency issue has to do with the length of the plans produced by a planner. Current planners can, unfortunately, generate highly inefficient plans.
Representational issues are just as important as efficiency issues, and the two are closely intertwined. The representation of operators and plans used by TWEAK is at the same time too powerful and too weak. Chapman [1987] proved that even with simple STRIPS-style operators, planning in general is not even decidable, although it is semidecidable: If there is a plan that solves a problem, a planner can find it, but if there is no such plan, the planner may never halt. NP-completeness results suggest tha t planning is exponentially hard. But it is of no use to look for a simpler representation that might allow for more efficient plan construction-if anything, most domains seem to require operators that are much more complex than the operators used by TWEA K. For example, it is natural to express many preconditions using quantifiers and embedded negation and also to have postconditions with different effects depending on the state of the world. Figure 13.10 depicts a more complex operator structure, of the type used in the PRODIGY planning system [Minton et al., 1989]. As our representation becomes more expressive, the idea of a provably correct, efficient, domain-independent planner becomes more unlikely, and we must again turn to knowledge-intensive heuri stic methods.
In order to solve hard problems, a problem solver may have to generate long plans. In order to do that efficiently, it is important to be able to eliminate some of the details of
the problem until a solution that addresses the main issues is found. Then an attempt can be made to fill in the appropriate details. Early attempts to do this involved the use of macro-operators, in which larger operators were built from smaller ones [Fikes and Nilsson,1971]. But in this approach, no details were eliminated from the actual descriptions of the operators. A better approach was developed in the ABSTRIPS system [Sacerdoti,1974], which actually planned in a hierarchy of abstraction spac es, in each of which preconditions at a lower level of abstraction were ignored.
As an example, suppose you want to visit a friend in Europe, but you have a limited amount of cash to spend. It makes sense to check air fares first, since finding an affordable flight will be the most difficult part of the task. You should not worry a bout getting out of your driveway, planning a route to the airport, or parking your car until you are sure you have a flight.
The ABSTRIPS approach to problem solving is as follows: First solve the problem completely, considering only preconditions whose criticality value is the highest possible. These values reflect the expected difficulty of satisfying the preconditi on. To do this, do exactly what STRIPS did, but simply ignore preconditions of lower than peak criticality. Once this is done, use the constructed plan as the outline of a complete plan and consider preconditions at the next-lowest criticality level. Augm ent the plan with operators that satisfy those preconditions. Again, in choosing operators, ignore all preconditions whose criticality is less than the level now being considered. Continue this process of considering less and less critical preconditions u ntil all of the preconditions of the original rules have been considered. Because this process explores entire plans at one level of detail before it looks at the lower-level details of any one of them, it has been called length-first search.
Clearly, the assignment of appropriate criticality values is crucial to the success of this hierarchical planning method. Those preconditions that no operators can satisfy are clearly the most critical. For example, if we are trying to solve a problem involving a robot moving around in a house and we are considering the operator PUSH-THROUGHDOOR, the precondition that there exist a door big enough for the robot to get through is of high criticality since there is (in the normal situation) nothing we ca n do about it if it is not true. But the precondition that the door be open is of lower criticality if we have the operator OPEN-DOOR. In order for a hierarchical planning system to work with STRIPS-like rules, it must be told, in addition to the rules th emselves, the appropriate criticality value for each term that may occur in a precondition. Given these values, the basic process can function in very much the same way that nonhierarchical planning does. But effort will not be wasted filling in the detai ls of plans that do not even come close to solving the problem.
So far, we have described a deliberative planning process, in which a plan for completing an entire task is constructed prior to action. There is a very different way, though, that we could approach the problem of deciding what to do. The idea of re active systems [Brooks, 1986; Agre and Chapman,1987; Kaebling,1987] is to avoid planning altogether, and instead use the observable situation as a clue to which one can simply react.
A reactive system must have access to a knowledge base of some sort that describes what actions should be taken under what circumstances. A reactive system is very different from the other kinds of planning systems we have discussed because it chooses actions one at a time; it does not anticipate and select an entire action sequence before it does the first thing.
One of the very simplest reactive systems is a thermostat. The job of a thermostat is to keep the temperature constant inside a room. One might imagine a solution to this problem that requires significant amounts of planning, taking into account how th e external temperature rises and falls during the day, how heat flows from room to room, and so forth. But a real thermostat uses the simple pair of situation-action rules:
It turns out that reactive systems are capable of surprisingly complex behaviors, especially in real world tasks such as robot navigation. We discuss robot tasks in more detail in Chapter 21. The main advantage reactive systems have over traditional pl anners is that they operate robustly in domains that are difficult to model completely and accurately. Reactive systems dispense with modeling altogether and base their actions directly on theirperception ofthe world. In complex and unpredictable domains, the ability to plan an exact sequence of steps ahead of time is of questionable value. Another advantage of reactive systems is that they are extremely responsive, since they avoid the combinatorial explosion involved in deliberative planning. This makes them attractive for real time tasks like driving and walking.
Of course, many AI tasks do require significant deliberation, which is usually implemented as internal search. Since reactive systems maintain no model of the world and no explicit goal structures, their performance in these tasks is limited. For examp le, it seems unlikely that a purely reactive system could ever play expert chess. It is possible to provide a reactive system with rudimentary planning capability, but only by explicitly storing whole plans along with the situations that should trigger th em. Deliberative planners need not rely on pre-stored plans; they can construct a new plan for each new problem.
Nevertheless, inquiry into reactive systems has served to illustrate many of the shortcomings of traditional planners. For one thing, it is vital to interleave planning and plan execution. Planning is important, but so is action. An intelligent system with limited resources must decide when to start thinking, when to stop thinking, and when to act. Also, goals arise naturally when the system interacts with the environment. Some mechanism for suspending plan execution is needed so that the system can tu rn its attention to high priority goals. Finally, some situations require immediate attention and rapid action. For this reason, some deliberative planners [Mitchell,1990] compile out reactive subsystems (i.e., sets of situation-action rules) based on the ir problem-solving experiences. Such systems learn to be reactive over time.
Other planning techniques that we have not discussed include the following.
ASSIGN(x, v, lv, ov)
P: CONTAINS(lv, v) ^ CONTAINS(x, ov)
D: CONTAINS(x, ov)
A: CONTAINS(x, v)
Assume that there is at least one additional register, C, available.