In Chapter 1, we discussed the role that knowledge plays in AI systems. In succeeding chapters up until now, though, we have paid little attention to knowledge and its importance as we instead focused on basic frameworks for building search-based problem-solving programs. These methods are sufficiently general that we have been able to discuss them without reference to how the knowledge they need is to be represented. For example, in discussing the best-first search algorithm, we hid all the references to domain-specific knowledge in the generation of successors and the computation of the h' function. Although these methods are useful and form the skeleton of many of the methods we are about to discuss, their problem-solving power is limited precisely because of their generality. As we look in more detail at ways of representing knowledge, it becomes clear that particular knowledge representation models allow for more specific, more powerful problem-solving mechanisms that operate on them. In this part of the book, we retum to the topic of knowledge and examine specific techniques that can be used for representing and manipulating knowledge within programs.
In order to solve the complex problems encountered in artificial intelligence, one needs both a large amount of knowledge and some mechanisms for manipulating that knowledge to create solutions to new problems. A variety of ways of representing knowledge (facts) have been exploited in AI programs. But before we can talk about them individually, we must consider the following point that pertains to aIl discussions of representation, namely that we are dealing with two different kinds of entities:
One way to think of structuring these entities is as two levels:
See Newell [1982] for a detailed exposition of this view in the context of agents and their goals and behaviors. In the rest of our discussion here. we will follow a model more like the one shown in Figure 4.1. Rather than thinking of one level on top of another, we will focus on facts, on representations, and on the two-way mappings that must exist between them. We will call these links representation mappings. The forward representation mapping maps from facts to representations. The backward representation mapping goes the other way, from representations to facts.
One representation of facts is so common that it deserves special mention: natural language (particularly English) sentences. Regardless of the representation for facts that we use in a program, we may also need to be concerned with an English representation of those facts in order to facilitate getting information into and out of the system. In this case, we must also have mapping functions from English sentences to the representation we are actually going to use and from it back to sentences. Figure 4.1 shows how these three kinds of objects relate to each other.
Let's look at a simple example using mathematical logic as the representational formalism. Consider the English sentence:
Spot is a dog.
The fact represented by that English sentence can also be represented in logic as:
dog(Spot)
Suppose that we also have a logical representation of the fact that all dogs have tails:
&every;x : dog(x) -> hastail(x)
Then, using the deductive mechanisms of logic, we may generate the new representation object:
hastail(Spot)
Using an appropriate backward mapping function, we could then generate the En- glish sentence:
Spot has a tail.
Or we could make use of this representation of a new fact to cause us to take some appropriate action or to derive representations of additional facts.
It is important to keep in mind that usually the available mapping functions are not one-to-one. In fact, they are often not even functions but rather many-to-many relations. (In other words, each object in the domain may map to several elements in the range, and several elements in the domain may map to the same element of the range.) This is particularly true of the mappings involving English representations of facts. For example, the two sentences "All dogs have tails" and "Every dog has a tail" could both represent the same fact, namely that every dog has at least one tail. On the other hand, the former could represent either the fact that every dog has at least one tail or the fact that each dog has several tails. The latter may represent either the fact that every dog has at least one tail or the fact that there is a tail that every dog has. As we will see shonly, when we try to conven English sentences into some other representation, such as logical propositions, we m4st first decide what facts the sentences represent and then convert those facts into the new representation.
The starred links of Figure 4.1 are key components of the design of any knowledge- based program. To see why, we need to understand the role that the intemal representation of a fact plays in a program. What an AI program does is to manipulate the internal representations of the facts it is given. This manipulation should result in new structures that can also be interpreted as internal representations of facts. More precisely, these structures should be the internal representations of facts that correspond to the answer to the problem described by the starting set of facts.
Sometimes, a good representation makes the operation of a reasoning program not only correct but trivial. A well-known example of this occurs in the context of the mutilated checkerboard problem, which can be stated as follows:
One way to solve this problem is to try to enumerate, exhaustively, all possible tilings to see if one works. But suppose one wants to be more clever. Figure 4.2 shows three ways in which the mutilated checkerboard could be repnesented (to a person). The first
representation does not directly suggest the answer to the problem. The second may; the third does, when combined with the single additional fact that each domino must cover exactly one white square and one black square. Even for human problem solvers a representation shift may make an enormous difference in problem-solvingeffectiveness. Recall that we saw a slightly less dramatic version of this phenomenon with respect to a problem-solving program in Section 1.3.1, where we considered two different ways of representing a tic-tac-toe board, one of which was as a magic square.
Figure 4.3 shows an expanded view of the starred part of Figure 4.1. The dotted line across the top represents the abstract reasoning process that a program is intended to model. The solid line across the bottom represents the concrete reasoning process that a particular program performs. This program successfully models the abstract process to the extent that, when the backward representation mapping is applied to the program's output, the appropriate final facts are actually generated. If either the program's operation or one of the representation mappings is not faithful to the problem that is being modeled, then the final facts will probably not be the desired ones. The key role that is played by the nature of the representation mapping is apparent from this figure. If no good mapping can be defined for a problem, then no matter how good the program to solve the problem is, it will not be able to produce answers that correspond to real answers to the problem.
It is interesting to note that Figure 4.3 looks very much like the sort of figure that might appear in a general programming book as a deseription of the relationship between an abstract data type (such as a set) and a concrete implementation of that type (e.g., as a linked list of elements). There are some differences, though, between this figure and the formulation usually used in programming texts (such as Aho et al. [1983]). For example, in data type design it is expected that the mapping that we are calling the backward representation mapping is a function (i.e., every representation corresponds to only one fact) and that it is onto (i.e., there is at least one representation for every fact). Unfortunately, in many AI domains, it may not be possible to come up with such a representation mapping, and we may have to live with one that gives less ideal results. But the main idea of what we are doing is the same as what programmers always do, namely to find conerete implementations of abstract concepts.
A good system for the representation of knowledge in a particular domain should possess the following four properties:
Unfortunately, no single system that optimizes all of the capabilities for all kinds of knowledge has yet been found. As a result, multiple techniques for knowledge representation exist. Many programs rely on more than one technique. In the chapters that follow, the most important of these techniques are described in detail. But in this section, we provide a simple, example-based introduction to the important ideas.
Simple Relational Knowledge
The simplest way to represent declarative facts is as a set of relations of the same sort used in database systems. Figure 4.4 shows an example of such a relational system.
The reason that this representation is simple is that standing alone it provides very weak inferential capabilities. But knowledge represented in this form may serve as the input to more powerful inference engines. For example, given just the facts of Figure 4.4, it is not possible even to answer the simple question. "Who is the heaviest player?" But if a procedure for finding the heaviest player is provided, then these facts will enable the procedure to compute an answer. If, instead, we are provided with a set of rules for deciding which hitter to put up against a given pitcher (based on right- and lefthandedness, say), then this same relation can provide at least some of the information required by those rules.
Providing support for relational knowledge is what database systems are designed to do. Thus we do not need to discuss this kind of knowledge representation structure further here. The practical issues that arise in linking a database system that provides this kind of support to a knowledge representation system that provides some of the other capabilities that we are about to discuss have already been solved in several commercial products.
Inheritable Knowledge
The relational knowledge of Figure 4.4 corresponds to a set of attributes and associated values that together describe the objects of the knowledge base. Knowledge about objects, their attributes, and their values need not be as simple as that shown in our example. In particutar, it is possible to augment the basic representation with inference mechanisms that operate on the structure of the representation. For this to be effective, the structure must be designed to correspond to the inference mechanisms that are desired. One of the most useful forms of inference is property inheritance, in which elements of specific classes inherit attributes and values from more general classes in which they are included.
In order to support property inheritance, objects must be organized into classes and classes must be arranged in a generalization hierarchy. Figure 4.5 shows some additional baseball knowledge inserted into a structure that is so arranged. Lines represent attributes. Boxed nodes represent objects and values of attributes of objects. These values can also be viewed as objects with attributcs and values, and so on. The arrows on the lines point from an object to its value along the corresponding attribute line. The structure shown in the figure is a slot-and-filler structure. It may also be called a semantic network or a collection of frames. In the latter case, each individual frame represents the collection of attributes and values associated with a particular node. Figure 4.6 shows the node for baseball player displayed as a frame.
Do not be put off by the confusion in terminology here. There is so much flexibility in the way that this (and the other structures described in this section) can be used to solve particular representation problems that it is difficult to reserve precise words for particular representations. Usually the use of the term frame systems implies somewhat more structure on the atlributes and the inference mechanisms that are available to apply to them than does the term semantic network.
In Chapter 9 we discuss structures such as these in substantial detail. But to get an idea of how these structures support inference using the knowledge they contain, we discuss them briefly here. All of the objects and most of the attributes shown in this example have been chosen to correspond to the baseball domain, and they have no general significance. The two exceptions to this are the attribute isa, which is being used to show class inclusion, and the attribute instance, which is being used to show class membership. These two specific (and generally useful) attributes provide the basis for property inheritance as an inference technique. Using this techniyue, the knowledge base can support retrieval both of facts that have been explicitly stored and of facts that can be derived from those that are explicitly stored.
An idealized form of the property inheritance algorithm can be stated as follows.
Algorithm: Property Inheritance
To retrieve a value V for attribute A of an instance object O:
This procedure is simplistic. lt does rtot say what we should do if thet'e is more than one value of the instance or isa attribute. But it does describe the basic mechanism of inheritance. We can apply this procedure to our example knowledge base to derive answers to the following queries:
Inferential Knowledge
Property inheritance is a powerful form of inference, but it is not the only useful form. Sometimes all the power of traditional logic (and sometimes even more than that) is necessary to describe the inferences that are needed. Figure 4.7 shows two examples of the use of first-order predicate logic to represent additional knowledge about baseball.
Of course, this knowledge is useless unless there is also an inference procedure that can exploit it (just as the default knowledge in the previous example would have been useless without our algorithm for moving through the knowledge structure). The required inference procedure now is one that implements the standard logical rules of inference. There are many such procedures, some of which reason forward from giverr facts to conclusions, others of which reason backward from desired conclusions to given facts. One of the most commonly used of these procedures is resolution, which exploits a proof by contradiction strategy. Resolution is described in detail in Chapter 5.
Recall that we hinted ut the need for something besides stored primitive values with the bats attribute ofour previous example. Logic provides a powerful structure in which to describe relationships among values. It is often useful to combine this, or some other powerful deseription language, with an isa hierarehy. In general, in fact, all of the techniques we are describing here should not be regarded as complete and incompatible ways ufrepresenting knowledge. Instead, they should be viewed as building blocks of a complete representational system.
So far, our examples of baseball knowledge have concentrated on relatively static, declarative facts. But another, equally useful, kind of knowledge is operational, or procedural knowledge, that specifies what to do when. Procedural knowledge can be
represented in programs in many ways. The most common way is simply as code (in some programming language such as LISP) for doing somethiog. The machine uses the knowledge when it executes the code to perform a task. Unfortunately, this way of representing procedural knowledge gets low scores with respect to the properties of inferential adequacy (because it is very difficult to write a program that can reason about another program's behavior) and acquisitional efficiency (because the process of updating and debugging large pieces of code becomes unwieldy).
As an extreme example, compare the representation of the way to compute the value of bats shown in Figure 4.6 to one in LISP shown in Figure 4.8. Although the LISP one will work gives a particular way of storing attributes and values in a list, it does not lend itself to being reasoned about in the same straightforward way as the representation of Figure 4.6 does. The LISP representation is slightly more powerful since it makes explicit use of the name of the node whose value for handed is to be found. But if this matters, the simpler representation can be augmented to do this as well.
Because of this difficulty in reasoning with LISP, attempts have been made to find other ways of representing procedural knowledge so that it can relatively easily be manipulated both by other programs and by people.
The most commonly used technique for representing procedural knowledge in AI programs is the use of production rules. Figure 4.9 shows an example of a production rule that represents a piece of operational knowledge typically possessed by a baseball player.
Production rules, particularly ones that are augmented with information on how they are to be used, are more procedural than are the other representation methods discussed in this chapter. But making a clean distinction between declarative and procedural knowledge is difficult. Although at an intuitive level such a distinction makes some sense, at a formal level it disappears, as discussed in Section 6.1. In fact, as you can see, the structure of the declarative knowledge of Figure 4.7 is not substantially different from that of the operational knowledge of Figure 4.9. The important difference is in how the knowledge is used by the procedures that manipulate it.
Before embarking on a discussion of specific mechanisms that have been used to represent various kinds of real-world knowledge, we need briefly to discuss several issues that cut across all of them:
We will talk about each of these questions briefly in the next five sections.
There are two attributes that are of very general significance, and we have already seen their use: instance and isa. These attributes are important because they support property inheritance. They are called a variety of things in AI systems, but the names do not matter. What does matter is that they represent class membership and class inelusion and that class inclusion is transitive. In slot-and-filler systems, such as those described in Chapters 9 and 10, these attributes are usually represented explicitly in a way much like that shown in Figures 4.5 and 4.6. In logic-based systems, these relationships may be represented this way or they may be represented implicitly by a set of predicates describing particular classes. See Section 5.2 for some examples of this.
The aitributes that we use to describe objects are themselves entities that we reresent. What properties do they have independent of the specific knowledge they encode? There are four such properties that deserve mention here:
Entities in the world are related to each other in many different ways. But as soon as we decide to describe those relationships as attributes, we commit to a perspective in which we focus on one object and look for binary relationships between it and others. Attributes are those relationships. So, for example, in Figure 4.5, we used the attributes instance, isa, and team. Each of these was shown in the figure with a directed arrow, originating at the object that was being described and terminating at the object representing the value of the specified attribute. But we could equally well have focused on the object representing the value. If we do that, then there is still a relationship between the two entities, although it is a different one since the original relationship was not symmetric (although some relationships, such as sibling, are). In many cases, it is important to represent this other view of relationships. There are two good ways to do this.
The first is to represent both relationships in a single representation that ignores focus. Logical representations are usually interpreted as doing this. For example, the assertion:
team(Pee-Wee-Reese,Brooklyn-Dodgers)
can equally easily be interpreted as a statement about Pee Wee Reese or about the Brooklyn Dodgers. How it is actually used depends on the other assertions that a system contains.
The second approach is to use attributes that focus on a single entity but to use them in pairs, one the inverse of the other. In this approach, we would represent the team information with two attributes:
Just as there are classes of objects and specialized subsets of those classes, there are attributes and specializations of attributes. Consider for example, the attribute height. It is actually a specialization of the more general attribute physical-size which is, in tum, a specialization of physical-attribute. These generalization-specialization relationships are important for attributes for the same reason that they are important for other concepts-they support inheritance. In the case of attributes, they support inheriting information about such things as constraints on the values that the attribute can have and mechanisms for computing those values.
Sometimes values ofattributes are specified explicitly when a knowledge base is created. We saw several examples of that in the baseball example of Figure 4.5. But often the reasoning system must reason about values it has not been given explicitly. Several kinds of information can play a role in this reasoning, including:
We discuss forward and backward rules again in Chapter 6, in the context of rule- based knowledge representation.
A specific but very useful kind of attribute is one that is guaranteed to take a unique value. For example, a baseball player can, at any one time, have only a single height and be a member of only one team. If there is already a value present for one of these attributes and a different value is asserted, then one of two things has happened. Either a change has occurred in the world or there is now a contradiction in the knowledge base that needs to be resolved. Knowledge-representation systems have taken several different approaches to providing support for single-valued attributes, including:
Regardless of the particular representation formalism we choose, it is necessary to answer the question "At what level of detail should the world be represente.d?" Another way this yuestion is often phrased is "What should be our primitives?" Should there be a small number of low-level ones or should there be a Iarger number covering a range of granularities? A brief example illustrates the problem. Suppose we are interested in the following fact:
John spotted Sue.
We could represent this as [Footnote: The arguments agent and objects are usually called cases. They represent roles involved in the event. This semantic way of analyzing sentences contrasts with the propably more familiar syntactic approach in which sentences have a surface subject, indirect object, and so forth. We will discuss case grammar [Fillmore, 1968] and its use in natural language understanding in Section 15.3.2. For the moment, you can safely assume that the cases mean what their names suggest.]
spotted(agent(John).
object(Sue))
Such a representation would make it easy to answer questions such as:
Who spotted Sue?
But now suppose we want to know:
Did John see Sue?
The obvious answer is "yes," but given only the one fact we have, we cannot discover that answer. We could, of course, add other facts, such as spotted(x,y) -> saw(x,y)
We could then infer the answer to the question.
An alternative solution to this problem is to represent the fact that spotting is really a special type of seeing explicitly in the representation of the fact. We might write something such as
saw(agent(John),
object(Sue),
timespan(briefly))
In this representation, we have broken the idea of spotting apart into more primitive concepts of seeing and timespan. Using this representation, the fact that John saw Sue is immediately accessible. But the fact that he spotted her is more difficult to get to.
The major advantage of convetting all statements into a representation in tetms of a small set of primitives is that the rules that are used to derive inferences from that knowledge need be written only in tetnis of the primitives rather than in terms of the many ways in which the knowledge may originally have appeared. Thus what is really being argued for is simply some sort of canonical form. Several A1 programs, induding those described by Sehank and Abelson [1977] and Wilks [1972], are based on knowledge bases described in tetms of a small number of low-level primitives.
There are several arguments against the use of low-level primitives. One is that simple high-level facts may require a lot of storage when broken down into primitives. Much of that storage is really wasted since the low-level rendition of a particular highlevel concept will appear many times, once for each time the high-level concept is referenced. For example, suppose that actions are being represented as combinations of a small set of primitive actions. Then the fact that John punched Mary might be represented as shown in Figure 4.10(a). The representation says that there was physical contact between John's fist and Mary. The contact was caused by John propelling his fist toward Mary, and in order to do that John first went to where Mary was [Footnote: the representation shown in this example is called conceptual dependency and is discussed in detail in section 10.1.]But suppose we also know that Mary punched John. Then we must also store the structure shown in Figure 4.10(b). If, however, punehing were represented simply as punching, then most of the detail of both structures could be omitted from the structures themselves. lt could instead be stored just once in a common representation of the concept of punehing.
A second but related problem is that if knowledge is initially presented to the system in a relatively high-level form, such as English, then substantial work must be done to reduce the knowledge into primitive form. Yet, for many purposes, this detailed primitive representation may be unnecessary. Both in understanding language and in interpreting the world that we see, many things appear that later tum out to be inelevant. For the sake of efficiency, it may be desirable to store these things at a very high level attd then to analyze in detail only those inputs that appear to be important.
A third problem with the use of low-level primitives is that in many domains, it is not at all clear what the primitives should be. And even in domains in which there may be an obvious set of primitives, there may not be enough information present in cach use of the high-level constructs to enable them to be converted into their primitive components. When this is true, there is no way to avoid representing facts at a variety of granularities. The classical example of this sort of situation is provided by kinship terminology [Lindsay,1963). There exists at least one obvious set of primitives: mother, father, son, daughter, and possibly brother and sister. But now suppose we are told that Mary is Sue's cousin. An attempt to describe the cousin relationship in terms of the primitives could produce any of the following interpretations:
If we do not already know that Mary is female, then of course there are four more possibilities as well. Since in general we may have no way of choosing among these representations, we have no choice but to represent the fact using the nonprimitive relation cousin.
The other way to solve this problem is to change our primitives. We could use the set: parent, child, sibling, male, and female. Then the fact that Mary is Sue's cousin could be represented as
But now the primitives incorporate some generalizations that may or may not be appropriate. The main point to be learned from this example is that even in very simple domains, the conect set of primitives is not obvious.
In less well-structured domains, even more problems arise. For example, given just the fact
a program would not be able to decide if John's actions consisted of the primitive sequence:
or the sequence:
or the single action:
or the single action:
As these examples have shown, the problem of choosing the conect granularity of representation for a particular body of knowledge is not easy. Clearly, the lower the level we choose, the less inference required to reason with it in some cases, but the more inference required to create the representation from English and the more room it takes to store, since many inferences will be represented many times. The answer for any particular task domain must come to a large extent from the domain itself to what use is the knowledge to be put?
One way of looking at the question of whether there exists a good set of low-level primitives is that it is a question of the existence of a unique representation. Does there exist a single, canonical way in which large bodies of knowledge can be represented independently of how they were originally stated? Another, closely related, uniqueness question asks whether individual objects can be represented uniquely and independently of how they are described. This issue is raised in the following quotation from Quine [1961] and discussed in Woods [1975]:
In order for a program to be able to reason as did the Babylonian, it must be able to handle several distinct representations that tum out to stand for the same object.
We discuss the question of the correct granularity of representation, as well as issues involving redundant storage of information, throughout the next several chapters, particularly in the section on conceptual dependency, since that theory explicitly proposes that a small set of low-level primitives should be used for representing actions.
It is important to be able to represent sets of objects for several reasons. One is that there are some properties that are true of sets that are not true of the individual members of a set. As examples, consider the assertions that are being made in the sentences "There are more sheep than people in Australia" and "English speakers can be found all over the world " The only way to represent the facts described in these sentences is to attach assertions to the sets representing people, sheep, and English speakers, since, for example, no single English speaker can be found all over the world. The other reason that it is important to be able to represent sets of objects is that if a property is true of all (or even most) elements of a set, then it is more efficient to associate it once with the set rather than to associate it explicitly with every element of the set. We have already looked at ways of doing that, both in logical representations through the use of the universal quantifier and in slot-and-filler structures, where we used nodes to represent sets and inheritance to propagate set-level assertions down to individuals. As we consider ways to represent sets, we will want to consider both of these uses of set-level representations. We will also need to remember that the two uses must be kept distinct. Thus if we assert something like large(Elephant), it must be clear whether we are asserting some property of the set itself (i.e., that the set of elephants is large) or some property that holds for individual elements of the set (i.e., that anything that is an elephant is large).
There are three obvious ways in which sets may be represented. The simplest is just by a name. This is essentially what we did in Section 4.2 when we used the node named Baseball-Player in our semantic net and when we used predicates such as Ball and Batter in our logical representation. This simple representation does make it possible to associate predicates with sets. But it does not, by itself, provide any information about the set it represents. It does not, for example, tell how to determine whether a particular object is a member of the set or not.
There are two ways to state a definition of a set and its elements. The first is to list the members. Such a specification is called an extensional definition. The second is to provide a rule that, when a particular object is evaluated, returns true or false depending on whether the object is in the set or not. Such a rule is called an lnrensionaldefinition. For example, an extensional deseription of the set of our sun's planets on which people live is {Earth}. An intensional deseription is
{x : sun-planet(x) ^ human-inhabited(x)}
For simple sets, it may not matter, except possibly with respect to efficiency concerns, which representation is used. But the two kinds of representations can function differently in some cases.
One way in which extensional and intensional representations differ is that they do not necessarily correspond one-to-one with each other. For example, the extensionally defined set {Earth} has many intensional definitions in addition to the one we just gave. Others include:
{x : sun-planet(x) ^ nth farthest-from-sun(x, 3)}
{x : sun-planet(x) ^ nth-biggest(x, 5)}
Thus, while it is trivial to determine whether two sets are identical if extensional deseriptions are used, it may be very difficult to do so using intensional deseriptions.
lntensional representations have two important properties that extensional ones lack, however. The first is that they can be used to describe infinite sets and sets not all of whose elements are explicitly known. Thus we can describe intensionally such sets as prime numbers (of which there are infinitely many) or kings of England (even though we do not know who all of them are or even how many of them there have been). The second thing we can do with intensional deseriptions is to allow them to depend on parameters that can change, such as time or spatial location. If we do that, then the actual set that is represented by the deseription will change as a funetion of the value of those parameters. To see the effect of this, consider the sentence, "The president of the United States used to be a Democrat," uttered when the current president is a Republican. This sentence can mean two things. The first is that the specific person who is now president was once a Democrat. This meaning can Me captured straightforwardly with an extensional representation of "the president of the United States." We just specify the individual. But there is a second meaning, namely that there was once someone who was the president and who was a Democrat. To represent the meaning of"the president of the United States" given this interpretation requires an intensional deseription that depends on time. Thus we might write president(t), where president is some function that maps instances of time onto instances of people, namely U.S. presidents.
Recall that in Chapter 2, we briefly touched on the problem of matching rules against state descriptions during the problem-solving process. This same issue now rears its head with respect to locating appropriate knowledge structures that have been stored in memory.
For example, suppose we have a script (a description of a class of events in terms of contexts, participants, and subevents) that describes the typical sequence of events in a restaurant.[Footnote: we discuss such a script in detail in Chapter 10.] This script would enable us to take a text such as
John went to Steak and Ale last night. He ordered a large rare steak, paid his bill, and left.
and answer "yes" to the question
Notice that nowhere in the story was John's eating anything mentioned explicitly. But the fact that when one goes to a restaurant one eats will be contained in the restaurant script. If we know in advance to use the restaurant script, then we can answer the question easily. But in order to be able to reason 'about a variety of things, a system must have many scripts for everything from going to work to sailing around the world. How will it select the appropriate one each time? For example, nowhere in our story was the word "restaurant" mentioned.
In fact, in order to have access to the right structure for describing a particular situation, it is necessary to solve all of the following problems. [Footnote: This list is taken from Minsky [1975]].
There is no good, general purpose method for solving all these problems. Some knowledge-representation techniques solve some of them. In this section we survey some solutions to two of these problems: how to select an initial structure to consider and how to find a better structure if that one turns out not to be a good match.
Selecting candidate knowledge structures to match a particular problem-solving situation is a hard problem; there are several ways in which it can be done. Three important approaches are the following:
Another problem with this approach is that it is only useful when there is an English description of the problem to be solved.
None of these proposals seems to be the complete answer to the problem. It often turns out, unfortunately, that the more complex the knowledge structures are, the harder it is to tell when a particular one is appropriate.
Once we find a candidate knowledge structure, we must attempt to do a detailed match of it to the problem at hand. Depending on the representation we are using, the details of the matching process will vary. It niay require variables to be bound to objects. It may require attributes to have their values compared. In any case, if values that satisfy the required restrictions as imposed by the knowledge structure can be found, they are put into the appropriate places in the structure. If no appropriate values can be found, then a new structure must be selected. The way in which the attempt to instantiate this first structure failed may provide useful cues as to which one to try next. If, on the other hand, appropriate values can be found, then the current structure can be taken to be appropriate for describing the current situation. But, of course, that situation may change. Then information about what happened (for example, we walked around the room we were looking at) may be useful in selecting a new structure to describe the revised siluation.
As was suggested above, the process of instantiating a structure in a particular situation often does not proceed smoothly. When the process runs into a snag, though, it is often not necessary to abandon the effort and start over. Rather, there are a variety of things that can be done:
So far in this chapter, we have seen several methods for representing knowledge that would allow us to form cornplex slate deseriptions for a search program. Another issue
Consider the world of a household robot. There are many objects and relationships in the world, and a state deseription must somehow inelude facts like on(Plan12, Table34), under(Table34, Window13), and in(Table34,Room15). One strategy is to store each state description as a list of such facts. But what happens during the problem-solving process if each of those descriptions is very long? Most of the facts will not change from one state to another, yet each fact will be represented once at every node, and we will quickly run out of memory. Furthermore, we will spend the majority of our time creating these nodes and copying these facts-most of which do not change often-from one node to another. For example, in the robot world, we could spend a lot of time recording above(Ceiling,Floor) at every node. All of this is, of course, in addition to the real problem of figuring out which facts should be different at each node.
This whole problem of representing the facts Ihat change as well as those that do not is known as the frame problem [MeCarthy and Hayes,1969]. In some domains, the only hard part is representing all the facts. In others, though, figuring out which ones change is nontrivial. For example, in the robot world, there might be a table with a plant on it under the window. Suppose we move the table to the center of the room. We must also infer that the plant is now in the center of the roorn too bul that the window is not.
To support this kind of reasoning, some syslems make use of an explicit set of axioms called frame axioms, which describe all the things that do not change when a particular operator is applied in state n to produce state n + 1. (The things that do change must be mentioned as part of the operator itself.) Thus, in the robot domain, we might write axioms such as
which can be read as, "If x has colory in state s, and the operation of moving x is applied in state Unfortunately, in any complex domain, a huge number of these axioms becomes necessary. An alternative approach is to make the assumption that the only things that change are the things that must. By "must" here we mean that the change is either nequired explicitly by the axioms that describe the operator or that it follows logically from some change that is asserted explicitly This idea of circumscribing the set of unusual things is a very powerful one; it can be used as a partial solution to the frame problem and as a way of reasoning with incomplete knowledge. We return to it in Chapter 7.
But now let's return briefly to the problem of representing a changing problem state. We could do it by simply starting with a description of the initial state and then making changes to that description as indicated by the rules we apply. This solves the problem of the wasted space and time involved in copying the information for each node. And it works fine until the first time the search has to backtrack. Then, unless all the changes that were made can simply be ignored (as they could be if, for example, they were simply additions of new theorems), we are faced with the problem of backing up to some earlier node But how do we know what changes in the problem state description need to be undone? For example, what do we have to change to undo the effect of moving the table to the center of the room? There are two ways this problem can be solved:
Sometimes, even these solutions are not enough. We might want to remember, for example, in the robot world, that before the table was moved, it was under the window and after being moved, it was in the center of the room. This can be handled by adding to the representation of each fact a specific indication of the time at which that fact was true. This indication is called a state variable. But to apply the same technique to a real-world problem, we need, for example, separate facts to indicate all the times at which the Statue of Liberty is in New York.
There is no simple answer either to the question of knowledge representation or to the frame problem. Each of them is discussed in greater depth later in the context of specific problems. But it is important to keep these questions in mind when considering search strategies. since the representation of knowledge and the search process depend heavily on each other.
The purpose of this chapter has been to outline the need for knowledge in reasoning programs and to survey issues that must be addressed in the design of a good knowledge- representation structure. Of course, we have not covered everything. In the chapters that follow we describe some specific representations and look at their relative strengths and weaknesses.
The following collections all contain further discussions of the fundamental issues in knowledge representation, along with specific techniques to address these issues: Bobrow [1975]. Winograd [1978], Brachman and Levesque [1985], and Halpern [1986]. For especially clear discussions of specific issues on the topic of knowledge representation and use see Woods [1975] and Brachman [1985].