The slot-and-filler structures described in the previous chapter are very general. Individual semantic networks and frame systems may have specialized links and inference procedures, but there are no hard and fast rules about what kinds of objects and links are good in general for knowledge representation. Such decisions are left up to the builder of the semantic network or frame system.
The three structures discussed in this chapter, conceptual dependency, scripts, and CYC, on the other hand, embody specific notions of what types of objects and relations are permitted. They stand for powerful theories of how A1 programs can represent and use knowledge about common situations.
Conceptual dependency (often nicknamed CD) is a theory of how to represent the kind of knowledge about events that is usually contained in natural language sentences. The goal is to represent the knowledge in a way that
Because of the two concerns just mentioned, the CD representation of a sentence is built not out of primitives corresponding to the words used in the sentence, but rather out of conceptual primitives that can be combined to form the meanings of words in any particular language. The theory was first described in Schank [1973] and was further developed in Schank [1975]. It has since been implemented in a variety of programs that read and understand natural language text. Unlike semantic nets, which provide only a structure into which nodes representing information at any Ievel can be placed, conceptual dependency provides both a structure and a specific set of primitives, at a particular level of granularity, out of which representations of particular pieces of information can be constructed.
As a simple example of the way knowledge is represented in CD, the event represented by the sentence
would be represented as shown in Figure 10.1.
In CD, represent:ctions of actions are built from a set of primitive acts. Although there :crc slight differenc:es in the exact set of primitive actions provided in the various sources on CD, a typical set is the following, taken from Sehank and Abelson [1977]:
ATRANS Transfer of an abstract relationship (e.g.,give)
PTRANS Transfer of the physical location of an object (e.g.,go)
PROPEL Application of physical force to an object (e.g.,push)
MOVE Movement of a body part by its owner (e.g.,kick)
GRASP Grasping of an object by an actor (e.g.,clutch)
INGEST Ingeslion of an object by an animal (e.g.,eat)
EXPEL Expulsion of socnething from the body of an animal (e.g.,cry)
MTRANS Transfer of mental information (e.g..tell)
MBUILD Building new information out of old (e.g.,decide)
SPEAK Production ot sounds (e.g.,say)
ATTEND Focusing of a sense organ toward a stimulus (e.g.,listen)
A second set of CD building blocks is the set of allowable dependencies among the conceptualizations deseribed in a sentence. There are four primitive conceptual categories from which dependency structures can be built. These are
ACTs Actions
PPs Objects (picture producers)
AAs Modifiers of actions (action aiders)
PAs Modifiers of PPs (picture aiders)
In addition, dependeney structures are themselves conceptualizations and can serve as components of larger dependency structu:es.
The dependencies among conceptual izations correspond to semantic relations among the underlying concepts. Figure 10.2 lists the most important ones allowed by CD. [Footnote:The table shown in the figure is adapted from several tables in Schank [1973]] The first column contains the rules; the second contains examples of their use; and the third contains an English version of each example. The rules shown in the figure can be interpreted as follows:
Conceptualizations representing events can be modified in a variety of ways to supply infomiation normally indicated in language by the tense, mood, or aspect of a verb form. The use of the modifier p to indicate past tense has already been shown. The set of conceptual tenses proposed by Schank [1973] includes
p Past
f Future
t Transition
ts Start transition
tf Finished transition
k Continuing
? Interrogative
/ Negative
nil Present
delta Timeless
c Conditional
As an example of the use of these tenses, consider the CD representation shown in Figure 10.3 (taken from Schank [1973]) of the sentence
The vertical causality link indicates that smoking kills one. Since it is marked c, however, we know only that smoking can kill one, not that it necessarily does. The horizontal causality link indicates that it is that first causality that made me stop smoking. The qualification tfp attached to the dcpendency between I and INGEST indicates that the smoking (an instance of INGESTING) has stopped and that the stopping happened in the past.
There are three important ways in which representing knowledge using the conceptual dependency model facilitates reasoning with the knowledge:
Each of these points merits further discussion.
The first argument in favor of representing knowledge in terms of CD primitives rather than in the higher-level terms in which it is normally described is that using the primitives makes it easier to describe the inference rules by which the knowledge can be manipulated. Rules need only be represented once for each primitive ACT rather than once for every word that describes that ACT. For example, all of the following verbs involve a transfer of ownership of an object:
lf any of them occurs, then inferences about who now has the object and who once had the object (and thus who may know something about it) may be important. ln a CD representation,those possible inferences can be stated once and associated with the primitive ACT ATRANS.
A second argument in favor of the use of CD representation is that to construct it, we must use not only the inforntation that is stated explicitly in a sentence but also a set
of inference rules associated with the specific information. Having applied these rules once, we store these results as part of the representation and they can be used repeatedly without the rules being reapplied. For example, consider the sentence
The CD representation of the information contained in this sentence is shown in Figure 10.4. (For simplicity, believe is shown as a single unit. In fact, it must be represented in terms of primitive ACTs and a model of the human information processing system.) It says that Bill informed John that he (Bill) will do something to break John's nose. Bill did this so that John will believe that if he (John) does some other thing (different from what Bill will do to break his nose), then Bill will break John's nose. In this representation, the word "believe" has been used to simplify the example. But the idea behind believe can be represented in CD as an MTRANS of a fact into John's memory. The actions do1 and do2 are dummy placeholders that refer to some as yet unspecified actions.
A third argument for the use of the CD representation is that unspecified elements of the representation ofone piece ofinformation can be used as a focus forthe understanding of later events as they are encountered. So, for example, after hearing that
we might expect to find out what action Bill was trying to prevent John from performing. That action could then be substituted for the dummy action represented in Figure 10.4 as do2 . The presence of such dummy objects provides clues as to what other events or objects are important for the understanding of the known event.
Of course, there are also arguments against the use of CD as a representation formalism. For one thing, it requires that all knowledge be decomposed into fairly low- level primitives. In Section 4.3.3 we discussed how this may be inefIicient or perhaps even impossible in some situations. As Schank and Owens [1987] put it,
Thus, although there are several arguments in favor of the use of CD as a model for representing events, it is not always completely appropriate to do so, and it may be worthwhile to seek out higher-level primitives.
Another difficulty with the theory of conceptual dependency as a general model for the representation of knowledge is that it is only a theory of the representation of events. But to represent aIl the information that a complex program may need, it must be able to represent otherthings besides events. There have been attempts to define a set of primitives, similar to those of CD for actions, that can be used to describe other kinds of knowledge. For example, physical objects, which in CD are simply represented as atomic units, have been analyzed in L,ehnert [1978]. A similar analysis of social actions is provided in Schank and Carbonell [1979). These theories continue the style of representation pioneered by CD, but they have not yet been subjected to the same amount of empirical investigation (i.e., use in real programs) as CD.
We have discussed the theory of conceptual dependency in some detail in order to illustrate the behavior of a knowledge representation system built around a fairly small set of specific primitive elements. But CD is not the only such theory to have been developed and used in AI programs. For another example of a primitive-based system, see Wilks [1972].
CD is a mechanism for representing and reasoning about events. But rarely do events occur in isolation. In this section, we present a mechanism for representing knowledge about common sequences of events.
A script is a structure that describes a stereotyped sequence of events in a particular context. A script consists of a set of slots. Associated with each slot may be some information about what kinds of values it may contain as well as a default value to be used if no other infomiation is available. So far, this definition of a script looks very similar to that of a frame given in Section 9.2, and at this level of detail, the two structures are identical. But now, because ofthe specialized role to be played by a seript, we can make some more precise statements about its structure. Figure 10.5 shows part of a typical script, the restaurant script (taken from Schank and Abelson [1977]). It illustrates the important components of a script:
Scripts are useful because, in the real world, there are patterns to the occurrence of events. These patterns aase because of causal relationships between events. Agents will perfomi one action so that they will then be able to perform another. The events described in a script form a giant causal chain. The beginning of the chain is the set of entry conditions which enable the first events of the script to occur. The end of the chain is the set of results which may enable later events or event sequences (possibly described by other scripts) to occur. Within the chain, events are connected both to earlier events that make them possible and to later events that they enable.
If a particular script is known to be appropriate in a given situation, then it can be very useful in predicting the occurrence of events that were not explicitly mentioned. Scripts can also be useful by indicating how events that were mentioned relate to each other. For example, what is the connection between someone's ordering steak and someone's eating steak? But before a particular script can be applied, it must be activated (i.e., it must be selected as appropriate to the current situation). There are two ways in which it may be useful to activate a script, depending on how important the script is likely to be:
The headers of a script (its preconditions, its preferred locations, its props, its roles, and its events) can all serve as indicators that the script should be activated. In order to cut down on the number of times a spurious script is activated, it has proved useful to require that a situation contain at least two of a script's headers before the script will be activated.
Once a script has been activated, there are, as we have already suggested, a variety of ways in which it can be useful in interpreting a particular situation. The most important of these is the ability to predict events that have not explicitly been observed. Suppose, for example, that you are told the following story:
If you were then asked the question
you would almost certainly respond thvt he did, even though you were not told so explicitly. By using the restaurant script, a computer question-answerer would also be able to infer that John ate dinner, since the restaurant script could have been activated. Since all ofthe events in the story correspond to the sequence ofevents predicted by the script, the program could infer that the entire sequence predicted by the script occurred normally. Thus it could conclude, in panicular, that John ate. In their ability to predict unobserved events, scripts are similar to frames and to other knowledge structures that represent stereotyped situations. Once one ofthese structures is activated in a panicular situation, many predictions can be made.
A second important use of scripts is to provide a way of building a single coherent interpretation from a collection of observations. Recall that a script can be viewed as a giant causal chain. Thus it provides information about how events are related to each other. Consider, for example, the following story:
Now consider the question
The script provides two possible answers to that question:
A third way in which a script is useful is that it focuses attention on unusual events. Consider the following story:
The important part of this story is the place in which it departs from the expected sequence of events in a restaurant. John did not get mad because he was shown to his table. He did get mad because he had to wait to be served. Once the typical sequence of events is interrupted, the script can no longer be used to predict other events. So, for example, in this story, we should not infer that John paid his bill. But we can infer that he saw a menu, since reading the menu would have occurred before the interruption. For a discussion of SAM, a program that uses scripts to perform this kind of reasoning, see Cullingford [1981].
From these examples, we can see how information about typical sequences ofevents, as represented in scripts, can be useful in interpreting a particular, observed sequence of events. Tbe usefulness of a script in some of these examples, such as the one in which unobserved events were predicted, is similar to the usefulness of other knowledge structures, such as frames. In other examples, we have relied on specific properties of the infomation stored in a script, such as the causal chain nepresented by the events it contains. Thus although scripts are less general structures than are frames, and so are not suitable for representing all kinds of knowledge, they can be very effective for representing the specific kinds of knowledge for which they were designed.
CYC [Lenat and Guha,1990] is a very large knowledge base project aimed at capturing human commonsense knowledge. Recall that in Section 5.1, our first attempt to prove that Marcus was not loyal to Caesar failed because we were missing the simple fact that all men are people. The goal of CYC is to encode the large body of knowledge that is so obvious that it is easy to forget to state it explicitly. Such a knowledge base could theri be combined with specialized knowledge bases to produce systems that are less brittle than most of the ones available today.
Like CD, CYC represents a specific theory of how to describe the world, and like CD, it can be used for AI tasks such as natural language understanding. CYC, however, is more comprehensive; while CD provided a speeific theory of representation for events, CYC contains representations of events, objects, attitudes, and so forth. In addition, CYC is particularly concerned with issues of scale, that is, what happens when we build knowledge bases that contain millions of objects.
Why should we want to build large knowledge bases at all? There are many reasons, among them:
we easily decide that "bank" means a financial institution, and not a river bank. To do this, we apply fairly deep knowledge about what a financial institution is, what it means to withdraw money, etc. Unfortunately, for a program to assimilate the knowledge contained in an encyclopedia, that program must already know quite a bit about the world.
The approach taken by CYC is to hand-code (what its designers consider to be) the ten million or so facts that make up commonsense knowledge. It may then be possible to bootstrap into more automatic methods.
CYC's knowledge is encoded in a representation language called CYCL. CYCL is a frame-based system that incorporates most of the techniques described in Chapter9 (multiple inheritance, slots as full-fledged objects, transfers-through, mutually-disjoint-with, etc). CYCL generalizes the notion of inheritance so that properties can be inherited along any link, not just isa and instance. Consider the two statements:
We can easily encode the first fact using standard inheritance-any frame with Bird on its instance, slot inherits the value 2 on its legs slot. The second fact can be encoded in a similar fashion if we allow inheritance to proceed along the friend relation--any frame with Mary on its friend slot inherits the value Spanish on its languagesSpoken slot. CYC funher generalizes inheritance to apply to a chain of relations, allowing us to express facts likc, "All the parents of Mary's friends are rich," where the value Rich is inherited through a composition of the friend and parentOf links.
In addition to frames, CYCL contains a constraint language that allows the expression of arbitrary first-order logical expressions. For example, Figure 10.6 shows how we can express the fact "Mary likes people who program solely in Lisp." Mary has a constraint called lispConstraint, which restricts the values of her likes slot. The slotValueSubsemes attribute of lispConstraint ensures that Mary's likes slot will be filled with at least those individuals that satisfy the logical condition, namely that they program in LispLanguage and no others.
The time at which the default reasoning is actually performed is determined by the direction of the slotvalueSubsumes rules. If the direction is backward, the rule is an if-needed rule, and it is invoked whenever someone inquires as to the value of Mary's likes slot. (In this case, the rule infers that Mary likes Bob but not Jane.) If the direction is forward, the rule is an if-added rule, and additions are automatically propagated to Mary's likes slot. For example, after we place LISP on Bob's programsIn slot, then the system quickly places Bob on Mary's likes slot for us. A truth maintenance system (see Chapter 7) ensures that if Bob ceases to be a Lisp programmer (or if he starts using Pascal), then he will also cease to appear on Mary's likes slot.
While forward rules can be very useful, they can also require substantial time and space to propagate their values. If a rule is entered as backward, then the system defers reasoning until the information is specifically requested. CYC maintains a separate background process for accomplishing forward propagations. A knowledge engineer can continue entering knowtedge while its effects are propagated during idle keyboard time.[Footnote: Anothor idea is to have the system do forward propagation of knowledge during periods of infrequent use, such as at night].
Now let us return to the constraint language itself. Recall that it allows for the expression of facts as arbitrary logical expressions. Since first-order logic is much more powerful than CYC's frame language, why does CYC maintain both? The reason is that frame-based infetence is very efficient, while general logical reasoning is computationally hard. CYC actually supports about twenty types ofefficient inference mechanisms (including inheritance and transfers-through), each with its own truth maintenance facility. The constraint language atlows for the expression of facts that are too complex for any of these mechanisms to handle.
The constraint language also provides an elegant, abstract layer ofrepresentation. In reality, CYC maintains two levels of representation: the epistemological level (EL) and the heuristic level (HL). The EL contains facts stated in the logical constraint language, while the HL contains the same facts stored using efficient inference templates. There is a translation program for automatically converting an EL statement into an efficient HL tepresentation. The EL provides a clean; simple functionat interface to CYC so that users and computer programs can easily insert and retrieve information from the knowledge base. The EL/HL distinetion represents one way of combining the formal neatness of logic with the computational efficiency of frames.
In addition to frames, inference mechanisms, and the constraint language, CYCL performs consistency checking (e.g., detecting when an illegal value is placed on a slot) and conflict resolution (e.g., handling cases where multiple inference procedures assign incompatible values to a slot).
Recall our discussion of control knowledge in Chapter 6, where we saw how to take information about control out of a production system interpreter and represent it declar- atively using rules. CYCL strives to accomplish the same thing with frames. We have already seen how to specify whether a fact is propagated in the forward or backward direction-this is a type of control infomiation. Associated with each slot is a set of inference mechanisms that can be used to compute values for it. For any given problem, CYC's reasoning is constrained to a small range of relevant, efficient procedures. A query in CYCL can be tagged with a level of effort. At the lowest level of effort, CYC merely checks whether the fact is stored in the knowledge base. At higher levels, CYC will invoke backward reasoning and even entertain metaphorical chains of inference. As the knowledge base grows, it will become necessary to use control knowledge to restrict reasoning to the most relevant portions of the knowledge base. This control knowledge can, of course, be stored in frames.
In the tradition of its predecessor RLL (Representation Language Language) [Greiner and Lenat,1980], many of the inference mechanisms used by CYC are stored explicitly as EL templates in the knowledge base. These templates can be modified like any other frames, and a user can create a new inference template by copying and editing an old one. CYC generates LISP code to handle the various aspects of an inference template. These aspects include recognizing when an EL statement can be transformed into an instance of the template, storing justifications of facts that are deduced (and retracting those facts when the justifications disappear), and applying the inference mechanism efficiently. As with production systems, we can build a more Hexible, reflective system by moving inference procedures into a declarative representation.
It should be clear that many of the same control issues exist for frames and rules. Unlike numerical heuristic evaluation functions, control knowledge often has a com- monsense, "knowledge about the world" flavor to it. It therefore begins to bridge the gap between two usually disparate types of knowledge: knowledge that is typically used for search control and knowledge that is typically used for natural language disambiguation.
Ontology is the philosophical study of what exists. In the AI context, ontology is concerned with which categories we can usefully quantify over and how those categories relate to each other. All knowledge-based systems refer to entities in the world, but in order to capture the breadth of human knowledge, we need a well-designed global ontology that specifies at a very high level what kinds of things exist and what their general properties are. As mentioned above, such a global ontology should provide a more solid foundation for domain-specific AI programs and should also allow them to communicate with each other.
The highest level concept in CYC is called Thing. Everything is an instance of Thing. Below this top-level concept, CYC makes several distinctions, including:
These are but a few of the ontological decisions that the builders of a large knowl- edge base must make. Other problems arise in the representation of space, causality, structures, and the persistence ofobjects through time. We return to some ofthese issues in Chapter 19.
CYC is a multi-user system that provides each knowledge enterer with a textual and graphical interface to the knowledge base. Users' modifications to the knowledge base are transmitted to a central server, where they are checked and then propagated to other users.
We do not yet have much experience with the engineering problems of building and maintaining very large knowledge bases. In the future, it will be necessary to have tools that check consistency in the knowledge base, point out areas of incompleteness, and ensure that users do not step on each others' toes.
How does this representation make it possible to answer the question
John slapped Bill.
John punched Bill
Bill drank his Coke.
Bill slurped his Coke.
Sue likes Dickens.
Sue adores Dickens.
How could the restaurant script be invoked by the contents of this story? Trace the process throughout the story. Might any other scripts also be invoked? For example, how would you answer the question, "Did Jane pay for her dinner?"