CHAPTER 15. - NATURAL LANGUAGE PROCESSING

Language is meant for communicating about the world. By studying language, we can come to understand more about the world. We can test our theories about the world by how well they suppon our attempt to understand language. And, if we can succeed at building a computational model of language, we will have a powerful tool for communicating about the world. In this chapter, we Iook at how we can exploit knowledge about the world, in combination with linguistic facts. to build computational natural language systems.

Throughout this discussion, it is going to be important to keep in mind that the difficulties we will encounter do not exist out of perversity on the part of some diabolical designer. Instead, what we see as difficulties when we try to analyze language are just the flip sides of the very properties that make language so powerful. Figure 15.1 shows some examples of this. As we pursue our discussion of language processing, it is important to keep the good sides in mind since it is because of them that language is significant enough a phenomenon to be worth all the trouble.

By far the largest part of human linguistic communication occurs as speech. Written language is a fairly recent invention and still plays a less central role than speech in most activities. But processing written language (assuming it is written in unambiguous characters) is easier, in some ways, than processing speech. For example, to build a program that understands spoken language, we need all the facilities of a written language understander as well as enough additional knowledge to handle all the noise and ambiguities of the audio signal.[Footnote: Actually, in understanding spoken language, we take advantage of clues, such as intonation and the resence of pauses, to which we do not have access when we read. We can make the task of a speech- uderstanding program easier by allowing it, too, to use these cues, but to do so, we must know enough about them to incorporate into the program knowledge of how to use them.] Thus it is useful to divide the entire language- processing problem into two tasks:

The Problem: English sentences are incomplete descriptions of the information that they are intended to convey:

The Good Side: Language allows speakers to be as vague or as precise as they like. It also allows speakers to leave out things they believe their hearers already know.

-------------------------------------------------------

The Problem: 'The same expression means diffierent things in different contexts:

The Good Side: Language lets us communicate about an infinite world using a finite (and thus learnable) number of symbols.

-------------------------------------------------------

The Problem: No natural language program can be complete because new words, expressions, and meanings can be generated quite freely:

The Good Side: Language can evolve as the experiences that we want to communicate about evolve.

-------------------------------------------------------

The Problem: There are lots of ways to say the same thing

The Good Side: When you know a lot, facts imply each other. Language is intended to be used by agents who know a lot.

-------------------------------------------------------
Figure 15.1: Features of Language That Make It Both Difficult and Useful

In Chapter 14 we described some of the issues that arise in speech understanding, and in Section 21.2.2 we return to them in more detail. In this chapter, though, we concentrate on written language processing (usually called simply natural language processing).

Throughout this discussion of natural language processing, the focus is on English. This happens to be convenient and tums out to be where much of the work in the field has occurred. But the major issues we address are common to all natural languages. In fact. the techniques we discuss are particularly important in the task of translating from one natural language to another.

Natural language processing includes both understanding and generation, as well as other tasks such as multilingual translation. In this chapter we focus on understanding, although in Section I5.5 we will provide some references to work in these other areas.

15.1 Introduction

Recall that in the last chapter we defined understanding as the process ofmapping from an input form into a more immediately useful form. It is this view of understanding that we pursue throughout this chapter. But it is useful to point out here that there is a formal sense in which a language can be defined simply as a set of strings without reference to any world being described or task to be performed. Although some of the ideas that have come out of this formal study of languages can be exploited in parts of the understanding process, they are only the beginning. To get the overall picture, we need to think of language as a pair(source language. target representation), together with a mapping between elements of each to the other. The target representation will have been chosen to be appropriate for the task at hand. Often, if the task has clearly been agreed on and the details of the target representation are not important in a particular discussion, we talk just about the language itself, but the other half of the pair is really always present.

One of the great philosophical debates throughout the centuries has centered around the question of what a sentence means. We do not claim to have found the definitive answer to that question. But once we realize that understanding a piece of language involves mapping it into some representation appropriate to a particular situation, it becomes easy to see why the questions "What is language understanding?" and "What does a sentence mean?" have proved to be so difficult to answer. We use language in such a wide variety of situations that no single definition of understanding is able to account for them all. As we set about the task of building computer programs that understand natural language, one of the first things we have to do is define precisely what the underlying task is and what the target representation should look like. In the rest of this chapter, we assume that our goal is to be able to reason with the knowledge contained in the linguistic expressions, and we exploit a frame language as our target representation.

15.1.1 Steps in the Process

Before we go into detail on the several components of the natural language understanding process, it is useful to survey all of them and see how they fit together. Roughly we can break the process down into the following pieces:

The boundaries between these five phases are often very fuzzy. The phases are sometimes performed in sequence, and they are sometimes performed all at once. If they are performed in sequence, one may need to appeal for assistance to another. For example, part of the process of performing the syntactic analysis of the sentence "Is the glass jar peanut butter?" is deciding how to form two noun phrases out of the four nouns at the end of the sentence (giving a sentence of the form "Is the x y?"). All of the following constituents are syntactically possible: glass, glass jar, gtass jar peanut, jar peanut butter, peanut butter, butter. A syntactic processor on its own has no way to choose among these, and so any decision must be made by appealing to some model of the world in which some of these phrases make sense and others do not. If we do this, then we get a syntactic structure in which the constituents "glass jar" and "peanut butter" appear. Thus although it is often useful to separate these five processing phases to some extent, they can all interact in a variety of ways, making a complete separation impossible.

Specifically, to make the overall language understanding problem tractable, it will help if we distinguish between the following two ways of decomposing a program:

In this chapter, we focus primarily on the first of these issues. It is the one that has received the most attention from people working on this problem. We do not completely ignore the second issue, although considerably less of substance is known about it. For an example of this kind of discussion that talks about interleaving syntactic and semantic processing, see Lytinen [1986].

With that caveat, let's consider an example to see how the individual processes work. In this example, we assume that the processes happen sequentially. Suppose we have an English interface to an operating system and the following sentence is typed:

Morphological Analysis

Morphological analysis must do the following things:

In addition, this process will usually assign syntactic categories to all the words in the sentence. This is usually done now because interpretations for affixes (prefixes and suffixes) may depend on the syntactic category of the complete word. For example, consider the word "prints." This word is either a plural noun (with the "-s" marking plural) or a third person singular verb (as in "he prints"), in which case the "-s" indicates both singular and third person. If this step is done now, then in our example, there will be ambiguity since "want," "print," and "file" can all function as more than one syntactic category.

Syntactic Analysis

Syntactic analysis must exploit the results of morphological analysis to build a structural description of the sentence. The goal of this process, called parsing, is to convert the flat list of words that forms the sentence into a structure that defines the units that are represented by that flat list. For our example sentence, the result of parsing is shown in Figure 15.2. The details of this representation are not particularly significant; we describe alternative versions of them in Section 15.2. What is important here is that a flat sentence has been converted into a hierarchical structure and that that structure has been designed to correspond to sentence units (such as noun phrases) that will correspond to meaning units when semantic analysis is performed. One useful thing we have done here, although not all syntactic systems do, is create a set of entities we call reference markers. They are shown in parentheses in the parse tree. Each one corresponds to some entity that has been mentioned in the sentence. These reference markers are useful later since they provide a place in which to accumulate information about the entities as we get it. Thus although we have not tried to do semantic analysis (i.e.. assign meaning) at this point, we have designed our syntactic analysis process so that it will find constituents to which meaning can be assigned.

Figure 15.2: The Result of Syntactic Analysis of "I want to print Bill's .init file.

Semantic Analysis

Semantic analysis must do two important things:

For this example, suppose that we have a frame-based knowledge base that contains the units shown in Figure 15.3. Then we can generate a partial meaning, with respect to that knowledge base, as shown in Figure 15.4. Reference marker RM1 corresponds to the top-level event of the sentence. It is a wanting event in which the speaker (denoted by "I") wants a printing event to occur in which the same speaker prints a file whose extension is ".init" and whose owner is Bill.

Discourse Integration

At this point, we have figured out what kinds of things this sentence is about. But we do not yet know which specific individuals are being referred to. Specifically, we do not know to whom the pronoun "I" or the proper noun "Bill" refers. To pin down these references requires an appeal to a model of the current discourse context, from which we can learn that the current user (who typed the word "I") is User068 and that the only

Figure 15.3: A Knowledge Base Fragment

Figure 15.4: A Partial Meaning for a Sentence

person named "Bill" about whom we could be talking is User073. Once the correct referent for Bill is known, we can also determine exactly which file is being referred to: F1 is the only file with the extension ".init" that is owned by Bill.

Pragmatic Analysis

We now have a complete description, in the terms provided by our knowledge base, of what was said. The final step toward effective understanding is to decide what to do as a result. One possible thing to do is to record what was said as a fact and be done with it. For some sentences, whose intended effect is clearly declarative, that is precisely the correct thing to do. But for other sentences, including this one, the intended effect is different. We can discover this intended effect by applying a set of rules that characterize cooperative dialogues. In this example, we use the fact that when the user claims to want something that the system is capable of performing, then the system should go ahead and do it. This produces the final meaning shown in Figure 15.5.

The final step in pragmatic processing is to translate, when necessary, from the knowledge-based representation to a command to be executed by the system. In this case, this step is necessary, and we see that the final result of the understanding process is

Figure 15.5: Representing the Intended Meaning

where "lpr" is the operating system's file print command.

Summary

At this point, we have seen the results of each of the main processes that combine to form a natural language system. In a complete system, all of these processes are necessary in some form. For example, it may have seemed that we could have skipped the knowledge-based representation of the meaning of the sentence since the final output of the understanding system bore no relationship to it. But it is that intermediate knowledge-based representation to which we usually attach the knowledge that supports the creation of the final answer.

All of the processes we have described are important in a complete natural language understanding system. But not all programs are written with exactly these components. Sometimes two or more of them are collapsed, as we will see in several sections later in this chapter. Doing that usually results in a system thatis easier to build for restricted subsets of English but one that is harder to extend to wider coverage. In the rest of this chapter we describe the major processes in more detail and talk about some of the ways in which they can be put together to form a complete system.

15.2 Syntactic Processing

Syntactic processing is the step in which a flat input sentence is converted into a hierarchical structure that corresponds to the units of meaning in the sentence. This process is called parsing. Although there are natural language understanding systems that skip this step (for example, see Section 15.3.3), it plays an important role in many natural language understanding systems for two reasons:

Although there are many ways to produce a parse, almost all the systems that are actually used have two main components:

15.2.1 Grammars and Parsers

The most common way to represent grammars is as a set of production rules. Although details of the forms that are allowed in the rules vary, the basic idea remains the same and is illustrated in Figure 15.6, which shows a simple context-free, phrase structure grammar for English. Read the first rule as, "A sentence is composed of a noun phrase followed by a verb phrase." In this grammar, the venical bar should be read as "or." The e denotes the empty string. Symbols that are further expanded by rules are called nonter-ntinal symbols. Symbols that correspond directly to strings that must be found in an input sentence are called terminal symbols.

Grammar formalisms such as this one underlie many linguistic theories, which in tum provide the basis for many natural language understanding systems. Modern linguistic theories include: the government binding theory of Chomsky [1981;1986], GPSG [Gazdar et al.,1985], LFG [Bresnan,1982], and categorial grammar [Ades and Steedman, I982; Oehrle et al.,1987]. The first three of these are also discussed in Sells [1986]. We should point out here that there is general agreement that pure, context- free grammars are not effective for describing natural languages. [Footnote: There is, however, still some debate on whether context-free grammars are formally adequate for describing natural languages (e.g., Gazdar [1982].)] As a result, natural language processing systems have less in common with computer language processing systems (such as compilers) than you might expect.

Regardless of the theoretical basis of the grammar, the parsing process takes the rules of thc grammar and compares them against the input sentence. Each rule that matehes adds something to the complete structure that is being built for the sentence.

Figure 15.6: A Simple Grammar for a Fragment of English

The simplest structure to build is a parse tree, which simply records the rules and how they are matehed. Figure 15.7 shows the parse tree that would be produced for the sentence "Bill printed the file" using this grammar. Figure 15.2 contained another example of a parse tree, although some additions to this grammar would be required to produce it.

Notice that every node of the parse tree corresponds either to an input word or to a nonterminal in our grammar. Each level in the parse tree corresponds to the upplication of one grammar rule. As a result, it should be clear that a grammar specifies two things about a language:

So far, we have shown the result of parsing to be exactly a trace of the rules that were applied during it. This is not always the case, though. Some grammars contain additional information that describes the structure that should be built. We present an example of such a grammar in Section 15.2.2.

But first we need to look at two important issues that define the space of possible parsers that can exploit the grammars we write.

Top-Down versus Bottom-Up Parsing

To parse a sentence, it is necessary to find a way in which that sentence could have been generated from the start symbol. There are two ways that this can be done:

Figure 15.7: A Parse Tree for a Sentence

The choice between these two approaches is similar to the choice between forward and backward reasoning in other problem-solving tasks. The most important consideration is the branching factor. Is it greater going backward or forward? Another important issue is the availability of good heuristics for evaluating progress. Can partial information be used to rule out paths early? Sometimes these two approaches are combined into a single method called bottom-up parsing with top-down filtering. In this method, parsing proceeds essentially bottom-up (i.e., the grammar rules are applied backward). But using tables that have been precomputed for a particular grammar, the parser can immediately eliminate constituents that can never be combined into useful higher-level structures.

Finding One Interpretation or Finding Many

As several of the examples above have shown, the process of understanding a sentence is a search process in which a large universe of possible interpretations must be explored to find one that meets all the constraints imposed by a particular sentence. As for any search process, we must decide whether to explore all possible paths or, instead, to explore only a single most likely one and to produce only the result of that one path as the answer.

Suppose, for example, that a sentence processor looks at the words of an input sentence one at a time, from left to right, and suppose that so far, it has seen:

There are two paths that the processor could be following at this point:

There are four ways of handling sentences such as these:

  • Chart parsers [Winograd, 1983), which provide a way of avoiding backup by sioring interm ediate constituents so that they can be reused along alternative parxing paths.
  • Detinite clause grammars [Pereira and Warren,1980], in which grammar rules are written as PROLOG clauses and the PROLOG interpreter is used to perform top-down, depth-first parsing.
  • Augmented transition networks (or ATNs) [Woods,1970], in which the parsing process is described as the transition from a start state to a final state in a transition network that corresponds to a grammar of English.

    We do not have space here to go into all these methods. In the next section, we illustrate the main ideas involved in parsing by working through an example with an ATN. After this, we look at one way of parsing with a more declarative representation.

    15.2.2 Augmented Transition Networks

    An augmented transition network (ATN) is a top-down parsing procedure that allows various kinds of knowledge to be incorporated into the parsing system so it can operate efficiently. Since the early use ofthe ATN in the LUNAR system [Woods, l973], which provided access to a large database of information on lunar geology, the mechanism has been exploited in many language-understanding systems. The ATN is similar to a finite state machine in which the class of labels that can be attached to the ares that define transitions between states has been augmented. Ares may be labeled with an arbitrary combination of the following:

    Figure 15.8 shows an example of an ATN in graphical notation. Figure 15.9 shows the top-level ATN of that example in a notation that a program could read. To see how an ATN works, let us trace the execution of this ATN as it parses the following sentence:

    This execution proceeds as follows:

    1. Begin in state S.
    2. Push to NP
    3. Do a category test to see if "the ' is a determiner.
    4. This test succeeds, so set the DETERMINER register to DEFINITE and go to state Q6.
    5. Do a category test to see if "long" is an adjective.

      Figure 15.8: An ATN Network for a Fragment of English
    6. This test succeeds, so append "long" to the list contained in the ADJS register. (This list was previously empty.) Stay in state Q6.
    7. Do a category test to see if "file" is an adjective. This test fails.
    8. Do a category test to see if "file" is a noun. This test succeeds, so set the NOUN register to "file" and go to state Q7.
    9. Push to PP
    10. Do a category test to see if "has" is a preposition. This test fails, so pop and signal failure.
    11. There is nothing else that can be done from state Q7, so pop and return the structure

      (NP (FILE (LONG) DEFINITE))

      The return causes the machine to be in state Q1. with the SUBJ register set to the structure just retumed and the TYPE register set to DCL.

    12. Do a category test to see if "has" is a verb. This test succeeds, so set the AUX register to NIL and set the V register to "has." Go to state Q4.
    13. Push to state NP. Since the next word, "printed," is not a determiner or a proper noun, NP will pop and retum failure.
    14. The only other thing to do in state Q4 is to halt. But more input remains, so a complete parse has not been found. Backtracking is now required.

      Figure 15.9: An ATN Grammar in List Form
    15. The last choice point was at state Q1, so retum there. The registers AUX and V must be unset.
    16. Do a category test to see if "has" is an auxiliary. This test succeeds, so set the AUX register to "has" and go to state Q3.
    17. Do a category test to see if "printed" is a verb. This test succeeds, so set the V register to "printed." Go to state Q4.
    18. Now, since the input is exhausted, Q4 is an acceptable final state. Pop and retum the structure

      (S DCL (NP (FILE (LONG) DEFINITE))

      HAS

      (VP PRINTED))

      This structure is the output of the parse.

    This example grammar illustrates several interesting points about the use of ATNs. A single subnetwork need only occur once even though it is used in more than one place. A network can be called recursively. Any number of intemal registers may be used to contain the result of the parse. The result of a network can be built, using the function BUILDQ, out of values contained in the various system registers. A single state may be both a final state, in which a complete sentence has been found, and an intermediate state, in which only a part of a sentence has been recognized. And, finally, the contents of a register can be modified at any time.

    In addition, there are a variety of ways in which ATNs can be used which are not shown in this example:

    15.2.3 Unification Grammars

    ATN grammars have substantial procedural components. The grammar describes the order in which constituents must be built. Variables are explicitly given values, and they must already have been assigned a value before they can be referenced. This procedurality limits the effectiveness of ATN grammars in some cases, for example: in speech processing where some later parts of the sentence may have been recognized clearly while earlier parts are still unknown (for example, suppose we had heard, "The long * * * file printed."), or in systems that want to use the same grammar to support both understanding and generation (e.g., Appelt [1987], Shieber [1988], and Barnett et al. [1990]). Although there is no clear distinction between declarative and procedural representations (as we saw in Section 6.1), there is a spectrum and it often turns out that more declarative representations are more flexibte than more procedural ones are. So in this section we describe a declarative approach to representing grammars.

    When a parser applies grammar rules to a sentence, it performs two major kinds of operations:

    Now think back to the unification operation that we described in Section 5.4.4 as part of our theorem-proving discussion. Matching and structure building are operations that unification performs naturally. So an obvious candidate for representing grammars is some structure on which we can define a unification operator. Directed acyclic graphs (DAGs) can do exactly that.

    Each DAG represents a set of attribute-value pairs. For example, the graphs corresponding to the words "the" and "file" are:

    [CAT: DET [CAT: N

    LEX: the] LEX: file

    NUMBER: SING]

    Both words have a lexical category (CAT) and a lexical entry. In addition, the word "file" has a value (SING) for the NUMBER attribute. The result of combining these two words to form a simple NP can also be described as a graph:

    [NP: [DET: the

    HEAD: file

    NUMBER: SING]]

    The rule that forms this new constituent can also be represented as a graph, but to do so we need to introduce a new notation. Until now, all our graphs have actually been trees. To describe graphs that are not trees, we need a way to Iabel a piece of a graph and then point to that piece elsewhere in the graph. So let {n} for any value of n be a label, which is to be interpreted as a label for the next constituent foflowing it in the graph. Sometimes, the constituent is empty (i.e., there is not yet any structure that is known to fill that piece of the graph). In that case, the label functions very much like a variable and will be treated like one by the unification operation. It is this degencrate kind of a label that we need in order to describe the NP rule:

    We can write this rule as the following graph:

    [CONSTITUENT1 : [CAT: DET

    LEX: {1} ]

    CONSTlTUENT2: [CAT: N

    LEX: {2}

    NUMBER: {3}]

    BUILD: [NP: [DET: {1}

    HEAD: {2}

    NUMBER: {3}]]]

    This rule should be read as follows: Two constituents, described in the subgraphs labeled CONSTITUENT1 and CONSTITUENT2, are to be combined. The first must be of CAT DET. We do not care what its lexical entry is, but whatever it is will be bound to the label { l }. The second constituent must be of CAT N. Its lexical entry will be bound to the label {2}, and its number will be bound to the label {3}. The result of combining these two constituents is described in the subgraph labeled BUILD. This result will be a graph corresponding to an NP with three attributes: DET, HEAD, and NUMBER. The values for all these attributes are to be taken from the appropriate pieces of the graphs that are being combined by the rule.

    Now we need to define a unification operator that can be applied to the graphs we have just described. It will be very similar to logical unification. Two graphs unify if, recursively, all their subgraphs unify. The result of a successful unification is a graph that is composed ofthe union ofthe subgraphs ofthe two inputs, with all bindings made as indicated. This process bottoms out when a subgraph is not an attribute-value pair but is just a value for an attribute. At that point, we must define what it means for two values to unify. Identical values unify. Anything unifies with a variable (a Iabel with no attached structure) and produces a binding for the label. The simplest thing to do is then to say that any other situation results in failure. But it may be useful to be more fiexible. So some systems allow a value to match with a more general one (e.g., PROPER-NOUN matches NOUN). Others allow values that are disjunetions [e.g., (MASCULINE V FEMININE)], in which case unification succeeds whenever the intersection of the two values is not empty.

    There is one other important difference between logical unification and graph unification. The inputs to logical unification are treated as logical formulas. Order matters, since, for example, f(g(a), h(b)) is a different formula than f(h(b), g(a)). The inputs to graph unification, on the other hand, must be treated as sets, since the order in which attribute-value pairs are stated does not matter. For example, if a rule describes a constituent as

    [CAT: DET

    LEX: {1} ]

    we want to be able to match a constituent such as

    [LEX: the

    CAT: DET)

    Algorithm: Graph-Unify

    1. If either C1 or C2 is an attribute that is not itself an attribute-value pair then
      1. If the attributes confiict (as defined above), then fail.
      2. If either is a variable, then bind it to the value of the other and return that value.
      3. Otherwise, retum the most general value that is consistent with both the original values. Specifically, if disjunction is allowed, then return the intersection of the values.
    2. Otherwise, do:
      1. Set variable NEW to empty.
      2. For each attribute A that is present (at the top level) in either G1 or G2 do
        1. If A is not present at the top level in the other input, then add A and its value to NEW.
        2. If it is, then call Graph-Unify with the two values for A. If that fails, then fail. Otherwise, take the new value of A to be the result of that unification and add A with its value to NEW.
      3. If there are any labels attached to G1 or G2, then bind them to NEW and return NEW.

    A simple parser can use this algorithm to apply a grammar rule by unifying CONSTITUENT1 with a proposed first constituent. If that succeeds, then CONSTITUENT2 is unified with a proposed second constituent. If that also succeeds, then a new constituent corresponding to the value of BUILD is produced. If there are variables in the value of BUILD that were bound during the matehing of the constituents, then those bindings will be used to build the new constituent.

    There are many possible variations on the notation we have described here. There are also a variety of ways of using it to represent dictionary entries and grammar rules. See Shieber [1986] and Knight [1989] for discussions of some of them.

    Although we have presented unification here as a technique for doing syntactic analysis, it has also been used as a basis for semantic interpretation. In fact, there are arguments for using it as a uniform representation for all phases of natural language understanding. There are also arguments against doing so, primarily involving system modularity, the noncompositionality of language in some respects (see Section 15.3.4), and the need to invoke substantial domain reasoning. We will not say any more about this here, but to see how this idea could work, see Allen [ I 989].

    15.3 Semantic Analysis

    Producing a syntactic parse of a sentence is only the first step toward undorstanding it. We must still produce a representation of the meaning of the sentence. Because understanding is a mapping process, we must first deline the language into which we are trying to map. There is no single, definitive language in which all sentence meanings can be described. All of the knowledge representation systems that were described in Part II are candidates, and having selected one or more of them, we still need to define the vocabulary (i.e., the predicates, frames, or whatever) that will be used on top of the structure. In the rest of this chapter, we call the final meaning representation language, including both the representational framework and the specific meaning vocabulary, the target language. The choice of a target language for any particular natural language understanding program must depend on what is to be done with the meanings once they are constructed. There are two broad families of target languages that are used in NL systems, depending on the role that the natural language system is playing in a larger system (if any).

    When natural language is being considered as a phenomenon on its own, as, for example, when one builds a program whose goal is to read text and then answer questions about it, a target language can be designed specifically to support language processing. In this case, one typically looks for primitives that correspond to distinctions that are usually made in language. Of course, selecting the right set of primitives is not easy. We discussed this issue briefiy in Section 4.3.3, and in Chapter 10 we looked at two proposals for a set of primitives, conceptual dependency and CYC.

    When nalural language is being used as an interface language to another program (such as a database query system or an expert system), then the target language must be a legal input to that other program. Thus the design of the target language is driven by the backend program. This was the case in the simple example we discussed in Section 15.1.1. But even in this case, it is useful, as we showed in that example, to use an intermediate knowledge-based representation to guide the overall process. So, in the rest of this section, we assume that the target language we are building is a knowledge-based one.

    Although the main purpose ofsemantic processing is the creation of a target language representation of a sentence's meaning, there is another important role that it plays. It imposes constraints on the representations that can be constructed, and, because of the structural connections that must exist between the syntactic structure and the semantic one, it also provides a way of selecting among competing syntactic analyses. Semantic processing can impose constraints because it has access to knowledge about what makes sense in the world. We already mentioned one example of this, the sentence, "Is the glass jar peanut butter?" There are other examples in the rest of this section.

    Lexical Processing

    The first step in any semantic processing system is to look up the individual words in a dictionary (or lexicon) and extract their meanings. Unfortunately, many words have several meanings, and it may not be possible to choose the correct one just by looking at the word itself. For example, the word "diamond" might have the following set of meanings:

    • A geometrical shape with four equal sides
    • A baseball field
    • An extremely hard and valuable gemstone

        To select the correct meaning for the word "diamond" in the sentence,

        • Joan saw Susan's diamond shimmering from across the room.

        it is necessary to know that neither geometrical shapes nor baseball fields shimmer, whereas gemstones do.

    Unfortunately, if we view English understanding as mapping from English words into objects in a specific knowledge base, lexical ambiguity is often greater than it seems in everyday English. For, example, consider the word "mean." This word is ambiguous in at least three ways: it can be a verb meaning "to signify"; it can be an adjective meaning "unpleasant" or "cheap"; and it can be a noun meaning "statistical average." But now imagine that we have a knowledge base that describes a statistics program and its operation. There might be at least two distinct objects in that knowledge base, both of which correspond to the "statistical average" meaning of "mean." One object is the statistical concept of a mean; the other is the particular function that computes the mean in this program. To understand the word "mean" we need to map it into some concept in our knowledge base. But to do that, we must decide which of these concepts is meant. Because of cases like this, lexical ambiguity is a serious problem, even when the domain of discourse is severely constrained.

    The process of determining the correct meaning of an individual word is called word sense disambiguation or lexical disamiguation. It is done by associating, with each word in the lexicon, information about the contexts in which each of the word's senses may appear. Each of the words in a sentence can serve as part of the context in which the meanings of the other words must be determined.

    Sometimes only very straightforward information about each word sense is necessary. For example, the baseball field interpretation of "diamond" could be marked as a LOCATION. Then the correct meaning of "diamond" in the sentence "I'll meet you at the diamond" could easily be determined if the fact that at requires a TIME or a LOCATION as its object were recorded as part of the lexical entry for at. Such simple properties of word senses are called semantic markers. Other useful semantic markers are

    • PHYSICAL-OBJECT
    • ANIMATE-OBJECT
    • ABSTRACT OBJECT

    Using these markers, the correct meaning of "diamond" in the sentence "I dropped my diamond" can be computed. As part of its lexical entry, the verb "drop" will specify that its object must be a PHYSICAL-OBJECT. The gemstone meaning of "diamond" will be marked as a PHYSICAL-OBJECT. So it will be selected as the appropriate meaning in this context.

    This technique has been extended by Wilks [1972;1975a;1975b] in his preference semantics, which relies on the notion that requirements, such as the one described above for an object that is a LOCATION, are rarely hard-and-fast demands. Rather, they can best be described as preferences. For example, we might say that verbs such as "hate ' prefer a subject that is animate. Thus we have no difficulty in understandingthe sentence

    • Pop hates the cold.

    as describing the feelings of a man and not those of soft drinks. But now consider the sentence

    • My lawn hates the cold.

    Now there is no animate subject available, and so the metaphorical use of lawn acting as an animate object should be accepted.

    Unfortunately, to solve the lexical disambiguation problem completely, it becomes necessary to introduce more and more finely grained semantic markers. For example, to interpret the sentence about Susan's diamond correctly, we must mark one sense of diamond as SHIMMERABLE, while the other two are marked NONSHIMMERABLE. As the number of such markers grows, the size of the lexicon becomes unmanageable. In addition, each new entry into the lexicon may require that a new marker be added to each of the existing entries. The breakdown of the semantic marker approach when the number of words and word senses becomes large has led to the development of other ways in which correct senses can be chosen. We retum to this issue in Section 15.3.4.

    Sentence-Level Processing

    Several approaches to the problem of creating a semantic representation of a sentence have been developed, including the following:

    • Semantic grammars, which combine syntactic, semantic, and pragmatic knowledge into a single set of rules in the form of a grammar. The result of parsing with such a grammar is a semantic, rather than just a syntactic, description of a sentence.
    • Case grammars, in which the structure that is built by the parser contains some semantic information, although further interpretation may also be necessary.
    • Conceptual parsing, in which syntactic and semantic knowledge are combined into a single interpretation system that is driven by the semantic knowledge. In this approach, syntactic parsing is subordinated to semantic interpretation, which is usually used to set up strong expectations for particular sentence structures.
    • Approximately compositional semantic interpretation, in which semantic processing is applied to the result of performing a syntactic parse. This can be done either inerementally, as constituents are built, or all at once, when a structure corresponding to a complete sentence has been built.

    In the following sections, we discuss each of these approaches.

    15.3.1 Semantic Grammars

    A semantic grammar [Burton, 1976; Hendrix et al.,1978; Hendrix and Lewis, 1981] is a context-free grammar in which the choice of nonterminals and production rules is governed by semantic as well as syntactic funetion. In addition, there is usually a semantic action associated with each grammar rule. The result of parsing and applying all the associated semantic actions is the meaning of the sentence. This close coupling of semantic actions to grammar rules works because the grammar rules themselves are designed around key semantic concepts.

    Figure 15.10: A Semantic Grammar

    An example of a fragment of a semantic grammar is shown in Figure 15.10. This grammar defines part of a simple interface to an operating system. Shown in braces under each rule is the semantic action that is taken when the rule is applied. The term "value" is used to refer to the value that is matched by the right-hand side of tbe rule. The dotted notation x.y should be read as the y attribute of the unit x. The result of a successful parse using this grammar will be either a command or a query.

    A semantic grammar can be used by a parsing system in exactly the same ways in which a strictly syntactic grammar could be used. Several existing systems that have used semantic grammars have been built around an ATN parsing system, since it offers a great deal of Nexibility.

    Figure 15.11 shows the result of app)ying this semantic grammar to the sentence

    • I want to print Bill's .init file.

    Figure 15.11: The Result of Parsing with a Semantic Grammar

    Notice that in this approach. we have combined into a single process all five steps of Section 15.1.1 with the exception of the final part of pragmatic processing in which the conversion to the system's command syntax is done.

    The principal advantages of semantic grammars are the following:

    • When the parse is complete, the result can be used immediately without the additional stage ofprocessing that would be required ifa semantic interpretation had not already been performed during the parse.
    • Many ambiguities that would arise during a strictly syntactic parse can be avoided since some of the interpretations do not make sense semantically and thus cannot be generated by a semantic grammar. Consider, for example, the sentence "I want to print stuff.txt on printer3." During a strictly syntactic parse, it would not be possible to decide whether the prepositional phrase, "on printer3" modified "want" or "print." But using our semantic grammar, there is no general notion of a prepositional phrase and there is no attachment ambiguity.
    • Syntactic issues that do not affect the semantics can be ignored. For example, using the grammar shown above, the sentence, "What is the extension of.lisp file?" would be parsed and accepted as correct.
    There are, however, some drawbacks to the use of semantic grammars:
    • The number of rules required can become very large since many syntactic generalizations are missed.
    • Because the number of grammar rules may be very large, the parsing process may be expensive.

    After many experiments with the use of semantic grammars in a variety of domains, the conclusion appears to be that for producing restricted natural language interfaces quickly, they can be very useful. But as an overall solution to the problem of language understanding, they are doomed by their failure to capture important linguistic generalizations.

    15.3.2 Case Grammars

    Case grammars [Fillmore. 1968; Bruce, 1975) provide a different approach to the problem of how syntactic and semantic interpretation can be combined. Grammar rules are written to describe syntactic rather than semantic regularities. But the structures the rules produce correspond to semantic relations rather than to strictly syntactic ones. As an example, consider the two sentences and the simplified forms of their conventional parse trees shown in Figure 15.12.

    Although the semantic roles of "Susan" and "the file" are identical in these two sentences, their syntactic roles are reversed. Each is the subject in one sentence and the object in another.

    Using a case grammar, the interpretations of the two sentences would both be

    (printed (agent Susan)

    (object File))

    Figure 15.12: Syntactic Parses ofan Active and a Passive Sentence

    Figure 15.13 : Syntactic Parses of Two Similar Sentences

    Now consider the two sentences shown in Figure 15.13.

    The syntactic structures of these two sentences are almost identical. In one case, "Mother" is the subject of "baked," while in the other "the pie" is the subject. But the relationship between Mother and baking is very different from that between the pie and baking. A case grammar analysis ofthese two sentences refiects this difference. The first sentence would be interpreted as

    (baked (agent Mother)

    (timeperiod 3-hours))

    Thc second would be interpreted as

    (baked (object Pie)

    (timeperiod 3-hours)) ln these representations, the semantic roles of "mother" and "the pie" are made explicit. It is interesting to note that this semantic information actually does intrude into the syntax of the language. While it is allowed to conjoin two parallel sentences (e.g., "the pie baked" and "the cake baked" become "the pie and the cake baked"), this is only possible if the conjoined noun phrases are in the same case relation to the verb. This accounts for the fact that we do not say, "Mother and the pie baked."

    Notice that the cases used by a case grammar describe relationships between verbs and their arguments. This contrasts with the grammatical notion of surface case, as exhibited, for example, in English, by the distinetion between "I" (nominative case) and "me" (objective case). A given grammatical, or surface, case can indicate a variety of semantic, or deep, cases.

    There is no clear agreement on exactly what the correct set of deep cases ought to be, but some obvious ones are the following:

    • (A) Agent-Instigatorof the action (typically animate)
    • (I) Instrument-Cause of the event or object used in causing the event (typically inanimate)
    • (D) Dative-Entity affected by the action (typically animate)
    • (F) Factitive-Object or being resulting from the event
    • (L) Locative-Place of the event
    • (S) Source-Place from which something moves
    • (G) Goal-Place to which something moves
    • (B) Beneficiary-Being on whose behalf the event occurred (typically animate)
    • (T) Time-Time at which the event occurred
    • (O) Object-Entity that is acted upon or that changes, the most general case

    The process of parsing into a case representation is heavily directed by the lexical entries associated with each verb. Figure 15.14 shows examples of a few such entries. Optional cases are indicated in parentheses.

    Languages have rules for mapping from underlying case structures to surface syntactic forms. For example, in English, the "unmarked subject" [Footnote: The unmarked subject is the one that is used by default: it signals no special focus or emphaxis in the sentence.] is generally chosen by the following rule:

    • If A is present, it is the subject. Otherwise, if I is present, it is the subject. Else the subject is O.

    These rules can be applied in reverse by a parser to determine the underlying case structure from the superficial syntax.

    Parsing using a case grammar is usually expectation-driven. Once the verb of the sentence has been located, it can be used to predict the noun phrases that will occur and to determine the relationship of those phrases to the rest of the sentence.

    ATNs provide a good structure for case grammar parsing. Unlike traditional parsing algorithms in which the output structure always mirrors the structure ofthe grammar rules that created it, ATNs allow output structures of arbitrary form. For an example of their use, see Simmons [1973], which describes a system that uses an ATN parser to translate English sentences into a semantic net representing the case structures of sentences. These semantic nets can then be used to answer questions about the sentences.

    Figure 15.14: Some Verb Case Frames

    The result of parsing in a case representation is usually not a complete semantic description of a sentence. For example, the constituents that fill the case slots may still be English words ratherthan true semantic descriptions stated in the target representation. To go the rest ofthe way toward buildinga meaning representation, we still require many of the steps that are described in Section 15.3.4.

    15.3.3 Conceptual Parsing

    Conceptual parsing, like semantic grammars, is a strategy for finding both the structure and the meaning of a sentence in one step. Conceptual parsing is driven by a dictionary that describes the meanings of words as coneeptual dependency (CD) structures.

    Parsing a sentence into a conceptual dependency representation is similar to the process of parsing using a case grammar. In both systems, the parsing process is heavily driven by a set ofexpectations that are set up on the basis ofthe sentence's main verb. But bec:cuse the representation ofa verb in CD is at a Iower level than that ofa verb in a case grammar (in which the representation is often identical to the English word that is used), CD usually provides a greater degree of predictive power. The first step in mapping a sentence into its CD representation involves a syntactic processor that extracts the main noun and verb. It also determines the syntactic category and aspectual class of the verb (i.e., stative, transitive, or intransitive). The conceptual processor then takes over. It makes use of a verb-ACT dictionary, which contains an entry for each environment in which a verb can appear. Figure 15.15 (taken from Sehank [1973]) shows the dictionary entries associated with the verb "want." These three entries correspond to the three

    Figure 15.15: The Verb-ACT Dictionary

    Figure 15.16: A CD Structure

    kinds of wanting:

    • Wanting something to happen
    • Wanting an object
    • Wanting a person

    Once the correct dictionary entry is chosen, the conceptual processor analyzes the rest of the sentence looking for components that will tit iuto the empty slots of the verb structure. For example, if the stative form of"want" has been found, then the conceptual processor will look for a conceptualization that can be inserted into the structure. So, if the sentence being processed were

    • John wanted Mary to go to the store.

    the structure shown in Figure 15.16 would be built.

    The conceptual processor examines possible interpretations in a well-defined order. For example, if a phrase of the form "with PP" (recall that a PP is a picture producer) occurs. it could indicate any of the following relationships between the PP and the conceptualization of which it is a part:

    1. Object of the instrumental case
    2. Additional actor of the main ACT
    3. Attribute of the PP just preceding it
    4. Attribute of the actor of the conceptualization

    Suppose that the conceptual processor were attempting to interpret the prepositional phrase in the sentence

    • John went to the park with the girl.

    First, the system's immediate memory would be checked to see if a park with a girl has been mentioned. If so, a reference to that particular objectis generated and the process terminates. Otherwise, the four possibilities outlined above are investigated in the order in which they are presented. Can "the girl" be an instrument of the main ACT (PTRANS) of this sentence? The answer is no, because only MOVE and PROPEL can be instruments of a PTRANS and their objects must be either body parts or vehicles. "Girl" is neither of these. So we move on to consider the second possibility In order for girl" to be an additional actor of the main ACT, it must be animate. It is. So this interpretation is chosen and the process terminates. If, however, the sentence had been

    • John went to the park with the fountain.

    the process would not have stopped since a fountain is inanimate and cannot move. Then the third possibility would have been considered. Since parks can have fountains, it would be accepted and the process would terminate there. For a more detailed description of the way a conceptual processor based on CD works, see Schank (1973], Rieger [1975], and Riesbeck [1975].

    This example illustrates both the strengths and the weaknesses of this approach to sentence understanding. Because a great deal of semantic information is exploited in the understanding process, sentences that would be ambiguous to a purely syntactic parser can be assigned a unique interpretation. Unfortunately, the amount of semantic information that is required to do this job perfectly is immense. All simple rules have exceptions. For example, suppose the conceptual processor described above were given the sentence

    • John went to the park with the peacocks.

    Since peacocks are animate, they would be acceptable as additional actors of the main verb, "went." Thus, the interpretation that would be produced would be that shown in Figure 15.17(a), while the more likely interpretation, in which John went to a park containing peacocks, is shown in Figure 15.17(b). But if the possible roles for a prepositional phrase introduced by "with" were considered in the order necessary for

    Figure 15.17: Two CD Interpretations of a Sentence

    this sentence to be interpreted correctly, then the previous example involving the phrase, "with Mary," would have been misunderstood.

    The problem is that the simple check for the property ANIMATE is not sufficient to determine acceptability as an additional actor of a PTRANS. Additional knowledge is necessary. Some more knowledge can be inserted within the framework we have described for a conceptual processor. But to do a very good job of producing correct semantic interpretations of sentences requires knowledge of the larger context in which the sentence appears. Techniques for exploiting such knowledge are discussed in the next section.

    15.3.4 Approximately Compositional Semantic Interpretation

    The final approach to semantics that we consider here is one in which syntactic parsing and semantic interpretation are treated as separate steps, although they must mirror each other in well-defined ways. This is the approach to semantics that we looked at briefly in Section 15.1.1 when we worked through the example sentence "I want to print Bill's init file."

    If a strictly syntactic parse of a sentence has been produced, then a straightforward way to generate a semantic interpretation is the following:

    1. Look up each word in a lexicon that contains one or more definitions for the word, each stated in terms of the chosen target representation. These definitions must describe how the idea that conesponds to the word is to be represented, and they may also describe how the idea represented by this word may combine with the ideas represented by other wordsin the sentence.
    2. Use the structure information contained in the output of the parser to provide additional constraints, beyond those extracted from the lexicon, on the way individual words may combine to form larger meaning units.

    We have already discussed the first ofthese steps (in Section 15.3). In the rest of this section, we discuss the second.

    Montague Semantics

    Recall that we argued in Section 15.1.1 that the reason syntactic parsing was a good idea was that it produces structures that correspond to the structures that should result from semantic processing. If we investigate this idea more closely, we arrive at a notion called compositional semantics. The main idea behind compositional semantics is that, for every step in the syntactic parsing process, there is a corresponding step in semantic interpretation. Each time syntactic constituents are combined to form a larger syntactic unit, their corresponding semantic interpretations can be combined to form a larger semantic unit. The necessary mles for combining semantic structures are associated with the corresponding rules for combining syntactic structures. We use the word "compositional" to describe this approach because it defines the meaning of each sentence constituent to be a composition of the meanings of its constituents with the meaning of the rule that was used to create it. The main theoretical basis for this approach is modem (i.e., post-Fregean) logic; the clearest linguistic application is the work of Montague [Dowty et al., 1981; Thomason, 1974].

    As an example of this approach to semantic interpretation, let's retum to the example that we began in Section 15.1.1. The sentence is

    • I want to print Bill's .init file.

    The output of the syntactic parsing process was shown in Figure 15.2, and a fragment of the knowledge base that is being used to define the target representation was shown in Figure 15.3. The result of semantic interpretation was also shown there in Figure 15.4. Although the exact form of semantic mapping rules in this approach depends on the way that the syntactic grammar is defined, we illustrate the idea of compositional semantic rules in Figure 15.18.

    The first two rules are examples of verb-mapping rules. Read these rules as saying that they map from a partial syntactic structure containing a verb, its subject, and its object, to some unit with the attributes instance, agent. and object. These rules do two things. They describe the meaning of the verbs ("want" or "print") themselves in terms of events in the knowledge base. They also state how the syntactic arguments of the verbs (their subjects and objects) map into attributes of those events. By the way, do not get confused by the use of the term "object" in two different senses here. The syntactic object of a sentence and its semantic object are two different things. For historical reasons (ineluding the standard usage in case grammars as described in Section l5.3.2), they are often called the same thing, although this probtem is sometimes avoided by using some other name, such as affected-entity, for the semantic object. Altematively, in some knowledge bases, much more specialized names, such as printed-thing, are sometimes used as attribute names.

    The third and fourth rules are examples of modifier rules. Like the verb rules, they too must specify both their own constituent's contribution to meaning as well as how it combines with the meaning of the noun phrase or phrases to which it is attached.

    Figure 15.18: Some Semantic Interpretation Rules

    The last two rules are simpler. They define the meanings of nouns. Since nouns do not usually take arguments, these rules specify only single-word meanings; they do not need to describe how the meanings of larger constituents are derived from their components.

    One important thing to remember about these rules is that since they define mappings from words into a knowledge base, they implicitly make available to the semantic processing system all the information contained in the knowledge base itself. For example, Figure 15.19 contains a description of the semantic information that is associated with the word "want" after applying the semantic rule associated with the verb and retrieving semantic constraints associated with wanting events in the knowledge base. Notice that we now know where to pick up the agent for the wanting (RMI) and we know some property that the agent must have. The semantic interpretation routine will reject any interpretation that does not satisfy all these constraints.

    This compositional approach to defining semantic interpretation has proved to be a very powerful idea. (See, for example, the Absity syslem described in Hirst [1987].) Unfortunately, there are some linguistic constructions that cannot be ac:counted for naturally in a strictly compositional system. Quantified expressions have this property. Consider, for example, the sentence

    • Every student who hadn't declared a major took an English class.

    Figure 15.19: Combining Mapping Knowledge with the Knowledge Base

    There are several ways in which the relative scopes of the quantifiers in this sentence can be assigned. In the most likely, both existential quantifiers are within the scope of the universal yuantifier. But, in other readings, they are not. These include readings corresponding to, "There is a major such that every student who had not declared it took an English class," and "There is an English class such that every student who had not declared some major took it." In order to generate these meanings compositionally from the parse, it is necessary to produce a separate parse for each scope assignment. But there is no syntactic reason to do that, and it requires substantial additional effort. An altemative is to generate a single parse and then to use a noncompositional algorithm to generate as many alternative scopes as desired.

    As a second example, consider the sentence, "John only eats meat on Friday and Mary does too." The syntactic analysis of this sentence must include the verb phrase constituent, "only eats meat on Friday," since that is the constituent that is picked up by the elliptical expression "does too." But the meaning of the first clause has a structure more like

    • only(meat, {x | John eats x on Friday})

    which can be read as, "Meat is the only thing that John eats on Friday."

    Extended Reasoning with a Knowledge Base

    A significant amount of world knowledge may be necessary in order to do semantic interpretation (and thus, sometimes, to get the conect syntactic parse). Sometimes the knowledge is needed to enable the system to choose among competing interpretations. Consider, for example, the sentences

    1. John made a huge wedding cake with chocolate icing.
    2. John made a huge wedding cake with Bill's mixer.
    3. John made a huge wedding cake with a giant tower covered with roses.
    4. John made a cherry pie with a giant tower covered with roses.

    Let's concentrate on the problem of deciding to which constituent the prepositional phrase should be attached and of assigning a meaning to the preposition "with." We have two main choices: either the phrase attaches to the action of making the cake and "with" indicates the instrument relation, or the prepositional phrase attaches to the noun phrase describing the dessert that was made, in which case "with" describes an additional component of the dessert. The first two sentences are relatively straightforward if we imagine that our knowledge base contains the following facts:

    • Foods can be components of other foods.
    • Mixers are used to make many kinds of desserts.

    But now consider the third sentence. A giant tower is neither a food nor a mixer. So it is not a likely candidate for either role. What is required here is the much more specific (and culturally dependent) fact that

    • Wedding cakes often have towers and statues and bridges and flowers on them.

    The highly specific nature of this knowledge is illustrated by the fact that the last of these sentences does not make much sense to us since we can find no appropriate role for the tower, either as part of a pie or as an instrument used during pie making.

    Another use for knowledge is to enable the system to accept meanings that it has not been explicitly told about. Consider the following sentences as examples:

    1. Sue likes to read loyce.
    2. Washington backed out of the summit talks.
    3. The stranded explorer ate squirrels.

    Suppose our system has only the following meanings for the words "Joyce," "Washington," and "squirrel" (actually we give only the relevant parts of the meanings):

    1. Joyce-instance: Author; last-name: Joyce
    2. 2. Washington-instance City; name: Washington
    3. 3. squirrel-isa: Rodent ;...

    But suppose that we also have only the following meanings for the verbs in these sentences:

    1. read-isa: Mental-Event ; object: must be
    2. back out-isa: Mental-Event ; agent: must be
    3. eat-isa: Ingestion-Event ; object: must be

    The problem is that it is not possible to construct coherent interpretations for any of these sentences with these definitions. An author is not a . A city is not an . A rodent is not a . One solution is to create addilional dictionary entries for the nouns: Joyce as a set of literary works. Washington as the people who run the U.S. government, and a squirrel as a food. But a better solution is to use general knowledge to derive these meanings when they are needed. By better, here we mean that since less knowledge must be entered by hand, the resulting system will be less brittle. The general knowledge that is necessary to handle these examples is:

    • The name of a person can be used to refer to things the person creates. Authoring is a kind of creating.
    • The name of a place can be used to stand for an organization headquartered in that place if the association between the organization and the place is salient in the context. An organization can in turn stand for the people who run it. The headquarters of the U.S. govemment is in Washington.
    • Food (meat) can be made out of almost any animal. Usually the word for the animal can be used to refer to the meat made from the animal.

    Of course, this problem can become arbitrarily complex. For example, metaphors are a rich source for Iinguistic expressions [Lakoff and Johnson, 1980]. And the problem becomes even more complex when we move beyond single sentences and attempt to extract meaning from texts and dialogues. We delve briefly into those issues in Section 15.4.

    The Interaction between Syntax and Semantics

    If we take a compositional approach to semantics, then we apply semantic interpretation rules to each syntactic constituent, eventually producing an interpretation for an entire sentence. But making a commitment about what to do implies no specific commitment about when to do it. To implement a system, however, we must make some decision on how control will be passed back and forth between the syntactic and the semantic processors. Two extreme positions are:

    • Every time a syntactic constituent is formed, apply semantic interpretation to it immediately.
    • Wait until the entire sentence has been parsed, and then interpret the whole thing.

    There are arguments in favor of each approach. The theme of most of the arguments is seareh control and the opportunity to prune dead-end paths. Applying semantic processing to each constituent as soon as it is produced allows semantics to rule out right away those constituents that are syntactically valid but that make no sense. Syntactic processing can then be informed that it should not go any further with those constituents. This approach would pay off, for example, for the sentence, "Is the glass jar peanut butter?" But this approach can be costly when syntactic processing builds constituents that it will eventually reject as being syntactically unacceptable, regardless of their semantic acceptability. The sentence, "The horse raced past the barn fell down," is an example of this. There is no point in doing a semantic analysis of the sentence "The horse raced past the bam," since that constituent will not end up being part of any complete syntactic parse. There are also additional arguments for waiting until a co mplete sentence has been parsed to do atleast some parts of semantic interpretation. These arguments involve the need for large constituents to serve as the basis of those semantic actions, such as the ones we discussed in Section 15.3.4. that are hard to detine completely compositionally. There is no magic solution to this problem. Most systems use one of these two extremes or a heuristically driven compromise position.

    15.4 Discourse and Pragmatic Processing

    To understand even a single sentence, it is necessary to consider the discourse and pragmatic context in which the sentence was uttered (as we saw in Section 15.1.1 ). These issues become even more important when we want to understand texts and dialogues, so in this section we broaden our concern to these larger linguistic units. There are a number of important relationships that may hold between phrases and parts af their discourse contexts, including:

    • Identical entities. Consider the text
      • Bill had a red balloon.
      • John wanted it.

      The word "it" should be identified as referring to the red balloon. References such as this are called anaphoric references or anaphora.

    • Parts of entities. Consider the text
      • Sue opened the book she just bought.
      • The title page was torn.

      The phrase "the title page" should be recognized as being part of the book that was just bought.

    • Parts of actions. Consider the text
      • John went on a business trip to New York.
      • He left on an early morning flight.

      Taking a flight should be recognized as part of going on a trip.

    • Entities involved in actions. Consider the text
      • My house was broken into last week.
      • They took the TV and the stereo.

      The pronoun "they" should be recognized as referring to the burglars who broke into the house.

    • Elements of sets. Consider the text
      • The decals we have in stock are stars, the moon, ilem and a flag.
      • I'll take two moons.

      The moons in the second sentence should be understood to be some of thc moons mentioned in Ihe first sentence. Notice that to understand Ihe second sentenec at all reyuires that we use the context of the tirst sentence to establish that the word "moons" means moon decals.

    • Names of individuals. Consider the text
      • Dave went to the movies.

      Dave should be understood to be some person named Dave. Although there are many, the speaker had one particular one in mind and the discourse context should tell us which.

    • Causal chains. Consider the text
      • There was a big snow storm yesterday.
      • The schools were closed today.

      The snow should be recognized as the reason that the schools were closed.

    • Planning sequences. Consider the text
      • Sally wanted a new car.
      • She decided to get a job.

      Sally's sudden interest in a job should be recognized as arising out of her desire for a new car and thus for the money to buy one.

    • Illocutionary force. Consider the sentence
      • It sure is cold in here.

      In many circumstances, this sentence should be recognized as having, as its intended effect, that the hearer should do something like close the window or turn up the thermostat.

    • Implicit presuppositions. Consider the query
      • Did loe fail CS101?

      The speaker's presuppositions, including the fact that C5101 is a valid course, that Joe is a student, and that Joe took CS 1 O1, should be recognized so that if any of them is not satisfied, the speaker can be informed.

    In order to be able to recognize these kinds of relationships among sentences, a great deal of knowledge about the world being discussed is required. Programs that can do multiple-sentence understanding rely either on large knowledge bases or on strong constraints on the domain of discourse so that only a more limited knowledge base is necessary. The way this knowledge is organized is critical to the success of the understanding program. In the rest of this section, we discuss brietly how some of the knowledge representations described in Chapters 9 and 10 can be exploited by a language-understanding program. In particular, we focus on the use of the following kinds of knowledge:

    • The current focus of the dialogue
    • A model of each participant's current beliefs
    • The goal-driven character of dialogue
    • The rules of conversation shared by all participants

    Although these issues are complex, we discuss them only brieHy here. Most of the hard problems are not peculiar to natural language processing. They involve reasoning about objects, events, goals, plans, intentions, beliefs, and likelihoods, and we have discussed all these issues in some detail elsewhere. Our goal in this section is to tie those reasoning mechanisms into the process of natural language understanding.

    15.4.1 Using Focus in Understanding

    There are two important parts of the process of using knowledge to facilitate understanding:

    • Focus on the relevant part(s) of the available knowledge base.
    • Use that knowledge to resolve ambiguities and to make connections among things that were said.

    The first of these is critical if the amount of knowledge available is large. Some techniques for handling this were outlined in Section 4.3.5, since the problem arises whenever knowledge structures are to be used.

    The linguistic properties of coherent discourse, however, provide some additional mechanisms for focusing. For example, the structure of task-oriented discourses typically mirrors the structure of the task. Consider the following sequence of (highly simplified) instructions:

    • To make the torte, first make the cake, then, while the cake is baking, make the filling. To make the cake, combine all ingredients. Pour them into the pans, and bake for 30 minutes. To make the filling, combine the ingredients. Mix until light and tluffy. When the cake is done, alternate layers of cake and filling.

    This task decomposes into three subtasks: making the cake, making the filling, and combiningthetwocomponents. Thestructureoftheparagraphofinstructionsis: overall sketch of the task, instructions for step 1, instructions for step 2, and then instructions for step 3.

    A second property of coherent discourse is that dramatic changes of focus are usually signaled explicitly with phrases such as "on the other hand," "to return to an earlier topic," or "a second issue is."

    Assuming that all this knowledge has been used successfully to focus on the relevant part(s) of the knowledge base, the second issue is how to use the focused knowledge to help in understanding. There are as many ways of doing this as there are discourse phenomena that reyuire it. In the last section, we presented a sample list of those phenomena. To give one example, consider the problem of finding the meaning of definite noun phrases. Definite noun phrases are ones that refer to specific individual objects, for example, the first noun phrase in the sentence, "The title page was torn." The title page in question is assumed to be one that is related to an object that is currently in focus. So the procedure for finding a meaning for it involves searehing for ways in which a title page could be related to a focused object. Of course, in some sense, almost any object in a knowledge base relates somehow to almost any other. But some relations are far more salient than others, and they should be considered first. Highly salient relations inelude physical-part-of, temporal-part-of, and element-of. In this example, physical-part-of relates the title page to the book that is in focus as a result of its mention in the previous sentence.

    Other ways of using focused information also exist. We examine some of them in the remaining parts of this section.

    15.4.2 Modeling Beliefs

    In order for a program to be able to participate intelligently in a dialogue, it must be able to represent not only its own beliefs about the world, but also its knowledge of the other dialogue participant's beliefs about the world, that person's beliefs about the computer's beliefs, and so forth. The remark "She knew I knew she knew I knew she knew" [Footnote: From Kingsley Amis' Jake's Thing.] may be a bit extreme, but we do that kind of thinking all the time. To make computational models of belief, it is useful to divide the issue into two parts: those beliefs that can be assumed to be shared among all the participants in a linguistic event and those that cannot.

    Modeling Shared Beliefs

    Shared beliefs can be modeled without any explicit notion of belief in the knowledge base. All we need to do is represent the shared beliefs as facts, and they will be accessed whenever knowledge about anyone's beliefs is needed. We have already discussed techniques for doing this. For example, much ofthe knowledge described in Chapter 10 is exactly the sort that people presume is shared by other people they are communicating with. Scripts, in particular, have been used extensively to aid in natural language understanding. Recall that seripts record commonly occurring sequences of events. There are two steps in the process of using a script to aid in language understanding:

    • Select the appropriate script(s) from memory.
    • Use the script(s) to fill in unspecified parts of the text to be understood.

    Both of these aspects of reasoning with seripts have already been discussed in Section 10.2. The story-understanding program SAM [Cullingford,1981] demonstrated the usefulness of such reasoning with scripts in natural language understanding. To understand a story, SAM first employed a parser that translated the English sentences into their conceptual dependency representation. Then it built a representation of the entire text using the relationships indicated by the relevant scripts.

    Modeling Individual Beliefs

    As soon as we decide to represent individual beliefs, we need to introduce some explicit predicate(s) to indicate that a fact is believed. Up until now, belief has been indicated only by the presence or absence of assertions in the knowledge base. To model belief, we need to move to a logic that supports reasoning about belief propositions. The standard approach is to use a modal logic such as that defined in Hintikka [1962]. Logic, or "classical" logic, deals with the truth or falsehood of different statements as they are. Modal logic, on the other hand, concerns itself with the different "modes" in which a statement may be true. Modal logics allow us to talk about the truth of a set of propositions not only in the current state of the real world, but also about their truth or falsehood in the past or the future (these are called temporal logics), and about their truth or falsehood under circumstances that might have been, but were not (these are sometimes called conditional logics). We have already used one idea from modal logic, namely the notion necessarily true. We used it in Section 13.5, when we talked about nonlinear planning in TWEAK.

    Modal logics also allow us to talk of the truth or falsehood of statements concerning the beliefs, knowledge, desires, intentions, and obligations of people and robots, which may, in fact be, respectively, false, unjustified, unsatisfiable, irrational, or mutually contradictory. Modal logics thus provide a set ofpowerful tools for understanding natural language utterances, which often involv'e reference to other times and circumstances, and to the mental states of people.

    In particular, to model individual belief we define a modal operator BELIEVE, that enables us to make assertions of the form BELIEVE(A, P), which is true whenever A believes P to be true. Notice that this can occur even if P is believed by someone else to be false or even if P is false.

    Another useful modal operator is KNOW:

    • BELIEVE(A, P) ^ P --? KNOW(A, P)

    A third useful modal operator is KNOW-WHAT(A. P), which is true if A knows the value of the funetion P. For example, we might say that A knows the value of his age.

    An alternative way to represent individual beliefs is to use the idea of knowledge base partitioning that we discussed in Section 9.1. Partitioning enables us to do two things:

    1. Represent efficiently the large set of beliefs shared by the participants. We discussed one way of doing this above.
    2. Represent accurately the smaller set of beliefs that are not shared.

    Requirement 1 makes it imperative that shared beliefs not be duplicated in the representation. This suggests that a single knowledge base must be used to represent the beliefs of all the participants. But reyuirement 2 demands that it be possible to separate the beliefs of one person from those of another. One way to do this is to use partitioned semantic nets. Figure 15.20 shows an example of a partitioned belief space.

    Three different belief spaces are shown:

    • S1 believes that Mary hit Bill.

      Figure 15.20: A Partitioned Semantic Net Showing Three Belief Spaces
    • S2 believes that Sue hit Bill.
    • S3 believes that someone hit Bill. It is important to be able to handle incomplete beliefs of this kind, since they frequently serve as the basis for questions, such as, in this case, "Who hit Bill?"
    15.4.3 Using Goals and Plans for Understanding

    Consider the text

    • John was anxious to get his daughter's new bike put together before Christmas Eve. He looked high and low for a screwdriver.

    To understand this story, we need to recognize that John had

    1. A goal, getting the bike put together.
    2. A plan, which involves putting together the various subparts until the bike is complete. At least one of the resulting subplans involves using a serewdriver to screw two parts together.

    Some of the common goals that can be identified in stories of all sorts (including children`s stories, newspaper reports, and history books) are

    • Satisfaction goals, such as sleep, food, and water.
    • Enjoyment goals, such as entertainment and competition.
    • Achievement goals, such as possession, power, and status.
    • Preservation goals,such as health and possessions.
    • Pleasing goals, which involve satisfying some other kind of goal for someone else.
    • Instrumental goals, which enable preconditions for other, higher-level goals.

    To achieve their goals, people exploit plans. In Chapter 13, we talked about several computational representations of plans. These representations can be used to support natural language processing, particularly if they are combined with a knowledge base of operators and stored plans that describe the ways that people often accomplish common goals. These stored operators and plans enable an understanding system to form a coherent representation of a text even when steps have been omitted, since they specify things that must have occurred in the complete story. For example, to understand this simple text about John, we need to make use of the fact that John was exploiting the operator USE (by A of P to perform G), which can be described as:

    USE(A, P, G):

    precondition: KNOW-WHAT(A, LOCATION(P))

    NEAR(A, P)

    HAS-CONTROL-OF(A, P)

    READY(P)

    postcondition: DONE(G)

    In other words, for A to use P to perform G, A must know the location of P, A must be near P, A must have coştrol of P (for example, I cannot use a serewdriver that you are holding and refuse to give to me), and P must be ready for use (for example, I cannot use a broken serewdriver).

    In our story, John's plan for constructing the bike ineludes using a sevewdriver. So he needs to establish the preconditions for that use. In particular, he needs to know the location of the serewdriver. To find that out, he makes use of the operator LOOK-FOR:

    LOOK-FOR(A, P):

    precondition: CAN-RECOGNIZE(A, P)

    postcondition: KNOW-WHAT(A, LOCATION(P))

    A story understanding program can connect the goal of putting together the bike with the activity of looking for a serewdriver by recognizing that John is looking for a serewdriver so that he can use it as part of putting the bike together.

    Often there are alternative operators or plans for achieving the same goal. For example, to find out where the screwdriver was, lohn could have asked someone. Thus the problem of constructing a coherent interpretation of a text or a discourse may involve considering many partial plans and operators.

    Plan recognition has served as the basis for many understanding programs. PAM [Wilensky, 1981] is an early oxample; it translated stories into a CD representation. Another such program was BORIS [Dyer,1983]. BORIS used a memory structure called the Thematic Abstraction Unit to organize knowledge about plans, goals, interpersonal relationships, and emotions. For other examples, see Allen and Perrault [1980] and Sidner [1985].

    15.4.4 Speech Acts

    Language is a form of behavior. We use it as one way to accomplish our goals. In essence, we make communicative plans in much the same sense that we make plans for anything else [Austin,1962]. In fact, as we just saw in the example above, John could have achieved his goal of locating a screwdriver by asking someone where it was rather than by looking for it. The elements of communicative plans are called speech acts [Searle,1969]. We can axiomatize speech acts just as we axiomatized other operators in the previous section, except that we need to make use of modal operators that describe states of belief, knowledge, wanting, etc. For example, we can define the basic speech act A INFORM B of P as follows:

    INFORM(A, B, P)

    precondition: BELIEVE(A, P)

    KNOW-WHAT(A, LOCATION(B))

    postcondition: BELIEVE(B, BELIEVE(A, P))

    BELIEVE-IN(B, A) --? BELIEVE(B, P)

    To execute this operation, A must believe P and A must know where B is. The result of this operator is that B believes that A believes P, and if B believes in the truth of what A says, then B also believes P.

    We can deline other speech acts similarly. For example, we can define ASK-WHAT (in which A asks H the value of some predicate P):

    ASK-WHAT(A, B, P):

    precondition: KNOW-WHAT(A, LOCATION(B))

    KNOW-WHAT(B, P)

    WILLING-TO-PERFORM

    (B, INFORM(B, A, P))

    postcondition: KNOW-WHAT(A, P)

    This is the action that John could have performed as an alternative way of finding a screwdriver.

    We can also define other speech acts, such as A REQUEST B to petform R:

    REQUEST(A, B, R)

    precondition: KNOW-WHAT(A, LOCATION(B))

    CAN-PERFORM(B, R)

    WILLING-TO-PERFORM(B, R)

    postcondition: WILL(PERFORM(B, R))

    15.4.5 Conversational Postulates

    Unfortunately, this analysis of language is complicated by the fact that we do not always say exactly what we mean. Instead, we often use indirect speech acts, such as "Do you know what time it is?" or "It sure is cold in here." Searle [1975] presents a linguistic theory of such indirect speech acts. Computational treatments of this phenomenon usually rely on models of the speaker's goals and of ways that those goals might reasonably be achieved by using language. See, for example, Cohen and Perrault [l979].

    Fortunately, there is a certain amount of regularity in people's goals and in the way language can be used to achieve them. This regularity gives rise to a set of conversational postulates, which are rules about conversation that are shared by all speakers. Usually these rules are followed. Sometimes they are not, but when this happens, the violation of the rules communicates something in itself. Some of these conversational postulates are:

    • Sincerity Conditions-For a request by A of B to do R to be sincere, A must want B to do R, A must assume B can do R, A must assume B is willing to do R, and A must believe that B would not have done R anyway. If A attempts to verify one of these conditions by asking a question of B, that question should normally be interpreted by B as equivalent to the request R. For example,
      • A: Can you open the door?
    • Reasonableness Conditions-For a request by A of B to do R to be reasonable, A must have a reason for wanting R done, A must have a reason for assuming that B can do R, A must have a reason for assuming that B is willing to do R, and A must have a reason for assuming that B was not already planning to do R. Reasonableness conditions often provide the basis for challenging a request. Together with the sincerity conditions described above, they account for the coherence ofthe followinginterehange:
      • A: Can you open the door?
      • B: Why do you want it open?
    • Appropriateness Conditions-For a statement to be appropriate, it must provide the correct amount of information, it must accurately reflect the speaker 's beliefs, it must be concise and unambiguous, and it must be polite. These conditions account for A's response in the following interchange:
      • A: Who won the race?
      • B: Someone with long, dark hair.
      • A: I thought you knew all the runners.

      A inferred from B's incomplete response that B did not know who won the race, because if B had known she would have provided a name.

    Of course, sometimes people "cop out" of these conventions. In the following dialogue, B is explicitly copping out:

    • A: Who is going to be nominated for the position?
    • B: I'm sorry, I cannot answer that question.

    But in the absence of such a cop out, and assuming a cooperative relationship between the panies to a dialogue, the shared assumption of these postutates greatly facilitates communication. For a more detailed discussion of conversational postulates, see Grice [1975] and Gordon and Lakoff [1975].

    We can axiomatize these conversational postulates by augmenting the preconditions for the speech acts that we have already defined. For example, we can describe the sin- cerity conditions by adding the following clauses to the precondition for REQUEST(A, B, R):

    WANT(A, PERFORM(B, R))

    BELIEVE(A, CAN-PERFORM(B, R))

    BELIEVE(A, WILLING-TO-PERFORM(B, R))

    BELIEVE(A, ,WILL(PERFORM(B, R)))

    If we assume that each panicipant in a dialogue is following these conventions, then it is possible to infer facts about the participants' belief states from what they say. Those facts can then be used as a basis for constructing a coherent interpretation of a discourse as a whole.

    To summarize, we have just described several techniques for representing knowledge about how people act and talk. This knowledge plays an imponant role in text and discourse understanding, since it enables an understander to fill in the gaps left by the original writer or speaker. It turns out that many ofthese same mechanisms, in particular those that allow us to represent explicitly the goals and beliefs of multiple agents, will also turn out to be useful in constructing distributed reasoning systems, in which several (at least partially independent) agents interact to achieve a single goal. We come back to this topic in Section 16.3.

    15.5 Summary

    In this chapter, we presented a brief introduction to the surprisingly hard problem of language understanding. Recall that in Chapter 14, we showed that at least one understanding problem, line labeling, could effectively be viewed as a constraint satisfaction problem. One interesting way to summarize the natural language understanding problem that we have described in this chapter is to view it too as a constraint satisfaction problem. Unfonunately, many more kinds of constraints must be considered, and even when they are all exploited, it is usually not possible to avoid the guess and seareh pan of the constraint satisfaction procedure. But constraint satisfaction does provide a reascinable framework in which to view the whole collection of steps that together create a meaning for a sentence. Essentially each of the steps described in this chapter exploits a particular kind of knowledge that contributes a specific set of constraints that must be satisfied by any correct final interpretation of a sentence.

    Syntactic processing contributes a set of constraints derived from the grammar of the language. It imposes constraints such as:

    • Word order, which rules out, for example, the constituent, "manager the key," in the sentence, "I gave the apartment manager the key."
    • Number agreement, which keeps "trial run" from being interpreted as a sentence in "The first trial run was a failure."
    • Case agreement, which rules, out, for example, the constituent, "me and Susan gave one to Bob," in the sentence, "Mike gave the program to Alan and me and Susan gave one to Bob."

    Semantic processing contributes an additional set of constraints derived from the knowledge it has about entities that can exist in the world. It imposes constraints such

    • Specific kinds of actions involve specific classes of panicipants. We thus rule out the baseball field meaning of the word "diamond" in the sentence, "John saw Susan's diamond shimmering from across the room."
    • Objects have propenies that can take on values from a limited set. We thus rute out Bill's mixer as a component of the cake in the sentence, "John made a huge wedding cake with Bill's mixer."

    Discourse processing contributes a funher set of constraints that arise from the structure of coherent discourses. These include:

    • The entities involved in the sentence must either have been introduced explicitly or they must be related to entities that were. Thus the word "it" in the discourse "John had a cold. Bill caught it," must refer to John's cold. This constraint can propagate through other constraints. For example, in this case, it can be used to determine the meaning of the word "caught" in this discourse, in contrast to its meaning in the discourse, "John threw the ball. Bill caught it."
    • The overall discourse must be coherent. Thus, in the discourse, "I needed to deposit some money, so I went down to the bank," we would choose the financial institution reading of bank over the river bank reading. This requirement can even cause a later sentence to impose a constraint on the interpretation of an earlier one, as in the discourse, "I went down to the bank. The river had just flooded, and I wanted to see how bad things were."
    And finally, pragmatic processing contributes yet another set of constraints. For example,
    • The meaning of the sentence must be consistent with the known goals of the speaker. So, for example, in the sentence, "Mary was anxious to get the bill passed this session, so she moved to table it." we are forced to choose the (normally British) meaning of table (to put it on the table for discussion) over the (normally American) meaning (to set it aside for later).

    There are many important issues in natural language processing that we have barely touched on here. To learn more about the overall problem, see Allen [1987], Cullingford [1986], Dowty et al. [1985], and Grosz et al. [1986]. For more information on syntactic processing, see Winograd [1983] and King [1983). See Joshi et al. [1981] for more discussion of the issues involved in discourse understanding. Also, we have restricted our discussion to natural language understanding. It is often useful to be able to go the other way as well, that is, to begin with a logical description and render it into English. For discussions of natural language generation systems, see McKeown and Swartout [1987] and McDonald and Bolc [1988]. By combining understanding and generation systems, it is possible to attack the problem of machine translation, by which we understand text written in one language and then generate it in another language. See Slocum [1988], Nirenburg [1987], Lehrberger and Bourbeau [1988], and Nagao [1989] for discussions of a variety of approaches to this problem.

    15.6 Exercises

    1. Consider the sentence

      The old man's glasses were filled with sherry.

      What information is necessary to choose the correct meaning for the word "glasses '? What information suggests the incorrect meaning?

    2. For each of the following sentences, show a parse tree. For each of them, explain what knowledge, in addition to the grammar of English, is necessary to produce the correct parse. Expand the grammar of Figure 15.6 as necessary to do this.
      • John wanted to go to the movie with Sally.
      • John wanted to go to the movie with Robert Redford.
      • I heard the story listening to the radio.
      • I heard the kids listening to the radio.
      • All books and magazines that deal with controversial topics have been removed from the shelves.
      • All books and magazines that come out quarterly have been removed from the shelves.
    3. In the following paragraph, show the antecedents for each of the pronouns. What knowledge is necessary to determine each?
      • John went to the store to buy a shirt. The salesclerk asked him if he could help him. He said he wanted a blue shirt. The salesclerk found one and he tried it on. He paid for it and left.
    4. Consider the following sentence:
      • Put the red block on the blue block on the table.
      1. Show all the syntactically valid parses ofthis sentence. Assume any standard grammatical formalism you like.
      2. How could semantic information and world knowledge be used to select the appropriate meaning of this command in a particular situation?

      After you have done this, you might want to look at the discussion of this problem in Church and Patil [1982].

    5. Each of the following sentences is ambiguous in at least two ways. Because of the type of knowledge represented by each sentence, different target languages may be useful to characterize the different meanings. For each ofthe sentences, choose an appropriate target language and show how the different meanings would be represented:
      • Everyone doesn't know everything.
      • John saw Mary and the boy with a telescope.
      • John flew to New York.
    6. Write an ATN grammar that recognizes verb phrases involving auxiliary verbs. The grammar should handle such phrases as
      • "went"
      • "should have gone"
      • "had been going"
      • "would have been going"
      • "would go"

      Do not expect to produce an ATN that can handle all possible verb phrases. But do design one with a reasonable structure that handles most common ones, including the ones above. The grammar should create structures that reflect the structures of the input verb phrases.

    7. Show how the ATN of Figures 15.8 and 15.9 could be modified to handle passive sentences.
    8. Write the rule "S -? NP VP" in the graph notationthat we defined in Section l5.2.3. Show how unification can be used to enforce number agreement between the subject and the verb.
    9. Consider the problem of providing an English interface to a database of employee records.
      1. Write a semantic grammar to define a language for this task.
      2. Show a parse, using your grammar, of each of the two sentences What is Smith's salary? Tell me who Smith's manager is.
      3. Show parses of the two sentences of part (b) using a standard syntactic grammar of English. Show the fragment of the grammar that you use.
      4. How do the parses of parts (b) and (c) differ? What do these differences say about the differences between syntactic and semantic grammars?
    10. How would the following sentences be represented in a case structure:
      1. The plane flew above the clouds.
      2. John flew to New York.
      3. The co-pilot flew the plane.
    11. Both case grammar and conceptual dependency produce representations of sentences in which noun phrases are described in terms oftheir semantic relationships to the verb. In what ways are the two approaches similar? In what ways are they different? Is one a more general version of the other? As an example, compare the representation of the sentence
      • John broke the window with a hammer.

      in the two formalisms.

    12. Use compositional semantics and a knowledge base to construct a semantic interpretation of each of the following sentences:
      1. A student deleted my file.
      2. John asked Mary to print the file.

      To do this, you will need to do all the following things:

      • Define the necessary knowledge base objects.
      • Decide what the output of your parser will be assumed to be.
      • Write the necessary semantic interpretation rules.
      • Show how the process proceeds.
    13. Show how conversational postulates can be used to get to the most common, coherent interpretation of each of the following discourses:
      1. A: Do you have a comb?
      2. A: Would Jones make a good programmer?

        B: He's a great guy. Everyone likes him.

      3. A (in a store): Do you have any money?

        B (A's friend): What do you want to buy?

    14. Winograd and Flores [1986] present an argument that it is wrong to attempt to make computers understand language. Analyze their arguments in light of whai was said in this chapter.