Chapter 16 - Parallel and Distributed AI

Recent years have seen significant advances in parallel computation and distributed systems. What are the implications of these advances for AI? There are three main areas in which parallel and distributed architectures can contribute to the study of intelligent systems:

These areas are often overlapping and complementary. For example, consider the production system model that we described in Chapter 2. The ideas of short- term and long-term memory, independently operating productions, matching, and so forth first arose in the psychological literature. When researchers began building AI systems based on these principles, they realized that parallel computers might be used to increase significantly the speed at which the systems could run. Even on single processor systems, however, the production system architecture turned out to have many benefits over conventional programming. One benefit is better modularity. When rules operate more or less independently, it is easy to add, delete, or modify them without changing the structure ofthe entire program. In this chapter, we discuss all these issues. First we briefly discuss psychological modeling. Then, in the following two sections we present some specific techniques that can be exploited in constructing parallel and distributed reasoning systems.

16.1 Psychological Modeling

The production system was originally proposed as a model of human information processing, and it continues to play a role in psychological modeling. Some production system models stress the sequential nature of production systems, i.e., the manner in which short-term memory is modified over time by the rules. Other models stress the parallel aspect, in which all productions match and fire simultaneously, no matter how many there are. Both types of models have been used to explain timing data from experiments on human problem solving.

SOAR [Laird et al., 1987] is the production system architecture that we mentioned in Chapter 6. SOAR has a dual mission. On the one hand, it is intended as an arehitecture for building integrated AI systems; on the other hand, it is intended as a model of human intelligence [Newell, 1991]. SOAR incorporates both sequential and parallel aspects of production systems by operating in cycles. In the elaboration phase of the processing cycle, productions fire in parallel. In the decision phase, operators and states are chosen, and working memory is modified, thus setting the stage for another elaboration phase. By tying these.phases to particular timings, SOAR accounts for a number of psychological phenomena.

Another approach to psychological modeling draws its inspiration from the physical organization of the human brain itself. While individual neurons are quite slow compared to digital computer circuits, there are vast numbers ofthese richly interconnected components, and they all operate concunently. If we wish to model the brain or use it as a source of ideas for AI, we must consider the powers and constraints imposed by the brain's architecture at the neural level. Unfortunately, we do not understand very well how neurons are wired in the brain, so modeling at this level is difficult. But we return to this idea in Chapter 18, where we describe the use of neural networks as a way of representing and using knowledge.

16.2 Parallelism in Reasoning Systems

AI programs consume significant time and space resources. It is therefore important that AI algorithms make use of advances in parallel computation. In this section. we deseribe several ways of doing this without substantially changing the programs that we write. Then, in the next section, we explore ways in which techniques from parallel and distributed computing can be used in the overall design of AI systems.

16.2.1 Parallelizing AI Architectures

As we mentioned above, production systems have both sequential and parallel aspects. The question arises, how much speedup can we expect from parallel processing? There are several sources of parallel speedup in production systems:

The amount of task-level parallelism available is completely dependent on the nature of the task. In a medical diagnosis system, for example, each production firing might be dependent on the previous production firing, thus enabling a long, sequential chain of reasoning tooccur. However, ifthe system were diagnosingfive patients simultaneously, productions involving different patients would not interact with one another and could be executed in parallel.

Match-Ievel parallelism is more widely applicable. Since production systems spend nearly all of their time in the matching phase, it was expected early on that match-level parallelism would lead to vast speedups. In a system with a thousand productions, for example, one processor could be assigned to every production, possibly speeding up every match cycle by a factor of a thousand. However, as Gupta [1985] showed, having n processors does not lead to an n-fold speedup. Some reasons for this effect are:

  1. Only a few productions are affected by each change in working memory. With some bookkeeping to save state information, sequential implementations such as RETE [Forgy, 1982) (Section 6.4.2) can avoid processing large numbers of productions. Parallel implementationsmust bejudged with respect tothe speedups they offer over efficient seyuential algorithms, not inefficient ones.
  2. Some productions are very expensive to match, while others are cheap. This means that many processors may sit idle waiting for others to finish. When processors are idle, the speedup available from parallel processing diminishes.
  3. Overhead resulting from communication costs among multiple processors can further reduce the benefits of parallelism.

Other architectures behave differently with respect to parallel implementation. The brain-style arehitectures mentioned above are naturally parallel; in fact, simulating them on sequential machines is often prohibitive because ofthe high degree of parallelism they assume. In Section I6.3, we discuss some other parallel AI architectures.

16.2.2 Parallelizing AI Programming Languages

In the last section, we discussed the benefits of parallelizing a particular kind of program, namely a production system interpreter. Other frequently used interpreters in AI inelude those for the programming languages LISP and PROLOG.

Writing parallel programs is a difficult task for humaits, and lhere is some hope that parallel implementations of these languages (perhaps scugmented with parallel programming constructs) will make effective speedups more practical. Parallel LISP models inelude Multilisp [Halstead, I988], QLISP [Gabriel and McCarthy, 1988], and the Paralation Model (Sabot, I9RR]. Parallel PROLOG models inelude Concurrent PROLOG [Shapiro, 1987], PARLOG [Clark and Gregory, 1986], and Guarded Horn Clauses [Ueda, 1985].

Researeh into parallel logic programming languages was an important focus of the Japanese Fifth Generation project [ICOT, 1984]. Languages like PROLOG immediately suggest two types of parallelism. In OR-parallelism, multiple paths to the same gual are taken in parallel. For example, suppose we have the following clauses:

uncle(X,Y) :- mother(Z,Y), sibling(X,Z).

uncle(X,Y) :- father(Z,Y), sibling(X,Z).

Then the query ?- uncle(John,Bill)

could be satisfied in two dififerent ways since John could be the sibling of Bill's mother or of Bill's father. A sequential implementation would try to satisfy the first condition, and then, if that failed, try the second condition. There is no reason, however, why these two paths could not be pursued in parallel.[Footnote: In PROLOG, clauses are matched sequentially from top to bottom. If PROLOG programmers write code that depends on this behavior, OR-parallelism may yield undesired results.]

In AND-parallelism, the portions of a conjunctive goal are pursued in parallel. Consider the clause:

infieldfly(X) :- fly(X), infieldcatchable(X),

occupiedbase(first), outs(zero)

Here, the four conditions can be checked in parallel, possibly leading to a four-fold speedup in processing infieldfly queries. Such AND-parallelism is not so straightforward when variables are shared across goals, as in:

uncle(X,Y) :- mother(Z,Y), sibling(X,Z).

The mother(Z,Y) and sibling(X,Z) conditions cannot be satisfied independently, since they must instantiate the variable Z in the same manner.

Research on parallel logic programming shares the same goal as that on parallel production systems: to permit the efficient execution of high-level, easily written code for AI systems.

16.2.3 Parallelizing AI Algorithms

Some problems are more amenable to parallel solutions than others. While nine authors may be able to write a book much faster than one author (if they each write separate chapters), nine women cannot bear a child any faster than one can. Likewise, throwing more processors at an AI problem may not bring the desired benefits. One example of an inherently sequential problem in AI is unification (recall Section 5.4.4). While multiple processors can help somewhat [Vitter and Simons,1986], formal arguments [Dwork et al.,1984] show that vast speedups in the unification of Iarge terms are not possible.

Many problems can be solved efficiently by parallel methods, but it is not always a simple matter to convert a sequential algorithm into an efficient parallel one. Some AI algorithms whose parallel aspects have been studied are best-first search [Kumar et al.,1988], alpha-beta pruning [Hsu, l989], constraint satisfaction [Kasif,1986], natural language parsing [Thompson, 1989], resolution theorem proving [Cheng and Juang, 1987], and propeny inheritance [Fahlman,1979].

16.2.4 Custom Parallel Hardware

Finally, we must ask how these parallel algorithms can be implemented in hardware. One approach is to code an algorithm in a programming language supported by a general-purpose parallel computer. Another approach is to build custom parallel hardware that directly implements a single algorithm. This last approach has Ied to striking perfonnance increases, as demonstrated in the SPHINX [Lee and Hon, 1988] speech recognition system, where real-time performance was achieved through the use of a beam search accelerator [Bisiani et al., 1989], and in the DEEP THOUGHT chess machine [Hsu,1989], which uses a parallel tree-seareh algorithm for searching game trees.

16.3 Distributed Reasoning Systems

In all of our discussions of problem-solving systems until now, we have focused on the design of single systems. In this section, we expand our view, and look at distributed reasoning systems. We define a distributed reasoning system to be one that is composed of a set of separate modules (often called agents since each module is usually expected to act as a problem-solving entity in its own right) and a set of communication paths between them. This definition is intentionally very vague. It admits systems everywhere along a spectrum that ranges from tightly coupled systems in which there is a completely centralized control mechanism and a shared knowledge base to ones in which both control and knowledge are fully distributed. In fact, of course, most real distributed reasoning systems Iie somewhere in the middle. This definition also includes systems that are distributed at varying levels of granularity, although we do not intend it to include systems with very fine granularity (such as connectionist systems in which the individual nodes do not perform reasoning in the same sense that we have been using the term).

For many kinds of applications, distributed reasoning systems have significant ad- vantages over large monolithic systems. These advantages can include:

  1. System Modularity-It is easier to build and maintain a collection of quasi- independent modules than one huge one. [Footnote: In this respect, reasoning programs are no different from other large programs [Dijkstra, 1972].]
  2. Efficiency-Not all knowledge is needed for all tasks. By modularizing it, we gain the ability to focus the problem-solving system's efforts in ways that are most likely to pay off.
  3. Fast Computer Arehitectures-As problem solvers get more complex, they need more and more cycles. Although machines continue to get faster, the real speed- ups are beginning to come not from a single processor with a huge associated memory, but from clusters of smaller processors, each with its own memory. Distributed systems are better able to exploit such architectures.
  4. Heterogeneous Reasoning-The problem-solving techniques and knowledge representation formalisms that are best for working on one part of a problem may not be the best for working on another part.
  5. Multiple Perspectives-The knowledge required to solve a problem may not reside in the head of a single person. It is very difficult to get many diverse people to build a single, coherent kŝowledge base, and sometimes it is impossible because their models of the domain are actually incnnsistent.
  6. Distributed Problems-Some problems are inherently distributed. For example, there may be different data available in each of several distinct physical locations.
  7. Reliability-If a problem is distributed across agents on different systems, problem solving can continue even if one system fails.

An architecture for distributed reasoning must provide:

  1. A mechanism for ensuring that the activities of the various agents in the system are coordinated so that the overall problem-solving system achieves its goal(s).
  2. A communication structure that enables information to be passed back and forth among agents.
  3. Distributed versions of the necessary reasoning techniques. These mechanisms are likely to differ from their monolithic counterparts since they will be presumed to operate on a set of local knowledge bases rather than on a global one that can be assumed to possess a set of global properties (such as consistency).

In the rest of this section, we address each of these issues.

16.3.1 Coordination and Cooperation

The biggest issue that needs to be faced in the design of any distributed reasoning system is how the actions of the individual agents can be coordinated so that they work together effectively. There are a variety of approaches that can be taken here, including the following:

Although these approaches differ considerably, there is one modification to a simple, single agent view of reasoning that is necessary to support all of them in anything other than a trivial way. We need a way to represent models of agents, including what they know, what they can do, and what their goals are. Fortunately, what we need is exactly the set of mechanisms that we introduced in Section 15.4. But now, instead of using modal operators and predicates (such as BELIEVE, KNOW, KNOW-WHAT, CAN-PERFORM, and WILLING-TO-PERFORM) to model writers and speakers, we use them to model agents in a distributed system. Using such operators, it is possible for each agent to build a model of both itself and the other agents with which it must interact. The self-descriptive model is necessary to enable the agent to know when it should get help from others and to allow it to represent itself accurately to other agents who may wish to get help from it. The model of other agents is necessary to enable an agent to know how best to get help from them.

Planning for Multi-agent Execution

The least distributed form of distributed reasoning is that in which a single agent:

  1. Decomposes the goal into subgoals, and
  2. Assigns the subgoals to the various other agents

This kind of reasoning is usually called multi-agent planning. The first step, problem decomposition, is essentially the same as it is for single-agent planniog systems. Ideally, the decomposition results in a set of subproblems that are mutually independent. This is often not possible, however, so various ofthe techniques that we described in Chapter 13 must be exploited.

Once a decomposition has been produced, the subproblems must be allocated to the available agents for execution. At this point, distributed planning dififers from single agent planning in the following important ways:

Let's consider this last issue in a bit more detail. Suppose the task is to do spelling cotrection on a document with several chapters, and then to print it. We can distribute this among several spelling cotzecting agents and one printing agent. But to get the desired result, we need to ensure that the printing agent does not begin printing any chapter until spelling correction on that chapter is complete. Distributed reasoning systems exploit a wide variety of synehronization techniques to guarantee this, ranging from simple ones (e.g., in which the printing process does not begin until all the spelling correctors are done and have so infomŝed the master) to more sophisticated ones in which the slave processes communicate directly with each other (e.g., the spelling correctors each infomt the printer when they have finished). These more sophisticated techniques require that each slave agent be told sonte information about the other slaves at the time that it is given its task.

For relatively simple tasks, such as the one we just described, the various agents can communicate effectively with each otherjust by announcing when operations have been completed. For other kinds of tasks, though, it is not enough to know when an agent has completed its task. It may also be necessary to know what state the system is in during task execution. For example, suppose that there is a single resource that the various agents share, such as an input or output device. Then one agent may want to know whether any other is cunently using the device. To support this kind of interaction, it is useful to introduce a state-based model, such as that described by Georgeff [1983; 1984]. In this kind ofa model, each available action is characterized as a sequence of state changes that it effects. The various agents may share a single model, which they all update as necessary, or they may each have their own model, in which case they must also inform all other relevant agents whenever they make a change to their intemal state that could be important externally.

Planning and Negotiation: Contract Nets

A slightly more distributed kind of reasoning occurs when a single agent performs the problem decomposition but then negotiates with the other agents to determine who will take on which subtasks. The contract net mechanism [Davis and Smith,1983] supports this kind of interaction. In a contract net, there are two roles that the agents can assume:

  1. Manager, who decomposes a problem, looks for contractors to attack pieces of the problem, and monitors the problem's execution.
  2. Contractor, who executes a subtask, possibly by actually doing thejob and possibly by recursively becoming a manager and subcontracting subparts ofthejob to other contractors.
M:inagers and contractors find each other through a process of bidding
  1. A manager announces a task.
  2. Contractors evaluate the task with respect to their own abilities and the resource reyuirements necessary to accomplish it.
  3. Contractors make bids to the manager.
  4. The manager chooses a single contractor and waits for the result.

Thus managers and contractors select each other by communicating in a completely distributed fashion. A node can be both a manager and a contractor simultaneously; rather than sit idle, waiting for results from its contractors. a manager can take on work in the meantime.

Distributed Control and Communication

So far, we have focused on systems in which there is a single agent who maintains control of the overall problem-solving process. In this section, we look at the problem ofdistriMutedplanning, in which there is no such centralized controller. In the extreme form of such systems, we can make no assumptions about how the various agents will behave. But without any such assumptions, it is impossible to construct problem-solving algorithms. So we start by assuming that each agent is rational. We can define rationality as follows:

Unfortunately, in a complex world, an agent may not have enough processing power to behave optimally. This leads to a slightly weaker, but more useful, notion of bounded rationality [Simon,1957]:

Bounded rationality is akin to the notion of satisficing that we discussed in Chapter 2.

Using these ideas, we can define techniques that an individual agent can use to attain its goals, taking into account what will probably happen as a result of what the other agents in its environment are likely to do. Sometimes, the other agents are cooperating to achieve the same goal. Sometimes they are working on other goals, which may be competitive or simply orthogonal. We consider two classes of approaches to this problem:

The first approach is one in which the agents can communicate freely with each other during problem solving. In this case, the agents can each create their own plans, which are composed both of problem-solving actions and of communication actions of the sort we described in Section 15.4. Sometimes the communication actions are addressed to a specific other agent who is believed to be able to satisfy a request (either for information or to perfomi some other task). In other systems, the agents do not know explicitly about each other. Instead, each agent can broadcast to a shared memory structure, which other agents can be counted on to read. Each agent can then reply to those messages to which it chooses to pay attention. We deseribe one specific way of implementing such a broadcast structure as a blackboard system in Section 16.3.2.

One specific technique that several cemmunicating agents can use is called the functionally accurute, cooperative (FA/C) approach [Lesser and Corkill,1981] to distributed problem solving. Each agent begins by forming a tentative, incomplete plan. These plans are then shared among the agents, who are able to help refine each other's plans by adding information that they possess. Ideally, the entire system converges on a complete plan.

Figure 16.1 : A Payoff Matrix for Two Agents and Two Actions

The second approach is one in which we assume that the agents cannot communicate. This may seem to be a very serious restriction, but it is useful to consider it both because it does sometimes arise in the extreme (perhaps because the agents are geographicalty isolated) and because it often arises at small granularity levels where the cost ofconstant communication may come to dominate the cost of actual problem solving.

If we assume that the agents cannot communicate and that they are all rational, then we can use many of the standard notions of game theory to describe how each of them should act. The most basic technique is that of a payoff matrix, such as the one shown in Figure I6.1. We assume that there are only two agents, P and Q, and that there are only two actions that each of them can perform (a and b for P and c and d for Q). Then the matrix shows the payoff for each of them for each of the possible joint actions. The number in the lower left of each box is the payoff for P; the number in the upper right is the payoff for Q. Each agent's goal is to maximize its own payoff. For example, P comes out best if it makes move b and Q makes move d. On the other hand, Q comes out best if P makes move a and Q makes move c.

Of course, no one of the agents can force such a dual move. Each must make its own decision independently. In this case, for example, P should choose move a (rather than b, even though the best case for P included move b). Why? The answer is that P should assume that Q will behave rationally. In this matrix, the c column dominates the d column for Q, by which we mean that in every row, the payotf for Q is higher in the c column than in the d column. Thus Q can be predicted to choose c, and P should plan accordingly. Given that Q will choose c, P sees that it does better to choose move u than move b.

We can now view our discussion of game-playing programs (Chapter 12) from a different perspective, that of noncommunicating agents trying to solve their own goals. Both payoff matrices and tree-search algorithms can be generalized to more than two players (e.g., Korf [1989]), but there are some imponant differences. In board games, players usually take turns making moves, whereas payoff matrices model the kind of simultaneous decision making common in the real world. Also, games are usually zero-sum, meaning that one player's gain is another player's loss. Payoff matrices are sometimes zero-sum, but need not be. See Genesereth et al. [1987] and Rosenschein and Breese [1989] for more substantial discussions of operations on payoff matrices.

16.3.2 Communication: Blackboards and Messages

The specific communication archilectures that have been proposed to support distributed reasoning fall into two classes with respect to communication structure:

Although on the surface, these two techniques appear quite different, they turn out in practice to offer essentially the same suppon for distributed reasoning. In fact, they can be used to simulate each other, as we see below. In the rest of this section, we deseribe examples of each approach.

Blackboard Systems

The blackboard approach to the organization of large AI programs was first developed in the context of the HEARSAY-II speech-understanding project [Ennan et al.,1980]. The idea behind the blackboard approach is simple. The entire system consists of:

To see how these pieces work together, let's look at the HEARSAY-II system. Here, the KSs correspond to the levels of knowledge about speech, language (syllables, words, phrases, and sentences), and the task being discussed. The blackboard contains hypotheses about interpretations at each of these levels. Control is performed by a specialized knowledge source that reasons about such factors as cost ofexecution and likelihood of achieving a result.

When a KS is activated (as deseribed below), it examines the current contents of the blackboard and applies its knowledge either to create a new hypothesis and write it on the blackboard, or to modify an existing one. Although the execution of thc entire HEARSAY-II system consists of the asynchronous execution of a collection of KSs, the execution of an individual KS is a seyuential process. Once a KS is activated, it executes without being interrupted until it is finished.

Figure 16.2: A Snapshot ofa HEARSAY-II Blackboard

The hypotheses on the blackboard are arranged along two dimensions: level (from small, low-level hypotheses about individual sounds to large, high-level hypotheses about the meaning of an entire sentence) and time (corresponding to periods of the utterance being analyzed). The goal of the system is to create a single hypothesis that represents a solution to a problem. For HEARSAY II, such a solution would be an acceptable interpretationofan entire utterance. Figures 16.2 and 16.3 show a snapshotof a HEARSAY-11 blackboard. Figure I 6.2 shows the lowest three levels ofthe blackboard, and Figure 16.3 shows the top three. The levels are the following:

  1. The waveform correspouding to the sentence "Are any by Feigenbaum and Feld- man?"
  2. The correct words shown just for reference
  3. The sound segments
  4. The syllable classes
  5. The words as created by one word KS
  6. The words as created by a second word KS
  7. Word sequences
  8. Phrases

Figure 16.3: The Rest of a HEARSAY-II Blackboard

Associated with each KS is a set of triggers that specify conditions under which the KS should be activated. These triggers are an example of the general idea of a demon, which is, conceptually, a procedure that watehes for some condition to become true and then activates an associated process.[Footnote: Of course, demons usually are not actually implemented as processes that watch for things, but rather the things they are watching for are set up to activate them when appropriate.]

When a trigger fires, it creates an acrivarion record deseribing the KS that should be activated and the specific event that fired the trigger. This latter information can be used to focus the attention of the KS when it is actually activated. Of course, a single event, such as the addition of a particular kind of hypothesis to the blackboard, could cause several triggers to fire at once, causing several activation records to be created. The KS that caused the triggering event to occur need not know about any of these subsequent activations. The actual determination of which KS should be activated next is done by a special KS, called the scheduler, on the basis of its knowledge about how best to conduct the seareh in the panicular domain. The scheduler uses ratings supplied to it by each of the independent KSs. If the scheduler ever discovers that there are no activation records pending, then the system's execution temiinates. For more information on the HEARSAY-II scheduler, see Hayes-Roth and Lesser [1977].

The techniques developed in HEARSAY II have since been generalized in several multipurpose blackboard systems, including HEARSAY III [Balzer et al.,1980] Erman et al.,1981], GBB [Corkill et al.,1987], and BBI [Hayes-Roth,1985; Hayes-Roth and Hewett,1989]. For example, the use oftime as an explicit dimension on the blackboard is not appropriate in all domains, so it has been removed from these more general systems.

But these new blackboard systems also provide facilities that HEARSAY II lacked. In HEARSAY-II, control was data-driven. This worked well for speech understanding. But for other kinds of problem solving, other kinds of control are more appropriate. Examples include control that is driven either by goals or by plans. The newer blackboard systems provide explicit support for these other control mechanisms. One important way in which they do that is to allow the use of multiple blackboards. Although this idea can also be exploited as a way to modularize domain reasoning, one of its important uses is to exploit one blackboard for reasoning in the problem domain and another for controlling that reasoning. In addition, these systems provide a goal-structured agenda mechanism that can be used in the control space to allow problem solving to be driven by an explicit goal structure. See Englemore and Morgan [1989] and Jagannathan et al. [1989] for further descriptions of these systems and some applications that have been built on top of them.

Message-Passing Systems

Message-passing systems provide an alternative way for agents in a distributed reasoning system to communicate with each other. In such a framework, the agents tend to know more about each other than they do in a blackboard system. This knowledge enables them to direct their messages to those agents who are most likely to be able to do what needs to be done. As an example of a message-passing distributed system, we describe MACE [Gasser et al.,1987], which provides a general architecture for distributed reasoning systems (in the same sense that systems such as BB 1 provide a general arehitecture for blackboard systems). A MACE system is composed of five kinds of components:

  1. Problem-solving agents, which axe specialized to a problem domain
  2. System agents, which provide such facilities as command interpretation, error handling, tracing
  3. Facilities, which are built-in funetions that agents can use for such things as pattem matching and simulation
  4. A description database, which maintains deseriptions of the agents
  5. Kemels, of which there is one for each processor, which handle such funetions as message routing and I/O transfers

A MACE problem-solving agent maintains models of other agents. A model that an agent P has of some other agent A contains the following information:

  1. Name: A 's name
  2. Class: A's class
  3. Address: A's location
  4. Role: A's relationship to P. This relationship can be identity, creator, or member of an organization.
  5. Skills: P's knowledge about what A can do
  6. Goals: P's beliefs about A's goals
  7. Plans: P's beliefs about A's plans for achieving its goals

This architecture supports many of the kinds of distributed reasoning systems that we have been discussing. Let's consider a few.

First, suppose we want to build a system in which a controllingagent will decontpose the problem and then negotiate with other agents to perfomi subtasks using a contract net mechanism. Then each of the agents can be represented as a problem-solving agent in MACE. The manager decomposes the problem. It then sends reyuests for bids to all the other agents, about which it knows nothing except their addresses. As the other agents respond, the manager can build up its model of them. Using that model, it can choose the agents to whom it wishes to award bids. The chosen agents perform their tasks and then send reply messages to the manager.

At another extreme, suppose we want to build u system thtlt is composed of competing agents. We can model such a system in a MACE arehitecture. again by building a set of problem-solving agents, but this time their models of each other must be more sophisticated. In particular, it will be necessary to model each other's goals and plans.

Although MACE directly supports a message-passing communication protocol, it can be used to simulate a blackboard system. A single problem-solving agent, or a collection of them, can be used to simulate each blackboard knowledge source. Additional agents can simulate the blackboard itself. Agents send messages to the blackboard, which in tum routes the messages to the other agents that should be triggered as a result of the posting.

As this example suggests, there really is no dichotomy between blackboard and message-passing systems so much as there is a continuum. At one extreme, an agent can do nothing but broadcast its message to everyone. At the other. an agent can do nothing except send its message to a specific other agent. There are many in-between positions that can be very useful. For example, an agent may not know exactly which other agent should receive its message, but it may know that it is some agent belonging to a particular class. In a message-passing architecture, this can be implemented by arranging agents into a hierarchy of classes and allowing messages to be sent to a class and thus delivered to all members of the class. In a blackboard system, this same capability can be implemented by creating a type hierarchy for blackboard elements. Then each KS is marked with the types of elements that will be considered as triggering events. When an element is posted, only those KSs that are defined for elements of that type will be given a chance to trigger on it.

16.3.3 Distributed Reasoning Algorithms

So far we have discussed various issues that arise when planning and plan execution are distributed across multiple agents. But we have not considered any modifications to any other reasoning algorithms. We have implicitly assumed that such standard procedures as matching and inheritance w ould work in a distributed system just as they do in a single-agent system. In many cases they do. But there are some reasoning algorithms, particularly ones that operate globally on a knowledge base, that need to be redesigned to support distributed reasoning. We consider one example of such an algorithm here.

Consider again the justification-based truth maintenance system (JTMS) that we deseribed in Section 7.5.2. The JTMS works by considering an entire knowledge base and labeling the nodes in the knowledge base so that the labeling is consistent and well-founded. Both of these are global properties. But consider a distributed reasoning system in which there are several agents, each of which has its own knowledge base. Although we expect that each of these knowledge bases will be locally consistent, we do not want to insist that, taken together, they be globally consistent. This is important, since one of the benefits of a distributed system is that agents that represent different points of view and positions can interact. So what does it mean to label the nodes in such an inconsistent knowledge base?

A second question arises when we extend the notion of a JTMS to a distributed system. In a single-agent system, a justification is created as part of the reasoning process. It stays with the resulting node and can be used to update the belief status of the node if any ofthe assumptions on which the reasoning depended ever change. But what if one agent does the reasoning and then communicates its result to another? It may not make sense to communicate the justification, since it may involve knowledge-base objects that the receiver of the result knows nothing about. This will often happen if one agent asks another to solve a problem about which it knows very little.

Both of these problems can be solved by introducing the idea of a distributed truth maintenance system. In this system, interagent justifications work as follows. Assume A1 solves a problem and reports the result to A2. Then A1 also reports to A2 a justification that says "Because A1 says so." This justification is treated by A2 essentially like a premisejustification. But A1 must also remember the justification, and it must remember that it sent this justification to A2. If the justification ever becomes invalid in A1, then A1 must send a message to A2 saying that A1 no longer says so. At that point, the conclusion must go OIlT in A2 unless there exists some otherjustification that is still valid.

Node labeling in the distributed truth maintenance system works similarly to node labeling in a single-agent system except that we need to redefine consistency. Rather than insisting on global consistency, we instead insist on extended local consistency, by which we mean that the labels within the knowledge base of a single agent must be consistent and the labels that are attached to nodes that have been explicitly shared among agents must be consistent across agents. But we do not insist that the labels attached to nodes that have not been explicitly shared be consistent across agents. For more information on how to do this, see Bridgeland and Huhns [1990]. For a similar discussion of ways to create a distributed assumption-based truth maintenance system, see Mason and Johnson [1989].

16.4 Summary

In this chapter, we discussed parallel and distributed aspects of AI. We examined psychological factors as well as efficiency concems. The last section deseribed the issues that arise when we attempt to extend the problem-solving mechanisms of earlier chapters to distributed reasoning systems. We have by no means covered all of them. For more information in this area, see the the following collections: Huhns [1987], Bond and Gasser [1988], and Gasser and Huhns [1989].

Before we end this chapter, we should point out that as distributed systems become more complex, it becomes harder to see how best to organize them. One thing that has proved promising is to look for analogies in the organization of other complex systems. One of the most promising sources of such analogies is the structure of human organizations, such as societies and corporations. A team or a corporation or a govemment is, after all, a distributed goal-oriented system. We have already seen one example ofthis idea, namely the bidding that is exploited in the contract net framework. See Fox [1981], Malone [1987], and Kornfeld and Hewitt [1981] for further discussion of this idea.

Another source of ideas is the way a single human brain functions. The book, The Society of Mind [Minsky,1985] explores the notion that single minds are also distributed systems, composed ofcollections ofheterogeneous agents that simultaneously cooperate and compete.

16.5 Exercises

  1. Consider a situation in which one agent A1 requests help from a second agent A2 to help find a movie it would like. A1 knows what it likes and A2 knows about movies.
    1. Using the belief and communication operators that we have defined (plus any others that you find it useful to define), write a plan that could be used by A1.
    2. Write a similar plan for A2.
  2. Consider the following payoff matrix:

    If Q assumes P is rational, what move should Q make?

  3. Show how the HEARSAY-II blackboard system could be extended to support the whole natural language understanding process that we described in Chapter 15.
  4. Show how a speech understanding system could be built using a MACE-style architecture.