HOPE - Home   |  The Academy   |  Cyber Library   |  Art Gallery   |  Comments   |  E-mail


Understanding Organizations

Floris and Winograd, pp.112-117


9.4  What does it mean to understand?

In light of this critique, we may be puzzled when Newsweek (Machines That Think, 1980) reports that "Computers can . . . draw analogies among Shakespearean plays and understand tales involving friendship and adultery," and Schank and Riesbeck (Inside Computer Understanding, 1981, p.6) state that their program SAM "was a major advance . . . because its use of scripts allowed it to understand real stories."  Are these claims true or false?

 To answer this last question in its own terms would be an obvious violation of the theory of language we have been presenting.  If objective truth conditions cannot be defined for 'water', how could they possibly be found for 'understand'?  We need instead to analyze the web of commitments into which we have entered when we seriously utter a sentence in the form of 'X understands Y'.  We will first illustrate some simple 'language understanding' programs as a basis for comparison.

Program 1

prints out the time of day whenever the precise sequence "What time is it?" is typed in.  Any other sequence is simply ignored.  Such a program might well operate to the satisfaction of those who use it, and they might want to claim that it 'understands the question', since it responds appropriately.

Program 2

accepts sequences of the form "What . . . is it?" where the gap is filled by 'time', 'day', 'month', or 'year'.  It types out the appropriate answer to each of those and ignores any sequence not matching this pattern.

Program 3

has a collection of patterns that are matched against the input.  For each of these there is a corresponding form to be printed out, where that printout may include fragments of the pattern that was entered.  The program finds the first pattern that matches successfully and prints out the associated response.  For example it may to inputs matching "My name is ..." with "Hello, ..., how are you today?" where the response is filled in with the name.

 Those familiar with artificial intelligence will recognize Program 3 as ELIZA (Weizenbaum, ELIZA, 1966).  This program was used (under the name DOCTOR) to simulate a non-directive psychiatrist in an interview with a patient.  Its patterns were those relevant to psychoanalysis.  For example, given an input of the form "I am ...." it responded "How long have you been ...?" filling in the blank.  Given "I hope ...." it responded "What would it mean to you if ...? and given "... everybody ..." it responded "Are you thinking of somebody in particular?"
 The behavior of the DOCTOR program is strikingly unlike popular preconceptions of computers.  As Weizenbaum reported:

 I was startled to see how quickly and how very deeply people conversing with DOCTOR became emotionally involved with the computer and how unequivocally they anthropomorphized it...  Another widespread, and to me surprising, reaction to the ELIZA program was the spread of a belief that it demonstrated a general solution to the problem of computer understanding of natural language.

— Weizenbaum, Computer Power and Human Reason (1976), p.6.

Program 4

has a collection of 'scripts' each corresponding to a particular sequence of events.  When a particular pattern is entered that matches the 'title' of the script, the program then compares each subsequent input with one of the event patterns in the script and fills in values based on the input (as ELIZA filled in the "..." in the examples above).  If the input does not match the next event in line, it skips over that one and compares it to the next.  Once the input is complete, it can use the values to answer questions (themselves in the form of simple patterns).   For example, the program might be given a script corresponding to: "When a person goes to a restaurant, the following happens: the person enters, is seated by a host, is brought a menu by a waiter, orders some food, is brought the food by a waiter, eats the food, is brought a check by the waiter, pays the check and leaves."  Given a sequence of inputs such as

"John went to a restaurant.  John ate a hamburger," it can use the script to answer the question "What did John order?" with "a hamburger."

 Again, this is a description (slightly simplified but not in any essential way) of an existing program — the SAM program that Schank and Riesbeck described as "understanding real stories."  It has served as a model for a series of more elaborate programs done by Schank and his group, as described in Schank and Riesbeck, Inside Computer Understanding (1981).

 With these example in mind, let us return to the question of what it would mean for a computer to understand language.  We might say that the computer understands when it responds appropriately.  The obvious problem lies in determining what constitutes an appropriate response.  In one sense, the simple clock program always responds appropriately.  Asked "What time is it?" it types out the time.  But of course we could equally well have designed it so that it responds with the time when we type in "Why is the sky blue?" or "Twas brillig in the slithy tove," or for that matter, 'X'.  The appropriateness of the response must be understood with respect to a background of other things that might be said, and in the case of the time keeper (or the more elaborate Program 2 that allows some variability in the patterns) this range is too limited to warrant being called understanding.

 But as we move up in complexity to ELIZA and SAM, the essential issue doesn't change.  The range of patterns grows larger and, as illustrated in Weizenbaum's discussion of ELIZA, it may be difficult for a person to recognize the program's limitations.  Nonetheless, the program is still responding on the basis of a fixed set of patterns provided by a programmer who anticipated certain inputs.  This anticipation may be clever (as in DOCTOR'S response to sentences including 'everybody') but it still represents a permanent structure of blindness.  This limitation is not one of insufficient deductive power.  It applies equally to programs like SHRDLU that includes routines for reasoning with representations, and would hold for systems with 'frame-like' reasoning.  It lies in the nature of the process by which representations are fixed in a computer program.

 It is important to recognize this limitation is not dependent on the apparent breadth of subject.  SHRDLU operates in a micro-world, in which the set of objects, properties and relations are fixed and limited in an obvious way.  The DOCTOR apparently deals with all aspects of human life, but it is really working with an even more limited set of objects and properties, as specified in its patterns.  Given the sentence "I am swallowing poison," it will respond "How long have you been swallowing poison"? rather than responding as a person would to implications of what is being said that were not anticipated in creating the pattern.

 In saying that computers "understand tales involving friendship and adultery," the Newsweek article was reporting on a program called BORIS (Lehnert et al., BORIS: An Experiment in Depth Understanding of Narratives, 1981), a more elaborate version of SAM.  Instead of dealing with inputs like "John went to a restaurant.  He ate a hamburger," BORIS works on stories containing sentences like "When Paul walked into the bedroom and found Sarah with another man, he nearly had a heart attack.  Then he realized what a blessing it was."  It responds to questions like "What happened to Paul at home?" and "How did Paul feel?" with "Paul caught Sarah committing adultery," and "Paul was surprised."

 If we examine the workings of BORIS we find a menagerie of script-like representations (called MOPs, TOPs, TAUs, and META-MOPs) that were used in preparing the system for the one specific story it could answer questions about.  For example, TAU-RED-HANDED is activated "when a goal to violate a norm, which requires secrecy for its success, fails during plan execution due to a witnessing."  It characterizes the feeling of the witness as "surprised."  In order to apply this to the specific story, there are MOPs such as M-SEX (which is applied whenever two people are in a bed together) and M-ADULTERY (which includes the structure needed to match the requirements of TAU-RED-HANDED).  The apparent human breadth of the program is like that of ELIZA.  A rule that 'if two people are in bed together, infer they are having sex,' is as much a micro-world inference as 'if one block is directly above another, infer that the lower one supports the upper'.  Subject matter that allows people to imagine that complex and subtle processing is taking place produces the illusions described by Weizenbaum.

 In a similar vein, the program that can "draw analogies among Shakespearean plays," (Winston, Learning and Reasoning by Analogy, 1980) operates in a micro-world that the programmer fashioned after his reading of Shakespeare.  The actual input is not a Shakespeare play, or even a formal representation of the lines spoken by the characters, but is a structure containing a few objects and relations based on the plot.  The version of Macbeth used for drawing analogies consisted of the following:

{Macbeth is a noble} before {Macbeth is a king}.  Macbeth marry Lady-Macbeth.  Lady-Macbeth is a woman — has-property greedy ambitious.  Duncan is a king.  Macduff is a noble — has-property loyal angry.  Weird-sisters is a hag group — has-property old ugly weird — number 3.

Weird-sisters predict {Macbeth murder Duncan}.  Macbeth desire {Macbeth kind-of king} [cause {Macbeth murder Duncan}].  Lady-Macbeth persuade {Macbeth murder Duncan}.  Macbeth murder Duncan {coagent Lady-Macbeth — instrument knife}.  Lady-Macbeth kill Lady-Macbeth.  Macbeth murder Duncan [cause {Macduff kill Macbeth}].

 The reasoning employed in the program includes the use of simple rules like "whenever a person persuades another to do an action, the action is caused by the persuasion and the persuaded person has 'control' of the action."  As with all of the examples so far, the program's claim to understanding is based on the fact that the linguistic and experiential domains the programmer is trying to represent are complex and call for a broad range of human understanding.  As with the other examples, however, the program actually operates within a narrowed micro-world that reflects the blindness of that representation.

 But, one might argue, aren't people subject to blindness too?  If we don't want to describe these programs as 'understanding language', how can we coherently ascribe understanding to anyone?

 To answer this we must return to the theory of language presented in Chapter 5.  We argued there that the essence of language as a human activity lies not in its ability to reflect the world, but its characteristic of creating commitment.  When we say that a person understands something, we imply that he or she has entered into the commitment implied by that understanding.  But how can a computer enter into a commitment?

 As we pointed out in Chapter 8, the use of mental terms like 'understand' presupposes an orientation towards the object being characterized in which it is taken as an autonomous agent.  In spite of this, it is often convenient to use mental terms for animals and machines.  It seems natural to say, "This program only understands commands asking for the time and date," and to find this way of talking effective in explaining behavior.  In this case, 'understand a command' means to perform those operations that I intend to invoke in giving the command.  But the computer is not committed to behave in this way — it is committed to nothing.  I do not attribute to it the kind of responsibility that I would to a person who obeyed (or failed to obey) the same words.

 Of course there is a commitment, but it is that of the programmer, not the program.  If I write something and mail it to you, you are not tempted to see the paper as exhibiting language behavior.  It is part of a medium through which you and I interact.  If I write a complex computer program that responds to things you type, the situation is still the same; the program is still a medium through which my commitments to you are conveyed.  This intermediation is not trivial, and in Chapter 12 we will describe the roles that computers can play as an 'active structured communication medium'.  Nonetheless, it must be stressed that we are engaging in a particularly dangerous form of blindness if we see the computer — rather than the people who created the program — as doing the understanding.

 This applies equally to systems like TEIRESIAS (Davis, Interactive Transfer of Expertise (1979)) that can respond to queries about the details of the representation itself and the way it has been used in a particular calculation.  The 'meta-knowledge' programmed into such a system is a representation of exactly the kind we have been talking about throughout the paper.  It may play a useful role in operating the program, but it reflects a pre-determined choice of objects, properties, and relations, and is limited in its description of the program in the same way the program is limited in its description of a domain.  Hofstadter (Godel, Escher, Bach, 1979) argues that these limitations might not apply to a system that allows multiple levels of such knowledge, including 'strange loops' in which a level of description applies to itself.  However he admits that this is an unsupported intuition and is not able to offer explanations of just why we should expect such systems to be really different.

 As we have pointed out in earlier chapters, a person is not permanently trapped in the same kind of blindness because of the potential to respond to breakdown with a shift of domains in which we enter into new commitments.  Understanding is not a fixed relationship between representation and the things represented, but is a commitment to carry out a dialog within the full horizons of both speaker and hearer in a way that permits new distinctions to emerge.

 What does all this have to say about practical applications of language processing on computers?  Our critique is not a condemnation of the technical work that has been done or even of the specific devices (representations, deductive logic, frames, meta-description etc.) that have been developed.  It challenges the common understanding of how these devices are related to the human use of language.  Chapter 10 describes some practical applications of computer programs in which linguistic structures (e.g. English words and syntax) provide a useful medium for building or accessing formal representations.  The deductive techniques developed in artificial intelligence (including the frame-like reasoning discussed in this chapter) may serve well in producing useful responses.

 What is important is that people using the system recognize (as those duped by ELIZA did not) two critical things.  First, they are using the structures of their natural language to interact with a system that does not understand the language but is able to manipulate some of those structures.  Second, the responses reflect a particular representation that was created by some person or group of people, and embodies a blindness of which even the builders cannot be fully aware.


HOPE Home Page Academy of Jerusalem Cyber Library Comments Form E-mail