The Holorobophobe’s Dilemma

Eric Lormand, Dept. of Philosophy, Univ. of Michigan
e-mail: lormand@umich.edu

December, 1991

The Robot’s Dilemma Revisited, Pylyshyn and Ford, eds., Ablex<1>

Much research in AI (and cognitive science, more broadly) proceeds on the assumption that there is a difference between being well-informed and being smart. Being well-informed has to do, roughly, with the content of one’s representations--with their truth and the range of subjects they cover. Being smart, on the other hand, has to do with one’s ability to process these representations and with packaging them in a form that allows them to be processed efficiently. The main theoretical concern of artificial intelligence research is to solve "process-and-form" problems: problems with finding processes and representational formats that enable us to understand how a computer could be smart.

Of course, to understand how a computer can deal with particular domains, we must also solve "content" problems involving figuring out which particular representations it should have, so that it is well-informed about the domains. Although AI has not solved many such content problems, they are not primarily AI’s problems. One can make headway into process-and-form problems in the AI lab, but to make headway into content problems, one must incorporate empirical investigations in particular domains ranging from medical diagnosis to the mechanics of middle-sized objects to sociology to chess to laundromats to train stations.<2> These investigations appear to demand enormous resources; if we decide not to allocate these resources, it may be that neither AI nor cognitive science will ever provide a complete account of robot or human cognition. But such content problems are practical rather than principled, in the specific sense that they do not lead one to doubt the partial accounts that may be provided. They are difficult problems for AI and cognitive science as human activities, but not difficult problems for theories within AI and cognitive science.

By contrast, holorobophobes think that AI faces underlying principled problems revealed by the striking absence of holorobots, robots that achieve human levels of flexibility in dealing with the real world in all its variety and interconnectedness. The frame problem is widely reputed among holorobophobes to be the deepest and most difficult of these problems. I will discuss three recent attempts to display this problem: Dennett’s problem of ignoring obviously irrelevant knowledge, Haugeland’s problem of efficiently keeping track of salient side effects, and Fodor’s problem of avoiding the use of "kooky" concepts. In a negative vein, I will argue that these problems bear nothing but a superficial similarity to the frame problem of AI, so that they do not provide reasons to disparage standard attempts to solve it. More positively, I will argue that holorobophobes face a dilemma about how to construe their frame problems. If these problems are construed as process-and-form problems, they are easily solved by slight variations on familiar AI themes. If, on the other hand, they are construed as content problems, they constitute no principled objection to AI, just as the absence of a complete list of individual molecules or prices constitutes no principled objection to chemistry or economics.

1 PERSISTENCE AND THE FRAME PROBLEM

1.1 A Fable

Once upon a time there was a causation-computer, named C2 by its creators. Its only task was to read about simple events and to report their likely effects, in as much detail as it could. One day its designers arranged for it to learn that a bomb was in a room, resting on a wagon, and that the wagon was pulled through the doorway. C2 quickly reached the obvious conclusion that the bomb rode out of the room. "CONTRADICTION!" it printed, to the surprise of its teachers. "THE BOMB WAS BOTH IN AND OUT OF THE ROOM. CONTRADICTION! CONTRA"--they were forced to unplug it. Poor C2 could not understand that the time at which the bomb was out of the room was different from the time at which it was in the room.

Back to the drawing board. "The solution is obvious", said the designers. "Since states may change from one moment to the next, our next computer must represent the particular moments at which they obtain." They called their next model, the chronological-causation-computer, C3. C3 was told that the bomb was on the wagon at t1, and that the wagon was pulled a moment later, at t2. Then the programmers put it to the test:

"Tell us as much as you can about the effects at t3."

"THE WAGON WAS OUT OF THE ROOM AT t3."

"Anything else? Did anything happen to the bomb?"

"I DON’T KNOW. WHERE WAS IT WHEN THE WAGON WAS PULLED?"

"We just told you it was on the wagon, you tin ninny!"

"SURE, IT WAS THERE AT t1, BUT MAYBE THAT CHANGED BY t2."

Further questioning confirmed the worst: they had neglected to teach C3 how to tell which changeable facts persisted from one time to the next. "What color is the wagon?" "I DON’T KNOW--MAYBE IT CHANGED BECAUSE THE WAGON WAS PULLED." "What is your name?" "I DON’T KNOW--IT WAS ‘C3’ BEFORE YOU TOLD ME ABOUT THE ROOM." After a few more questions, mercifully, someone pulled the plug.

Back to the drawing board. "We might try giving it ‘frame axioms’", said the designers, "which put a border around the effects of an event." They soon realized that this was hopeless, however, since the number of frame axioms would mushroom. They would have to teach their next model that reading about a wagon does not change its color, that pulling a wagon does not change one’s name or change the number of pink elephants in the world, and so on. This presented the "frame problem": how to design a system that could, unlike C3, infer the persistence of nonchanges, but that could do so automatically--that is, without explicitly storing or accessing frame axioms for them.

Before long, the programmers discovered various ways for a system to infer automatically the persistence of nonchanges. Their favorite was the suggestion that representations of facts should refer not to particular moments but to time intervals. Thus was born a chronological-causation-computer-for-persistence, named C3P. C3P was given the same problem that had stumped C3. When C3P learned that the bomb was on the wagon at t1, it generated this internal representation:

R: THE BOMB IS ON THE WAGON FROM t1 ONWARD.

R did not need to be updated with each passing moment to handle persistence, since R itself meant that the bomb was on the wagon at t2, t3, and so on. This allowed C3P, unlike C3, to infer the bomb’s motion, when it was told that the wagon was pulled at t2. The programmers also gave C3P the ability to "snip" representations such as R, by representing finite intervals. For example, when C3P learned that the bomb was taken off the wagon at t100, it substituted "TO t99" for "ONWARD" in R. As a result of all of this, C3P was able genuinely to ignore facts that it took to be unchanged by a given event, focusing only on purported changes. Since there was no longer a need for storing and accessing frame axioms, the designers of C3P were satisfied that they had solved the frame problem.

All was calm, all was bright, until one night three wise men arrived from the East. C3P received no homage from them, however--much less any expensive gifts. The first wise man deemed the frame problem "a new, deep epistemological problem" which "whatever it is, is certainly not solved yet". The second wise man intensified the point, suggesting that the frame problem is "foisted on unsuspecting epistemology by misguided presumptions underlying AI as a discipline." Needless to say, the programmers found this completely mystifying. "You may suppose that you have solved the frame problem," explained the third wise man, "but in fact you are begging it. How could the depth, beauty, and urgency of the frame problem have been so widely misperceived?" In answer to his own question, he pronounced, "It’s like the ancient doctrine of the music of the spheres. If you can’t hear it, that’s because it’s everywhere." Satisfied that their hosts were completely at a loss for words, the wise men bid them farewell. As they left, the first wise man turned and issued the ominous warning, "If there is ever to be a robot with the fabled perspicacity and real-time adroitness of C3P0, robot-designers must solve the frame problem."<3>

1.2 Persistence and Sleeping Dogs

The frame problem as it is most commonly construed in AI was first described and named by John McCarthy and Pat Hayes (1969). These authors were interested in exploring a certain formalism for reasoning about change, called the "situation calculus". In the situation calculus, changeable facts are represented as being relative to particular moments of time, as in the chronological-causation-computer, C3. The facts that are represented as obtaining at a given moment are said to constitute a "situation". Given that an event E occurs in a situation S, and that certain surrounding conditions hold in S, the system’s task is to calculate what is true in the next situation, S+1. It does so by applying conditional rule-symbols that describe purported effects of E-like events, given that certain facts obtain in S. McCarthy and Hayes called these rule-symbols "axioms", although these axioms needn’t be unsupported or irrefutable.

McCarthy and Hayes discovered, though presumably not the hard way, that the situation calculus deals with nonchanges very inefficiently. Such a system makes no inferences about what is true in situations unless these inferences are sanctioned by axioms. Consequently, a system needs axioms relating each event of which it has an idea (e.g., E) to each changeable fact of which it has an idea. This is true even of facts that are not (purported to be) changed by a given event. These "frame axioms" have (to a near enough approximation) the form: "if E occurs in S and F is true in S, then F is true in S+1". Without such an axiom, the system would not infer that F persists, as illustrated by C3 in the fable. It is difficult to see how inferential processes that access huge numbers of frame axioms could be implemented (in brains or in known computers) without serious degradations in speed. For this reason, McCarthy and Hayes posed what they called the "frame problem": how can the persistence of nonchanges be inferred without accessing frame axioms? To minimize confusion with other problems (to be discussed) that have come to be called the "frame problem", I will adopt the more descriptive term "persistence problem" for this original frame problem (see Shoham 1988).

The standard strategy for solving the persistence problem has usefully been labeled the "sleeping-dog strategy".<4> According to the sleeping-dog strategy, instead of using frame axioms a system should assume by default that a fact persists, unless there is an axiom specifying that it is changed by an occurring event (given existing conditions). In this way, given that an event E occurs in situation S, the system can use axioms to infer new facts existing in S+1, and then simply "copy" the remainder of its beliefs about S over to S+1. In turn, the copying process can be avoided by abandoning the framework of momentary situations, in favor of that of extended time intervals, as illustrated in C3P. If a representation specifies a fact as holding over an interval of time, then the representation need not be accessed at all unless an axiom of change becomes applicable to it (Shoham, 1988). By modifying both the situations and the calculations of the situation calculus, the sleeping-dog strategy allows the persistence of facts to be inferred "automatically", that is, without accessing representations of the facts. A system can let sleeping representations lie, unless there is a positive reason to wake them.<5>

The sleeping-dog strategy is an example of what I will call a "computational shortcut", a technique that reduces the number of representations that need to be accessed in implementing particular inferential processes. Computational shortcuts are of special interest to cognitive science (including AI). We want to understand how it is that humanly possible (or robotically desirable) inferential tasks can be implemented in computationally limited processors (e.g., slow neurons or feasible computers). With luck, a combination of shortcuts can sufficiently reduce the computational burden of these tasks so that this becomes plausible. To drive the search for computational shortcuts, it is very useful to consider specific threats to the project of reducing the number of representations that inferential processes need to access. The sleeping-dog strategy has itself been threatened by the frame problems posed by our wise holorobophobes three. I will argue negatively that the sleeping-dog strategy is not susceptible to criticism based on these problems, and positively that (on one horn of a dilemma) the problems are easy to solve, and (on the other horn) they are practically difficult but theoretically uninteresting.

2 RELEVANCE AND THE FRAME PROBLEM

2.1 The Relevance Problem

As I mentioned, my introductory fable is a twist on a fable with which the first wise man, Daniel Dennett, introduces the frame problem of AI (Dennett 1987, pp. 41-42). I will first retell his tale, and then explain how it is misleading in this role. The robots in Dennett’s fable are charged with the task of mentally testing a plan, given a goal to be reached and some idea of the initial conditions under which the plan is to be executed. We can think of each of them as coming complete with these three states:

G: the goal of saving its spare battery from a live bomb.

I: knowledge of the initial conditions that the battery and the bomb

are on a wagon in a room.

P: the plan of pulling the wagon out of the room (to remove the battery).

Plan testing in these robots also requires a fourth element, a set R of "inference rules". To test a plan, one tries to find a sequence of rules in R that allows the goal to be inferred from the plan and the initial conditions. In other words, one searches for an "inferential path" from the plan and the initial conditions to the goal, one for which each step along the way is sanctioned by an inference rule.<6> Very roughly, if such a path exists, the plan passes the test.

Dennett begins with a simple robot, R1, which can recognize "the intended implications of its acts", but not "the implications about their side effects". In other words, in testing a plan, R1 uses only inference rules that correspond to intended effects of the plan. Since G is an intended effect of P, of course, P passes R1’s test. So R1 proceeds to pull the wagon out of the room, without recognizing the tragic side effect due to the fact that the bomb is also on the wagon. Back to the drawing board go the designers; out pops the robot-deducer, R1D1, which can test its plans for side effects. It does so by removing all restrictions on which inference rules and initial conditions it can consider in testing a plan. As a result, in searching for an inferential path from P to G it "deduces" everything it can: that P "[does] not change the color of the room’s walls", that P "cause[s] [the wagon’s] wheels to turn more revolutions than there [are] wheels on the wagon", and so on. Boom! Therefore, the designers install in their next robot a method for tagging implications as relevant or irrelevant to its goals. They call the new model R2D1, the robot-relevant-deducer. The relevance tags don’t help, however, since not only does R2D1 waste time inferring all the same irrelevant implications, but it also generates more inferences to the effect that they are irrelevant. "All these robots suffer from the frame problem," Dennett concludes. "If there is ever to be a robot with the fabled perspicacity and real-time adroitness of R2D2, robot-designers must solve the frame problem".

R1D1 and R2D1 do seem to illustrate the original frame problem--the persistence problem--since they engage in explicit inferences about nonchanges such as the color of the walls. The persistence problem requires one not to use frame axioms to infer the persistence of nonchanges. My claim is that a good dose of the sleeping-dog strategy would cure this ill, and I will argue for this claim throughout this paper. However, these robots suffer from a further problem that is not even addressed by the sleeping-dog strategy: not only do they bother with the noneffects of their plans, but they also bother with many genuine effects that are obviously irrelevant to their goals, such as the number of revolutions of the wagon’s wheels. The extra problem facing their programmers, then, is how to design systems that test plans without bothering with obviously irrelevant inferences.

This problem may be generalized in a straightforward way, since there are other kinds of goal-oriented searches besides plan testing. To generate a plan, for example, one may search for an inferential path from the initial conditions to the goal that requires performing some actions. To generate subgoals for a current goal, one may search for an inferential path to the goal that requires that certain subgoals be reached. From this general perspective, Dennett’s problem becomes that of designing a system that finds inferential paths between initial conditions and goals without considering inferences that "obviously" do not point in the right direction. I will call this the "relevance problem".

2.2 Relations to the Frame Problem of AI

Despite the similarities between the persistence and relevance problems, it is something of a mystery why, in Dennett’s hands, the shift takes place. He seems to feel that the original frame problem is merely an instance of the more general relevance problem. Thus, he calls the relevance problem a "broader" problem than the "narrowly conceived" original frame problem (Dennett 1987, p. 43). Although this may have some initial appeal, I think it should be resisted.

First, consider what Dennett can say in defense of the claim that the persistence problem is an instance of the relevance problem. A first attempt might be to argue that the desirability of ignoring noneffects of an event follows from the desirability of ignoring all irrelevant knowledge. The situation is not so simple, however. Often, noneffects are highly relevant to one’s goals: in Dennett’s fable, for example, pulling the wagon does not change the fact that the battery is on the wagon, and this is directly relevant to the robot’s goal. Therefore, the robot might need to access the knowledge that the battery will stay on the wagon.

Nevertheless, it is possible for Dennett to reply that, even if noneffects are often relevant to a system’s goals, processing them with explicit frame axioms is irrelevant. However, this substitution of "irrelevant processing" for "irrelevant knowledge" forces an unwelcome shift in the construal of the relevance problem. What is "irrelevant processing" supposed to mean? Useless processing? But if a robot needs to know about a certain (relevant) noneffect, a corresponding frame axiom might be very useful in supplying this knowledge. Of course, given that systems can use the sleeping-dog strategy instead of frame axioms, the latter are too costly. But being too costly is not the same as being irrelevant. If it were, any problem about processing costs would be a problem about irrelevant processing. On this view, for example, electrical engineers debating the relative processing virtues of various home computers would be discussing an "instance" of the relevance problem! But then the relevance problem would no longer be Dennett’s problem of accessing useful knowledge at the right time. Therefore, appealing to the irrelevance of processing noneffects fails to show that the persistence problem is an instance of Dennett’s relevance problem.

There is a more direct reason not to assimilate the persistence problem to the relevance problem. The persistence problem arises completely independently of goals, planning, action, or problem-solving. It deals purely with causal reasoning--keeping track of change. In my fable, C3 and friends are "pure predictors"; the only "goal" they ever have is to report as much as they can about the effects of an event. As a result, every effect is "relevant" to them, and no effect is irrelevant. Therefore, no instance of the relevance problem can arise for pure predictors like C3; there are no irrelevant effects to ignore. Since the persistence problem is present in its full force for C3, it cannot be an instance of the relevance problem. Nevertheless, the point remains that if there are ever to be smart robots such as R2D2 and C3P0, the relevance problem must be solved.

2.3 The Role of Bidirectional Search

The task facing the plan-tester is, as I have described it, that of searching for an inferential path from a plan and some initial conditions to a goal. In this respect it is rather like walking through a labyrinth, searching for an unobstructed path from the entrance to the exit. Now, compare three strategies for negotiating a labyrinth. First, there is "forward search": starting at the entrance and walking around (marking one’s path, of course) until one happens upon the exit. Second, there is "backward search": starting at the exit, and trying to make one’s way to the entrance. Third, there is "bidirectional search": searching forward while a partner searches backward, until one finds a path marked by the other. Bidirectional search is clearly the more efficient strategy, in general (see Barr and Feigenbaum (eds.) 1981, pp. 46-53).

From this perspective, it appears that a major defect of Dennett’s robots is that they engage only in forward search. His robots start with their plan P and initial conditions I and keep making inferences from these (and from their consequences, and so on) until they happen upon their goal G (or its negation). As a result, they infer consequences more or less at random, with respect to the goal, and so suffer from the relevance problem. We can account for one aspect of R2D2’s fabled perspicacity and real-time adroitness if we suppose that it uses bidirectional search instead. Supposing this, how would R2D2 test P?

We can imagine R2D2 first searching backward from G. The procedure is to look at some inference rules of the form "IF <condition>, THEN G", and to mark these conditions as plausible parts of paths from P to G. (Recall that G is the goal of saving the battery from the bomb.) This set of inference rules is likely to refer to the condition that the battery and the bomb are far apart, but is unlikely to refer to the number of revolutions of a wagon’s wheel, or the color of the walls.<7> So the locations of the battery and bomb would be marked as "relevant" to G.

At this point, R2D2 can ask itself: what happens to the whereabouts of the battery and bomb if I roll the wagon out of the room? More precisely, R2D2 can let the details of this question guide its forward search from this plan. That is, instead of looking at all the rules of the form "IF . . . A WAGON ROLLS . . ., THEN <consequence>", it can look only at those with potential consequences for the positions of batteries and bombs. Presumably, it finds inference rules such as these:<8>

IF A WAGON ROLLS, AND x IS IN THE WAY,

THEN x IS PUSHED ALONG.

IF A WAGON ROLLS, AND x IS ON THE WAGON,

THEN x RIDES ALONG.

R2D2 therefore checks whether it believes that the battery and bomb satisfy x in the antecedents of these rules. It finds that, in fact, it does believe that the two are on the wagon, so it infers that the two will ride along, and will not be far apart. Finally, it infers that the battery will not be saved, and can try to find a better plan based on what went wrong with this one.

As I mentioned above, the relevance problem arises for tasks other than plan testing, such as subgoal generation and plan generation. Given that R2D2 can paint the wagon, draw the drapes, or pace up and down the room, what keeps it from considering these options in generating a plan to rescue its battery? Bidirectional search does. R2D2 can search backward from the goal, to find subgoals and actions most likely in general to lead to the goal. It can then direct its forward search from the initial conditions, to determine which of these subgoals and actions are most likely to be suitable under these conditions. Other subgoals and actions should be considered only if none of these are suitable or if subsequent plan testing rules them out.

Although bidirectional search is a computational shortcut in that it greatly reduces the number of representations that need to be accessed in problem solving, it does not itself bring these computational costs to a minimum. In my illustration, I vaguely described R2D2 as looking at "some" inference rules of the form "IF <condition>, THEN G". But which? If it looks at them all, it is likely to bother with many irrelevancies. I discuss this problem in the next section, in connection with "relevance holism". There I will focus on the problems generated by holism generally. My aim here is to argue provisionally that goals (and relevance-to-goals) do not exacerbate the holism problem in any interesting way--in other words, that the goal-relevance aspects of Dennett’s frame problem do not themselves present difficulties over and above the holism of pure causal reasoning, or of inference generally,

Bidirectional search is, at best, a solution to the relevance problem construed as a process-and-form problem, i.e., "What procedures should I follow to find out what’s relevant to my goals, given what I believe?" As is normal in discussions of the frame problem, however, it may be tempting here to reconstrue the relevance problem as a content problem, e.g., "What goals should I have when?" However, as is typical with content problems, the lack of a complete answer to this question is of very little theoretical interest. We might suppose that some goals are always current in R2D2, e.g., the goal of staying out of danger, and that some goals are reflexively "triggered" by certain beliefs, e.g., given that there is a potential danger, R2D2 can generate the goal of finding out if any valuables are in danger and removing them from the danger. Once R2D2 learns that there is a live bomb in the room (i.e., a potential danger), but that there is some time to work with (so R2D2 itself is not yet in danger), R2D2 can search for valuables near the bomb (i.e., in danger). We can imagine that it can discover that the battery is near the bomb either by quickly looking around the room, or else by being told this, as in Dennett’s fable. Consequently, it can generate the goal of removing the danger and, as I have described, it can generate and test plans to meet this goal. Of course, for all we know it might take enormous effort to catalog the basic goals and basic triggering relations necessary for holorobothood, but for all we know this might be of merely practical interest.

3 HOLISM AND THE FRAME PROBLEM

3.1 The Holism Problem

The second wise man, John Haugeland, construes the frame problem as arising from the fact that inferential relations in the real world are holistic: what it is reasonable to infer from a given condition may depend on many other "surrounding conditions". First, virtually any inference can be warranted by virtually any condition, if the right surrounding conditions hold. From the premise that a wagon is pulled, for example, one may infer that a bomb moves (if there is one on the wagon), that one pulls a muscle (if the load is heavy), that the wheels will squeak (if they aren’t oiled), that one will please a coworker (if he has asked for the slab on the wagon), and so on. Second, virtually any inference can fail to be warranted by virtually any condition if the wrong surrounding conditions hold. As Haugeland points out, there are many possible situations in which pulling a wagon might fail to make a bomb ride along, even though the bomb is on the wagon:

But what if [the bomb] is also tied to the doorknob with a string? Or what if, instead of [rolling], [the wagon] tips over? (Haugeland 1987, p. 85)

This holism leads to Haugeland’s problem:

The so called frame problem is how to ‘notice’ salient [inferences] without having to eliminate all of the other possibilities that might conceivably have occurred had the facts somehow been different. (Haugeland 1985, p. 204)

In other words, the problem is to come up with an efficient algorithm for respecting the fact that what may be inferred from a given condition may depend on virtually any surrounding condition. Such an algorithm would have to make tractable the number of surrounding conditions a system must check, without blinding it to the "salient" ones. To distinguish this problem from others that have gone by the name "frame problem", I will refer to it as the "holism problem".<9>

The holism problem intensifies the relevance problem. My illustration of bidirectional search in section 2.3 proceeds under the assumption that the inference rules associated with R2D2’s goal of saving the battery from the bomb do not refer to the precise number of revolutions of a wagon’s wheel, or the color of the walls, or any other "obviously" irrelevant conditions. However, if the bomb is activated by the squeaking of the wagon’s wheels, the precise number of revolutions of the wheels may be of crucial relevance. Even the color of the walls may be relevant, if the room is painted in such a way as to camouflage the door. As a result of this holism, to deal with the real world R2D2 is likely to need inference rules to handle these possibilities, raising the combined "relevance-holism" problem: how can a system know which knowledge is relevant to a goal in its particular situation, without having to think about a vast number of possibilities?

3.2 Relations to the Frame Problem of AI

As Haugeland points out, the sleeping-dog strategy does not provide a solution to the holism problem. But more than this is needed to show that something is wrong with the sleeping-dog strategy, of course. (After all, the sleeping-dog strategy also "fails" to solve the problem of world hunger.) Haugeland therefore makes a stronger claim, to the effect that the sleeping-dog strategy raises the holism problem. The sleeping-dog strategy requires there to be "positive indications" to the effect that certain facts are changed by an event, so that the system can focus only on these facts. These positive indications are provided by inference rules (e.g., the non-frame axioms of the situation calculus). Therefore, he concludes, it is the sleeping-dog strategy that "raises formidable design questions about how to get the needed positive indications for all the important [inferences]", i.e., the holism problem (Haugeland 1985, p. 206). On closer inspection, however, it is easy to see that it’s not the sleeping-dog strategy that raises the holism problem; the problem arises for any system that has to make inferences about the real world, whether or not it uses the sleeping-dog strategy. For example, in my introductory fable the computer C3 does not use the sleeping-dog strategy. Nevertheless, of course, it must make inferences, and these inferences must be sensitive to salient surrounding conditions, so it must face problems about inferential holism.

As a consequence, something more is needed to show that the sleeping-dog strategy is inadequate for the problem it’s intended to solve, namely, the persistence problem. Perhaps Haugeland’s idea is that the persistence problem cannot be solved without simultaneously solving the holism problem. Since he does not even attempt to provide reasons for bringing inferential holism into discussions of the frame problem, there is room for speculation about why he is tempted to do so. Perhaps the reasoning goes like this: to be a solution to the persistence problem, a system must ignore the facts that are not changed (by an event), so it must be able to tell which facts are changed, so it must respect the holism of change, and, more generally, the holism of inference. The problem with this argument is that it misconstrues the persistence problem. The persistence problem is posed as a process-and-form problem--"What procedures should I use to ignore the facts that are unchanged, given what I believe?" In other words, it is not a content problem about which particular facts a system should take to be unchanged by which events. Consider again the frame axiom proposal (see section 1). Frame axioms turned out to be a bad idea, not because they didn’t capture reliable information about nonchanges (we may suppose that they did), but because there were too many of them. The persistence problem therefore arises regardless of how reliable or unreliable a system is about which facts are unchanged. As a result, to solve it all we need to do is to design a system that has the capacity to ignore the facts that are not changed, if it knows which facts really are unchanged.

This shows that the holism problem does not lurk behind the persistence problem. To be a solution to the persistence problem, a system only needs to ignore the facts it thinks are not changed by an event. But to do that, the system needn’t be able to tell which facts really are changed. Since a solution to the persistence problem needn’t insure that systems are right about which facts are changed, it needn’t insure that systems have the capacity to keep track of the holism of change. So the sleeping-dog strategy can solve the persistence problem without solving the holism problem. Of course, I am not denying that we need to solve the holism problem in order to get intelligent holorobots that can deal reliably with the real world. In the rest of this section I focus on attempts in AI to solve this very problem. The point here is merely that the fate of this problem is irrelevant to the fate of the sleeping-dog strategy.

3.3 The Role of Heuristic Search

At root, the holism problem is this: for any set of conditions one wishes to make inferences from, there are always too many potentially applicable inference rules to consider, rules that may require one to check virtually any surrounding conditions. Returning to the labyrinth analogy, the problem is that, from any fork, there are so many paths that one can’t follow them all. If one knows nothing about the particular labyrinth one is in, one must select a path more or less at random. This is called "blind search" in AI. However, in some cases one can use specific information about the labyrinth to help one select the paths that are likely to be the best to follow. This is called "heuristic search". For example, one might know that the better paths in a certain garden tend to be wider, while those in another tend to be better lit. Such heuristics can help one to achieve better results than blind search (see Barr and Feigenbaum (eds.) 1981, pp. 58-63).

Now, when a computer is searching for inferential paths, it can use similar heuristics to avoid blindly checking every inference rule. For example, associated with each inference rule might be some measure of its general reliability. The inference rule "IF A WAGON IS PULLED, IT ROLLS" might, for instance, be deemed more reliable than "IF A WAGON IS PULLED, THE COLOR OF THE WALLS CHANGES". In addition, or instead, each inference rule might make reference to the antecedent probability that it will "apply", that is, to the antecedent probability of the surrounding conditions it presupposes. Take the rule "IF A WAGON ROLLS, AND x IS ON THE WAGON, THEN x RIDES ALONG". As Haugeland says, this rule can fail if x is tied to the doorknob, but then the antecedent probability of such failure might be deemed to be very low.

Given some such metric, a computer can constrain searches by looking initially only at the set of rules with the best marks (the size of the set depends on how many rules can be processed at the same time). It can thereby focus on the rolling of the wagon rather than the potential change of color of the walls, and it can assume "by default" that x is not tied to the doorknob.<10>,<11> If this set doesn’t get it where it wants to go, it can try the next best set, and so on down the "search hierarchy".

If one’s special concern is relevance holism, one might prefer (instead, or in addition) to use heuristics regarding the general usefulness of inference rules. For instance, the rule "IF A WAGON ROLLS, AND x IS ON THE WAGON, THEN x RIDES ALONG" may be deemed to be generally more useful than the rule "IF A WAGON ROLLS, THEN THE NUMBER OF REVOLUTIONS OF ITS WHEELS IS PROPORTIONAL TO THE DISTANCE". This may be so even though the former is less reliable (since x might be tied to the doorknob) and less likely to be applicable (since the wagon might be empty).<12>

It is natural to wonder how context-free heuristics--such as general reliability, antecedent probability, and general usefulness--can be efficient or even effective. For surely the reliability, probability, and usefulness of inference rules are themselves highly dependent on one’s specific situation. Dreyfus and Dreyfus put the point nicely, in discussing a slightly different proposal:

[E]ven if the [rules] were stored in a context-free form the computer still couldn’t use them. To do that the computer requires rules enabling it to draw on just those [rules] which are relevant in each particular context. Determination of relevance will have to be based on further facts and rules, but the question will again arise as to which facts and rules are relevant for making each particular determination. One could always invoke further facts and rules to answer this question, but of course these must be only the relevant ones. And so it goes. It seems that AI workers will never be able to get started here unless they can settle the problem of relevance beforehand by cataloguing types of context and listing just those facts which are relevant in each. (Dreyfus and Dreyfus, 1986, p. 80)

The idea of repeated searches down a hierarchy is very important for blocking this sort of objection to heuristic search. For given search hierarchies, it isn’t true that a thinker needs to zero in on just the relevant rules. When the context is most normal (in the respects built into the search hierarchy), the top of the hierarchy will contain only the relevant rules, which can then be used successfully in action, inference, plan testing, etc. When the context is abnormal, the top of the hierarchy will contain irrelevant rules, which will be unsuccessful guides in action, inference, and planning. If Dreyfus and Dreyfus are correct, this in itself shows that the system should not even access them, much less use them. I agree that to avoid accessing them would require a regress (of some length or other) of rules for relevance. On a search-hierarchy model, however, the guiding idea is that a system should treat its context as normal until something goes wrong, and the context is thereby revealed to be abnormal. When use of some rules leads to a blocked action, or a subjectively incredible inference, or an unwieldy plan, etc., then the system should access the next level of rules down the hierarchy. (Of course, this procedure does not apply if the system already believes, or wonders whether, the context is abnormal.) Moving down the hierarchy in this fashion amounts to treating the situation as abnormal. A consequence of this model is that the less normal the context, the worse the system tends to perform (with respect to speed, errors, stops-and-starts, etc.), even if the system knows, "deep down" the hierarchy, what to do in abnormal contexts. The decreased performance in abnormal contexts is simply the price a system pays for increased performance in normal contexts. This rather unsurprising "prediction" is less an indication of the overwhelming plausibility of search hierarchies than it is of the underwhelming plausibility of the idea that context-dependence somehow forces upon AI a regress of rules.

Although Haugeland doesn’t particularly discuss heuristics as an approach to the holism problem, Jerry Fodor, the third wise man, registers this complaint:

So far as I can tell, the usual assumption about the frame problem in AI is that it is somehow to be solved ‘heuristically’. . . . Perhaps a bundle of such heuristics, properly coordinated and rapidly deployed, would suffice to make the central processes of a robot as [holistic] as yours, or mine, or the practicing scientist’s ever actually succeed in being. Since there are, at present, no serious proposals about what heuristics might belong to such a bundle, it seems hardly worth arguing the point. (Fodor 1983, pp. 115-116)

Fodor appears to be insisting that the trouble with the idea of heuristic search is that it raises the hard question: which heuristics should be used to establish search hierarchies of inference rules?

It is unclear whether Fodor construes this as a domain-general process-and-form problem--"What basic procedures should I use to form search hierarchies?"--or as a domain-specific content problem--"What particular search hierarchies should result from these procedures, in each particular domain?" He seems to be asking for a domain-general answer when he calls for a "principled solution to the frame problem" (Fodor 1983, p. 116), although he doesn’t attempt to explain the difference between principled and unprincipled solutions. Looked at this way, however, "serious proposals" about heuristics are a dime a dozen. I’ve just seriously proposed three principled heuristics, regarding the general reliability of an inference rule, its antecedent probability of applying, and its general usefulness. Of course, these principles leave open the various domain-specific problems about which inference rules are generally more reliable for dealing with the real world than which others, about which conditions in the real world are antecedently more likely to hold than which others, and about which inference rules are more likely to be useful than which others. Perhaps, then, Fodor is referring to the difficulty of these domain-specific "hierarchy problems".

How is a computer to establish the search hierarchies of inference rules necessary to solve hierarchy problems? Well, if we could set robots loose to gather data for themselves, they could rely on their own past experience, experience of which conditions have in fact obtained most often, or of which inference rules have in fact been most reliable and useful. Although this is undoubtedly necessary, as I mentioned above we can’t leave all learning up to the robots (see note <1>). Typically, then, a system must rely on the hierarchies we program into it. Can Fodor argue that the solution to the "frame problem" escapes our grasp by swinging away on this loose end? After all, how do we know which hierarchies to program into a reasoning system? Alas, for many domains, we don’t! Hierarchy problems are domain-specific content problems; to solve them, we have to do a lot of science. In this respect, hierarchy problems are no deeper than any other content problems, say, the "price problem": how are computers to know the prices of objects in the real world? Well, we’ve got to tell them, since as things stand we can’t very well turn them loose to find out for themselves. And for us to know, we’ve got to split up and do a lot of domain-specific investigations: you’ve got to find out about the prices of wagons, I’ve got to find out about the prices of bombs, etc. Similarly with hierarchy problems: you’ve got to find out how often wagons malfunction, I’ve got to find out how often bombs are put on wagons, etc. If AI is to build a well-informed computer, at least without the benefit of letting it grow up from infancy through college, it must incorporate the findings of experts in diverse domains. The important point is that, for all the holorobophobes have argued, AI’s "problems" of ranking conditions according to their relative probabilities, and of ranking rules according to their relative reliability and usefulness, are no more surprising or principled than its "problem" with specifying the prices of objects.

3.4 Summary

Before moving on, it may be helpful to summarize the main conclusions thus far. First, the relevance problem and the holism problem have nothing important to do with the frame problem as it is understood in AI, namely, the persistence problem. As a result, it is improper to use them in arguments against the sleeping-dog strategy. Second, the two problems, construed as process-and-form problems, are easily solved by appeal to two familiar AI shortcuts: bidirectional and heuristic search. Finally, although AI does not have a complete solution to certain domain-specific problems, the musings of the three wise men have not shown this to be a deep, epistemological threat to theories in AI.

4 KOOKINESS AND THE FRAME PROBLEM

4.1 The "Fridgeon" Problem

The third wise man, Jerry Fodor, raises a novel and interesting objection to the sleeping-dog strategy based on the kooky predicate "fridgeon", defined as follows: x is a fridgeon at t iff x is a physical particle at t and Fodor’s fridge is on at t. Fodor points out that when he turns his fridge on, he makes billions of changes--namely, he turns each particle in the universe into a fridgeon. Therefore, he argues:

If I let the facts about fridgeons into my database . . ., pursuing the sleeping dog strategy will no longer solve the frame problem. . . . [A] strategy which says ‘look just at the facts which change’ will buy you nothing; it will commit you to looking at indefinitely many facts. (Fodor 1987, pp. 144-145; emphasis Fodor’s)

The point is quite general. As Fodor explains, "there are arbitrarily many kooky concepts which can be defined with the same apparatus that you use to define perfectly kosher concepts," namely, the apparatus of "basic concepts" and "logical syntax" (Fodor 1987, pp. 145-146). "So," he continues, "the problem--viz. the FRAME problem--is to find a RULE that will keep the kooky concepts out while letting the non-kooky concepts in" (Fodor 1987, p. 146; emphasis Fodor’s). But this would be tantamount to "a rigorous account of our commonsense estimate of the world’s taxonomic structure," which would require "formalizing our intuitions about inductive relevance" (Fodor 1987, pp. 147-148). It’s no wonder, then, that Fodor claims the frame problem is "too important to leave to the hackers" (Fodor 1987, p. 148)!<13>

4.2 Three Kinds of Memory

Before turning directly to Fodor’s problem of formalizing inductive kookiness, it will help to get clearer about what a system should do in the face of kookiness. What I will argue is that a system should represent kooky facts implicitly in its representations of nonkooky facts. The basic idea can be explained by reference to the way people (like yourself) deal with the mental predicate "FRIDGEON". If Fodor is right, then you must keep representations of fridgeon facts out of your "database". But this doesn’t mean you must keep the definition of "FRIDGEON" out of your memory; if you did, you wouldn’t even be able to understand Fodor’s argument! On a natural view, then, you must have something like a mental dictionary, in which you can store the definition of "FRIDGEON".<14> If (for some odd reason) you want to check whether Nancy-the-Neutron is a fridgeon, you must first find "FRIDGEON" in your mental dictionary, and then check your database to determine whether Nancy satisfies the definition--that is, whether Nancy is a particle and whether Fodor’s fridge is on. Given that "FRIDGEON" appears in your mental dictionary, then, representations of fridgeon facts needn’t appear in your database. So you don’t need to update them when you discover that Fodor has turned his fridge on. The same is true for an AI system with both a dictionary and a database. When Fodor turns his fridge on, the system only needs to change one representation, namely, its representation of the state of Fodor’s fridge.

The most obvious objection to this strategy is that even representations of fridgeon facts must sometimes be explicit. Otherwise, one could never use the predicate "FRIDGEON", as you are in thinking about Fodor’s argument. In the example, once you find "FRIDGEON" in your dictionary, and check whether Nancy satisfies the definition, you still must infer explicitly that Nancy is a fridgeon. In other words, apparently, you must put the representation "NANCY IS A FRIDGEON" in your database. Since this representation is explicit, however, it needs to be updated explicitly, when Fodor turns his fridge off. It might seem, then, that the distinction between the dictionary and the database cuts too little ice. The proper response to this objection is to appeal to a third kind of memory, which cognitive scientists call "working memory". Working memory is a temporary storage space, for representations that are being used at a given time. The important thing about working memory for present purposes is that once representations in working memory are used, they can be erased. Now, while it is true that fridgeon facts sometimes need to be represented explicitly, they need only be explicit in working memory, not in the long-term database. Therefore, after generating and using the explicit representation "NANCY IS A FRIDGEON", you can simply erase it, without worrying about updating it. The same is true for an AI system with a working memory.

But Fodor can also object to this. The situation is different when a system is told that Nancy is a fridgeon--that is, when this is new information. If the system simply erases this representation from working memory, it will lose the information about Nancy. So, apparently, it must first copy the representation into the database, in which case it needs to worry about updating the copy. The response to this objection is simple. If the system is to keep fridgeon facts out of the database, it must translate representations of them into nonkooky representations (using the dictionary), and copy these nonkooky representations into the database. So, when a system is told that Nancy is a fridgeon, it should put the representations "NANCY IS A PARTICLE" and "FODOR’S FRIDGE IS ON" into the database.

4.3 How to Rule out Kooky Predicates

Even given the viability of keeping kooky mental predicates in the dictionary and in working memory, the "fridgeon" problem has not been addressed. For how does a system know which predicates to keep there, and which to allow into the database? Mustn’t it follow a rule which, as Fodor claims, codifies "our intuitions about inductive relevance"? Not obviously. I agree with Fodor that no one knows how to formalize inductive kookiness, but I disagree with his claim that we need to do this to save the sleeping-dog strategy. As Fodor himself insists, kooky predicates are defined in terms of basic predicates, so representations involving kooky predicates can always be left implicit in representations involving only basic predicates. Suppose, then, that a system follows this rule: allow only basic predicates into the database, and keep all defined predicates in the dictionary and in working memory.<15> Even though this rule does not formalize kookiness, it is generally applicable to any kooky predicate Fodor chooses to define.

Call a system using this rule a "basic system", since all of its inferential processes are carried out over representations involving only basic predicates. Although a basic system does not need to appeal to inductive relevance in order to exclude kooky predicates, if it is to be well-informed about the real world then it needs to know which particular "basic representations" to infer from which particular others.<16> Call this the "basic learning problem". It may appear that my appeal to basic systems simply begs Fodor’s questions, since the basic learning problem is similar to Fodor’s problem of formalizing inductive relevance.<17> If this is Fodor’s question, however, it deserves begging, for it is deprived of any interest. Given the possibility of basic systems, Fodor cannot support his (interesting) claim that the sleeping-dog strategy raises special problems about kooky predicates. All he can claim, then, is that the sleeping-dog strategy must work hand-in-hand with a solution to the basic learning problem. But this is no surprise. The basic learning problem arises for any system that has to make inferences about the real world, whether or not it uses the sleeping-dog strategy (compare the discussion of C3 in section 3.2). Therefore, the sleeping-dog strategy does not raise or intensify the problem. More importantly, Fodor has not shown any principled difficulties with solving the problem. If we want well-informed robots, then we must do two things: we must engage in lots of domain-specific scientific investigations about what may be inferred from what, and we must occupy ourselves with issues surrounding how machines can learn as children do. The basic learning problem is a familiar example of an unprincipled, domain-specific content problem.

Another objection is that the rule that defines basic systems is much too strong. It not only keeps kooky predicates out of the database, but also excludes nonkooky defined predicates, like "MY BULGARIAN GRANDMOTHER" and "VEGETABLE CRISPER". The problem is that if one often uses these predicates, one might need to have representations involving them at one’s mental fingertips--that is, one might need to have them explicit in the database. In other words, it might take too much time and energy to deal with all the basic predicates each time one needs to use one of these complex predicates. Fair enough. The rule needs to be weakened in the following way: allow only representations involving basic predicates into the database, except for representations (involving defined predicates) that are so useful that you need to have them at your fingertips. In other words, when a particular combination of basic predicates recurs very frequently in the course of problem solving, the system may introduce into the database an abbreviation for it (i.e., a derived predicate). As amended, however, the rule needn’t mention anything about "our commonsense estimate of the world’s taxonomic structure."

One last argument on behalf of Fodor: he can object that weakening the rule may allow fridgeon facts back into the database after all. If individual fridgeon facts were (somehow) vitally important to a system, it might indeed need to have fridgeon information at its fingertips. But then it would be forced to update many representations when Fodor turns his fridge on or off. This is true. For such a system, however, "FRIDGEON" would not be a kooky predicate at all--at least, it would not be something the system should want to rule out of the database! A system with kooky enough needs would have to update indefinitely many beliefs; that’s just tough kookies.<18> The sleeping-dog strategy is not supposed magically to eliminate this possibility, but only to help minimize the number of updates, given a fixed set of needs. I conclude, then, that Fodor has not shown that the sleeping-dog strategy faces a problem about formalizing our intuitions about inductive kookiness.

4.4 The New Riddle of Induction

Although Fodor doesn’t explicitly mention Nelson Goodman’s "new riddle of induction" (the "grue" problem), he certainly nudges the reader toward the conclusion that AI must solve it to solve the frame problem. As I have mentioned, he (incorrectly) touts his "fridgeon" problem as a problem about "inductive relevance". This can make it seem similar to the problem of showing why "grue" facts are irrelevant to inductive inference. Elsewhere, Fodor writes, "the frame problem is not distinguishable from the problem of nondemonstrative confirmation" (Fodor, 1983, p. 138). To philosophers, at least, the "grue" problem is the paradigm paradox of nondemonstrative confirmation. Whatever his intentions on the matter, then, it is common for Fodor’s readers to come away with the distinct impression that he has shown the frame problem to be or to include the "grue" problem.

For all its distinctness, this impression is wrong. Although Fodor’s problem and Goodman’s problem are both problems about kooky predicates, they are only superficially similar. "Grue" is, like "fridgeon", a kooky predicate, which may be defined as follows: x is grue if and only if x is a green thing that has been examined or x is a blue thing that has not been examined. Suppose that every emerald ever examined is green. According to the definition, then, every emerald ever examined is grue. The trouble with "grue" is that, unlike the nonkooky predicate "green", it is not "projectible" to unexamined emeralds. "Green" is projectible, since the fact that all examined emeralds are green helps confirm the hypothesis that all unexamined emeralds are green. But the fact that all examined emeralds are grue does not help confirm the hypothesis that all unexamined emeralds are grue; by the definition, this would amount to the hypothesis that all unexamined emeralds are blue. The problem Goodman poses, then, is to show why "grue" and its like are not projectible from examined instances to unexamined instances, while "green" and its like are projectible.

By contrast, Fodor’s problem simply has nothing to do with projectibility, since "fridgeon", unlike "grue", is perfectly projectible. For if at least one particle is a fridgeon, it follows that Fodor’s fridge is on, so it follows that all particles are fridgeons. Therefore, even if Fodor is right that the sleeping-dog strategy converts the frame problem into a serious problem about inductive relevance, it would not follow that the frame problem would include the problem of avoiding the projection of "GRUE"-like mental predicates.<19>

Whether or not Goodman’s problem has anything to do with the frame problem, AI must solve something like it to design holorobots. Cast in Fodor’s computational terms, the problem is to find a "rule" that will keep kooky predicates (like "grue") from being projected while letting nonkooky predicates (like "green") be projected. Fortunately, AI can follow Goodman’s own lead in finding such a rule (Goodman, 1965, chapter 4). Goodman’s proposal centers on the "entrenchment" of a predicate. Roughly, a predicate is entrenched in proportion to the number of times it and its superordinates (e.g., "colored" for "green") have actually been projected in the past. Again roughly, Goodman suggests that a predicate (like "green") is projectible if it is much more entrenched than any "competing" predicate (like "grue"). Cast in terms of a "rule", what Goodman is saying is, roughly: "Project the predicates you’ve projected often in the past, and don’t project ones that lead to conflicting projections". It is hard to believe that AI would run into much difficulty with implementing this sort of rule! To exclude "grue", for instance, all that needs to be done is to endow a system with an "innate" disposition to project "green" (an endowment presumably shared by humans), and a disposition to favor entrenched predicates over conflicting, nonentrenched predicates.

If Goodman’s "grue" problem sends shivers down philosophical spines, this is probably due to the worry that Goodman’s appeal to entrenchment merely describes our inductive practices, without justifying them. Why should we project "green" rather than "grue"? Because, he says, this coheres better with our established practices. Surely this is reasonable; it follows from a more general principle of conservation of existing theories and practices until something goes positively wrong with them. But the worry is that even if we had normally projected "grue" much more often than "green", this practice would still have been less optimal than our existing practice. Fortunately for AI, however, even if Goodman does need to say more to dispel this worry, AI doesn’t. AI might succeed in building a system with our inductive practices, without even beginning to justify these practices. As Pat Hayes writes in response to Fodor, "We [in AI] wouldn’t need to solve the philosophical problem of other minds in order to build a conversational program, or the problem of the ding an sich in order to build a vision system, either" (Hayes, 1987, p. 134). The musings of the third wise man, far from leading to a new, deep problem for AI, lead to an old, deep problem for philosophy.

4.5 Summary

In several respects, the "fridgeon" problem shares the fate of the relevance problem and the holism problem. First, none of them are properly identified with the frame problem as it is understood in AI (i.e., with the persistence problem), and none of them weigh against the sleeping-dog strategy. Second, if we construe them as content problems, they merely serve as reminders of severe but familiar practical problems with generating a complete understanding of holothinkers. But not even holorobophobia has led its victims to rest content with problems about what thinkers believe; each of the frame problems is supposed to be about how thinkers access and use these beliefs. Finally, however, if we construe the various frame problems as serious attempts to focus on the hows rather than the whats, then they are easily solved.

 

NOTES

<1>This paper is a moderately revised version of my "Framing the Frame Problem", Synthèse, vol. 82, pp. 353-374, March 1990. Portions are reprinted with the permission of Kluwer Academic Publishers, Dordrecht.

<2>Couldn’t we avoid having to gather this information for the computers, by designing them to investigate the world for themselves, as children do? Not completely, for two broad reasons. First, setting computers loose in the world involves implanting them in robots; but we don’t yet know how to build robots that can see, feel, hear, hop, skip, and jump well enough to cross a city street safely. In part this is a problem about giving the robots general information about the looks and feels of things, a problem we might approach by allowing the robots to go through a long stage of infancy, in which we carry them across streets and constantly monitor their proximity to the oven. But in part it is a problem about giving the robots working eyes, tongues, fingertips, and funny bones, tasks that are surely at least as difficult as proceeding with sociology and the like. Second, there is the "blank slate" problem. It appears impossible to learn efficiently about a domain unless one already has some reliable information about what sorts of data to concentrate on, what sorts of hypotheses to try out, etc. Thus, building robot learners requires endowing them with considerable domain-specific innate knowledge, which requires us to engage in domain-specific investigations after all. Add to this the explicit instruction (e.g., "book learning") that must be gathered and presented to children, and it appears that a robot’s need to be spoon-fed extensive amounts of domain-specific beliefs is unsurprising and rather humanlike.

<3>I have transcribed the words of the three wise men from the reports of Daniel Dennett (whose original "R2D2" fable is the model for mine), John Haugeland, and Jerry Fodor, respectively (Dennett 1987, pp. 42-43; Haugeland 1987, p. 93; Fodor 1987, p. 142).

<4>The strategy originated in Fikes and Nilsson, 1971, and has since appeared in a number of more sophisticated guises, e.g., Shoham, 1988. The term is due to Haugeland, 1987.

<5>There is some room for the suspicion that the sleeping-dog strategy is nearly vacuous, perhaps because the notion of a "positive reason" may itself seem nearly vacuous. I find it helpful to pursue this worry by considering the following sort of case, suggested to me by Jim Higginbotham. Imagine an agent who is placed on a disk that rotates at a constant speed, and who (for whatever reason) has to keep track of certain of his momentary, egocentrically-specified relations to objects that are off the disk and out of sight. Since these relations change continuously, the agent has continuous "positive reason" to access and update his corresponding representations. But since these relations change predictably, the system can employ a regular procedure for these updates. In this case, the system can adopt a "sleeping-procedure strategy": don’t consider changing the procedure unless there is positive reason to do so. Both the sleeping-dog and sleeping-procedure strategies might then be seen as unremarkable instances of the nearly vacuous "sleeping-thing strategy": don’t consider changing anything (e.g., one’s wallpaper) unless there is a positive reason to do so. Why isn’t the sleeping-dog strategy equally uninteresting?

There is good reason to give special attention to the sleeping-dog strategy (formulated as a principle specifically governing the direct alteration of beliefs about changeable facts, rather than the alteration of procedures or "things" generally). If we focus on the general sleeping-thing strategy, I think, we remove the possibility of placing substantive constraints on what it is to be a "positive" reason. In the general case, maybe a positive reason to consider changing something is simply that one is in the mood to do so, or, even more weakly, simply that one finds oneself about to do so. With respect to the sleeping-dog strategy conceived in the standard way, however, we can motivate at least a skeletal account of what it is for a given computational system to have a positive reason for considering the revision of a belief that p: perhaps the sort of system under consideration must have an "axiom" whose antecedent is believed to be satisfied, and whose consequent is that not-p. Furthermore, we can add that the axiom should not be a frame axiom--that is, its antecedent should not presuppose that not-p. This account can be elaborated by requiring the "axioms" to have theoretical or observational support, to admit of certain degrees of confidence, etc.

Furthermore, the sleeping-dog strategy has special interest as a technique for reducing computational load, at least by comparison with the sleeping-procedure strategy. Generally, there are more representations (of changeable facts) than there are procedures, so the sleeping-dog strategy has greater potential for reducing computational load. Indeed, the sleeping-dog strategy has merit even in Higginbotham’s spinning-disk example. Although in this case it is possible that the agent must continually access and update a huge number of representations, without the sleeping-dog strategy the agent would have to access and update an even huger number of representations, e.g., about the unchanging but changeable aspects of his situation.

<6>For a good introduction to the AI literature on "search", see chapter II of Barr and Feigenbaum (eds.) 1981. The term "operators" is standardly used for inference rules as well as other goal-reaching devices that do not concern me here.

<7>As I explain in section 2.1, "relevance holism" creates a difficulty here, but one that can be solved.

<8>I omit nuances such as the temporal factors mentioned in the introduction and the exceptions discussed in section 2.1.

<9>In The Modularity of Mind, Fodor anticipates Haugeland’s treatment of the frame problem as a problem about holism. He writes that one of the things that "makes [the frame] problem so hard" is that "which beliefs are up for grabs depends intimately upon which actions are performed and upon the context of the performances" (Fodor 1983, p. 114).

<10>Occasionally, when the stakes are high, it may be advantageous for a system to go into a more careful mode in which it avoids making some default assumptions, explicitly checking the surrounding conditions instead. I ignore this nicety here, since to the degree that a system needs to be careful, the holism problem is made less important: if the stakes are so high that a system needs explicitly to check surrounding conditions, it can hardly be faulted for doing so.

<11>AI researchers have had mixed success in trying to develop a "nonmonotonic logic" for reasoning with default assumptions (for a review of this literature, see Shoham 1988). From the perspective adopted here, however, default (or nonmonotonic) reasoning is an ordinary example of heuristic search, which is generally thought not to require the development of a corresponding "logic". This is one way of seeing that we may not need nonmonotonic logic (as opposed to nonmonotonic reasoning), so that the shortcomings of nonmonotonic logics may not be important.

<12>A good illustration of this method is in Holland, et al. 1986. Their "strength" parameters reflect the past usefulness of rules, and are used to constrain search.

<13>Some of Fodor’s readers have been struck by similarities between his "fridgeon" problem and Nelson Goodman’s infamous "grue" problem (Goodman, 1965). I will discuss the relevance of the "grue" problem in section 4.4.

<14>For simplicity, we can suppose that this dictionary is wholly separate from the database of "facts", although it is not necessary for my purposes. More on this in the next note.

<15>Strictly speaking, this rule may have to be modified to be implemented in familiar models. The distinction between basic and derived predicates is a semantic distinction of considerable intricacy, not a syntactic distinction. Given this, it is difficult to see how a computational system could classify its predicates literally as basic or as derived, especially if there are no separate storage areas conveniently labeled "Dictionary" and "Database" or, even more conveniently, "Analytic" and "Synthetic". However, a number of computational tests can combine to generate a useful classification which is near enough to the basic/derived distinction: call it the distinction between "quasi-basic" and "quasi-derived" predicates.

The first step is to classify predicates as either syntactically complex or syntactically simple. Since this is a formal distinction, nearly enough, I presume it is not difficult to understand how a system could draw it, nearly enough. Syntactically complex predicates may be classified as quasi-derived.

The harder task is to draw a distinction that can do duty for the distinction between basic and derived syntactically simple predicates. It may appear that we need a test for when a syntactically simple predicate (e.g., "fridgeon") is literally defined by a syntactically complex predicate. However, all that is actually required is that there be a test for syntactically simple predicates for which the system is disposed to substitute a particular syntactically complex predicate, in any situation the system judges to be both likely and important. (This seems to be the situation we find ourselves in when Fodor tells us that "x is a fridgeon" means "x is a particle and Fodor’s fridge is on".) In many cognitive models such dispositions are realized in such a way that processes may access and modify them (e.g., as "strengths" of productions in production systems or as "weights" of connections in connectionist networks). I presume, then, that it is possible for computational processes to access these dispositions in distinguishing quasi-basic from quasi-derived predicates.

<16>I may need to indicate that my use of the term "basic" differs from Haugeland’s in "An Overview of the Frame Problem" (Haugeland, 1987, pp. 84-85). I mean "semantically primitive" representations (see previous note) while he means "causally primitive" facts. His criticisms of appeals to basic facts do not apply to my appeals to basic representations (and, of course, they are not intended to).

<17>I thank Joelle Proust for pressing this point.

<18>Even if a system had the kooky need to allow fridgeon facts into the database, it would not necessarily run into computational problems. Once the system has determined that Fodor’s fridge is on, it can form the one explicit representation "ALL PARTICLES ARE FRIDGEONS". When it changes its representation of Fodor’s fridge, it can change this one representation. Surely this is computationally feasible! But Fodor can object that even this move leaves implicit all the billions of particular-fridgeon facts, say, the fact that Nancy-the-Neutron is a fridgeon, and so on. In order to establish that Nancy is a fridgeon, the system has to infer this from the general fact about all particles. Perhaps each one of the billions of fridgeon facts might be so vital to a system’s microsecond-to-microsecond welfare that, in the heat of the moment, it cannot afford to waste time on this inference. Such a system would be very likely to fail. But if this is Fodor’s frame problem, it should simply be ignored. Any human would fail, too, in the same situation. So AI could design a machine as smart as any human, without running up against this problem.

<19>Conversely, the mental predicate "GRUE" does not create the (alleged) problems "FRIDGEON" does. Using "FRIDGEON" rather than nonkooky predicates leads to Fodor’s problem, because doing so (apparently) increases the number of representations one must update. Using "GRUE" rather than "GREEN", on the other hand, does not (even apparently) increase the number of updates. When one observes an object x, one can classify it as "GREEN" by storing the single representation "x IS GREEN". Even if one instead classifies it as "GRUE", however, one need only store the single representation "x IS GRUE". (One wouldn’t also need to store "x IS GREEN", since this may be left implicit in "x IS GRUE" and "x IS OBSERVED".) Likewise, one can project "GREEN" to an unobserved object y by adding the single representation "y IS GREEN". Even if one instead projects "GRUE", however, one need only add the single representation "y IS GRUE". (One wouldn’t also need to add "y is BLUE", since this may be left implicit in "y IS GRUE" and "y IS UNOBSERVED".)

 

REFERENCES

Barr, A., and E. Feigenbaum, ed., 1981: The Handbook of Artificial Intelligence, vol. I, William Kaufmann, Inc., Los Altos, CA.

Dennett, D., 1984: "Cognitive Wheels: the Frame Problem of AI", in Pylyshyn, ed., 1987.

Dreyfus, H. and S. Dreyfus, 1986: Mind Over Machine, Free Press, New York.

Fikes, R., and N. Nilsson, 1971: "STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving", Artificial Intelligence.

Fodor, J., 1983: The Modularity of Mind, MIT Press, Cambridge.

Fodor, J., 1987b: "Modules, Frames, Fridgeons, Sleeping Dogs, and the Music of the Spheres", in Pylyshyn, ed., 1987.

Goodman, N., 1965: Fact, Fiction, and Forecast, Bobbs-Merrill, Indianapolis.

Haugeland, J., 1985: Artificial Intelligence: the Very Idea, MIT Press, Cambridge.

Haugeland, J., 1987: "An Overview of the Frame Problem", in Pylyshyn, ed., 1987.

Hayes, P., 1987: "What the Frame Problem Is and Isn’t", in Pylyshyn, ed., 1987.

Holland, J., K. Holyoak, R. Nisbett, and P. Thagard, 1986: Induction, MIT Press, Cambridge.

Janlert, L., 1987: "Modeling Change--The Frame Problem", in Pylyshyn, ed., 1987.

McCarthy, J., and P. Hayes, 1969: "Some Philosophical Problems from the Standpoint of Artificial Intelligence", in B. Meltzer and D. Michie, eds., Machine Intelligence 4, Edinburgh University Press, Edinburgh.

Pylyshyn, Z. (ed), 1987: The Robot’s Dilemma: the Frame Problem in Artificial Intelligence, Ablex, Norwood, NJ.

Shoham, Y., 1988: Reasoning about Change, MIT Press, Cambridge.