IVUU^JU X~t^ l - Stanford Universityfb973yf0241/fb973...gs-ns OBOJNVIS nOA 33S OISINVM NIVOV 11VDHIM...

IVUU^JU <" X~t^ l

Stanford University LiOrariegDept. of Special Cofeections

Box Series -A_X2J/_Xb> Fol. TitleFol

7- /?-77

gs-nsOBOJNVIS

nOA 33S OI SINVM NIVOV 11VD HIM

nOA 33S 01 3WV3 3NOHd 3SV3"Id

nvD ynoA a3Nani3d CI3NOHd3~I3I

3NOHd

dO

n

OIAI3IAI3iva3WII

01

,■

(

(

24

APPENDIX A

These notes are supposed to serve as a partial updating of several sectionsof "Steps Toward Artificial Intelligence."

p. 2: Search

Before resorting to incomplete search methods, one should make quite surethat the underlying brute -force search is as efficient as practical. It has beendiscovered that there often exist modifications of exhaustive search proceduresinvolving no sacrifice. Notable is McCarthy's alpha -beta" technique (describedin "The Tree Prune Algorithm," T. Hart and D.J. Edwards, M.I.T. ArtificialIntelligence Project Memo 30, Dec. 1961, M.I.T. Computation Center; publica-tion of these results will occur in the near future). Briefly, by keeping track ofthe results obtained during a search, one can approach a reduction in the branch-ing rate of a tree by a factor of two—thus allowing about twice the search depth.This does not make games like Chess and Checkers amenable to exhaustivesearch, but it did bring into this domain an interesting game called Kalah. Forexample, in the diagram on p. 13, the player need not investigate further themiddle branch once he sees that the opponent can hold him to 2, because he canalready do better than that on the first or third branches—assuming that he hasalready explored one of those.p. 2: Note 2

These remarks are slightly amplified in remarks inlC] ,vol. 2, pp. 565-567.

p. 3: Hill -Climbing

The discussion on hill -climbing is greatly amplified in a forthcoming paperby Selfridge and Minsky, "Multiple Simultaneous Optimization, " to be publishedin the proceeding of the next Soviet Automatic Control Conference. We developthere some further ideas about the growth rate of such processes as a functionof number of variables. The "Mesa" phenomenon was discussed in further detailin paper ['[] . The chief idea here is that local improvement methods work onlywhen one has a reasonable starting position. In the case of complex problem-solving situations, hill-climbing is no substitute for the concept -formation phaseessential to get the process on the right track:

p. 4: Prototype -Derived Patterns

Those wishing to review work on this field should consult the extensive re-view of Mary Stevens, which obsoletes ref. [41]: "Automatic Character Recogni-tion: State of the Art, »" NBS Tech. Note 112, P8161613, Wash. D.C. , May 1961.

\

{ p. 5: Property Lists

The property-list method, despite its limitations, is the most popular ideaused in planning of pattern-recognition machines (with the possible exception ofthe Bayes nets, about which more later). When people make machines usingthis or a modified template matching method, they usually run into trouble on afew important discriminations. At this point it is necessary to build into theequipment some post-processing—an additional decision-tree structure.

If I were rewriting "Steps, " I would introduce here a new section on "Grow-ing Decision Trees, " based on the notions implicit in the EPAM model ofFeigenbaum and Simon. The idea here is that one can build up a "discriminationnet" on the basis of experience, by introducing new tests where they are neededin a growing decision tree. Judicious use of such techniques may make it possi-ble to do very nearly as well in a serial machine as one expects to do in aparallel machine—at accordingly much greater computational efficiency. Oneshould consider very seriously the EPAM (1961 WJCC) model in designingproblem -solving and pattern-recognition machines, not so much because oftheir alleged psychological verisimilitude but because of their basic economy-one does not compute a property function until it is necessary. Slagle's experi-ence in the Calculus program (SAINT: to appear soon in the JACM) was thatproperty-list computations consumed an appreciable part of the running time;he found it useful to introduce heuristics to reduce the frequency of using thisprocedure. An EPAM process here might have worked better.

The EPAM processes, because of the clear context for new properties,also help one make a more directed search for appropriate new properties,along the lines of GPS goals.

p. 6: Generating Properties

An elaborate scheme related to the ideas in this section is that of Uhr andVossler, as presented in the 1962 IFIPS symposium, to be published shortly.Work in the area of "concept-formation" is surely among the most pressingaspects of today's artificial intelligence research. One should see also, A.Hormann, "Programs for Machine Learning, " SDC, Santa Monica, TM-669/000/01, May 1962.

p. 7: "Bayes Nets"

The analysis for single-layer nets given here has finally become popular,and several devices have been reported using our initial weightings. We mustreiterate our caveats concerning the nonindependence of most generators ofproperties. (These remarks are developed in much greater detail in a paper bySelfridge and me in "Learning in Random Nets, 11, " a sequel toref. I7], to be

rClj

28

available soon.) Perhaps the most notable development along this line is thePAPA machine family of A. Gamba,* which makes use of random masks to-gether with the basic probabilistic analysis. Another family of related devicesare the "majority logic" machines of Widrow, subject to the same criticisms;good performance on simple discrimination problems is liable to be followed byhopelessly poor performance on somewhat more difficult problems. Recent ex-periments in these areas show that these machines usually converge to a positiveerror rate; they don't approach perfection as the number <£ "A -units" is in^creased indefinitely.

p. 9: Descriptions

There has been a considerable amount of work on the problem of descrip-tions, most of it in process of publication or Thesis -writing. T. Evans ofAFCRC has applied articular description methods to pictorial -analogy intelli-gence tests with good results. R. Canaday has used such methods successfullyto deal with such patterns as

which is recognized by his program (Canaday, R.H. , "The description of over-lapping figures; " M.S.E.E. Thesis, M.1.T., June 1962) to be a "triangle overa rectangle over a square, " etc. Some most striking results in this area arethose of Sutherland (M.I.T. Thesis in preparation) on a recursive picture-description language for the purpose of drawingfigures on-line at the computer.This program deals with pictures using a symbolic system which makes the pic-tures highly amenable for use in other problem -solving procedures.

p. 11: Learning

The statistical learning procedures mentioned here continue to appear inall sorts of simple learning systems. As noted at the end of the section, I takethe view that these are not bo central as are the organizational problems. In

♦Nuovo Cimento, Supplement 2, vol. 20, series X, pp. 221-231.

rit 7

connection with the cited work of Miller et al. , the reader is referred toHermann's (op. cit.) experiments hopefully to be completed soon, which arebased on a recursive planning procedure related to that of Miller. Those inter-ested in the basic reinforcement schemes may be interested in work of Marzocco(IFIPS 19G2 Proceedings) on simulation of stimulus-sampling learning aJa Estes.For physiological models of learning one should know about the volume BrainMechanisms and Learning [C.C. Thomas, Springfield, 1961]. Some recentwork in this area will be reported in a volume edited by 11. Gerard which willappear in connection with the 1962 International Congress of Physiologists; thiswill contain papers by a number of workers active in the artificial intelligencearea as well as in neurophysiology.

While on the subject of neurophysiology, several groups, including ours,are considering simulated Models of the visual system, as understood from theexperiments of Lettvin and Maturana and Hubel and Wiesel, and others. It willbe interesting to see how well one can do, with Bayes and other network-learningsystems, using these as pre-processing schemes for the source of inputproperty-functions. We should note the work of Maron here, as an interestinginstance of a network learning program based on first-order probabilities modi-fied by a scheme for rejection of poor inputs properties to be replaced by others(with greater delay) .p. 12: Samuel's Program

Samuel has continued to improve his program, so that the 7090 versionis now very fast and powerful. Recently the program defeated a former StateChampion in a rather well-played game: the score of this game appears in ashort expository article by me in Discovery, October 1962. I expect Samuel topublish a more critical discussion of this and other games.

No such success has been reported for Chess programs, despite generalpublic belief to the contrary. A chess program developed at M.I.T. by McCarthyand others is the best current program, and the first one, I believe, to pose avery strong threat to weak players—it uses McCarthy's "alpha-beta" heuristic.Given Queen-odds and first move it recently beat an expert player, whateverthat means. We do expect stronger Chess machines to appear within a coupleof years .p. 14: LT. and Other Theorem -Proving Programs

The LT program has served its historical purpose and is quiescent, butI stand by my remarks about the necessity for heuristics in theorem-proving.There has been a great deal of activity on the direction of improving decisionprocedures and partial decision procedures in the predicate calculus, for thegoal of getting machines to prove nonelementary theorems in mathematics.

28

While there has been no notable single achievement in this direction, it doesappear that such programs can be made to be very much more efficient thanappeared at first, and several workers are attempting to get proofs in elemen-tary number theory and group theory. None of these programs, to my knowledge,are planned to be able to make use of previously established theorems or"lemmas, " and it will be most interesting to see how far they can go, usingtheir inhuman methods !

I have not seen any great progress on this problem. Evidently mostreaders did not understand that the game described in Figure 11 on page 16 wasnot supposed to be of interest in itself, but was to serve as a metaphor for pre-senting the subproblem decision problem. (See, e.g., Mullin's review of"Steps" in the PGEC reviews.) For those interested in this problem, note theremarkable solution to the game discovered by O. Gross and presented inGardner's column in Scientific American, some time last year. No elegantsolution has been found for the superficially similar game of "Hex."

The experiments described in [ 15 ] did not, apparently, come off suc-cessfully, and Newell's paper (which is to accompany the present one in the finalproceedings) may explain the organizational complications involved in attemptingto apply GPS to itself. I hope that he will also review Simon's work on a "Heu-ristic Compiler" which is intended to solve some planning problems.

p. 21: Conclusions

One problem not discussed at any length in "Steps" is that of informationretrieval— in particular, theproblem of finding a relevant conclusion from a largebody of factual data. There is quite a lot of current activity on this problem.The first important paper—"Baseball" [B. F. Green et al. , 1961 WJCC] —describesa question-answering program which deals with a highly organized body of data;it answers questions about this data, when the questions are written in a re-stricted form of ordinary English. There seem to be a good many projects onthe current scene devoted to extensions and variations of this kind of problem.

The aspect of current results in Artificial Intelligence that seems mostdubious, suspicious to skeptical observers, is the narrowness of the area withinwhich each program will work. In many cases, this could be broadened throughthe use of better relevancy-retrieval techniques.

While one must agree with this criticism that current programs are toospecialized, I think it v/ould be unwise for all of us to be too sensitive on this

p. 16: The Subproblem -Selection Problem

p. 18-19: GPS and Planning

-

(

29

score. We are still very weak on the basic organizational processes needed forintelligent reasoning. It is my feeling that we have still more to benefit frompursuing a few problems (e.g. , Chess, Theorem -proving, Semantic models) invery great depth and detail than we have from trying to extend our correct understanding to broader problem domains. The latter will be done, perforce, inapplications and development laboratories —those in a more academic situationshould seize the chance to study details of methods for solving very hard prob-lems.

%

Date post:	04-Apr-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

IVUU^JU X~t^ l - Stanford Universityfb973yf0241/fb973...gs-ns OBOJNVIS nOA 33S OISINVM NIVOV 11VDHIM...

Documents