Scientific Conceptualization of Information

8/7/2019 Scientific Conceptualization of Information

1/24

A SurveyThe article surveys the development of a scienti fic conceptualization ofinformation during and in the decade following World War II . It examines theroots of information science in nineteenth- and early twentieth-centurymathematical logic, physics, psychology, and electrical engineering, and thenfocuses on how Warren McCulloch, Walter Pitts, Claude Shannon, Alan Turing,John von Neumann, and Norbert Wiener combined these diverse studies into acoherent discipline.Categories and Subject Descriptors: E.4 [Coding and Information Theory];F. 1 [Computation by Abstract Devices]: Models of Computation, Modes o fComputation; F.4.1 [Mathematical Logic and Formal Languages]:Mathematical Logic-computability theory, recursive function theory; H. 1.1[Models and Principles]: Systems and Information Theory-information

theory; K.2 [History of Computing]-peopleGeneral Terms: TheoryAdditional Key Words and Phrases: W. McCulloch, W. Piths, C. Shannon,A. Turing, J. von Neumann, N. Wiener

n scholarship has tended to equate the history which could be quantified and examined using math-information processing with the history of comput- ematical tools. Around this conceljt grew a number ofmachinery. Because of the phenomenal growth of research areas involving study of both machines andnew generation of more powerful machines every living organisms. They included the mathematicalw years, other important events in information- theory of communication, mathematical modeling ofhistory have been overshadowed. One such the brain, artificial intelligence, cybernetics, automatais the scientific conceptualization of information theory, and homeostasis.occurred during and in the decade following Of course, the word information. was in commonWar II. In that period a small group of math- usage for many years before its scientific conceptual-lly oriented scientists developed a theory of ization. It was recorded in print in 1390 to meanand information processing. For the first communication of the knowledge or news of somee, information became a precisely defined concept fact or occurrence (Oxford English Dictionary). In-able to scientific study. Information was given formation also found a place in the traditional scien-status of a physical parameter, such as entropy, tific discourse of physics, mathematical logic, electri-cal engineering, psychology, and biology--in some in-stances as early as the nineteenth century. These1985 by the American Federation of Information Proce ssingieties, Inc. Perm ission to copy without fee all or part of this disciplines provided the avenues to the study of infor-

is granted provided that the copie s are not made or distrib- mation for the scientists reformulating the conceptfor direct comm ercial advantage, the AFIPS copyright notice during and after World War II.

the title of the publication and its date appear, and notice is The pursuit of five research areas in particular ledthe copying is by permiss ion of the American FederationInformation Proces sing Socie ties, Inc. To copy otherwise, or to

requires spe cific permiss ion.Address: Charles Babbage Institute, 104 Walter Library,y of Minnesota, Minneapolis, MN 55455.

1985 AFIPS 0164-1239/85/020117-l 40$01 .OO/OO

to the scientific conceptualization of information.i Quantification is not universally poss ible in information scien ce.For example, it is poss ible to quantify coding counts, but not mostsemantic concerns.

Anna ls of the History of Computing, Volume 7, Number 2, April 1985 * 117


2/24

l Conceptualization of Information

1. James Clerk Maxwell, Ludwig Boltzmann, and whose similar laws of functioning could be betterSzilards work in thermodynamics and statistical understood with the help of the abstract results de-cs, especially on the concept of entropy and duced from the mathematical models of automatamathematical formulation. theory.2. The emergence of control and communication as The major figures in this movement were Claude E.ew branch of electrical engineering, supplementary Shannon, Norbert Wiener, Warren S. McCulloch,power engineering, as a result of the development Walter Pitts, Alan M. Turing, and John von Neu-telegraphy, radio, and television. mann. They came to the subject from various estab-3. The study of the physiology of the nervous sys- lished scientific disciplines: mathematics, electricalbeginning in the nineteenth century with the engineering, psychology, biology, and physics. Despiterk of H. von Helmholtz and Claude Bernard, and their diverse backgrounds and research specialties,.nuing into the twentieth century, especially with they functioned as a cohesive scientific community.

work of Walter Cannon on homeostasis and the They shared a common educational background inregulation of living organisms. mathematical logic. Most had wartime experience that4. The development of functionalist and behavior- sharpened their awareness of the general importancetheories of the mind in psychology, leading to a of information as a scientific concept. Each appreci-w of the brain as a processor of information and to ated the importance of information to his own re-

demand for experimental verification of theories of search. Each was familiar with the others work andnd through observation of external behavior. recognized its importance to his own work and to the5. The development of recursive function theory in science of information generally. Most were personallylogic as a formal, mathematical char- acquainted, and often collaborated with or built di-of the human computational process. rectly upon the work of the others. They reviewed oneWhat was new after the war was a concerted effort anothers work in the scientific literature3 and oftenunify these diverse roots through a common math- attended the same conferences or meetings-some of

ical characterization of the concepts of informa- which were designed specifically to study this new dis-and information processing. The seminal idea cipline. Typical was a 1944 conference in Princeton or-s that an interdisciplinary approach is appropriate ganized by Wiener and von Neumann for mathemati-solve problems in both biological and physical set- cians, engineers, and physiologists to discuss problemsin cases where the key to the problems is the of mutual interest in cybernetics and computing.4 Out-lation, storage, or transmission of information siders recognized them as forming a cohesive commu-where the overall structure can be studied using nity of scholars devoted to a single area of research.matical tools. For these scientists, both the hu- The introduction to Wieners Cybernetics (1948)n brain and the electronic computer were consid- describes the sense of community and common pur-twes of complicated information processors pose among these diversely trained scientists. Perhaps

William Aspray (B.A., M.A.Wesleyan, 1973; M.A.1976, Ph.D. 1980i Wisconsin) is associatedirector of the CharlesBabbage institute for theHistory of InformationProcessing at the Universityof Minnesota inMinneapolis. Prior to joiningCBI Aspray taughtmathematical science atWilliams College and

of science at Harvard University. His historicalience, social diffusion of

computer industry.

2 It is not clear whether Pitts knew Turing personally. Wiener,McCulloch, and Pitts were asso ciates at MIT. V on Neumann wasinvolved in several conferences with McCulloch and Wiener. Turingworked with von Neumann in Princeton in 1937 and 1938, and withShannon during the war at the Bell Telephone Laboratories in NewYork City. Wiener and McCulloch both visited Turing in England.Shannon had ample opportunity to become personally acquaintedwith Wiener, von Neumann, McCulloch, and Pitts, and he wascertainly familiar with their work.3 See, for example, Shannons reviews of four papers authored orcoauthored by Pitts, including the famous Mc Culloch and Pittspaper, in Mathematical Reviews 5 (1944), p, 45, and 6 (1945), p. 12.4 Turing, as a confirmed loner and the only non-American amongthe major pioneers, is the least likely to have had contact with thegroup. Nevertheless, he discu ssed artificial-intelligence issue s withShannon (in fact, both had developed ches s programs); Wienersought him out in England and regarded him as one of the cyber-neticists ; McCulloch also went out of his way to visit in Manchester,although there is reason to believe that Turing did not think highlyof McCulloch (Hodges 1983); von Neumann offered Turing a job ashis assista nt at the Institute for Advanced Study to continue histheoretical work (a position Turing refused). Turing was also incontact with others in England working in this area, most notablyRoss Ashby and Grey Walter.

l Annals of the History of Computing, Volume 7, Number 2, April 1985


3/24

more tel ling were both Wieners attempts top an interdisciplinary science, known as cyber-

around the concept of feedback information,von Neumanns attempts to unify the work of

Turing, and McCulloch and Pitts into aof automata.

Mathematical logic does not have many real-worldtions. It did here because it studies the laws of

in abstraction, and more particularly because1930s logicians were concerned with find ing

mathematical characterization of the process of com-In fact, training in mathematical logic was the most

tie among these early pioneers. Shannon com-d a masters thesis (Shannon 1940) in electricaleering at the Massachusetts Institute of Tech-

the application of mathematical logic to theof switching systems. Wiener studied mathe-cal logic with Bertrand Russell and averred its

ence on his later work in cybernetics (WienerTuring received his doctoral degree from

eton University for work on ordinal logics (Tur-1938). Pitts studied mathematical logic under Ru-

f Carnap, while his associate, McCulloch, was alogical psychologist interested in questions con-the learning of logic and mathematics. Early

his career, von Neumann contributed significantlythe two branches of logic known as proof theory

set theory.The similarity in their work does not end with theuse of mathematical logic to solve problems

a variety of fields. They also shared the convictionnewly discovered concept of information

tie together, in a fundamental way, problemsdifferent branches of science. While the content

their work shows this common goal,5 so does theorganization of their research. Despite widely

backgrounds and research interests, these sci-sts were in close contact through collaboration,

review of one anothers work, and frequentThe growth of this interdisciplinary science in the

and early 1950s was at least partially theof the massive cooperative and interdiscipli-

scientific ventures of World War II that carriedntists to subjects beyond the scholarly

of their specialties. At no previous time hadsuch a mobilization of the scientific com-

. Wiener was led to develop cybernetics at leaston account of his participation in Vannevar E.

s computing project at MIT, his work with Y. W.An article by E. Colin Cherry (1952) shows that the unity of this

already recognized by outsiders.

W. Aspray * Conceptualization of Information

Lee on wave filters, and his collaboration with JulianBigelow on fire control for antiaircraft artillery. Eachof these projects was related to the war effort. BothTuring and von Neumann applied wartime computingexperience to their postwar work on artificial intelli-gence and automata theory. Shannons theory of com-munication resulted partly from the tremendous ad-vances in communications engineering spawning fromthe development of radar and electronics and partlyfrom the need for secure communications during thewar.The focus in this paper is on the sources and con-tributions of the six most important early figures inthis movement: Shannon, Wiener, McCulloch, Pitts,Turing, and von Neumann. It is impossible in a workof this length to present an exhaustive treatment ofthe contributions of any of these pioneers, or even tomention the less centrally related work of their col-leagues. Instead, the intent is to sketch the generaloutlines of this new conceptualization, with hope thatothers will contribute the fine brushstrokes necessaryto complete the picture.Claude Shannon and the Mathematical Theory ofCommunicationWhile working at Bell Laboratories in the 1940s oncommunication problems relating to radio and tele-graphic transmission, Shannon developed a generaltheory of communication that would treat of the trans-mission of any sort of information from one point toanother in space or time.6 His aim was to give specifictechnical definitions of concepts general enough toobtain in any situation where information is manipu-lated or transmitted-concepts such as information,noise, transmitter, signal, receiver, and message.At the heart of the theory was a new conceptuali-zation of information. To make communication theorya scientific discipline, Shannon needed to provide aprecise definition of information that transformed itinto a physical parameter capable of quantification.He accomplished this transformation by distinguish-ing information from meaning. He reserved meaningfor the content actually included in a particular mes-sage. He used information to refer to the number ofdifferent possible messages hat could be carried alonga channel, depending on the messages ength and onthe number of choices of symbols for transmission at6 Note that Shannon had already begun to work on his theory ofcomm unication prior to his arrival at Bell Labs in 1941 (Shannon1949a), and he continued to develop the theory while there. Witnes shis later paper (Shannon 1949b), originally written as a Bell Con-fidential Report based on work in the labs during the war (Shannon1945). For more information, see the biography of Turing by Hodges(1983).

Annals of the History o f Computing, Volume 7, Number 2, April 1985 l 11 g


4/24


Selective Chronology of Events in the Scientific Conceptualization of InformationFirst recorded printed use of word information

-1860s Helmholtz, investigations of the physiologyand psychology of sight and soundBernard, work on homeostasisMaxwell, On GovernorsRibot, Diseases of MemoryJames, Principles of PsychologyWaldeyer, neurone theoryBoltzmann, work on statistical physics6 Sherrington, The Integrative Action of the NervousSystemHolt, The Freudian Wish9 Watson, Psychology from the Standpoint of a Behav-iorist McCulloch, work on logic of transitive verbs

3 Lashley, The Behaviorist lntepretation of Conscious-ness4 Nyquist, Certain Factors Affecting Telegraph SpeedSzilard, work on entropy and statistical mechanics

1945 Shannon, A Mathematical Theory of Cryptography(Bell Laboratories report, published in 1949 underdifferent title)1945 McCulloch, A Heterarchy of Values Determined bythe Topology of Nerve Nets1946 Macy Foundation meetings on feedback organized byMcCulloch1946 Turing, NPL report on ACE computer1946-l 948

8 Hartley, Transmission of InformationVon Neumann, Foundation of Quantum Mechanics-l 930s Development of recursive function theory (workof Godel, Church, Kleene, Post, Turing)

Burks, Goldstine, von Neumann, IAS computerreports1947 Ashby, The Nervous System as Physical Machine1947 McCulloch and Pitts, work on a prosthetic device toenable the blind to read by ear1947 McCulloch and Pitts, How We Know Universals1948 Wiener, Cybernetics1948 Turing, Intelligent Machinery (NPL Report)1948 Turing, first work programming the Manchester com-puter to carry out purely mental activities

1948

s Feedback concept studied by electrical engi-neers (work of Black, Nyquist, Bode)930s Development of electroencephalographyTuring, On Computable NumbersTuring and von Neumann, first discussions about com-puting and artificial intelligence at Princeton UniversityShannon, Symbolic Analysis of Relay and SwitchingCircuits (published 1938, M.A. Thesis 1940)Harvard Medical School seminar led by CannonWiener, work at MIT on computersShannon, Communication in the Presence of Noise(submitted; published 1948)Macy Foundation meeting on central inhibition in thenervous systemRosenblueth, Wiener, Behavior, Purpose,TeleologyTuring, work with Shannon and Nyquist at Bell Labo-ratories in New York City

Von Neumann, General and Logical Theory o f Auto-mata (Hixon Symposium, Pasadena)1948 Shannon, The Mathematical Theory of Communica-tion

Princeton conference organized by Wiener andvon Neumann on topics related to computers andcontrolVon Neumann, draft report on EDVAC

McCulloch and Pitts, Logical Calculus o f the IdeasImmanent in Nervous ActivityPitts accepts position at MIT to work with Wiener

1949 Von Neumann, Theory and Organization of Compli-cated Automata (University of Illinois Lectures)1949 Shannon and Weaver, The Mathematical Theory ofCommunication1950 Shannon, Programming a Computer to Play Chess1950 McCulloch, Machines That Think and Want1950 McCulloch, Brain and Behavior1950 Turing, Computing Machinery and Intelligence1950 Ashby, The Cerebral Mechanism of Intelligent Action1952 Turing, The Chemical Basis of Morphogenesis1952 McCulloch, accepts position at MIT to be with Pitts

and Wiener1952 Ashby, Design for a Brain1952 Von Neumann, Probabilistic Logics and the Synthesisof Reliable Organisms from Unreliable Components(California Institute of Technology Lectures)

1952-I 953 Von Neumann, The Theory of Automata: Con-struction, Reproduction, Homogeneity1953

1956 Von Neumann, The Computer and the Brain (preparedfor the Silliman Lectures, Yale; published 1958)1956 Ashby, Introduction to Cybernetics1956 Ashby, Design for an Intelligence Amplifier

Walter, The Living Brain1953 Turing, Digital Computers Applied to Games: Chess1953 Turing, Some Calculations on the Riemann ZetaFunction

point in the message. Information in Shannonswas a measure of orderliness (as opposed toness) in that it indicated the number of pos-messages rom which a particular message o bewas chosen. The larger the number of possibili-the larger the amount of information transmitted,the actual message is distinguished from aof possible alternatives.

Shannon admitted the importance of previous workin communications engineering to his interest in ageneral theory of information, in particular the workof Harry Nyquist and R. V. Hartley.The recent developmentof various methodsofmodulation such as PGM and PPM which exchangebandwidth for signal-to-noise atio has ntensified theinterest in a general heory of communication. A basis

* Annals of the History of Computing, Volume 7, Number 2, April 1985


5/24

W. Aspray - Conceptualization of Information

for such a theory is contained in the important papersof the term intelligence, which masked the differenceNyquist and Hartley on this subject. In this paper we between information and meaning-his work was im-will extend the theory to include a number of newfactors. (Shannon 1948,pp. 31-32) portant for presenting the first statement of a loga-rithmic law for communication and the first exami-Nyquist was conducting research at Bell Laborato- nation of the theoretical bounds for ideal codes for theon the problem of improving transmission speeds transmission of information. Shannon later gave atelegraph wires when he wrote a paper on the more general logarithmic rule as the fundamental lawof intelligence (Nyquist 1924).7 This of communication theory, which stated that the quan-two factors affecting the maximum tity of information is directly proportional to theat which intelligence can be transmitted by logarithm of the number of possible messages. Ny-signal shaping and choice of codes. As quists law became a specific case of Shannons law,stated, because the number of current values is directly re-The first is concernedwith the best shape o be lated to the number of symbols that can be transmit-impressed n the transmitting medium so as o permit ted. Nyquist was aware of this relation, as his defini-of greater speedwithout undue interference either in the tion of speed of transmission indicates.circuit under consideration or in those adjacent, whilethe latter dealswith the choice of codeswhich will By the speedof transmissionof intelligence is meant thepermit of transmitting a maximum amount of number of characters, representingdifferent letters,intelligence with a given number of signal elements. figures, etc., which can be transmitted in a given length(Nyquist 1924,p. 324) of time. (Nyquist 1924,p. 333)While most of Nyquists article considered the prac- By letters, figures, etc. he meant a measure propor-engineering problems associated with transmit- tional to what Shannon later would call bits of infor-information over telegraph wires, one theoretical mation. Nyquists table, listing the relative amountwas of importance to Shannons work, entitled of intelligence transmitted, illustrates the gain in in-Possibilities Using Codes with Different formation consequent to a greater number of possibleers of Current Values. In this section Nyquist choices. That he listed the relative amount indicatesfirst logarithmic rule governing the his awareness that there is an important relationof information. between the number of figures and the amount ofNyquist proved that the speed at which intelligence intelligence (information) being transmitted. This re-be transmitted over a telegraph circuit obeys the lation is at the heart of Shannons theory of commu-nication. Nyquist did not generalize his concept of

w = k log m intelligence beyond telegraphic transmissions, how-ever.W is the speed of transmission of intelligence, Shannons other predecessor in information theory,is the number of current values that can be trans- Hartley, was also a research engineer at Bell Labora-k is a constant. He also prepared the tories. Hartleys intention was to establish a quanti-ing table by which he illustrated the advantage tative measure to compare capacities of various sys-using a greater number of current values for trans- terns to transmit information. His hope was to providemessages (Nyquist 1924, p. 334). a theory general enough to include telegraphy, teleph-Relative Amount o f ony, television, and picture transmission-communi-Intelligence that Can Be cations over both wire and radio paths. His in-

Number of Transmitted with the Given vestigation (Hartley 1928) began with an attempt toCurrent Values Number of Signal Elements establish theoretical limits of information transmis-

2 100 sion under idealized situations. This important step3 158 led him away from the empirical studies of engineering4 200 adopted by most earlier researchers and toward a5 230 mathematical theory of communication.8 300

16 400 Before turning to concrete engineering problems,Hartley addressed more abstract considerations. HeAlthough Nyquists work was primarily empirical began by making the first attempt to distinguish a

concerned with engineering issues-and he used notion of information amenable to use in a scientificcontext. He realized that any scientifically usable def-worked closely with Shannon during the war. He also had

uss ions about comm unications theory a nd engineering with-~,:I- m :..- . ._ _ -.z_:i - L L^ ,-I--wnue ~urmg wab a v~snor a~ LIK MA in 1943. See Hodgesfor details.

inition of information should be based on what hecalled physical instead of psychological consider-ations. He meant that information is an idea involvingAnnals of the History o f Computing, Volume 7, Number 2, April 1985 l 121


6/24


quantity of physical data and should not be confusedthe meaning of a message.The capacity of a system o transmit a particularsequence f symbolsdependsupon the possibility ofdistinguishing at the receiving end between the resultsof the various selectionsmadeat the sendingend. Theoperation of recognizing from the received record thesequence f symbolsselectedat the sendingend may becarried out by those of us who are not familiar with theMorse code. We could do this equally well for a sequencerepresentinga consciouslychosenmessage nd for onesent out by the automatic selectingdevice alreadyreferred to. A trained operator, however, would say thatthe sequence ent out by the automatic device wasnotintelligible. The reason or this is that only a limitednumber of the possiblesequences ave beenassignedmeanings common o him and the sendingoperator.Thus the number of symbolsavailable to the sendingoperator at certain of his selections s here limited bypsychological ather than physical considerations.Otheroperatorsusing other codesmight make other selections.Hence in estimating the capacity of the physical system totransmit information we should ignore the question ofinterpretation, make each selection perfect ly arbitrary,and base our results on the possibility of the receiversdistinguishing the result of selecting any one symbol f romthat o f selecting any other. By this means hepsychological actors and their variations are eliminatedand it becomes ossible o set up a definite quantitativemeasure f information basedon physical considerationsalone. (Hartley 1928,pp. 537-538; emphasis dded)

Hartley distinguished between psychological andl considerations-that is, between meaninginformation. The latter he defined as the numberpossible messages, ndependent of whether they areHe used this definition of information toe a logarithmic law for the transmission of infor-H = K log S

H is the amount of information, K is a constant,is the number of symbols in the message, s is theof the set of symbols, and therefore sn is thesymbolic sequencesof the specifiedhis law included the case of telegraphy andsumed Nyquists earlier law. Once the law for thetransmission of information had been estab-d, Hartley showed how it could be modified tot continuous transmission of information, as inof telephone voice transmission.Hartley turned next to questions of interference andhow the distortions of a system limit theselection at which differences between trans-symbols may be distinguished with certainty.concern was with the interference causedstorage and subsequent release of energyinduction and capacitance, a source of noise

of great concern to electrical engineers at the time.He found that the total amount of information thatcould be transmitted over a steady-state system ofalternating currents limited to a given frequency-range is proportional to the product of the frequency-range on which it transmits and the time during whichit is available for transmission.Hartley had arrived at many of the most importantideas of the mathematical theory of communication:the difference between information and meaning, in-formation as a physical quantity, the logarithmic rulefor transmission of information, and the concept ofnoise as an impediment in the transmission of infor-mation.Hartleys aim had been to construct a theory capableof evaluating the information transmitted by any ofthe standard communication technologies. Startingwith these ideas, Shannon developed a general theoryof communication, not restricted to the study of tech-nologies designed specifically for communication.Later, Shannons collaborator, Warren Weaver, de-scribed the theory clearly.The word communication will be usedhere in a verybroad sense o include all of the proceduresby whichone mind may affect another. This, of course, nvolvesnot only written and oral speech,but also music, hepictorial arts, the theatre, the ballet, and in fact allhuman behavior. In someconnections t may bedesirable o use a still broader definition ofcommunication, namely, one which would include theproceduresby meansof which one mechanism sayautomatic equipment to track an airplane and tocompute ts probable future positions) affects anothermechanism say a guidedmissilechasing his airplane).(Shannon and Weaver 1949, ntroductory Note)

What began as a study of transmission over telegraphlines was developed by Shannon into a general theoryof communication applicable to telegraph, telephone,radio, television, and computing machines-in fact, toany system, physical or biological, in which informa-tion is being transferred or manipulated through timeor space.In Shannons theory a communication system con-sists of five components related to one another, asillustrated in Figure 1 (Shannon and Weaver 1949, p.34). These components are:

1. An information sourcewhich producesa messageor sequence f messageso be communicated o thereceiving terminal. . . .2. A transmitter which operateson the messagensomeway to produce a signal suitable for transmissionover the channel.. . .

8Cherry (1952) eviews he research manatingrom or related oHartleyswork.Annals of the Hlstory of Computing, Volume 7, Number 2, April 1985


7/24

W. Aspray l Conceptualization of Information

source Transmitter

Message MessageSignal IL

then it is arbitrarily said hat the information,associatedwith this situation, is unity. Note that it ismisleading although often convenient) to say that oneor the other message onveys unit information. Theconcept of information appliesnot to the individualmessagesas the concept of meaning would), but ratherto the situation as a whole, the unit informationindicating that in this situation one has an amount offreedomof choice, n selecting a message, hich isconvenient to regard asa standard or unit amount.NoiseS O W X (Shannon and Weaver 1949,pp. 8-9)

Schematic diagram of a generalcommunication Shannon recognized that information could be(Redrawn from Shannon and Weaver (1949, p. measured by any increasing monotonic function, pro-vided the number of possible messages is finite. Hechose from among these the logarithmic function forthe same reason as Hartley: that it accords well with3. The channel is merely the mediumused o our intuition of what the appropriate measure shouldtransmit the signal from transmitter to receiver. . . . be. We intuitively feel that two punched cards should4. The receiver ordinarily performs the inverseoperation of that done by the transmitter, reconstructing convey twice the information of one punched card.the messagerom the signal. Assuming one card can carry n symbols, two cards5. The destination is the person (or thing) for whom will carry n* combinations. The logarithmic functionthe messages intended. then measures the two cards as conveying log n2 = 2log n bits of information, twice that conveyed by oneThe importance of this characterization is its appl- card (log n). Shannon chose base 2 for his logarithmicto a wide variety of communication problems, measure since log, assigns one unit of information tothe five components are appropriately a switch with two positions. Then N two-positionFor example, it applies equally well to switches could store log, 2N (=N) binary digits ofons between humans, interactions between information. If there were N equiprobable choices,

, and even to communication between parts the amount of information would be given by log, N.an organism. Properly interpreted, the communi- Shannon generalized this equation to the nonequi-between the stomach and the brain and probable situation, where the amount of informationen the target and the guided missile could be H would be given byas examples of a communication system. Usings theory, previously unrecognized connec- H = --(PI log2 PI + . . . + pn log, P,)biological and the physical worldsbe unmasked. if the choices have probabilities pl, . . . , p,,.While Hartley recognized that a distinction must Shannon recognized that this formulation of infor-drawn between information and meaning, Shannon mation is closely related to the concept of entropy,the distinction by giving the first definition since the information parameter measures the order-information sufficiently precise for scientific dis- liness of the communication channeLlo. Weaver described the importance of this defi- Quantities of the form H = -Pi logPi (the constant Kmerely amounts o a choice of a unit of measure)play a

The word information, in this theory, is used n a special central role . . . as measures f information, choice andsense hat must not be confusedwith its ordinary usage. uncertainty. The form of H will be recognizedas hat ofIn particular, information must not be confusedwith entropy asdefined in certain formulations of statisticalmeaning. mechanicswhere Pi is the probability of a systembeingIn fact, two messages,ne of which is heavily loaded in cell i of its phasespace.H is then, for example, the Hwith meaning and the other of which is pure nonsense, in Boltzmanns famousH theorem. (Shannon andcan be exactly equivalent, from the present viewpoint, as Weaver 1949)regards nformation. . . .To be sure, this word information in communication Binary digits was shortened to bits by John Tukey, a Princetontheory relates not so much to what you do say, as to University professor who also worked at Bell Laboratories (see thewhat you could say. That is, information is a measureof Annals, Vol. 6, No. 2, April 1984, pp. 152-155). The introduction ofones reedom of choice when one selectsa message.f a new term such as bit is a good indication of the introduction of anew concept.one s confronted with a very elementary situation lo Shannon refers t.he reader to Tolm ans book (1938) on statis ticalwhere he has to chooseone of two alternative messages, mechanics.

Annals of the History of Computing, Volume 7, Number 2, April 1985 l 123


8/24

1 Conceptualization of Information

Entropy has a long history in physics, and duringtwentieth century had already become closely as-with the amount of information in a physicalWeaver carefully credited these roots of Shan-Dr. Shannonswork roots back, as von Neumann haspointed out, to Boltzmanns work on statistical physics(1894), that entropy is related to missing nformation,inasmuchas t is related to the number of alternativeswhich remain possible o a physical system after all themacroscopicallyobservable nformation concerning thasbeen recorded.L. Szilard (Zsch. f. Phys., Vol. 53,1925)extended this idea to a generaldiscussionofinformation in physics, and von Neumann (Math.Foundation of Quantum Mechanics, Berlin, 1932,Chap.V) treated information in quantum mechanicsandparticle physics. (Shannon and Weaver 1949,p. 3, fn)

se relation of information to entropy is notfor information is related to the amount off choice one has in constructing messages.between thermodynamics, statistical mechan-communication theory suggests that commu-theory involves a basic and important prop-y of the physical universe and is not simply aic by-product of modern communication tech-Shannon used his theory to prove several theoreticalcation systems and to demon-

applications to the communications industry.established theoretical limits applicable to practi-communication systems.12His theory furnished adefinition of information that could be applied inwide variety of physical settings. It provided a math-the study of theoretical problemsinformation transmission and processing. ShannonWeaver continued the theoretical study of thisMeanwhile, the theory provided the basis forlinary information studies carried out byon electronic computing machines andphysical and biological feedback systems.

entrated mainly on applicationsinformation theory to communications engineering,ner stressed its application to control problemsIn the Szilard article cited by Weaver, Uber die Entropievermin-ng in einem Thermodynamischen System bei Eingriffen intel-Wesen, p. 840, Szilard discusses Maxwells Demon. Heout that the entropy lost by the gas through the separationthe high- and low-energy particles corresponds to the informationby the Demon to decide whether or not to let a particle throughAccording to Hodges (1983), Bell Labs was beginning to useons ideas in its research by 1943. Also see Cherry (1952).

involving other physical and complicated biologicalphenomena. Wiener recognized that several diverseproblems he had confronted during the war hadyielded to quite similar approaches involving feedbackcontrol and communication mechanisms. This reali-zation was the beginning of his new interdisciplinaryscience, cybernetics, which considered problems ofcontrol and communication wherever they occurred.13Many of his subsequent scientific projects were de-signed to illustrate the power of cybernetics in under-standing biological functioning.Wiener recognized the importance of his war-re-lated work to his later development of cybernetics.Elaborating on the conviction he shared with physiol-ogist Arturo Rosenblueth that the most fruitful areasfor the growth of the sciences were those which hadbeen neglected as a no-mans land between the variousestablished fields (Wiener 1948, p. S), he wrote:

We had agreedon these matters long before we hadchosen he field of our joint investigations and ourrespectiveparts in them. The deciding factor in this newstep was the war. I had known for a long time that if anational emergencyshould come,my function in itwould be determined argely by two things: my closecontact with the program of computing machinesdevelopedby Dr. Vannevar Bush, and my own ointwork with Dr. Yuk Wing Lee on the designof electricalnetworks. In fact, both proved important. (Wiener 1948,p. 9)In 1940 Wiener began work, in contact with Bushat MIT, on the development of computing machineryfor the solution of partial differential equations. Oneoutcome of this project was a proposal by Wiener,purportedly made to Bush, of features to be incorpo-rated into future computing machines. Included weremany of the features critical in the following decadeto the development of the modern computer: numeri-cal instead of analog central adding and multiplyingequipment, electronic tubes instead of gears or me-

chanical relays for switching, binary instead of deci-mal representation, completely built-in logical facili-ties with no human intervention necessary after theintroduction of data, and an incorporated memorywith capability for rapid storage, recall, and erasure.For Wiener the importance of, and presumably thesource of, these suggestions was that they were allideas which are of interest in connection with thestudy of the nervous system (Wiener 1948, p. 11).Wiener may have been the first to compare explicitlyfeatures of the electronic computer and the humanI3 Wieners interest in these subjects had first been piqued beforethe war in an interdisciplinary seminar he attended at the HarvardMedical School (discussed later in this section). See Wieners intro-duction to Cybernetics (1948) for details.



9/24


His comments certainly illustrated the simi- These feedback problems often reduced to partialof structure in diverse settings, which he em- differential equations representing the stability of thelater in his cybernetics program. system. Wieners third war-related project, the workAnother war-related program, undoubtedly the with Lee on wave filters, reinforced the close tie toimportant to Wieners formulation of cybernet- information theory; the purpose of that research wasvolved the development of fire-control apparatus to remove extraneous background noise from electricalantiaircraft artillery. This problem was made ur- networks.at the beginning of the war by the threat of Wiener contributed significantly to the mathemat-attack on the weakly defended English ical theory underlying these diverse engineering prob-The appreciable increase in velocity of the new lems. Like Shannon, Wiener was moving from the artaircraft made earlier methods for directing of engineering to the precision of science. Usingircraft fire obsolete. Wieners research suggested the statistical methods of time-series analysis, he wasnew device for antiaircraft equipment might able to show that the problem of prediction could beely incorporate a feedback system to direct solved by the established mathematical technique ofThus Wiener and Julian Bigelow minimization.toward a theory of prediction (for the flight of Minimization problemsof this type belong to aon its effective application to the anti- recognizedbranch of mathematics, he calculusof

problem at hand. variations, and this branch has a recognized echnique.It will be seen hat for the second ime I had become With the aid of this technique, we were able to obtain anengagedn the study of a mechanico-electricalsystem explicit best solution of the problem of predicting thewhich was designed o usurp a specifically human future of a time series,given its statistical nature; andfunction-in the first case, he execution of a even further, to achieve a physical realization of thiscomplicatedpattern of computation; and in the second, solution by a constructible apparatus.the forecasting of the future. (Wiener 1948,p. 13) Once we had done this, at least one problem ofengineeringdesign ook on a completely new aspect. nBigelow and Wiener recognized the importance of general,engineeringdesignhasbeen held to an artconcept of feedback in a number of different rather than a science.By reducing a problem of this sortand biological systems. For exam- to a minimization principle, we had established hethe movement of the tiller to regulate the direction subject on a far more scientific basis. t occurred to usship was shown to invblve a feedback process that this was not an isolated case,but that there wasailar to that used in hand-eye coordinations neces- whole region of engineeringwork in which similarto pick up a pencil. (The Wiener-Bigelow work designproblemscould be solvedby the methodsof theapparently never implemented in a fire-control calculusof variations. (Wiener 1948,p. 17)The recurrence of similar problems of control andWiener quickly saw that the mathematics of feed- communication in widely diverse fields of engineeringcontrol was closely associated with aspects of sta- and the availability of a mathematical theory withmechanics, and information theory. which to organize these problems led Wiener to theOn the communication engineeringplane, it had already creation of his new interdisciplinary science of cyber-becomeclear to Mr. Bigelow and myself that the netics. We have decided to call the entire field ofproblemsof control engineeringand of communication control and communication theory, whether in theengineeringwere nseparable,and that they centered not machine or in the animal, by the name of Cyberneticsaround the technique of electrical engineeringbutaround the much more fundamental notion of the (Wiener 1948, p. 19).message, hether this should be transmitted by Long before Wieners formulation of the science ofelectrical, mechanical,or nervous means.The messages cybernetics in 1947, results had been obtained thata discreteor continuous sequence f measurable vents Wiener included as cybernetic, The word cyberneticsdistributed in time-precisely what is called a time- derived from the Greek kybernetes (steersman). Ky-seriesby the statisticians. (Wiener 1948,p. 16) bernetes in Latin was gubernator, from which our wordgovernor derived. The connotations, both of a steers-

Both Turing and von Neumann made direct com parisons betweencomputer and the brain publicly in the 1950s. McCulloch and

ight a lso be regarded as having made this comparison injoint paper (McCulloch and Pitts 1943). It is hard towhen, if ever, Wiener first made the comparison. He sugges ts

the introduction to Cybernetics a date as early as 1940. Theothers have searched unsuc cessfu lly for written docu-

man of public policy and of a self-regulating mecha-nism on a steam engine, are faithful to the wordsancient roots. The governor on a steam engine is afeedback mechanism that increases or decreases thespeed of the engine depending on its current speed.Maxwell published a paper (1868) giving a mathemat-ical characterization of governors. Similar feedbackAnnals of the History of Computing, Volume 7, Number 2, April 1985 l 125


10/24

- Conceptualization of Information

hanisms were discussed by the physiologist Ber-his discussion of homeostasis, the means byich an organism regulates its internal equilibrium.history of feedback control, see Mayr (1970).Cannon (1932) for a discussion of Bernardsrk.) In the 1930s and 1940s Nyquist, H. S. Black,H. W. Bode renewed the study of feedback inpractical and theoretical studies of amplifiersother electrical devices.Although Wiener only arrived at the name cyber-in 1947, as early as 1942 he had participated inmeetings to discuss problems centralthe subject. One early meeting held in New York inunder the auspices of the Josiah Macy Founda-s devoted to problems of central inhibition innervous system. Bigelow, Rosenblueth, and Wie-read a joint paper, Behavior, Purpose, Teleology

et al. 1943), which used cybernetic prin-examine the functioning of the mind. Vonmann and Wiener called another interdisciplinarying at Princeton early in 1944. Engineers, phys-and mathematicians were invited to discussnetic principles and computing design. As Wie-At the end of the meeting, it had becomeclear to all thatthere was a substantial commonbasisof ideasbetweenthe workers of the different fields, that people n eachgroup could already usenotions which had beenbetterdevelopedby the others, and that someattempt shouldbe made o achieve a commonvocabulary. (Wiener 1948,P. 23)In fact, from discussions with electrical engineersand down the East Coast, Wiener reported, Every-re we met with a sympathetic hearing, and thelary of the engineers soon became contami-with the terms of the neurophysiologist and theologist (Wiener 1948, p. 23). In 1946 McCullochfor a series of meetings to be held in Newthe subject of feedback-again under thes of the Josiah Macy Foundation. Among those

a number of these meetings were the math-Wiener, von Neumann, and Pitts, thets McCulloch, Lorente de No, and Rosen-and the engineer Herman H. Goldstine (whos associated with the ENIAC, EDVAC, and IAS com-Thus, there was widespread interac-the United States among the participants innew information sciences. Typical of internationalwas a visit by Wiener to England andwhere he had a chance to exchange informa-on cybernetics and artificial intelligence withthe National Physical Laboratory atand mathematical results on the relationstatistics and communication engineering withnch mathematicians at a meeting in Nancy.

Wieners cybernetics work paid at least as muchattention to biological as to electromechanical appli-cations. This interest was rooted in his participationin a series of informal monthly discussions in the1930s on scientific method led by Walter Cannon atthe Harvard Medical School. A few members of theMIT faculty, including Wiener, attended these meet-ings. Here Wiener met Rosenblueth, with whom hewas to collaborate on biocybernetics throughout theremainder of his career.Bigelow and Wiener, perhaps as a result of theirwork on antiaircraft artillery, pointed to feedback asan important factor in voluntary activity. To illus-trate, Wiener described the process of picking up apencil. He pointed out that we do not will certainmuscles to take certain actions-instead, we will topick the pencil up.

Once we have determined on this, our motion proceedsin such a way that we may say roughly that the amountby which the pencil is not yet picked up is decreased teachstage.This part of the action is not in fullconsciousness.To perform an action in such a manner, there must bea report to the nervous system,consciousorunconscious, f the amount by which we have failed topick the pencil up at each nstant. (Wiener 1948,p. 14)They advanced their claims about the biological im-portance of feedback mechanisms so far as to usethem to explain pathological conditions retarding vol-untary actions such as ataxia (where the feedbacksystem is deficient) and purpose tremor (where thefeedback system is overactive). Such initial successesassured Wiener that this approach could provide val-uable new insights into neurophysiology.

We thus found a most significant confirmation of ourhypothesisconcerning the nature of at least somevoluntary activity. It will be noted that our point of viewconsiderably ranscended hat current amongneurophysiologists.The central nervous systemnolonger appearsas a self-contained organ, receivinginputs from the senses nd discharging nto the muscles.On the contrary, someof its most characteristicactivities are explicable only as circular processes,emerging rom the nervous system nto the muscles, ndre-entering the nervous system hrough the senseorgans,whether they be proprioceptors or organsof thespecialsenses. his seemedo us to mark a new step inthe study of that part of neurophysiology whichconcernsnot solely the elementary processes f nervesand synapses ut the performance of the nervous systemas an integrated whole. (Wiener 1948,p. 15)The revelation that cybernetics provided a new ap-

proach to neurophysiology prompted the joint paperby Rosenblueth, Wiener, and Bigelow. As the titleindicates, they gave an outline of behavior, purpose,and teleology from a cybernetic approach. They arguedl Annals of the History of Computing, Volume 7, Number 2, April 1985


11/24


teleological behavior thus becomes synonymousth behavior controlled by negative feedback, andtherefore in precision by a sufficiently restricted(Rosenblueth et al. 1943, pp. 22-23).also argued that the same broad classificationsbehavior (see Figure 2) hold for machines as holdr living organisms. The differences, they main-are in the way these functional similarities areout: colloids versus metals, large versus smallces in energy potentials, temporal versus spa-multiplication of effects, etc.

BehaviorehaviorNonactivk Jctiveonactivk Jctive(passwe)passive)

/ Pl$osef ulPl$osef ulonpurposefulonpurposeful(random)random)

Nonfeedback hbackonfeedback hback(nonteleological)nonteleological) (teleological)teleological)Nonpredictiveonpredictive / 1.1.Predictweredictwe

(nonextrapolative)nonextrapolative) (extrapolative)extrapolative)For the most part, however, Wieners work in bio-was less philosophical and more physio-joint paper with Rosenblueth andwould indicate. More typical was a joint proj-between Rosenblueth and Wiener on the muscleof a cat (Wiener 1948, pp. 28-30). In this

they used the (cybernetic) methods of McCallon servomechanisms to analyze the system iname way one would study an electrical or me-system, while using data provided by theirexperimentation on cats. For the re-er of his career Wiener split his time betweendge and Mexico City, where Rosenbluethat the national medical school. Wiener couldcontinue his collaboration with Rosenblueth onseries of physiological projects utilizing the cyber-approach to understand physiological processesbiological organisms.

Figure 2. Behavior classifications. Note that predictivemeans order of prediction (depending on the number ofparameters). (Redrawn from Rosenblueth et al. (1943, p.21).)

Walter Pitts, and theof Mathematical Models of the

r biological application of the new informations to the study of nerve systems, and into the study of the human brain. Mathe-al models resulted, based partly on physiologypartly on philosophy. The most famous applica-was made in a joint paper by McCulloch and Pittsin which they presented a mathematical modelthe neural networks of the brain based on Carnapsalculus and on Turings work on theoretical

rial functioning, that this functioning is amenable toscientific study, and consequently that the brain isamenable to mathematical analysis. Beginning withthe work of H. von Helmholtz, T. A. Ribot, and Wil-liam James at the end of the nineteenth century, phys-iological psychology became identified with the studyof the physiological underpinnings of behavior and ex-perience (Murphy 1949). The subject drew heavily onphysiological research concerning the central nervoussystem. Of special importance was the work of W.Waldeyer and C. S. Sherrington (Sherrington 1906).Waldeyers neurone theory, which argued for the in-dependence of the nerve cells and the importance ofthe synapses, was quickly accepted by psychologistsand became the basis of the next generation of phys-iological study of the brain. Sherringtons work on re-flex arc was instrumental in convincing psychologiststhat they should consider a neurophysiological ap-proach. McCulloch pointed explicitly to Sherringtonswork as a precursor of his own research. Both menadopted a highly idealized approach. Sherrington re-alized that his model of simple reflex was not physio-logically precise, but only a convenient abstraction.McCulloch and Pitts made a similar claim for theirmodel of neuron nets, but their model was even less re-alistic than the Sherrington model.

The application of the information sciences to psy-was linked to several active movements withinThe rise of physiological psychology, thement of functionalism, the growth of behav-and the infusion of materialism into the bio-and psychological sciences all contributed tostudy by mathematical models of the functioninge brain.

Research in physiological psychology had been car-ried out since the beginning of the century. Its impor-tance increased rapidly in the 1930s because of twodevelopments: the implementation of electroence-phalography enabled researchers to make precisemeasurements of the electrical activity of the brain;and the growth of mathematical biology, especiallyunder Nicholas Rashevskys Chicago school, contrib-uted a precise, mathematical theory of the functioningof the brain that could be tested experimentallv.15 ThePhysiological psychology was important to infor-on science because it contributed the idea thatcan understand the brain by examining its mate- I5 The Rashevsky scho ol published mainly in its own journal, Bul-letin of Mathematical Biophysics.Annals of the History of Computing, Volume 7, Number 2, April 1985 l 127


12/24


on the material properties of the brain, chanical thinking machines.17 Watsons behaviorismemphasis on its functioning instead of on its states did not convince the majority of American psycholo-consciousness, and the mathematical approach of gists, but a group of dedicated behaviorists did conductall set the stage for a mathematical theory experiments using the condition-response method.the functioning brain as an information processor. Most influential on McCulloch and Pitts was the workMost physiological psychologists and many other in the 1930s of the behaviorist K. S. Lashley, whoseists accepted the functionalist position. In viewpoint is indicated by the following quotation.ples of Psychology (1890) William James argued To me the essence f behaviorism s the belief that thely that mind should be conceived of dynam- study of man will reveal nothing except what isnot structurally, and by the end of the nine- adequatelydescribable n the conceptsof mechanicsandh century there was a consensus that psychology chemistry, and this study far outweighs he questionofconcentrate on mental activity instead of on the method by which the study is conducted. (Lashleyof experience. E. B. Holt took a radical-and 1923, p. 244)t cybernetic-position16 (Holt 1915) toward psy- McCulloch was trained within this psychologicalwhen he argued that consciousness is merely tradition of experimental epistemology. As an under-servomotor adjustment to the object under consid- graduate at Haverford and Yale, he majored in philos-on. As one historian of psychology has assessed ophy and psychology. He then went to Columbia,

importance of functionalism, where he received a masters degree in psychology forFunctionalism did not long maintain itself asa school; work in experimental aesthetics. Afterward, he en-but much of the emphasisived on in behaviorism . . . tered the College of Physicians and Surgeons of Co-and in the increasing endency to ask lessabout lumbia University, where he studied the physiology ofconsciousness, ore about activity. (Murphy 1949,p. the nervous system.223) In 1928 was n neurology at Bellevue Hospital and inThe importance of functionalism to information 1930at Rockland State Hospital for the Insane, but myis clear. It concentrated on the functional purpose to manufacture a logic of transitive verbs)never changed. t was hen that I encounteredEilhardof the brain, and represented the brain as a von Doramus, he great philosophic student of(of information)-as a doer as well as a psychiatry, from whom I learned to understand the

logical difficulties of true cases f schizophrenia and theBehaviorist psychology, by concentrating on behav- developmentof psychopathia-not merely clinically, asand not consciousness, helped to break down the he had learned hem of Berger, Birnbaum, Bumke,between the mental behavior of humans Hoche, Westphal, Kahn, and others-but as hethe information processing of lower animals and understood hem from his friendship with Bertrand. This step assisted the acceptance of a uni- Russell,Heidegger,Whitehead, and Northrop-underd theory of information processors, whether in hu- the last of whom he wrote his great unpublished hesis,The Logical Structure of the Mind: An Inquiry into theor machines.American behaviorism was a revolt against the old- Foundations of Psychology and Psychiatry. It is to himand to our mutual friend, CharlesHolden Prescott, thatintrospective psychology of Wilhelm Wundt and I am chiefly indebted for my understandingof paranoiaB. Titchener. As part of the general shift in scien- uera and of the possibility of making the scientificc attitude, there was a movement in psychology method applicable o systemsof many degrees fard materialism in the last half of the nineteenth freedom. (McCulloch 1965,pp. 2-3)y. Because behavior is observable, it can be McCulloch left Rockland to return to Yale, whereto scientific study. J. B. Watson, the leader he studied experimental epistemology with Dusser deAmerican behaviorism, concentrated on scientific Barenne. Upon de Barennes death, he moved to theas effector, receptor, and learning as University of Illinois Medical School in Chicago as ato the old concepts of sensation, feeling, and professor of psychiatry; he continued his work one (Watson 1919). Watson conceived of mental experimental epistemology and began his collabora-ons as a type of internal behavior that could be tion with Pitts. He completed his career at the MITby scientific probing. Turing adopted a similar Research Laboratory of Electronics, where he collab-in his unified treatment of human and me- orated with Pitts, Wiener, and others in the study of

electrical circuit theory of the brain.Of course, this is an anachr onistic characterization becaus e ther was not yet invented and the principles of cybernetics I7 Th is attitude is most evident in Turing s 1950 paper, Computingnot yet been enunciated. Nevertheless, there is a striking Machinery and Intelligence, but his 1937 paper, On Computableto those later ideas. Numbers, also sugge sts the view.



13/24


14/24


Using as axioms the rules McCulloch prescribed forhis psychons and as logical framework an amalgam ofCarnaps logical calculus and Russell and Whiteheadsrincipia Mathematics, McCulloch and Pitts pre-sented a logical model of neuron nets showing theirunctional similarity to Turings computing ma-hines.lgWhat Pitts and I had shown was hat neurons hatcould be excited or inhibited, given a proper net, couldextract any configuration of signals n its input. Becausethe form of the entire argument was strictly logical, andbecauseGodel had arithmetized logic, we had proved, insubstance, he equivalenceof all generalTuringmachines-man-made or begotten. (McCulloch 1965,pp.9-10)As von Neumann emphasized in his General andogical Theory of Automata (1951), the essence ofMcCulloch and Pittss contribution was to show howany functioning of the brain that could be describedlearly and unambiguously in a finite number of wordsould be expressed as one of their formal neuron nets.he close relationship between Turing machines andeuron nets was one of the goals of the authors; by1945 they understood that neuron nets, when suppliedh an appropriate analog of Turings infinite tape,re equivalent to Turing machines.20 With the Tur-ng machines providing an abstract characterizationf thinking in the machine world and McCulloch andittss neuron nets providing one in the biologicalld, the equivalence result suggested a unified the-of thought that broke down barriers between thebiological worlds.Their paper not only pointed out the similarity inct function between the human brain and com-ng devices; it also provided a way of conceiving ofn as a machine in a more precise way thanbeen available before. It provided a means forstudy of the brain, starting from a precise

But we had done more than this, thanks to Pittsmodulo mathematics. In looking into circuits composedof closedpaths of neuronswherein signalscould

9 t is more propero say that they had shown an exact correspond-nce between the clas s of Turing machines and the clas s of neuralets, such that each Turing machine corresponded to a functionallyquivalent neural net, and vice versa. Later, M cCulloch and Pittsxplicitly stated that Turings work on computable numbers was

heir inspiration for the neural net paper. See McCulloch s commentn the discu ssion following von Neumann (1951). McCulloch alsolludes to this (1965, p. 9).Arthur Burks has pointed out that because the threshold functions

all positive, the addition of a clocked source pulse w ill make theuniversal (private comm unication). Von Neumann (1945)he first to see that the s witches and delays of a stored-programcould be described in McCulloch and Pitts notation. Thi s

on led him to the logica l equivalence of finite nets and thetables describing Turings machines .

reverberate, we had set up a theory of memory-towhich every other form of memory is but a surrogaterequiring reactivation of a trace. (McCulloch 1965,p. 10)In a series of papers (McCulloch 1945; 1947; 1950;1952), McCulloch and Pitts carried out the mathe-matical details of this theory of the mind, providing,for example, a model of how humans believe universal(for all) statements.The precision of their mathematical theory offeredopportunity for additional speculation about the func-tioning of the mind. This precision was accomplishedat the expense of a detailed theory of the biologicalstructure and functioning of the individual nerve cells.Similar to Sherringtons model of the simple reflex,which he had called a convenient abstraction,McCulloch and Pittss neurons were idealized neu-rons. One knew what the input and output would be;but the neurons themselves were black boxes, closedto inspection of their internal structure and operation.Practicing physiologists objected21 (von Neumann1951) that not only was this model of neurons incom-plete, it was inconsistent with experimental knowl-edge. They argued further that the simplicity of theidealized neuron was so misleading as to vitiate anypositive results the work might achieve. Von Neu-mann popularized McCulloch and Pittss work amongbiologists, arguing that the simple, idealized nature ofthe model was necessary to understand the function-ing of these neurons, and contending that once this

was understood, biologists could account more easilyfor secondary effects related to physiological detailsof the neurons.McCulloch and Pitts were able to use their mathe-matical theory to analyze a number of aspects of thefunctioning of the human nervous system. A 1950article by McCulloch, entitled Machines that Thinkand Want (McCulloch 1950b), provided a prospectiveof the possible applications of their theory, emphasiz-ing especially the application of cybernetic techniquesto understanding the functioning of the central nerv-ous system. Typical of McCulloch and Pittss appli-cation of information theory to physiology was a jointproject in 1947 on prosthetic devices designed to en-able the blind to read by ear (Wiener 1948, pp. 31-32;de Latil 1956, pp. 12-13). The problem resolved intoone of pattern recognition involving the translation ofletters of various sizes into particular sounds. Usingcybernetic techniques, McCulloch and Pitts produceda theory correlating the anatomy and the physiologyof the visual cortex that also drew a similarity betweenhuman vision and television.l McCulloch and Pitts recognized and admitted in their paper thattheir neurons were highly idealized and did not fit all the empirica levidence about neurons.



15/24

W. Aspray l Conceptualization of information

Turing, Automata Theory, and Artificial These machines had arbitrarily large amounts of stor-age space and computation time-and unlimited flex-ibility of programming-so they represented a theo-le McCulloch and Pitts endeavored to show how retical bound on computability by physical machine.physical science of mathematics helped to explain Because of the precise mathematical characterizationical functioning of the brain, Turing was busy of these limits, Turing machines served as theating how the computer, a product of the starting point for the modern theory of automata.sciences, could mimic certain essential fea- Moreover, because they were consciously designed tof the biological thinking process.22 In fact, provide a formal analog of how the human compu-most famous paper, On Computable Num- ter functions, they gave a rudimentary, but pre-(1937), presented a basic mathematical model of cise, mathematical model of how the mind functionscomputer, proved several fundamental mathemat- when carrying out computations. This point wastheorems about automata, and marked the first not lost on McCulloch and Pitts, who used Turingsin his lifelong battle to break down what he saw machine characterization as the basis for their char-an artificial distinction between the computer and acterization of human neuron nets as informationbrain. His postwar work at the National Physical processors.designing the ACEcomputer can be viewed In fact, Turing attempted to model his machines,

an attempt to determine whether his theoretical down to specific functional details, after the way thecould be built in the metal. His later human carries out computations. His machines wereramming work at Manchester offered perhaps the each supplied with a tape (the analogue of paper)st attempt to achieve artificial intelligence by means divided into squares that were scanned for symbols.programming stored-program computers instead of The square being scanned at a given time, he wrote,specialized hardware to exhibit particular is the only one of which the machine is directlyof intelligent behavior.23 aware. The machines were designed so that by alter-Turing showed an early proclivity toward mathe- ing the internal configuration they could rememberand computing science. As an undergraduate some of the symbols they had seen previously. Tur-Cambridge in the mid-1930s he first encountered ing concluded this anthropomorphic description of thens hypothesis and sought to calculate me- machine by stating:the real parts of the zeros of the zetaHe returned to this approach several times, We may now construct a machine to do the work ofthis [human] computer. To each state of mind of theunsuccessfully before, during, and after the war [human] computer correspondsan m-configuration ofsettle the hypothesis using computing equipment. the machine.The machine scansB squaresproject and an interest in the computability and corresponding o the B squares bservedby the [human]problems of Kurt Godel and other logi- computer.. . . The move which is done, and thethe 1930s led Turing to speculate on the succeeding onfiguration, are determinedby the scannedof which numbers in mathematics are me- symbol and the m-configuration. , . . A computingcomputable. Upon reflection, he concluded machinecan be constructed to compute . . , the sequencethe mechanically computable numbers are exactly computedby the [human] computer. (Turing 1937,pp,that can be computed by the theoretical ma- 231-232)

described in his 1937 paper, known today as This paper was the first of many occasions on whichTuring publicly expressed his convictions that com-The computable-numbers paper made other con- puters and human brains carry out similar functions-as well. The Turing machine provided a information processing, to use modern terminology-matically precise characterization of the basic and, consequently, that there is no reason to believections and components common to all computing that machines will not be able to exhibit intelligenty, arithmetic, input, and behavior.functions were described for each machine.24 In 1936 Turing visited Princeton University for ayear to study mathematical logic with Alonzo Church,attitude is clear in On Computable Numbers (Turing who was pursuing research in recursion theory, anwhere Turing explicitly modeled the machine processes after

processes of a human carrying out mathematicalSee the discussion later in the text. Also see Hodges 24Although functions were differentiated, components were not.Memory, input, and output were all located on the tape. ControlSee Hodges (1983) for details. John McCarthy was the firs t person and arithmetic were both housed in the rules of description of thepoint this out to the author (private communication). machine.



16/24

- Conceptualization of Information

of logic with direct relation to Turings work on In Intelligent Machinery Turing began by ad-table numbers. Turing decided later to stay, dressing the question: What happens when we makecompleted a Ph.D. in 1938. Immediately afterward up a machine in a comparatively unsystematic wayreturned to England because of the worsening from some kind of standard components? (Turingpolitical situation. Soon the war was upon 1970, p. 9). He called these unorganized machines andnd, and Turing volunteered to work at Bletchley created a new mathematical theory for analyzingk, where the British were trying to break mechan- them, based on the flow-diagramming techniques ofoduced German codes with the aid of sym- Goldstine and von Neumann. The major aim of theequipment. There he learned about paper was to determine what sorts of machines couldand about the design and use of electro- be constructed to display evidence of intelligence.ical and electronic calculating devices. This After a lengthy analysis, Turing concluded that thence proved invaluable after the war when he best approach was not that of robotics-of buildingas hired to design a computing machine (ACE) for specialized hardware to mimic the various aspects ofNational Physical Laboratory in Teddington. human intelligence-because he felt that the resultIn many ways, the NPL design represented more a would always fall short of its human model in somecal embodiment of his theoretical machines than aspect or another-the human having so many prop-machine for practical use. For example, Turing was erties incidental to intelligence. Instead, he decided

mined not to construct additional hardware that the best approach was to simulate human mentalnever software could achieve the same end, no behavior on a general-purpose computer in such a wayhow roundabout the solution. He would also that the computer would react to purely mental activ-alter or construct software in order to make the ities (among which he counted games such as chess,job easier (Carpenter and Doran 1977). An- language learning and translation, mathematics, andindication of Turings lack of interest in the cryptography) in the same way the human respondsside of computing was his decision to leave to these activities. Indeed, Turing was among theACE was completed. While several factors earliest, if not the earliest, to see the advantages ofre probably involved in his decision to leave NPL the software-simulation approach to artificial intelli-assume chief programming responsibilities for the gence. His approach was in marked contrast to thatw Manchester computer), it seems clear that he had of other British researchers, such as Grey Walter or

his theoretical machines could Ross Ashby, who favored using robotics to achieveembodied physically. At least partly for this reason, artificial intelligence (Ashby 1947; 1950; 1952; 1956a;his interest in the ACE project was dissipated. 1956b; Walter 1953).Turings work in Manchester was among the earliest Turings specific plan for an intelligent machine hasof the use of electronic computers for the adventure of a science fiction story. He reasonedresearch. He was among the first that a thinking machine should be given the essen-believe that electronic machines were capable of tially blank mind of an infant, instead of an adultnot only numerical computations, but also gen- mind replete with fully formed opinions and ideas.information processing. He was con- The plan was to incorporate a mechanism, analogouswould soon have the capacity to to the function of childhood education, by which thery out any mental activity of which the human infant electronic brain could be educated. The possi-d is capable. He attempted to break down the bility of such an approach depended on Turings beliefctions between human and machine intelligence in nature over nurture, and on his understanding ofto provide a single standard of intelligence, in the human cortex.mental behavior, upon which both machines We believe then that there are large parts of the brain,biological organisms could be judged. In providing chiefly in the cortex, whose unction is largelystandard, he considered only the information that indeterminate. In the infant theseparts do not haveexited the automata. Like Shannon and much effect: the effect they have is uncoordinated. InTuring was moving toward a unified theory the adult they have great and purposive effect: the forminformation and information processing applicable of this effect dependson the training in childhood. Aboth the machine and the, biological worlds. The large remnant of the random behavior of infancyof this theory can be found in two papers he remains n the adult.All of this suggestshat the cortex of the infant is anote at the time, Intelligent Machinery and Com- unorganized machine, which can be organized byMachinery and Intelligence (Turing 1950; suitable nterference training. (Turing 1969,p. 16)in addition to being found in his Manchester Turings plan called not only for an unorganizedramming activities. machine, but also for a method by which the machine

m Annals of the History of Computing, Volume 7, Number 2, April 1985


17/24


change. Turing believed that humans learn from John von Neumann and the General Theory ofcreated by other humans, so he pro- Automatathat interference be designed into the educationcomputers. For interference to instigate a learning Von Neumann is the culminating figure of the earlyce, the machine needed to be able to adapt to period in the information sciences because of his workoutside stimulus. Turing achieved a learning ca- toward a unified scientific treatment of informationby including a Pavlovian pleasure/pain mech- encompassing the work of all six scientists featured in

sm with which humans could reinforce or disar- this article. He had social contacts as well as intellec-the machines circuitry by means of electrical tual interests in common with the other scientistsaccording to whether the machine showed the studying information. He discussed computers andbehavior in reaction to the stimulus. artificial intelligence with Turing when they wereIn the second paper, Computing Machinery and together in Princeton in 1937 and 1938. He had an Turing continued his assault on what active correspondence with Wiener. He and Wienersupposed to be an artificial distinction between the were the principal organizers of the interdisciplinary

and the brain. The paper began with a Princeton meetings in 1943 on cybernetics and com-of nine of the objections he had heard most puting. A paper by Pitts on the probabilistic nature ofto the possibilities of intelligent machinery, such neuron nets started von Neumann on his research inmachines do not have the consciousness to write, probabilistic automata (McCulloch 1965, Introduc-, a sonnet according to their emotions, except by a tion).manipulation of symbols. While Turings re- Early in his career von Neumann made valuablewere characteristically interesting and ingen- contributions to several areas of mathematical logic.perhaps a more important contribution of the His first discussions of automatic computing machin-was his presentation of the imitation game. ery were with Turing in Princeton. Problems of appliedassumed a behaviorist approach to the ques- mathematics related to the war effort required vonCan machines think? The imitation game of- Neumann to seek additional computing power possibleprecise way of answering this question. In the only with electronic computing equipment. Acciden-, an interrogator was able, by terminal, say (in tally, von Neumann heard about the computer projecthe respondents remain unseen by the being carried out for Army Ordnance at the Universityto ask questions of and receive re- of Pennsylvania. He was soon involved with J. Presperses from a human and a computer. If in a statis- Eckert and John Mauchlys group, which was thensignificant number of cases the interrogator placing the finishing touches on the ENIAC and begin-not determine which was which, then, claimed ning to work on the design plans for the EDVAC. Thethe machine could be said to think because it upshot was his central role in the logical design of thethe same mental behavior as the human. EDVAC (incorporating ideas from mathematical logic)est provided the first precise criterion for deter- (von Neumann 1945) and his leadership of the Insti-

tute for Advanced Study computer project.At Manchester, Turing attempted on a small scale Von Neumanns war-related computer activitiesprogram existing computing equipment to carry out spurred his further interest in theoretical issues of theal activities. For example, he programmed the information sciences. His main concern was for de-computer to play chess (weakly) and to veloping a general, logical theory of automata. Hismathematical problems.25 (Turing 1953a; hope was that this general theory would unify thee recognized that large efforts were required work of Turing on theoretical machines, of McCullochr machinery to exhibit any significant amount of and Pitts on neural networks, and of Shannon onhowever, and estimated optimistically communication theory. Whereas Wiener attempted towould require a battery of programmers 50 unify cybernetics around the idea of feedback andrs of full-time work to bring his learning machine control problems, von Neumann hoped to unify thechildhood to adult mental maturity, Turings various results, in both the biological and mechanicaldeath in 1954 did not allow him sufficient time realms, around the concept of an information proces-bring these or more modest ideas to fruition. In sor-which he called an automaton. (The term au-

it is hard to assess what effect Turing might have tomaton had been in use since antiquity to refer to aon the development of computer theory and prac- device that carries out actions through the use of ahad he lived longer. hidden motive power; von Neumann was concerned

Both Turing and Shannon had an interest in chess programming. with those automata whose primary action was theost accessible account is found in Hodges (1983). processing of information.)



18/24

Aspray - Conceptualization of Information

The task of constructing a general and logical theory a study of the highly complicated behavior of orga-automata was too large for von Neumann to carry nisms such as computers or the human nervous sys-in detail within the final few years of his career. tern-which would be impossible unless such regular-he attempted to provide a programmatic ities and simplifications were assumed.for the future development of the general Of course, as many neurophysiologists criticized,limited himself to developing specific as- the disadvantage of such an approach was its inherent, including the logical theory of automata, the inability to test the validity of its axioms againstl theory of automata, the theory of complex- physiological evidence. These critics suggested thatand self-replication, and the comparison of the accepted physiological evidence indicated that the sit-

and the brain. uation in the human nervous system is not as simpleVon Neumanns general program for a theory of as von Neumanns analysis made it out to be. Further,ta was laid out in five documents, which also they argued, even accepting von Neumanns simplifi-his specific contributions. cations, one learns nothing about the physiological1. The General and Logical Theory of Automata, operation of the individual elements. Nonetheless, vonat the Hixon Symposium on September 20,1948, Neumann was convinced that the axiomatic approach,Pasadena, California (von Neumann 1951). which had worked so successfully for him in clarifying2. Theory and Organization of Complicated Au- complicated situations in quantum mechanics, logic,ta, a series of five lectures delivered at the and game theory, was the best way to begin to under-of Illinois in December 1949 (von Neu- stand the problems of information processing in com-n 1966, pp. 29-87). plicated automata such as electronic computers or the3. Probabilistic Logics and the Synthesis of Reli- human nervous system.

Organisms from Unreliable Components, based Von Neumann pointed to Turings work on com-by R. S. Pierce of von Neumanns putable numbers and to McCulloch and Pittss axio-s in January 1952 at the California Institute of matic model of the neural networks of the brain as(von Neumann 1961-1963, V, pp. 329- the two most significant developments toward a for-ma1 theory of automata and indicated how each of4. The Theory of Automata: Construction, Repro- these developments was equivalent to a particulara manuscript written by von system of formal logic. Although von Neumann be-

nn in 1952 and 1953, completed and edited by lieved these were important steps toward a mathe-ur Burks (von Neumann 1966, pp. 89-380). matical theory of automata, he was dissatisfied with5. The Computer and the Brain, a series of lec- what the approach of formal logics could contributevon Neumann intended to deliver as the Silliman to a theory of automata useful in the actual construc-at Yale University in 1956. They were never tion of computing machinery.ed or delivered because of von Neumanns Von Neumann pointed out, for example, that formalncer, but were published posthu- logic has never been concerned with how long a finiteNeumann 1958). computation actually is. In formal logic all finite com-Von Neumanns ultimate aim in automata theory putations receive the same treatment. Formal logic

as to develop a precise mathematical theory that does not take into consideration the important factld allow comparison of computers and the human for the theory of computing that certain finite com-system. His concern was not for the particular putations are so long as to be practically prohibitive,or physiological devices that carry out the or even practically impossible if they require morermation processing, but only for the structure and time or space than there is in the physical universe.ng of the system. Second, he pointed out that in practice people allot aVon Neumann treated the workings of the individ- designated fixed time to completion of their compu-components of the systems, whether natural or tations-a fact to which formal logics are not sensi-as black boxes, devices that work in a well- tive. Finally, he observed that at each step in a

ay, but whose internal mechanism is un- computation there is a nonzero probability of error;(and need not be known for his purposes). The consequently, if computations were allowed to becomeapproach amounted to axiomatizing the arbitrarily long, the probability of a reliable compu-of the elements. The advantage, he pointed tation would approach zero. These considerations ledwas that all situations were idealized and that him to suggest that the formal logical approach becomponents were assumed to act universally modified in two ways to develop a logic of automata:a precise, clear-cut manner. This precision allowed by considering the actual lengths of the chain of

* Annals of the History of Computing, Volume 7, Number 2, April 1985


19/24

W, Aspray l Conceptualization of Information

g, and by allowing for a small degree of errorlogical operations. He indicated that such a logicld be based more on analysis (the branch of math-and less on combinatorics than is formalfact, it would resemble formal logic less thanwould resemble Boltzmanns theory of thermody-which implicitly manipulates and measures arelated to information.Von Neumanns overriding concern in the develop-nt of a statistical (also known as probabilistic)of information was the question of reliabilityautomata with unreliable components. His aimsre a theory that would determine the likelihood ofrs and malfunctions and a plan that would makers that did occur nonlethal. The problem ofled him to abandon a logical and adopt aapproach. He pointed out that the logical

the statistical theories were not distinct. Adoptingwell-known philosophical position that probabilitybe considered as an extension of logic, he arguedthe statistical theory of automata was simply anof the logical theory of automata.Von Neumanns extension of automata theory fromlogical to a statistical theory is strikingly similar towork in the foundations of quantum mechanics.e naturally, von Neumann turned to theoreticalor an approach to the statistical theory ofHe stated explicitly that two statisticalf information are quite relevant in thisxt although they are not conceived from thetly logical point of view (von Neumann 1966, p.referring to the work of Boltzmann, Hartley, andon thermodynamics and of Shannon on thet of noise and information on a communicationVon Neumann proceeded to give an informalof these theories, arguing that this work oncs should be incorporated into the for-l statistical account of automata. In Probabilisticthe Synthesis of Reliable Organisms from

iable Components he tried to develop this for-l statistical theory.Von Neumann recognized that the problem of reli-confronting information processors containingprone to error, no matter whether theessors are biological or electromechanical, is notincorrect information might be obtained occa-instead that untrustworthy results mightproduced regularly. He argued that if one assumessmall positive probability e for a basic componentan automaton to fail, the probability over time ofof the final output of the automaton tends toIn other words, the significance of the machineis lost because the behavior of the machine (its

output following a given input) is no different fromrandom behavior, as a result of the accumulation oferrors in the basic components over time.26Von Neumann proposed a technique he called mul-tiplexing to resolve the problem of unreliable com-ponents. This technique enables the probability d oferror in the final output to be made arbitrarily smallfor most fixe

Date post:	09-Apr-2018
Category:	Documents
Upload:	ogangurel
View:	222 times
Download:	0 times

Scientific Conceptualization of Information

Documents