CHAPTER 11 – HISTORICAL
USES AND ABUSES OF
INTELLIGENCE TESTING
Dr. Nancy Alvarado
Motivation for Intelligence Testing
In schools, the first intelligence tests were developed
in France to enable public schools to measure
children for proper grade placement.
Rural schools were primarily one-room with all ages
taught by a single teacher.
Schools in cities were stratified by academic
accomplishment (not age as is now done).
Children moving to large cities needed to be placed.
Other, concurrent efforts focused on measuring
intelligence as an individual difference.
Broca’s Craniometry
Broca measured the body to understand its
functions, including the head.
He equated a larger head with greater intelligence
and concluded that men were more intelligent than
women because their heads were larger.
He concluded that the sex difference was greater in
contemporary people than in the past.
His assumptions exemplified the biases of the times,
against women, the elderly, primitive people – he
believed differences in brain sizes supported them.
Broca and Darwin
Broca used ideas from Darwin’s evolutionary theory
to support his thinking.
“I would rather be a transformed ape than a
degenerate son of Adam.”
Broca believed that men struggle to survive
whereas women are protected, so bigger brains
are selected for in men but not women.
Broca’s work was cited to justify denying education
to women.
Criticisms of Broca
Stephen Jay Gould pointed out that brain weight
decreases with age – the women studied were
older than the men, introducing a confound.
Taking cause of death into account, Gould concluded
that there is probably no difference in brain weight
between men and women.
A man of the same height would have the same size
brain as a woman of that height.
The sample size for prehistoric brains is too small (7
male and 6 female brains).
Alfred Binet (1857-1911)
Binet developed the first psychological scales to
measure intelligence, supplanting earlier attempts
using physical measures and subjective judgments.
Informal, subjective assessments may be correct or
wrong, but are prone to prejudice and cause trouble
when people place excess confidence in them.
An important result of Binet’s work was replacement of
these haphazard and prejudiced methods with
standard, uniform, objective methods of assessment.
Alfred Binet
Binet’s Early Education
Binet read Darwin, Galton & John Stuart Mill – he
was a self-taught library psychologist.
This deprived him of interaction with others and training
in critical thinking.
Binet accepted a staff position at La Salpetriere
working with Charcot as his mentor.
Charcot used circular reasoning – people who could be
hypnotized had unstable nervous systems – as evidence
of this, they could be hypnotized.
Binet accepted Charcot’s reasoning without question.
Studies of Hypnosis
Binet and Fere claimed that hypnotic phenomena
could be transferred from one side of the body to
the other using magnets.
They also reported “polarization” in which a red
hallucination would turn green with use of a magnet.
They believed the magnetic field was responsible.
Patients had full knowledge of what was expected
so the expts were poorly controlled and carelessly
conducted. Ultimately they had to admit their errors.
Hypnotizability was not necessarily linked to hysteria.
Binet’s Research on Cognition
Binet was humiliated and became obsessively
concerned with suggestibility in experiments.
He became increasingly withdrawn and more shy.
Studying his own children, he published 3 papers
describing their cognitive development.
He devised a number of tests of their thinking.
These studies anticipated Piaget’s work – Piaget later
worked with Binet’s collaborator, Simon, analyzing the
wrong answers children gave on intelligence tests.
In 1891 at the Sorbonne, he did a variety of studies
Binet’s Test of Intelligence
In 1882, a law established mandatory primary
education for children from 6 to 14 years old.
A national system of exams had been established to
select students for secondary and university
education and vocational schooling.
Competition was intense, with 969 applicants to 1
opening at university (compared to 290 to 1 in the US).
Concern about “retarded” children in the schools
(children unable to learn in school) motivated
interest in a systematic way of identifying them.
Test Questions
Binet & Simon developed 20 subtests and
investigated a variety of other measures and
relationships between them.
They concluded craniometry had little value.
Tests included: association tests, sentence completion,
themes on a given topic, picture descriptions and
memory tests, object drawing and description, digit
repetition and other memory and attention tests,
tests of moral judgment.
They carefully specified controlled testing conditions.
Revised Binet-Simon Scale
They administered their tests to larger numbers of
schoolchildren and a small number of retarded
children, to develop norms.
In 1908, they developed a revised scale consisting
of 14 of the original tests, 7 modified, 33 new tests.
Tests were arranged according to age levels from 3-13
The average 5 year old should score at a mental level
of 5. If a majority (75-90%) passed a test it was
assigned to that age level.
Binet and Simon rejected the concept of mental age.
IQ Scores
They believed that even retarded children could
raise their mental levels and devised a system of
training for the retarded (like Montessori’s).
Louis Stern introduced the concept of mental quotient as
a ratio of chronological age to mental age.
A score below 1 indicated retardation, a score above 1
indicated superior intelligence, x 100 = IQ score.
Binet and Simon strongly opposed this concept of IQ.
Despite their objections, IQ became the standard
way of depicting performance on intelligence tests.
Testing Spreads
The Binet-Simon scale was easy to administer and
reasonably brief, so was quickly in wide use.
By WWI in 1914 the tests were being using in a
dozen countries, often simply translated without any
attempt to standardize them for the new setting.
Before the end of WWI, 1.7 million inductees to the
US Army had been tested.
Terman revised the scale for use in the US and 4
million children were tested.
Henry H. Goddard (1866-1957)
In 1984, the editors of Science named development
of the IQ test as one of the 20 most significant
discoveries in science, technology & medicine of the
20th century.
Henry Goddard and Lewis Terman were the two
men primarily responsible for introducing the IQ test
to America.
Goddard earned a doctorate at Clark University,
then was appointed research director of a New
Jersey home for 230 “feeble-minded” children.
Goddard’s Studies
Goddard became convinced of the need for a way
to distinguish between normal and feeble-minded
children, and a reliable way to identify levels.
He was given a copy of the Binet-Simon test in Europe.
He translated the scale into English, with some minor
changes, such as names of coins.
He administered the test to 400 children at
Vineland and 2000 in NJ public schools. The scores
at Vineland agreed with their records.
The scores of public school kids varied widely.
Gregor Mendel (1822-1884)
Hothersall reviews Mendel’s work to put the study
of the Kallikak’s into perspective.
Mendel did the first systematic experiments studying
genetics and heritability of characteristics.
First Mendel bred wild mice with albinos to see
what color coats they would have, then bred bees.
Next he bred peas to study blossom color, smooth
or wrinkled seeds, green or yellow seeds, tall or
dwarf plants – 10,000 plants, 300,000 peas.
His work established valid principles of inheritance.
Mendel’s Findings
First he bred tall & short plants – the resulting
hybrids were all tall.
Next he bred hybrids with each other – most were
tall, a minority were short.
He guessed that height was controlled by two genes
(one from each parent).
Tall height was dominant, short
recessive.
His ideas did not catch on and his
papers were burned.
Example Using Pea Blossom Color
Results across multiple generations
Mendel is Rescued from Obscurity
William Bateson published “Mendel’s Principles of
Heredity: A Defence” (1902). Dutch botanist Huge
de Vries also described Mendel’s work.
Goddard read De Vries’ report and applied it to
intelligence – a major leap influenced by Galton’s
reports of hereditary genius.
Goddard discovered that many of the siblings of
the inmates of his institution had themselves been
evaluated as feeble-minded.
The Kallikak Family
Deborah Kallikak was found to have a mental age
of 9 (at age 22). Goddard traced her ancestry
back to Martin Kallikak Sr. in the Amer. Revolution.
Deborah was descended from an illegitimate liaison
with a feeble-minded barmaid, starting the “bad
side” of the family tree, full of “riff-raff.”
Later Martin married a Quaker woman and
founded the “good side” of the family tree, which
was found to have little feeble-mindedness.
He concluded that feeble-mindedness is genetic.
Family Tree
A=Alcoholic, Sx=Sexually Immoral, E=Epileptic
http://psychclassics.yorku.ca/Goddard/chap4.htm
Good side:
496 descendants, 3
degenerate (2 A, 1 Sx)
15 infant deaths
Bad side:
480 descendants,
143 feeble-minded,
33 Sx, 3 E, 24 A,
36 illegitimate, 82
infant deaths
Criticisms of Goddard’s Study
The study took 2 years, which seems short.
Conducted by untrained staff, perhaps biased.
Little objective testing of the relatives – reliance on
reports by family & associates. Position in society
used to infer intelligence, etc.
Criminal behavior and feeble-mindedness were
equated.
Assumption of a single gene for IQ is implausible.
Influence of environment was totally ignored.
Pictures of Kallikaks
Stephen Jay Gould claimed that Goddard tampered with photos to
make them appear less normal. Fancher suggested the publisher
perhaps tried to eliminate blank, staring expressions. Goddard
believed the feeble-minded look normal, so he would have been less
likely to modify them – undercutting Gould’s claim.
Pictures of Deborah
are attractive.
Eugenic Sterilization
Similar studies of the Jukes, the Hill Folk, the Nams,
the Ishmaelites, and the Zeros, reportedly showed
reproduction rates twice those of “normal” families.
Goddard spoke about practical methods for
eliminating “defective people” from the US
population.
Mainstream psychologists supported eugenics, including
Yerkes, Thorndike, Cannon, Terman.
US involuntary sterilization laws were upheld by the
courts & stayed in place until the 1960s.
Goddard at Ellis Island
In 1910, one-third of the US population was foreign
born, raising fears that the US was being swamped.
Teddy Roosevelt appointed a commission to study this.
More recent immigrants were from East & So Europe.
It was feared that immigrants would be an impetus for
development of unions (to keep them out), which would
threaten the US economic system.
New immigrants were Catholic not Protestant.
It was claimed that many immigrants were mentally
defective – 2% were denied entry and sent back.
Goddard’s Innovations
Goddard began using psychological methods and
the number of feeble-minded increased
dramatically – 350% in 1913, 570% in 1914.
Goddard claimed that 83% of Jews, 80 of
Hungarians, 79% of Italians, 87% of Russians were
feeble-minded, based on culturally biased testing.
Restrictive immigration quotas were enacted.
Some people were
considered too inferior to
become citizens – such as
the Irish.
"Now the fact is, that workmen may
have a 10 year intelligence while
you have a 20. To demand for him
such a home as you enjoy is as
absurd ....... How can there be a
thing such as social equality with
this wide range of mental
capacity?" - Goddard, before a
group of Princeton undergraduates,
1919
Eugenics Demonstrators
Goddard and Gifted Children
In 1918, Goddard left Vineland for a position as
director of Ohio State Bureau of Juvenile Research,
then became professor at Ohio State University.
Goddard was hired as consulting psychologist to help
establish classes for gifted children.
Those with IQs above 120 were included.
Goddard advocated enrichment, not rapid promotion.
The program produced long-lasting, positive results.
Lewis M. Terman (1877-1956)
Terman grew up on a farm in Indiana, then was sent
to Central Normal College in Danville to become a
teacher. He earned an M.A. from Univ. of Indiana.
A former student of G.S. Hall helped him obtain a
fellowship to Clark Univ to work with Hall.
Hall disapproved of mental tests so Terman
switched to Edmund Sanford to direct his thesis.
After becoming a high school principal in San
Bernardino, he taught at CSULA (formerly LA
Normal School), then joined Stanford University.
Terman’s Stanford-Binet IQ Test
At Stanford, Terman revised the Binet-Simon, as
described in “The Measurement of Intelligence.”
He used a large standardization sample (2300,
including 1700 children, 200 “defective” and superior,
and 400 adults.
His goal was to make the median chronological and
mental ages coincide, to prevent IQs from changing
across different ages, with an average of 100.
This became the standard measure of intelligence, with
a standardization sample in 1916 of 10,000 people.
Terman’s Studies of Genius
In 1921, Terman began an ambitious longitudinal
study of children with exceptionally IQs of 140+.
The study was continued after his death.
Those participating in the study were called “Termites.”
His findings contradict the stereotype of geniuses as
sickly weaklings interested in nothing but books,
“early ripe, early rot.”
Exceptional performance continued in adult careers.
The sample was unrepresentative, admittedly.
Robert Mearns Yerkes (1876-1956)
Yerkes worked his way through college, then worked
with Munsterberg for this doctorate in comparative
psychology, publishing “The Great Apes.”
He was offered a job and remained at Harvard for his
whole career.
He replaced photos of James, Royce & Palmer with
pictures of great apes – his “philosophers.”
He also worked at Boston State Psychopathic
Hospital, which focused him on the need for better
ways of measuring mental abilities.
Army Alpha & Beta Tests
At the start of WWI, Yerkes organized a meeting to
figure out how psychologists might aid the war.
Yerkes traveled to Canada to study their war
experiences.
They decided to focus on adapting mental
measurement to military needs – IQ testing in the Army.
40 psychologists prepared tests for the Army, to
identify mentally incompetent, classify men by
mental ability and select individual for special
training and extra responsibility.
Test Requirements
Group administration.
Measuring “native wit” not education.
Steeply graded in difficulty – hard enough to tax
those with high ability but easy enough for those of
lesser ability.
Could not take more than an hour and be simple to
score objectively.
Alpha test – for those who are literate, Beta test for
those illiterate or non-English speaking.
Results of Army Testing
Only a minute percentage of inductees were
discharged due to low test scores.
A 900-page report concluded that the average
mental age was 13 years, much lower than assumed
Racist, antidemocratic conclusions were part of
popularized versions of this report.
Goddard proposed a meritocracy based on IQ to
replace our democracy.
Studies blamed non-Nordic immigrants for the low
scores (Brigham). Quotas were established.
Dissenting Voices
In The New Republic, Lippmann lambasted Terman,
Goddard & Yerkes, criticizing the assumption that
IQ tests measure intelligence & mental age is 13.
He stressed differences in early environment and
experiences making comparisons across class/race
meaningless.
Logically impossible for the intelligence of an adult to
equal that of a child. Labeling of kids is contemptible.
Terman’s reply was sarcastic and hostile.
Later Controversies
Cyril Burt’s twin studies – did he fake his data?
No way to know for certain, but Burt’s findings have
been replicated by other researchers.
Debates over social bias in testing arose in the
1940s & 1950s (working class vs upper class).
Debates over racial bias arose in the 1960s with
Arthur Jensen’s claim that IQs cannot be raised.
The Bell Curve (Herrnstein & Murray) in 1994 reignited
debates about racial differences.
Current Trends
Earl Hunt, Robert Sternberg & Howard Gardner
have proposed cognitive approaches studying the
knowledge structures underlying intelligent behavior
Hunt developed the “cognitive correlates”
approach, correlating response times with scores on
cognitive tasks.
Sternberg proposed a “cognitive components”
approach decomposing performance on analogies
into a series of cognitive processes.
Current Trends (Cont.)
Gardner proposed a “theory of multiple
intelligences” based on a decomposition of factors
contributing to performance.
This recapitulates the debate between Spearman and
Thurstone over “g” – a single factor correlating
performance across multiple tests, versus specific skills.
There remain few alternatives to objective, group-
administered standardized tests and intelligence
testing remains controversial today.