+ All Categories
Home > Documents > Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Date post: 14-Feb-2017
Category:
Upload: tranliem
View: 220 times
Download: 0 times
Share this document with a friend
34
451 Gifted Today but Not Tomorrow? Longitudinal Changes in Ability and Achievement During Elementary School David F. Lohman and Katrina A. Korb The term gifted implies a permanent superiority. However, the majority of children who score in the top few percentiles on ability and achievement tests in 1 grade do not retain their status for more than a year or 2. The tendency of those with high scores on one occasion to obtain somewhat lower scores on a later occasion is one example of regression to the mean. We first summarize some of the basic facts about regression to the mean. We then discuss major causes of regression: errors of measurement, indi- vidual differences in growth, changes in the content of the developmental score scale, and changes in the norming population across age or grade cohorts. We then show that year-to-year regression is substantial, even for highly reliable test scores. Different ways of combining achievement and ability test scores to reduce regression effects are illus- trated. Implications for selection policies and research on giftedness are also discussed. Longitudinal studies of intellectually exceptional students have pro- duced some of the most important findings in the field of giſted educa- tion (Lubinski, Webb, Morelock, & Benbow, 2001; Terman & Oden, 1959). However, there is a paradox in the literature on the relationship between estimates of ability in childhood and accomplishments in adulthood. On the one hand, in any group of children, the child who obtains the highest score on a measure of scholastic aptitude is the one who is most likely later to attain the highest level of academic excel- lence. On the other hand, the student who obtains the highest score is also the person whose test score at some later date is most likely to show the greatest amount of regression to the mean. How is this possible? Statistically, the paradox of high aptitude being associated both with high accomplishment and large regression effects merely restates what it means for two variables to be imperfectly correlated. David F. Lohman is Professor of Educational Psychology at the University of Iowa. His primary research interests are the nature and measurement of reasoning abilities, the identification of children with extraordinary academic aptitude, and adapting instruction to the needs and proclivities of learners. Katrina Korb is a doctoral student in Educational Psychology at the University of Iowa. Her primary interest is the measurement of individual differences in cognitive abilities, especially the quantitative reasoning abilities of school children. Journal for the Education of the Gifted. Vol. 29, No. 4, 2006, pp. 451–484. Copyright ©2006 Prufrock Press Inc., http://www.prufrock.com
Transcript
Page 1: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

451

Gifted Today but Not Tomorrow? Longitudinal Changes in Ability

and Achievement During Elementary SchoolDavid F. Lohman and Katrina A. Korb

the term gifted implies a permanent superiority. However, the majority of children who score in the top few percentiles on ability and achievement tests in 1 grade do not retain their status for more than a year or 2. the tendency of those with high scores on one occasion to obtain somewhat lower scores on a later occasion is one example of regression to the mean. We first summarize some of the basic facts about regression to the mean. We then discuss major causes of regression: errors of measurement, indi-vidual differences in growth, changes in the content of the developmental score scale, and changes in the norming population across age or grade cohorts. We then show that year-to-year regression is substantial, even for highly reliable test scores. different ways of combining achievement and ability test scores to reduce regression effects are illus-trated. implications for selection policies and research on giftedness are also discussed.

Longitudinal studies of intellectually exceptional students have pro-ducedsomeofthemostimportantfindingsinthefieldofgiftededuca-tion(Lubinski,Webb,Morelock,&Benbow,2001;Terman&Oden,1959).However,thereisaparadoxintheliteratureontherelationshipbetween estimates of ability in childhood and accomplishments inadulthood.Ontheonehand,inanygroupofchildren,thechildwhoobtainsthehighestscoreonameasureofscholasticaptitudeistheonewhoismost likely latertoattainthehighest levelofacademicexcel-lence.Ontheotherhand,thestudentwhoobtainsthehighestscoreisalsothepersonwhosetestscoreatsomelaterdateismostlikelytoshowthegreatestamountofregressiontothemean.Howisthispossible?

Statistically, the paradox of high aptitude being associatedbothwithhighaccomplishmentandlargeregressioneffectsmerelyrestateswhatitmeansfortwovariablestobeimperfectlycorrelated.

DavidF.LohmanisProfessorofEducationalPsychologyattheUniversityofIowa.Hisprimaryresearchinterestsarethenatureandmeasurementofreasoningabilities,theidentificationofchildrenwithextraordinaryacademicaptitude,andadaptinginstructiontotheneedsandproclivitiesoflearners.KatrinaKorbisadoctoralstudentinEducationalPsychologyattheUniversityofIowa.Herprimaryinterestisthemeasurementofindividualdifferencesincognitiveabilities,especiallythequantitativereasoningabilitiesofschoolchildren.

Journal for the Education of the Gifted.Vol.29,No.4,2006,pp.451–484.Copyright©2006PrufrockPressInc.,http://www.prufrock.com

Page 2: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted452

Theproblem,however,isthatthesortofprobabilisticthinkingthatiscapturedincorrelationsrunscountertothetendencytothinkcat-egoricallyaboutlabeledconcepts.Wespeakoflearning-disabledorgiftedstudentsasifthereweresharpboundariesseparatingindividualsinthecategoriesfromthoseoutsideofthem.Eventhosewhounder-standthattheboundariesarearbitraryoftenthinkthatifweagreedonthelocationofthecategorycutpointsandhadperfectlyreliableandvalidmeasures,thencategorymembershipwouldremaincon-stantovertime.Inthecaseofacademicallyadvancedchildren,theexpectationisthatifwecouldmeasuregiftednesswell,thenthechildwhoisconsideredgiftedatage6wouldstillbeconsideredgiftedat16.Ifretestingthechildatage8or10suggestedalowerscore,thetypical reaction would be to question either the dependability ofscores (especially the latter, lower score)or thevalidityof the testthat produced them. Indeed, it is common practice to administerdifferentabilityteststoindividualsorgroupsinthehopesofiden-tifyingadditionalgiftedstudents.Theassumptionisthatanyhighscoreislegitimate,whereaslowerscoresunderestimateability.

Confusionsaboutgiftednessthusreflectmorethanacommonfondnessfortypologies.Theyalsoresultfromassumptionsaboutthenature of intellectual development and the characteristics of testsused to measure that development. For example, the assumptionthatthechildwhoseperformanceisunusualatonepointintimewillbeequallyunusualatanotherpointintimeassumes(a)thaterrorsofmeasurementdonotsubstantiallyaffecteithertestscore,(b)thatgrowthfromTime1toTime2isconstantforallwhohavethesameinitial score, (c) that tests measure the same mix of constructs atall points along the score scale that spans the developmental con-tinuum,and(d)thatthepopulationoftesttakersisconstantacrosstime.Totheextentthattheseassumptionsarenottrue,thenwewillseearegressioninscoresfromTime1toTime2.

Regression to the Mean

Regressiontothemeanoccurswheneverscoresarenotperfectlycor-related. The amount of regression in standard scores can easily be

Page 3: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 453

estimatedfromthecorrelationbetweenthetwosetsofscores.Thepredicted score on Test 2 is simply z z r2 1 12= × , where z2 is thepredictedstandardscoreonTest2, z1isthestandardscoreonTest1,and r12 isthecorrelationbetweentheTests1and2.

TheexpectedscoreatTime2willequalthescoreatTime1onlyifthecorrelationis1.0orifthestandardscoreatTime1iszero(i.e.,the mean). The lower the correlation, the greater is the expectedregression.Indeed,whenthecorrelationbetweentwotestsiszero,thentheexpectedtestscoreatTime2isthemean(i.e.,0)foralltesttakers.Althoughthereisnoregressionatthemean(i.e.,z1=z2=0),theamountofregressionincreasesasscoresdepartfromthemean.StudentswhoreceiveextremelyhighscoresonTest1areunlikelytoreceivesimilarlyhighscoresonTest2.

Theequationfor z2canbeusedtoestimatetheexpectedregres-sioninstatusscoressuchasIQs.ThefirststepistoconverttheIQtoazscorebysubtractingthemeanIQanddividingbythepopulationSd forthetest.Forexample, ifthemeanis100andtheSd is 15,thenanIQof130convertstoazscoreof 130 100

15− =2.0.Ifthecorrela-

tionbetweenscoresatTime1andTime2isr=.8,thentheexpectedzscoreatTime2is2.0×.8=1.6.ThisconvertstoanIQof(1.6×15)+100=124.Theexpectedregressionis6IQpoints.IftheIQwere145,thentheexpectedregressionwouldbe9IQpoints.

Thestandardizedscoresusedintheequationfor z2maybeinap-propriateifthevarianceofscoresisnotthesameacrossoccasions.1Thisoftenoccurswhenusingattainmentscores(suchasmentalageor developmental scale scores) rather than status scores (such aspercentilerankorIQ).Whetherthevarianceofattainmentscoresincreasesordecreasesovertimedependsonthenatureoftheabili-ties that are measured, the dependent measures that are used, andhow score scales are constructed. The variance of scores tends todecreasewithpracticewhenthedomainisclosedratherthanopen(Ackerman, 1989). A closed skill set is one that is relatively smallandbounded.Forexample,learningtocountto10isaclosedskill.Learningmathematicsisanopenskill.Thedependentmeasurealsomatters.Forexample,as individuals learnanewskill, thevarianceofaccuracyscoresoftendeclines.However,responsespeedorothermeasuresoflearningandtransfercanshowimprovementswithaddi-

Page 4: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted454

tionalpractice.Suchscoresmayshowanincrease invariancewithextendedpractice.

Howtestsarescaledcanhaveasubstantialimpactonwhetherthevarianceofscoresincreasesovertime.Forexample,theIowaTestsofBasicSkills,FormA(ITBS;Hoover,Dunbar,&Frisbie,2001)andtheCognitiveAbilitiesTest,Form6(CogAT;Lohman&Hagen,2001) were jointly normed on the same national sample. ITBSscaled scores show considerable increase in variance across grades,whereas CogAT scaled scores do not. This is because the ITBS isscaledusingagrowthmodelthatassumesthatindividualdifferencesinachievementincreaseovergrades.TheCogATisscaledusingtheRasch(1960)model thatmakesnoassumptionsaboutchanges inscore variance across time. These differences in scaling proceduresaremaskedwhenstatusscoressuchaspercentileranksorstandardagescoresarereported.

Developmental psychologists recognize that regression to themean is a pervasive phenomenon when retesting students (Marsh&Hau,2002;Nesselroade,Stigler,&Baltes,1980;Phillips,Norris,Osmond,&Maynard,2002).Regressiontothemeanisalsocom-monlycitedasaproblemwhenworkingwithlearning-disabledstu-dents(e.g.,Milich,Roberts,Loney,&Caputo,1980).However,thisstatistical fact of life is less commonly applied to gifted students.2Manywhorecognizetheproblemoftenascribeitentirelytoerrorsof measurement (e.g., Callahan, 1992; Mills & Jackson, 1990).However,measurementerrorisonlypartofthepicture.

Any factor that reduces the correlation between two sets ofscorescontributes toregressiontowardthemean.Wediscussfive:errorsofmeasurement,conditionalerrorsofmeasurement,differen-tialgrowth,changesinthecontentofthedevelopmentalscale,andchangesinthenormingsample.

Errors of Measurement

Erroristhemostobviouscontributortoregressiontowardthemean.Sourcesoferrorthatmightlowerascoreonaparticularoccasionarecallednegativeerror;theyincludefactorssuchastemporaryinatten-tionordistractionswhentakingthetest.Errorcanalsocontributeto

Page 5: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 455

higherscores.Examplesofpositiveerrorareluckyguessingorgoodfortune in having learned the solutions to particular items. Thesesortsofseeminglyrandomfluctuationsinbehavioracrosssituationsarewhatmostpeopleunderstandaserrorsofmeasurement.

Alargersourceofmeasurementerrorformostexaminees,how-ever,istheparticularcollectionoftasksanditemsthatarepresented.Forexample,theestimateoneobtainsofastudent’sreasoningabili-tiesdependsontheformatofthetask(e.g.,matrices,analogies,orclassificationproblems)andtheparticularsampleofitemspresentedineachofthesetasks.Factoranalysesoflargetestbatteriescommonlyshowthattheloadingofatestonitstask-specificfactorisoftennotmuchsmallerthanitsloadingonthefactorthatithelpsdefine.Thismeansthatthescoresonthetestareas likelytoreflectsomethingspecific to the task and measurement occasion as something thatwouldbesharedwithothermeasuresofthesameconstruct.Forthisreason,measurementexpertshavelongadvocatedestimatingabilityusingteststhatpresentasmanyitemsaspossibleinmanydifferentformatsaspossible.However,evenwhentestscontainmanyitemsinmultipleformats,oneisalmostneverinterestedinthestudent’sscoreonaparticularformofatestthatisadministeredonaparticu-laroccasion.Theidealscorewouldbeonethatisaveragedacrossallacceptableconditionsofobservation:testformats,samplesofitems,testoccasions,andotherconditionsoftesting.

Severalofthesefactorsarevariedwhenscoresareobtainedforrep-resentativesamplesofstudentsondifferentindividuallyadministeredabilitytests.Testtasks,testoccasions,andperhapsevenexaminersorother conditions of testing vary. Correlations between individuallyadministeredabilitytestsrangefromapproximatelyr=.7to.85.Forexample,Phelps(inMcGrew&Woodcock,2001)reportedacorrela-tionofr=.71betweentheWoodcock-JohnsonIIIGeneralIntellectualAbilityscoreandtheFullScaleIQontheWISC-IIIforasampleof150randomlychosenstudentsfromgrades3to5.Flanagan,Kranzler,andKeith(inMcGrew&Woodcock,2001)reportedacorrelationofr=.70betweentheWoodcock-JohnsonIIIBriefIntellectualAbilityscoreandtheFullScaleScoreontheCognitiveAssessmentSystem.Roid(2003)reportedacorrelationofr =.84betweentheStanford-BinetVandtheWISC-III(seealsoDaniel,2000).Asshownlater,cor-

Page 6: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted456

relationsofthismagnitudewillresultinsubstantialregressionwhenstudentswhoreceiveahighscoreononeofthesetestsareadminis-teredadifferenttest.Forexample,givenacorrelationofr=.84,onlyabouthalfofthestudentswhoscoreinthetop3%ofthedistributiononone testwill also score in the top3%of thedistributionon theothertest(seeTable1).

Averaging can reduce the impact errors of measurement. Forexample, one could compute the average of a student’s readingachievement scale scoresorability test scoresacross2years ratherthan using the score from a single testing. Averaged scores willregresstowardthemean,andsotheaverageoftwotestscorescannotbeinterpretedusingthenormsthatarederivedforasingleadmin-istrationofthetest.But,evennormsforindividualtestscoresmaybemisleading.Normsforabilitytests—especiallynonverbaltests—have changed dramatically over the past 40 years (Flynn, 1987,1999;Thorndike,1975).Schools shouldnotusepublishednormsonabilityteststhatareinadequate(e.g.,seeTannenbaum,1965,ontheCultureFairIntelligenceTest)orseverelyoutofdate(e.g.,theStanford-BinetL-M).Whenitisimpossibletoadministermultipletestsofaparticularconstruct,oneshouldendeavortouseteststhatpresentitemsinmultipleformatsratherthanasingleitemformat.Suchteststypicallyhavehighergeneralizabilitythanthosethatuseasingleresponseformatforall items.Finally,as shownlater,aver-agingscoresonadomain-specifictestofachievementandatestofreasoningabilitiesinthesymbolsystemsusedtocommunicatenewknowledge in thatdomaincandramatically reduce theamountofregressionintestscores.

Conditional Errors of Measurement

Although many researchers understand that the concept of errorincludesmorethanrandomfluctuationsacrosstestoccasions,fewerunderstandthattheamountoferror intestscores isgenerallynotuniformacrossthescorescale.Formulasforestimatingthestandarderrorofmeasurement(SEM)fromthereliabilitycoefficientgener-allyassumethatthevariabilityoferrorsisconstantacrossscorelev-els.Thisisareasonableassumptionformostexaminees.It isoften

Page 7: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 457

Tabl

e 1

Prop

orti

ons

of S

tude

nts

Exce

edin

g a

Cut S

core

on

Test

1 W

ho E

xcee

d th

e Sa

me

Cut

Scor

e on

Tes

t 2, b

y Cu

t Sco

re a

nd C

orre

lati

on B

etw

een

the

Two

Test

s

Cut

scor

e

Cor

rela

tion

betw

een

test

s

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

0.97

5

Top

1%0.

130.

160.

190.

220.

270.

320.

380.

450.

540.

670.

76

Top

2%0.

170.

200.

230.

270.

310.

360.

420.

490.

580.

700.

79

Top

3%0.

200.

230.

260.

300.

350.

400.

450.

520.

600.

720.

80

Top

4%0.

220.

250.

290.

330.

370.

420.

480.

540.

620.

730.

81

Top

5%0.

240.

280.

310.

350.

390.

440.

500.

560.

640.

740.

82

Top

6%0.

260.

290.

330.

370.

410.

460.

510.

570.

650.

750.

82

Top

7%0.

280.

310.

350.

380.

430.

470.

530.

590.

660.

760.

83

Top

8%0.

300.

330.

360.

400.

440.

490.

540.

600.

670.

770.

83

Top

9%0.

310.

340.

380.

410.

460.

500.

550.

610.

680.

770.

84

Top

10%

0.32

0.36

0.39

0.43

0.47

0.51

0.56

0.62

0.69

0.78

0.84

Top

15%

0.38

0.42

0.45

0.48

0.52

0.56

0.61

0.66

0.72

0.80

0.86

Top

20%

0.44

0.47

0.50

0.53

0.56

0.60

0.65

0.69

0.75

0.82

0.88

Page 8: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted458

notareasonableassumptionforthosewhoobtainveryhighorverylowscoresonthetest.

Conditionalerrorsofmeasurementareerrorsthatdependonthelocationofascoreonthescorescale.Thetypicalpatternsoferrorsofmeasurementforrawandscaledscoresonafixed-lengthtestareshowninFigure1.AsFigure1shows,thepatternsareoppositeforraw scores (i.e., number correct) than for scaled scores and othernormativescoresbasedonscaledscores(e.g.,IQscores).Differencesinthepatternsoferrorsforrawandscaledscoresarecausedbytheway the scaling process expands the score scale at the extremes ofthedistribution.Thismeansthatpassingorfailingasingleitemwillhaveamuch largereffecton scale scores for thosewhoscoreneartheceilingorfloorofatestthanforthosewhoscorenearthemean.Thismostcommonlyoccursontestsinwhichallstudentsinagradeareadministeredthesamelevelofatest.Thelevelofthetestthatisappropriateforthemajorityofstudentsinaclasswilloftenbetooeasyforthemostablestudents.

TeststhatarescaledusingItemResponseTheorysuchastheOtis-LennonSchoolAbilityTest(Otis&Lennon,2003)andtheCogAT(Lohman&Hagen,2001)typicallyreportconditionalerrorsofmea-surementforscalescores.Conditionalerrorsofmeasurementcanbe

figure 1. conditional Standard Errors of Measurement for raw scores (dashed line) and scaled scores (solid line) on the cogat (form 6) Verbal Battery, level a.

0

2

4

6

8

10

12

14

16

18

0 5 10 15 20 25 30 35 40 45 50 55 60 65

Number Correct

SE

M Raw Score

Scale Score

Figure 1.Conditional Standard Errors of Measurement for raw scores (dashed line) and scaled scores

(solid line) on the CogAT (Form 6)Verbal Battery, Level A.

Page 9: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 459

dramaticallyreducedbyadministeringahigherlevelofthetesttomoreablestudents.Forexample,considerthestudentwhoreceivesaVerbalscalescoreof221onCogAT.Table5.7inLohmanandHagen(2002)showsthattheerrorofmeasurementatthisscoreis14.8onLevelAofthetestbutonly7.4atLevelD.Thus,administeringthehigherlevelofthetesthalvestheexpectederrorofmeasurement.

Differential Growth Rates

If errors of measurement were the only factor that contributed toregressiontothemean,noadditionalregressionshouldoccurafterthefirstretest.Supposethatonlystudentswhoobtainhighscoresontheinitialtestareselected.Onaverage,scoreswouldbeexpectedto decline when the students were retested. After this first retest,however,scoreswouldregresstothemeantrue(oruniverse)scoreofthegroup—someindividualsgettinghigherscoresonsubsequentretests, somegetting lower scores,but themean true score stayingthe same. Put differently, the correlation between the initial testscore and every subsequent retest would be the same. All of thesecorrelationswouldestimatethereliabilityofthetest.However,lon-gitudinal studies of ability do not show this pattern. Rather, thecorrelationstendtodecreaseasthe intervalbetweentestadminis-trations increases (Bayley, 1949; Humphreys & Davey, 1988). Forexample,theupperdiagonalofthematrixinTable2showscorrela-tionsamongCompositescoresontheITBSfor6,321studentswhowere tested every year from third grade to eighth grade (Martin,1985).ThelowerdiagonalshowscorrelationsamongIQscoresforthesameintervalsestimatedfromThorndike’s(1933)meta-analysisof36studies inwhichstudentswerereadministeredtheStanford-Binetafterintervalsthatrangedfromlessthanamonthto5years.Thepatterninbothmatricesapproximatesasimplex:Highcorrela-tionsnearthediagonalofthematrixdeclineasonemovesawayfromthe diagonal. Correlations are higher for the longer and thereforemorereliableachievementtest(medianrxx'=.98)thanfortheBinettest (estimatedrxx'= .89).Thefact thatcorrelationsdeclineas theintervalbetweentestsincreasesmeansthatfactorsotherthanerrormustaffectretestscoresonbothabilityandachievementtests.

Page 10: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted460

Onepossibilityisdifferentialratesofgrowth.Asimplexpatternforcorrelationswillbeobtainedaslongastrue-scoregainsarenotperfectlycorrelatedwiththetrue-scorebase(Humphreys&Davey,1988).Putdifferently,year-to-yeargainsdonothavetoberandom,assomehavehypothesized(Anderson,1939).Rather,theyonlyneedtovaryacrossindividuals.Thereisinfactconsiderableevidencethatstudentsshowdifferentpatternsofgrowthonabilitytests.Forexam-ple,McCall,Appelbaum,andHogarty(1973)investigatedchangesinStanford-BinetIQscoresfor80middle-classchildrenwhoweregiventhesametest17timesbetweenages2½and17.IQprofilesfor67ofthe80childrencouldbeclassifiedintooneoffivegroups.Thelargest group showed a slightly rising pattern of scores over child-hood.Othergroupsshowedpatternsofsharpdeclinesorincreasesatdifferentages.Ingeneral,majorshiftsoccurredmostfrequentlyatages6and10.NotethatchangesinIQreflectchangesinrankwithinsuccessiveagegroupsratherthanchangesinabilitytoperformtasks.IQscoresdeclineevenifabilityimproves,butataslowerratethanage-mateswhoobtainedthesameinitialIQscore.

Students’growthonbothabilityandachievementtestsfromyeartoyearisaffectedbymaturation,interest,qualityofinstruction,out-of-schoolexperiences,andmanyotherpersonalandsocial factors.Forexample,instructionthatengagesandappropriatelychallengesastudentcanresultincognitivegrowththatislargerthanexpected.

Table 2 Correlations Between ITBS Composite Scores

and Binet IQ Scoresa

Grade 3 4 5 6 7 83 91 89 87 85 834 86 93 91 89 875 83 86 94 92 916 80 83 86 94 937 75 80 83 86 948 70 75 80 83 86

note. Decimals omitted. aAbove the diagonal, from Martin, 1985; below the diagonal, from Thorndike, 1933.

Page 11: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 461

However,thesamestudentmaybeplacedinaclassroomwithmanydistractionsinthesubsequentyearandthusshowlessgrowth.3

Although growth rates vary across individuals, the stability ofindividualdifferencesinscoresthataverageacrosstasksanddomainsis substantial. Indeed, Humphreys (1985) estimated that betweentheagesof9and17,truescoresonatestofgeneralabilitywouldcorrelateapproximatelyr= .965withtruescoresonthesametestadministered1yearlater.Ashethenputit:

Itbecomeseasytounderstandthebeliefinafixedintelli-gencewhenonelooksonlyatthesmalldifferenceintruescorestabilityfromyeartoyearbetweenanestimated[cor-relationof.965]andthe1.00requiredby[theassumptionof ]afixedintelligence. (p.200)

Humphreys(1985)alsoshowedthatthecorrelation(r)betweentruescoresacrossyearscouldbeestimatedbyry,whereywasthenum-berofyearsseparatingthetwotestadministrations.Thus,theesti-matedcorrelationbetweentrueIQscoresatages9and17isgivenby.9658=.75.Thismeansthatabout60%ofthechildrenwhosetruescoresfallinthetop3%ofthedistributionatage9wouldnotfallabovethatcutatage17.Ofcourse,errorinbothtestswouldlowertheobservedcorrelationandthusresultinsubstantiallylessstabil-ityacrosstime.Weneverknowtruescores,onlyerror-encumberedobservedscores.One-yearretestcorrelationstypicallyrangefromr=.8tor=.9.Ifaparallelformofthetestisused,thenthecorrelationisevenlower.

Changes in Score Scales

Both the magnitude and the interpretation of changes in scoresareinfluencedbythepsychologicalandstatisticalpropertiesofthescore scale. Quite commonly, the content of ability and achieve-menttestsdiffersacrossscorelevels.Onecanreducetheseeffectsbypresentingitemsinacommonformatatallpointsonthescale,bycheckingtoensurethat individualdifferences in itemsconformtoaunifactormodel,andbyusingscalingproceduresthatattempttomakethescalepropertiesconstantthroughoutitsrange.However,

Page 12: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted462

noneofthesecontrolscanguaranteethattheunitsofthescalewillindeedbeuniform,especiallyattheextremes.Forexample,thefactthatallitemsarepresentedinacommonformatdoesnotmeanthatitemsrequirethesamecognitiveprocesses.Matrixtestsuseacom-mon format. However, difficult items on the Progressive Matricestest require theapplicationof rulesnotrequiredonsimpler items(Carpenter,Just,&Shell,1990).Nordoesthefactthataunidimen-sional IRT scale can be fit to the data guarantee an equal-intervalscale,especiallywhenthefullscaleisconstructedbyverticallyequat-ingoverlappingteststhatareadministeredtoexamineesofdifferentages(seeKolen&Brennan,2004).

Changes in the Norming Population

Longitudinalchangesinstatusareeasilyconfoundedbynonrandomlossofcasesovertime.Althoughdevelopmentalpsychologistsrec-ognizethisasapotentialconfoundintheirownresearch,manywhouse test scores—particularly those normed on school children—oftenfailtotakeintoaccountthefactthatasubstantialfractionoflow-scoringstudentsdropoutofschool.Nationally,onlyaboutonethird of students complete high school (Barton, 2005). Dropoutratesalsovaryacrossethnicgroups,states,andgrades.Dropoutrateshavedecreasedbetween11thand12thgradeandincreasedbetween9thand10thgrade(Haneyetal.,2004).Theupshot is thatrank-within-gradecohortmeansdifferentthingsat12thgradethanat8thgradeorat4thgrade.Becauselessablestudentstendtodropoutatahigherratethanmoreablestudents,apercentilerankof90meansbetterperformancefor12thgradersthanitdoesfor8thgraders.

Summary

For educational, psychological, and statistical reasons, test scoresobtainedbyhigh-scoringstudentswillchangefromyeartoyear.Thischangereflectserrorsofmeasurement inthetests thatarecommontoallanderrorsthatareparticularlysevereforextremescorers,differ-entialgrowthofstudentsfromyeartoyear,changesinthecontentofscorescalesorthetests,andsystematicchangesintherepresentative-

Page 13: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 463

nessofsamplesonwhichnormsarederived.Howlargewouldthesechangesbeastheresultofthecombinationofthesefactors?Onewayto address this question is to set a criterion for giftedness and theneitherestimate(fromcorrelations)orcount(fromscores)thenumberofstudentsatlatergradeswhowouldfailtomeetthecriterion.Thishas important implications for policy, such as how best to identifythosestudentswhowillcontinuetoexcel,orhowfrequentlyschoolsshouldretesttodetermineeligibilityforTAGservices.

Estimating the Size of Regression Effects From Longitudinal Studies

Oneofthemostimportantlimitationsofmostlongitudinalstudiesinthefieldofgiftededucationisthattheyfollowonlythatportionofthepopulationidentifiedasgiftedatonepoint intime.Abetterprocedure,ofcourse,wouldbetofollowtheentirecohortofstudents.However,longitudinalstudiesinwhichanentirecohortofstudentsarerepeatedlyadministeredabilitytestsarerare,generallydated,andmore often than not, quite small. For example, the classic BerkeleyGrowthStudy(Bayley,1949)hadonly40children.TheFelsdatausedbyMcCalletal.(1973)had80subjects.Correlationscomputedonsuchsmallsampleshavelargestandarderrors.For40cases,the95%confidenceintervalforapopulationcorrelationofρ=.65isr =.43tor=.90.Further,thecasesareoftennotrepresentativeofthepopu-lation.Evenwhensamplesaremuchlarger,asintheWilson(1983)study, differential dropout and variation in sample size across occa-sionsatbestcomplicatesandatworstseriouslybiasestheanalyses.4

Achievement tests, on the other hand, are often administeredeveryyeartolargegroupsofstudents.Ifthesampleislarge,thedatacanbereweightedbettertorepresentthepopulationdistributionsofachievement.Thiscaninsignificantmeasurecontrolfornonrandomlossofcasesovertime.Largesamplesalsomeanthatcorrelationsarequitestable.Correlationsamongachievementtestscoresexhibitthesamesimplexstructurethatisobservedforabilitytests.Thisshouldbeexpected,giventhehighcorrelationsbetweenabilityandachievementtests.Indeed,abilitytestsareprobablybestunderstoodasachievement

Page 14: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted464

tests that sample general reasoning abilities developed in a culture,whereasachievementtestssamplethoseabilitiesspecificallydevelopedthrough formal schooling. The belief that ability and achievementtestsmeasure(oroughttomeasure)qualitativelydifferentconstructshasinhibitedtheinterpretationanduseofbothtypesoftestssincetheearliestdaysoftesting(Lohman,inpress-a).

Martin(1985)reporteda longitudinalanalysisofITBSscoresfor6,321studentswhoweretestedeveryyearfromthirdtoeighthgrade.Prior tocomputingthecorrelations,Martinreweightedthedata to better approximate the distribution of grade 5 CompositeachievementforIowastudents.Weusedhiscorrelationstoestimatethepercentofstudentswhofellinthetop3%oftheReadingTotal,LanguageTotal,MathematicsTotal,andCompositescoredistribu-tions at grade 3 who were also in the top 3% on each subsequentretest.(SeeTable2forCompositescorecorrelations.)Theestimatesassume a bivariate normal distribution of each pair of test scores.5TheresultsareshowngraphicallyinFigure2.

Thegreatestregression—whichislargelyduetoerrorsofmea-surement—occursfromYear1toYear2.Onlyabout40%ofthestu-dentswhohadcompositescores inthetop3%inthirdgradealsoscoredinthetop3%infourthgrade.NotethatthisoccursinspiteofthefactthatCompositeITBSscoresarehighlyreliable(K-R20rxx'=.98)andshowsubstantialstabilityacrossyears(r=.91forgrade3to grade 4). As would be expected, regression effects were greaterfortheReading,Language,andMathematicssubtestscoresthanfortheCompositescorethatcombinesthem.Foreachofthesecontentscores, the fallout was approximately 50% in the first year. As thefigure shows, however, regression continues at a slower rate acrossgrades. This means that regression effects reflect more than errorsofmeasurement.Byeighthgrade,thecorrelationsindicatethatonly35–40%ofthosewhoscored inthetop3%atgrade3wouldstillscoreinthatrange.

The procedures that many schools use to identify exceptionalstudents were not designed to cope with regression effects of thismagnitude. Indeed, some use procedures that exacerbate theseeffects. Others, however, use procedures that, wittingly or unwit-tingly,reduceregressioneffects.

Page 15: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 465

Regression and Common Identification Procedures

Schoolsusenominations,ratingscales,andtestscoresinmanydiffer-entwayswhenselectingstudentsforparticipationinspecialclassesfor thegifted. Inthis section,weexaminesomeof themorecom-monrules.Thefirstpolicyistorequireahighscoreontwoormoretests.Wecallthisthe“and”rule.Thesecondpossibilityistoacceptahighscoreoneitheroftwoormoretests.Wecallthisthe“or”rule.Although rarely employed, another possibility is to average scoresacross two or more measures. We call this the “average” rule. ThethreerulesareillustratedinFigure3.

The “And” Rule

ManyTAGprogramssetupaseriesofhurdlesandadmitonlythosestudentswhosurmountallofthem.Forexample,thepotentialpoolofapplicantsisfirstrestrictedtothosestudentswhoarenominatedbyateacherorwhoscoreaboveacertainscoreonascreeningtest.

figure 2. Percent of cases in the top 3% of the grade 3 distribution also in the top 3% of the score distributions at grades 4 through 8 for itBS reading, language, Mathematics, and composite total scores

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

3 vs 3 3 vs 4 3 vs 5 3 vs 6 3 vs 7 3 vs 8

Grades Compared

Perc

en

tsti

llin

top

3%

Reading Tot

LangTot

MathTot

Composite

Figure 2. Percent of cases in the top 3% of the grade 3 distribution also in the top 3% of thescore distributions at grades 4 through 8 for ITBS Reading, Language,Mathematics, andComposite Total scores

Page 16: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted466

Thesestudentsarethenadministeredasecondtest.Onlythosewhoexceedsomescoreonthesecondtestareadmittedtotheprogram.

Therearebothadvantagesanddisadvantagestothisprocedure.Theprimaryadvantageisthatitreducesthenumberofstudentswhomustbeadministeredthesecondtest.Thiscanbeimportantwhenthesecondtestmustbeindividuallyadministeredbyatrainedexam-iner.Thesecondadvantageofthemultiplehurdlesprocedureisthatitdecreasestheamountofregressionthatwillbeobservedonfutureoccasionswhencomparedtoaselectionrulethatusesonlyonetestortakesthehighestscoreonanyofseveraltests.However,aswillbeshown,thiseffectisonlyobservedifbothtestsareusedtovalidatestudentclassificationsonthesecondoccasion.

Theprimarydisadvantageofthe“and”ruleisthattheprocedureassumes that the two tests are exchangeable measures of the sameconstruct. If the tests are not exchangeable, then the sample willbebiasedunlessaveryliberalcutscoreisusedforthefirsttest.Forexample,supposethatthefirst“test”isateachernominationscale.Studentswhodonotconformtotheteacher’smodelofgiftednessbutwhowouldhaveexceededthecutscoreonthesecondtestwillnotbeconsideredfortheprogram.ThiswasoneofthelimitationsoftheTermanstudy(Terman&Oden,1959).Aseconddisadvantage

figure 3. Plots of the conjunctive “and” rule (left panel), the disjunc-tive “or” rule (center panel), and the statistically optimal “average” rule (right panel).

Test 1 and Test 2 Test 1 or Test 2 Average of Test 1 andTest 2

Test 1 Test 1 Test 1

Test

2

Test

2

Test

2

Page 17: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 467

isthattheselectionruleisnoncompensatory.Averyhighscoreononetestcannotcompensateforascoreonthesecondtestthatisjustbelowthecut.Therefore,requiringstudentstoscoreaboveaparticu-larcutscoreonTest1andTest2restrictsthenumberofstudentswhoareidentifiedcomparedtoarulethatadmitsonthebasisofahighscoreoneithertest.But,byhowmuch?

Table1showstheamountofregressiontoexpectwithvariouscommoncutscoresfortwoselectionteststhatarecorrelatedtodif-ferent degrees. The table shows the proportion of students aboveacommoncut scoreonboth tests as thecorrelationbetween twotestsvariesfromr=.5tor=.975.Forexample,considerthecaseinwhichthecommoncutscoreissetatthetop3%andthecorrelationbetweenthetestsisr=.80.Table1showsthat45%ofthestudentsinthepopulationwhoscoreinthetop3%onTest1areexpectedtoscoreinthetop3%onTest2.Thismeansthat45%ofthe3%whomet the criterion on Test 1 or 1.35% of the total student popula-tionwillbeadmittedwhenascoreinthetop3%isrequiredonbothtests.

If a more lenient cut score is used for the initial nominationprocedureandthesamecutscoreisusedforthefinaladmissionstest (top 3%), then the effects are much smaller. Table 1 cannotbe used to estimate these effects, because it assumes a commoncut score.6 For example, once again assume that the correlationbetweenTest1andTest2isr =.80.Supposethatwetakethetop10%ofthecasesonTest1.Thetop10%onthefirsttestincludes79%ofthecasesinthetop3%onthesecondtest.Anevenmorelenientcriterionofthetop20%onTest1gets93%ofthosewhoscoreinthetop3%onTest2.

The policy implications are clear. If the goal is to reduce thenumberofstudentswhomustbeadministeredthesecondtestbutto exclude as few of those who would obtain high scores on thesecondtest,thenonemustusealenientcriteriononthescreeningtest. This is increasingly important as the correlation between thetwotestsdeclines.If,however,bothtestsareequallyreliableandareassumedtomeasurethesameconstruct,thensimilarcriteriacanbeused on both. Nevertheless, the proportion of students who clearbothhurdleswillbeconsiderablysmallerthantheproportionwho

Page 18: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted468

cleareitherhurdle.Thelowerthecorrelationbetweenthetests,thesmaller this proportion will be. Finally, if the two tests are in factexchangeable,thenacompensatorymodelsuchasthe“average”ruleismoredefensible.

The “Or” Rule

Thedisjunctive“or”rulehasquitedifferenteffects.Table1allowsonetoestimatetheeffectsofthisrule,aswell.Asbefore,assumeacorrelationofr=.80andacommoncutscoreofthetop3%.Test1admits3%ofthepopulation.Test2alsoadmits3%,but45%ofthesestudents(asTable1shows)werealreadyadmittedbyTest1.Theremaining55%willbenew.Therefore,3%+(.55)(3%)=4.65%of the student population would be admitted. Changing the rulefrom“and”to“or”morethantriplesthenumberofstudentsadmit-tedfrom1.35%to4.65%.

Thedisjunctive“or”ruleismostdefensibleifthetwotestsmea-suredifferentconstructssuchaslanguageartsormathematics.Ifpro-grams(oraccelerationoptions)areavailableinbothdomains,thenoneshouldseektoidentifystudentswhoexcelineitherdomain,notjust thosewhoexcel inbothdomains.However, as is shown later,multiple measures of aptitude for each domain are preferred to asinglemeasure.

The“or”isnotdefensible,however,whenbothtestsareassumedto measure the same construct. For example, the test scores mayrepresentmultipleadministrationsof the sameability testorcon-secutive administrations of several different ability tests. Error of measurement is defined as the difference between a particular testscoreforanindividualandthehypotheticalmeantestscoreforthatindividualthatwouldbeobtainedifmanyparallelformsofthetestcouldbeadministered.Thehighestscoreinasetofpresumablypar-allelscoresisactuallythemosterror-encumberedscoreinthatset.Therefore,unlessonehasagoodreasonfordiscountingaparticularscoreasinvalid,takingthehighestoftwoormorepresumablypar-alleltestscoreswillleadtoevenmoreregressiontothemeanthanwouldbeobservedbyusingjustonescore.

Page 19: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 469

The “Average” Rule

Ifbothtestsmeasurethesameconstruct,however, thestatisticallyoptimalruleisneither“or”nor“and”butrather“average.”The“aver-age”rulewilladmitmorestudentsthantherestrictive“and”rulebutfewerstudentsthantheliberal“or”rule.Itallowsformorecompen-sationthanthe“and”rulebutlesscompensationthanthe“or”rule.Thestudentwhohasahighscoreononetestbutascorethatisjustbelowthecutontheothertestwillbeadmitted.Essentially,studentsarerankedonthebasisofwheretheyfallonthe45ºdiagonalintheplot of scores on Test 1 versus scores on Test 2 rather than eithertheX-axisortheY-axis.However,becausetheaverageoftwoscoreswill immediately regress to the mean, fewer students will meet anarbitrarycutscorethanwillmeetitifjustonetestisadministered.Withacorrelationofr =.8,forexample,2.4%ofthestudentswouldbeexpectedtohaveanaveragescorethatexceededthecutscorethatadmitted3%oneithertestalone.6

Regression Effects on Subsequent Retest

Oneofthemostimportantconsiderationsforanyselectionruleistheextenttowhichiteffectsareasonablecompromisebetweenobtain-ingthemoststablescoresandthemostvalidscores.Themoststablescoreswillgenerallybeobtainedbycombiningscoresacrossdifferenttestsandoccasions,witheachweightedbyitsreliability.However,ascorethataveragesacrossseveraldomainswillgenerallybelessvalidasameasureofaptitudeforaspecificdomainthanscoresthatcap-turethegeneralandspecificaptitudesneededtoattainexcellenceinthatdomain.Wediscussbothoftheseissuesbutfirstfocusonthestabilityofscoresinthecommonscenarioinwhichstudentsmustbenominatedbeforetheyaretested.Dotheadmissiontestscoresforthesestudentsexhibitgreaterstabilitythanwouldbeobservedifnoscreening testhadbeenadministered? Intuitively, it seemsreason-abletoexpectlessregressiontothemeanovertime,say,inIQscoresforagroupofstudentswhowerefirstnominatedbytheirteachersasthemostablestudentsintheirclassthanforagroupidentifiedsolely

Page 20: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted470

onthebasisoftheirIQscores.Asweshallsee,however,intuitionscanbewrong.

To Nominate or Not to Nominate

Simulations provide a useful method for investigating the regres-sioneffectsofdifferentdecisionruleswhenmorethantwovariablesmust be considered. Here, we investigated the typical scenario inwhichonlythosestudentswhoarenominatedbyateachertakeanintelligencetest.Asalreadydemonstrated,thenumberofstudentsadmitteddependsonthecutscoresestablishedforthenominationprocedure and the correlation between scores on the nominationratingscaleandtheadmissionstest.Tosimplifymatters,weassumethat10%ofstudentswiththehighestscoresonthenominationscaleareadministeredtheintelligencetest.Thecutscorefortheintelli-gencetestissetsothatinanunselectedpopulation,3%ofthestu-dentswouldbeadmitted.ForanintelligencetestwithSdof15,thiswouldbeanIQ>128.

Nominationproceduresvaryintheextenttowhichtheymea-surethesamecharacteristicsastheintelligencetest.Inthissimula-tion,westartedwithapopulationof10,000students.Wevariedthecorrelationbetweenthenominationscaleandthe intelligencetestfromr=.1tor=.9.Ahighcorrelationsuchasr=.9simulatesthecaseinwhichthenominationprocedureishighlyeffectiveiniden-tifyingthosewhowillobtainthehighestscoresontheintelligencetest.Thecriticalquestioniswhetherthenominationprocessreducestheamountofregressionthatwillbeseenayearlaterwhentheintelli-gencetestisreadministered.Weassumedthatthecorrelationbetweenthesetwoadministrationsoftheintelligencetestwasr=.8.Table3showstheresults.

Thefirstcolumnofthetableshowsthecorrelationbetweenthenominationratingscaleandtheintelligencetest.Thesecondcolumnofthetableshowsthenumberofstudentsinapopulationof10,000studentswhoscored in the top10%onthenomination scaleandthenobtainedanIQ>128ontheintelligencetest.Thesearethestu-dentswhowouldbeadmittedtotheprogram.Whenthecorrelationbetweenthenominationscaleandthe intelligencetestwasr=1.0,

Page 21: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 471

then300students(i.e.,3%of10,000)wouldbeadmitted.Thissimu-latesthecaseinwhichthenominationprocedurewasnotusedandall students took the intelligence test. As the correlation betweenthe nomination scale and the admissions test declines, many stu-dentswhowouldhaveobtainedIQsgreaterthan128ontheadmis-sions testwereexcludedbecause theywerenotnominated.Whenthecorrelationishigh,however,onemightarguethatmanyoftheexcludedstudentsdidnotbelonginthegroupinthefirstplaceandwouldbethestudentsleastlikelytoscoreaboveanIQof128whentheintelligencetestwasreadministered1year later.Thethirdandfourthcolumnsofthetableshowthatthisisnotthecase.Althoughthe nomination procedure reduced the number of students whowereadmitted,itdidnotsignificantlyreducetheregressioneffectsobservedwhentheintelligencetestwasreadministered1yearlater.

Table 3 Effects of Nomination on Subsequent Regression

to the Mean of IQ Scores

Correlationbetweennomina-

tionscaleandintel-ligencetest

Numberofstudentsnominatedwith

IQ>128a

Numberofadmit-tedstudentswith

IQscore>1281yearlaterb

Percentofadmittedstudentswith

IQ>1281yearlaterc

1.0d 300 126 42

0.9 274 122 45

0.7 202 101 50

0.5 137 65 47

0.3 84 37 44

0.1 41 20 49

aNumber of students from a population of 10,000 scoring in the top 10% on the nomination scale and the top 3% on the admissions test. bNumber admitted scoring in top 3% on retesting; correlation between the two administrations of the admissions test was r = .8. c(Column 3/ Column 2) × 100. dA correlation of r = 1.0 simulates the case in which the nomination step is omitted and all students are administered the intelligence test.

Page 22: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted472

Notetheimportantdifferencebetweenthisprocedureandthecase in which scores on the screening test (or nomination ratingscale) and the admissions test are first combined and students areselectedonthebasisoftheirscoresontheresultingcomposite.Oneoftheeasiestwaystocombinescoresissimplytosumthemoraver-agethem,afterfirstputtingallscoresonthesamescale.7

Combining Ability and Achievement Test Scores

The identification of academically talented students ultimatelyresolvestotheestimationofaptitudeforrapidoradvancedlearningintheparticulareducationalprogramsthatcanbeoffered.Aptitudeisamultidimensionalconcept.Ithascognitive,affective,andcona-tivedimensions.Theprimarycognitiveaptitudesforacademiclearn-ingarecurrentknowledgeandskillsinadomainandtheabilitytoreasoninthesymbolsystemsusedtocommunicatenewknowledgein that domain. The primary affective aptitude is interest in thedomain. The primary conative aptitude is the ability to persist inone’spursuitofexcellence.Differentinstructionalprogramsrequireoraffordtheuseofdifferentaptitudes.Oneofthemostimportantgoalsforresearchshouldbetobetterunderstandtherelationshipsbetween those aptitude characteristics that can be measured priortoidentificationandthatcontributetothepredictionofsuccessindifferentkindsofprograms.

Thereismuchresearch,however,onthecriticalimportanceofthe two primary aspects of cognitive aptitude for learning—priorachievement and reasoning abilities. The best way to do this is tocombine scores so that they best predict subsequent achievement.Whendonewell,bothimmediateandlong-termregressioneffectswillbeminimized.Inthissectionweexploresomebasicoptionsforachievingthisgoal.Todothis,weneedalongitudinaldatasetthathas both achievement and ability scores for a large sample of stu-dents.

Gustafson(2002)collectedabilityandachievementtestscoresfor 2,362 students in a large Midwestern school district whowere testedfirst ingrade4, then ingrade6,andagain ingrade9.The ability test was CogAT Form 5 (Thorndike & Hagen, 1993)

Page 23: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 473

and the achievement test was the ITBS Form K Survey Battery(Hoover,Hieronymus,Frisbie,&Dunbar,1993).Inordertoillus-tratehowdifferentselectionmodelsperformovertime,welookedat high achievers in two domains: reading (Reading Vocabularyplus Reading Comprehension) and Mathematics (MathematicsConcepts,MathematicalProblemSolving,andMathComputation)atfourthgrade.

Table 4 shows the correlations across the three grades for thethreebatteriesoftheITBSandthethreebatteriesofCogAT.Inallcases,thecorrelationbetweengrades4and9wassmallerthanthecorrelationbetweengrades4and6orbetweengrades6and9.

ThesolidlineintheleftpanelofFigure4showsthepercentageof high-achieving students identified on the fourth-grade readingtestwhoalsometthesamepercentile-rankcutscoreinsixthandinninthgrade.Wecouldnotuseacriterionofthetop3%becauseofaceilingeffectonthegrade9tests.8Therefore,weselectedthe7%ofstudentswiththehighestscoresatgrade4.AsintheanalysesoftheMartin(1985)data,Figure4showsadramaticdeclineinthepercentofstudentsidentifiedatgrade4whocontinuedtoscoreatorabovethe93rdpercentilebetweenthefirsttest(grade4)andthesecondtest(grade6),andthenasmallerdeclinebetweengrades6and9.TherightpanelofFigure4showssimilareffectsforMathematics.

Table 4 Correlations Across Grades for ITBS (Form K)

and CogAT (Form 5) Scores (N = 2,363)

TestGrades

4with6 6with9 4with9ITBS

Reading 0.76 0.77 0.73Language 0.77 0.72 0.67Mathematics 0.74 0.73 0.67Composite 0.86 0.84 0.79

CogATVerbal 0.81 0.80 0.75Quantitative 0.75 0.77 0.71Nonverbal 0.72 0.74 0.68Composite 0.85 0.87 0.82

note. Data from Gustafson (2002).

Page 24: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted474

figure 4. Students in the top 7% of the it

BS distribution who were still in the top 7%

at grades 6 and 9 using three identi-fication m

odels. left panel for reading total scores; right panel for Mathem

atics total scores.

Reading

Mathem

atics

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Gra

de

4G

rade

6G

rade

9

Percent still in top 7%

ITB

S

ITB

S&

CogA

TP

red

CogA

TP

red

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Gra

de

4G

rade

6G

rade

9

Percent still in top 7%

ITB

S

ITB

S&

CogA

TP

red

CogA

TP

red

Figure4.Studentsinthetop

7%oftheIT

BSdistribution

who

werestillin

thetop7%

atgrades6and

9usingthreeidentification

models.Left

panelforReadingT

otalscores;rightpanelforMathem

aticsTotalscores.

Reading

Mathem

atics

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Gra

de

4G

rade

6G

rade

9

Percent still in top 7%

ITB

S

ITB

S&

CogA

TP

red

CogA

TP

red

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Gra

de

4G

rade

6G

rade

9

Percent still in top 7%

ITB

S

ITB

S&

CogA

TP

red

CogA

TP

red

Figure4.Studentsinthetop

7%oftheIT

BSdistribution

who

werestillin

thetop7%

atgrades6and

9usingthreeidentification

models.Left

panelforReadingT

otalscores;rightpanelforMathem

aticsTotalscores.

Page 25: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 475

Although current achievement is a critical aspect of academictalent,itisalsoimportanttoconsiderothercharacteristicsthatindi-catereadinesstocontinuetoachieveatahighlevelsuchasreasoningabilitiesinthemajorsymbolsystemsusedinthatdomain,interestinthedomain,andpersistence.Wedidnothavemeasuresofinter-estorpersistence,butdidhaveCogATreasoningscores inverbal,quantitative,andfiguraldomains.Therefore,wealsolookedatthelinearcombinationof thethreeCogATreasoningscores thatbestpredictedninth-gradereading.9ThepercentageoftopreadersthatwouldbeidentifiedusingthisestimatefromCogATscoresisshownbythedashedlineinFigure4.Clearly,usingpredictedratherthanobserved reading achievement at grade 4 missed many of the bestreadersatgrade4.However,thefigureshowsthatmostofthosewhoweremisseddidnotfallinthetop7%ofthereadingdistributionatgrade6.Andbygrade9,grade4Readingandgrade4CogATscoresbothidentifiedthesameproportionofstudentswhowerestillinthetop7%ofthereadingdistribution.Formathematics(rightpanelofFigure4),grade4Mathematicsandgrade4CogATidentifiedthesameproportionofstudentsstillinthetop7%atgrade6.Bygrade9,theregressionestimatebasedongrade4CogATscoresidentifiedmoreofthosewhowereinthetop7%oftheMathdistributionthandidgrade4ITBSmathematicsscores.

Becausebothpriorachievementandreasoningabilitiesfunctionas aptitudes for learning, a more effective selection model wouldcombine current achievement and reasoning abilities in the sym-bol systems used to communicate new knowledge in the domain.Estimatingachievementatgrade4isstraightforward.Weusedthechild’s ITBS Reading Total and Mathematics Total scaled scores.But,whichofthethreereasoningscoresfromCogATshouldweuse?Inpreviousanalysesofthisdata,weestimatedtheoptimalweightsto apply to ITBS and CogAT scaled scores at grade 4 to predictachievementinreadingandmathematicsatgrade9(seeTables2and3 inLohman,2005).Theseanalyses showedthatgrade9Readingwasbestpredictedbythegrade4CogATVerbalscore,withminorcontributionsfromtheCogATQuantitativeandNonverbalscores.Similarly, grade 9 Mathematics was best predicted by the grade 4Quantitativescore,althoughboththeNonverbalandVerbalbatter-

Page 26: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted476

iescontributedsignificantly,aswell.Althoughweusedtheoptimalweights,onecandoaboutaswellbyusingonlytheCogATVerbalscoretopredictreadingandthesumofallthreeCogATscores(i.e.,theComposite)topredictmathematics.Thisgaveustwoaptitudescores foreachchild ineachdomain:currentachievement in thatdomainandacompositeCogATreasoningscoreforthatdomain.

Next,wecombinedobservedachievementinfourthgradewithour estimate of predicted achievement (in reading or in math) inninth grade. Observed fourth-grade achievement and predictedninth-gradeachievementinreadingormathwereconvertedtostan-dardorzscoresandthensummed.Thisensuredthatbothscorescon-tributedequallytothecomposite.Weweightedeachequallybecauseourpreviousanalysesshowedthatpriorachievementandreasoningabilitiesmadeapproximatelyequalcontributionstothepredictionofachievementatgrade9.Fordetailedinstructionsonhowtocre-ateandcombinestandardscoresinaMicrosoftExcelworksheet,seeLohman(inpress-b).

The dotted lines in Figure 4 show how this selection variableperformed.Thelargesteffectwasatgrade4.Althoughthecompos-itescoredidnotidentifyallthehighscorersatgrade4,itdididen-tifyabout70%inreadingandabout80%inmath.Atgrade6,thecompositeperformedaswellasgrade4readingachievementaloneand significantly better for math than grade 4 math achievement.Bygrade9,thecompositeachievement-abilitymeasurewasthebestpredictorforreadingandwasaboutasgoodasCogATscoresaloneformath.

AsFigure4shows,thereisatradeoffbetweenmeasurementofcurrentachievementandaptitudeforfutureachievement.Measuresof domain-specific achievement best identify high performers at aparticular point in time. However, many of these students do notcontinuetoperformatthesamestellar levelsofachievementevenafter1year(seealsoFigure2).Ontheotherhand,althoughreason-ingabilitiesdonotidentifyallofthehighachieversatagrade,thosethat theydo identifyare thosewhoaremost likely tocontinueashighachieversinsubsequentyears.Indeed,inmathematicsatleast,bygrade9thosewiththehighestpredictedachievementbasedongrade 4 CogAT scores were even more likely to still be identified

Page 27: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 477

ashighachieversthanthosewhowereidentifiedongrade4ITBSMath alone. The final set of analyses using both achievement andabilitytestscoressuggeststhatasensiblepolicyforidentifyingtal-entedandgiftedstudentswouldcombinebothcurrentachievementin particular domains and that combination of reasoning abilitiesthatbestpredictslaterachievementinthosedomains.

Policy Implications

The stability of test scores has important implications for educa-tionalpolicy.First,multiple scores shouldalwaysbeusedtomakeeducationaldecisionsaboutgiftedstudents.Therearetwowaysthatthiscouldbedone.Eachstudent’sprevioustestscorescouldbetakenintoaccountwhenmakingeducationaldecisions,suchasconsideringachievementtestscoresoverthecourseofafewyears.Forexample,onecouldlookattheaverageofscaledscoresonthetwomostrecentassessments.10However,whenscoresareaveraged,thecutscoremustbelowered:Themorereliableaveragescorewillshowsomeregres-siontothemean.

Multiplescorescanalsobeusedbycombiningbothachievementand ability test scores that are administered at roughly the sametime.Figure4showsthattheaverageofITBSachievementandthecombinationofCogATscoresthatbestpredictedlaterachievementperformed better than either measure alone in identifying thosestudentswhocontinuedtoexhibitacademicexcellenceinparticu-lardomains.Schoolsshouldalsoinvestigatetheuseofmeasuresofinterest and persistence, although these measures should surely begivenmuchlessweightthanmeasuresofachievementandability.11Combiningscoresthatestimatedifferentaptitudesneededforthedevelopment of future competence is the best way to identify tal-entedstudents.However, judgmentsaboutaptitudearebestmadeby comparing a student’s scores on the relevant aptitude variablesto those of other students who have had similar opportunities todeveloptheknowledge,skills,interests,orotherattributessampledbytheassessment(seeLohman,inpress-b).

Page 28: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted478

Another important policy issue is the amount of time thatshould be allowed before students are retested for continued par-ticipationingiftedprograms.ApplyingtheentriesinTable1tothecorrelationsreported inTable2suggests that3years isanoutsidelimit(2wouldbebetter),especiallyifthefirsttestisadministeredduringtheearlyprimarygrades(K–2).Finally,becausetestscoresareespeciallyunstableforthosestudentswithextremescores,studentswhowouldqualifyasgiftedbasedononetestwillnotnecessarilyqualifyasgiftedwhenretestedeven1yearlater.Therefore,insteadofusingtermsthatimplyfixedcategories,suchasgifted,perhapseduca-torsshouldusewordsthatfocuslessonafixedstateandinsteadoncurrentaccomplishment,suchassuperior achievementorhigh accom-plishment.

Conclusions

Ourfirstgoalinthispaperwastosummarizesomeofthebasicfactsaboutregressiontothemeanforresearchersandpractitionersinthefieldofgiftededucation.Wehopedtodispelnotionsthatregressiontothemeanisattributablesolelytoerrorsofmeasurement.Rather,regression is determined by the correlation between two sets ofscores.Anythingthatlowersthecorrelationincreasesregressiontothemean.Thedatathatwepresentedshowthat,evenforhighlyreli-abletestscores,approximatelyhalfofthestudentswhoscoreinthetop3%ofthescoredistributionin1yearwillnotfallinthetop3%ofthedistributioninthenextyear.Thishasimportantimplicationsforbothresearchandpractice.

Theresearchimplicationisthatweneedmorelongitudinalinves-tigationsof individualdifferences inabilitiesofallsorts.Retestingthosewhoareidentifiedasgiftedatonepointintimeprovidesuse-fulinformation.However,asshownhere,thiswillcommonlymissmany—orevenmost—ofthosewhoattainhighscoresontheattri-buteatsomelaterpointintime.Therefore,muchmoreinformationabout the origin and development of academic excellence (ratherthanprecocity)couldbeobtainedfromstudiesinwhichtheentirepopulationoflearnerswasfollowedovertime.

Page 29: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 479

Theprimaryimplicationforpracticeisthatonecansubstantiallyreducetheamountofregressionbycombiningtheinformationfrommultipleassessments.However,differentwaysofcombiningscoreshavedramaticallydifferenteffectsonthenumberof studentswhoareadmittedandtheamountofregressionseenintheirtestscores.In general, the statistically optimal method of combining similarscoresistoaveragethem.

Intheend, inadditiontomultiplemeasures, localnormspro-videabetterway to identify students for inclusion in specialpro-gramsthatarebasedintheschool.Understandingthatallabilitiesaredevelopedandthatschoolsplayacriticalroleinthatprocesscanleadtopoliciesinwhichchildren’sreasoningabilitiesareassessedifnotasregularlyastheirachievement,thenatleastatseveralpointsintheiracademiccareers.Lackingsuchunderstanding,bothselec-tionpoliciesandresearchonthegiftedwillcontinuetogivemutetestimonytotherobustnessofregressiontothemean.

References

Ackerman,P.L.(1989).Abilities,elementaryinformationprocesses,andothersightstoseeatthezoo.InR.Kanfer,P.L.Ackerman,&R.Cudeck(Eds.),abilities, motivation, and methodology: the Minnesota symposium on learning and individual differences (pp.281–293).Hillsdale,NJ:Erlbaum.

Anderson,J.E.(1939).Thelimitationsofinfantandpreschooltestsinthemeasurementof intelligence.Journal of Psychology, 3,351–379.

Barton,P.E.(2005).one-third of a nation: rising dropout rates and declining opportunities.Princeton,NJ:EducationalTestingService,PolicyEvaluationandResearchCenter.

Bayley,N.(1949).Consistencyandvariability inthegrowthofintelligence frombirthtoeighteenyears.Journal of Genetic Psychology, 25,165–196.

Callahan,C.M.(1992).Determiningtheeffectivenessofeduca-tionalservices:Assessmentissues.Inchallenges in gifted educa-tion: developing potential and investing in knowledge for the 21st

Page 30: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted480

century (pp.109–114).Columbus,OH:OhioDepartmentof Education. (Eric Document Reproduction Service No.ED344416)

Carpenter,P.,Just,M.A.,&Shell,P.(1990).Whatoneintelligencetestmeasures:AtheoreticalaccountoftheprocessingintheRavenProgressiveMatricesTest.Psychological review, 97,404–431.

Daniel,M.H.(2000).Interpretationofintelligencetestscores.InR.J.Sternberg(Ed.),Handbook of intelligence(pp.477–491).Cambridge,UK:CambridgeUniversityPress.

Flynn,J.R.(1987).MassiveIQgainsin14nations:WhatIQtestsreallymeasure.Psychological Bulletin, 101,171–191.

Flynn,J.R.(1999).Searchingforjustice:ThediscoveryofIQgainsovertime.american Psychologist, 54,5–20.

Gagné,F.,&StPère,F.(2001).WhenIQiscontrolled,doesmotiva-tionstillpredictachievement?intelligence, 30,71–100.

Gustafson,J.P.(2002).Empirical Bayes estimators as an indicator of educational effectiveness: the Bryk and raudenbush school perfor-mance model.Unpublisheddoctoraldissertation,UniversityofMinnesota,Minneapolis.

Haney,W.,Maduas,G.,Abrams,L.,Wheelock,A.,Miao,J.,&Gruia,I.(2004).the education pipeline in the united States 1970–2000.Boston:BostonCollege,NationalBoardonEducationalTestingandPublicPolicy.

Hoover,H.D.,Dunbar,S.B.,&Frisbie,D.A.(2001).the iowa tests of Basic Skills, form a.Itasca,IL:Riverside.

Hoover,H.D.,Hieronymus,A.N.,Frisbie,D.A.,&Dunbar,S.B.(1993).iowa tests of Basic Skills, form K: Survey Battery.Chicago:Riverside.

Humphreys,L.G.(1985).Generalintelligence:Anintegrationofactor,test,andsimplextheory.InB.B.Wolman(Ed.),Handbook of intelligence: theories, measurement, and applications (pp.201–224).NewYork:Wiley.

Humphreys,L.G.,&Davey,T.C.(1988).Continuityinintellectualgrowthfrom12monthsto9years.intelligence, 12,183–197.

Kolen,M.J.,&Brennan,R.L.(2004).test equating, scaling, and linking: Methods and practices(2nded.).NewYork:Springer-Verlag.

Page 31: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 481

Lohman,D.F.(2005).Theroleofnonverbalabilitytestsintheiden-tificationofacademicallygiftedstudents:Anaptitudeperspec-tive.Gifted child Quarterly, 49,111–138.

Lohman,D.F.(inpress-a).Beliefsaboutdifferencesbetweenabil-ityandaccomplishment:Fromfolktheoriestocognitivescience.roeper review.

Lohman,D.F.(inpress-b). identifying academically talented minor-ity students (ResearchMonograph).Storrs,CT:TheNationalResearchCenterontheGiftedandTalented.

Lohman,D.F.,&Hagen,E.P.(2001).cognitive abilities test (form 6).Itasca,IL:Riverside.

Lohman,D.F.,&Hagen,E.P.(2002).cognitive abilities test (form 6): research handbook.Itasca,IL:Riverside.

Lubinski,D.,Webb,R.M.,Morelock,M.J.,&Benbow,C.P.(2001).Top1in10,000:A10-yearfollow-upoftheprofoundlygifted.Journal of applied Psychology, 86,718–729.

Marsh,H.W.,&Hau,K.-T.(2002).Multilevelmodelingoflongitudi-nalgrowthandchange:Substantiveeffectsorregressiontowardthemeanartifacts?Multivariate Behavioral research, 37,245–282.

Martin,D.J. (1985).the measurement of growth in educational achievement.Unpublisheddoctoraldissertation,UniversityofIowa,IowaCity.

McCall, R . B., Appelbaum, M., & Hogarty, P. S. (1973).Developmentalchangesinmentalperformance.Monographs of the Society for child development, 38(3,SerialNo.150).

McGrew,K.S.,&Woodcock,R.W.(2001).Woodcock-Johnson iii technical manual.Itasca,IL:Riverside.

Milich, R ., Roberts, M. A., Loney, J., & Caputo, J. (1980).Differentiating practice effects and statistical regression ontheConnersHyperactivityIndex.Journal of abnormal child Psychology, 8,549–552.

Mills,J.R.,&Jackson,N.E.(1990).Predictivesignificanceofearlygiftedness:Thecaseofprecociousreading.Journal of Educational Psychology, 82,410–419.

Nesselroade,J.R.,Stigler,S.M.,&Baltes,P.B.(1980).Regressiontowardthemeanandthestudyofchange.Psychological Bulletin, 88,622–637.

Page 32: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted482

Otis,A.S.,&Lennon,R.T.(2003).otis-lennon School ability test, Eighth Edition.SanAntonio,TX:Harcourt.

Phillips,L.M.,Norris,S.P.,Osmond,W.C.,&Maynard,A.M.(2002).Relativereadingachievement:A longitudinal studyof 187 children from first through sixth grades. Journal of Educational Psychology, 94,3–13.

Rasch,G.(1960).Probabilistic models for some intelligence and attain-ment tests. Copenhagen, Denmark: Denmarks PædagogiskeInstitut.

Roid,G.(2003).Stanford Binet intelligence Scale, fifth Edition: technical manual.Itasca,IL:Riverside.

Schafer,J.L.,&Graham,J.W.(2002).Missingdata:Ourviewofthestateoftheart.Psychological Methods,7,147–177.

Spangler,R.S.,&Sabatino,D.A.(1995).Temporalstabilityofgiftedchildren’sintelligence.roeper review, 17,207–210.

Tannenbaum,A.J.(1965).ReviewoftheCultureFairIntelligenceTest.the Sixth Mental Measurements Yearbook, 445,721–722.

Terman,L.M.,&Oden,M.H.(1959).the gifted at mid-life, thirty-five years follow-up of the superior child: Vol. 3. Genetic studies of genius.Stanford,CA:StanfordUniversityPress.

Thorndike,R.L.(1933).TheeffectoftheintervalbetweentestandretestontheconstancyoftheIQ.Journal of Educational Psychology, 24,543–549.

Thorndike,R.L.(1975).Mr.Binet’stest70yearslater.Educational researcher, 4,3–7.

Thorndike,R.L.,&Hagen,E.(1993).the cognitive abilities test (form 5).Itasca,IL:Riverside.

Tibbetts, K. A. (2004). When the test fails: the invalidity of assumptions of normative stability in above-average populations.Unpublished doctoral dissertation, University of Hawai’i,Mãnoa.

Wainer,H.(1999).IstheAkebonoschoolfailingitsbeststudents?AHawaiianadventureinregression.Educational Measurement: issues and Practice, 18,26–31,35.

Willerman,L.,&Fiedler,M.F.(1977).Intellectuallyprecociouspre-schoolchildren:Earlydevelopmentandlaterintellectualaccom-plishments.Journal of Genetic Psychology, 131,13–20.

Page 33: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Gifted Today but Not Tomorrow 483

Wilson,R.S.(1983).TheLouisvilletwinstudy:Developmentsyn-chroniesinbehavior.child development, 54,198–216.

End Notes

1Themoregeneralequationforpredictingregressioneffectswhentheassumptionofequalvarianceisinappropriatecanbeexpressedinseveralways.Ausefulequationis Y Y b X Xp y x p= −( )+ ,whereYp isthepredictedYscoreforpersonp, Y isthemeanY score, by xistheunstandardizedcoefficientfortheregressionofYonX, X p istheXscoreforpersonp,and X isthemeanXscore. 2NoteworthyexceptionsarethedissertationbyTibbetts(2004)andWainer’s(1999)discussionofthesamedata.AlsoseethepaperbyWillermanandFiedler(1977)foranexampleofregressioninIQscoresforgifted4-year-olds. 3Inarecentstudy,SpanglerandSabatino(1995)didnotobservechanges inmeanretestIQsfor66giftedchildreninasouthernAppalachianschooldistrict.However,childrenwereexcludedfromthestudyifthey“experiencedremarkablesensory,physical,health-related,social,personalorfamilyproblems”(p.208).Further,theinitial test scoresmayhavebeendepressedbypooreducationalopportunitiesforsomeofthechildren.ThefactthattheSdofWISC-RIQscoresmorethandoubledonretestsupportsthiscon-jecture. 4Recentimprovementsinstatisticalmethodsformakinginfer-encesfromsparselypopulateddatamatricesofferanotheravenue(see,e.g.,Schafer&Graham,2002). 5EstimateswerederivedusingaprogramcalledStaTable,whichisavailableasafreedownloadathttp://www.cytel.com/statable.Forbivariatenormaldistributions,StaTableasksforthezscoresthatrestrictthedistribution(1.8808forthetop3%),aswellasthecorre-lationbetweenthetwomeasures.TheproportionofscoresfallingintherestrictedrangeisthengivenbyStaTable.Todeterminetheper-centageofstudentsfallinginthetop3%uponthesecondmeasure,theproportiongivenbyStaTablewasdividedby.03,theproportionfallinginthetop3%atthefirstmeasure.

Page 34: Gifted Today but Not Tomorrow? Longitudinal Changes in Ability ...

Journal for the Education of the Gifted484

6Tablesillustratingtheeffectsofaveragingtestscoresandofusingdifferentcutscoresforonetestthananotherhadtobedeletedfromthemanuscripttosavespace.Theseareavailablefromtheauthorsonrequest. 7Averagingorsummingstandardscoreseffectivelyweightseachthesame.Regressionproceduresallowestimationofmorenearlyoptimalweights.However,theunitweightsimpliedbysummingscoresgenerallyfunctionaboutaswellasoptimalweightsoncross-validationaslongaseachscoremakesareasonablecontributiontotheprediction. 8Missingonemoreitemthusresultedinasubstantialshiftinpercentilerank(PR).Wemoveddownthedistributionuntilthiswasnolongeraproblem. 9Weusedgrade9ratherthangrade4or6becausewewereinter-estedinpredictingsuccessoverthelonghaul.However,usingtheregressionweightsthatbestpredictedgrade4orgrade6readingwouldnotmakemuchdifference. 10Notethatanaverageoftwoassessmentsisrecommendedratherthanapolicyofrequiringthatthestudentmeetthecutscoreontwosuccessiveassessments.Thelatterrule—whetherappliedtothesameassessmentadministeredindifferentyearsortodifferentassessments(e.g.,achievementandability)administeredinagivenyear—missesmanycapablestudents. 11Forasummaryofresearchonthecontributionofmeasuresofmotivationtothepredictionofacademicsuccess,seetheexcellentliteraturereviewinGagnéandStPère(2001).However,theGagnéandStPèrestudyitselfprobablyunderestimatesthecontributionofmotivationbecausethemajormotivationvariablewasdifferencescore.


Recommended