+ All Categories
Home > Documents > The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for...

The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for...

Date post: 24-Apr-2020
Category:
Upload: others
View: 16 times
Download: 0 times
Share this document with a friend
41
The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral Scholar Berkeley Evaluation and Assessment Research (BEAR) Center University of California, Berkeley
Transcript
Page 1: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

TheValidityofStandardizedTestsforEvaluatingCurricularInterventionsin

MathematicsandScienceJoshuaSussman

PostdoctoralScholarBerkeleyEvaluationandAssessmentResearch(BEAR)Center

UniversityofCalifornia,Berkeley

Page 2: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Talkoverview

• Threestudiesthatexaminetheuseofstandardizedacademictestsforevaluatingtheimpactofcurricularinterventions•Analyzethevalidity(AERA,APA,&NCME,2014)ofthetestforevaluatingtheintervention• Thestudiesleadtopoliticalandmethodologicalsolutionstoanenduringprobleminappliededucationalmeasurement.

Page 3: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Threestudies:Researchquestions

1. Howoftendoinvestigatorsusestandardizedteststoevaluatetheimpactofeducationalinterventions;arethetestsvalidfortheirintendedpurpose?

2. Howmuchalignmentattheitemlevelisnecessaryforvalidevaluation?

3. Whatresearchdesignscaninvestigatorsusetomitigatevalidityproblemswithstandardizedtestsasoutcomemeasures?

Page 4: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Aboutme

• Thegoalofmyworkistoadvanceappliedmeasurementinschools.• MyresearchexperienceincludescurriculumdevelopmentprojectsfundedbytheInstituteofEducationSciences(IES)andNationalScienceFoundation(NSF).DissertationresearchfundedbyanIESpre-doctoralfellowshipinintheResearchinCognitionandMathematicsEducationProgram• Experienceintestconstructionandvalidation(Blackracialidentity,sustainedattention,earlychildhooddevelopment,non-cognitivepredictorsofacademicsuccess,mathematicsandscience).

Page 5: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Reasonstoevaluateeducationalinterventionsusingstandardizedtestsasoutcomemeasures• Theyarereliablemeasuresofgrade-levelacademicproficiency,inamajorsubjectarea,forgroupsofstudents.• Theyprovidea“fair”measureoftheimpactofanacademicintervention.• Curriculum-independentandnotsubjecttoresearcherbiasesor“trainingeffects.”

• Schoolsareaccountableforimprovingtestscores

Page 6: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Problemswiththeuseofstandardizedtestsasoutcomemeasures

Page 7: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

• Whatifthedomainoftheeducationalinterventionisnarrowerthan“mathematics?”• E.g.,fractions

• Thebroadtestdesigncanbeproblematic.• Alongstandingconsensusisthatweshouldevaluateinterventionsbydeterminingthedegreetowhichthegoalsoftheprogramarebeingrealizedinstudents(Baker,Chung,&Cai,2016;Tyler,1942).

Problemswiththeuseofstandardizedtestsasoutcomemeasures:contentmismatch

Page 8: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Problemswiththeuseofstandardizedtestsasoutcomemeasures:cognitivemismatch• Standardizedtestsdonotmeasureeverythingthatisimportantinacademiccompetence(Darling-Hammondetal.,2013;NRC,2001).• Specificissues:NRC(2004)foundseriousproblemswiththevalidityofstandardizedtestsin86evaluationsof25differentmathcurricula.

• Newstandardizedtestsinmathematicsdoabetterjobofmeasuringmodernlearninggoalsbutseriousshortcomingscontinuetoexist(Doorey &Polikoff,2016).• Inscience,existingtestsarenotdesignedtomeasurethemodernlearninggoalsintheNextGenerationScienceStandards(DeBarger,Penuel,&Harris,2013;Wertheimetal.,2016).

Page 9: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Study1:Afocusonprevalenceandvalidityofstandardizedtestsasoutcomemeasures1. Howoftendoinvestigatorsusestandardizedtestsaskeyoutcome

measures?2. Arethetestsvalid?• Dothegoalsoftheinterventionappeartoalignwiththemeasurementtargetofthestandardizedtest?• Doinvestigatorsestablishvalidityevidenceforthespecificuseofthetestperrecommendationsintheliterature(AERA,APA,&NCME,2014)?• Isthevalidityevidenceadequate?

Page 10: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Afocusonthealignmentaspectsoftestvalidity• Evaluatethevalidityevidencewithanemphasisonthealignmentbetweenthetestsandtheinterventions(Bhola,Impara,&Buckendahl,2003;Roach,Niebling,&Kurz,2009;Porter,2002)• Aprincipledwaytostudythematchbetweenatestandanintervention• Contentalignment• Cognitiveprocessalignment

• Welldevelopedinvestigationsintothealignmentbetweenstandardizedtestsandinterventionsarearelativelynewareaoftheliterature(e.g.,May,Johnson,Haimson,Sattar,&Gleason,2009)

Page 11: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Method• Asecondaryanalysisof85projectsfundedbytheIESmathematicsandscienceeducationprogram(2003– 2015).• Datasources

a) IESdatabaseentries(studygoals,descriptionofintervention,keymeasuresetc…)

b) ReportstoIESreceivedfromprojectPI’sc) Peer-reviewedarticlesassociatedwithprojectsd) Testinformationontheinternet

Page 12: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

TheprevalenceofstandardizedtestsasoutcomemeasuresAnalysis:CalculatetheproportionoftheprojectsthatevaluatedacurricularinterventionusingdatafromastandardizedtestResults:• Mostprojectsdevelopedandevaluatedacurricularintervention(82%)• Mostinterventionprojectsused,orplannedtouse,astandardizedtestforimpactevaluation(72%)• Thus,evaluationofnewcurricularinterventionsusingstandardizedtestsiswidespreadpractice

Page 13: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

ThevalidityofstandardizedtestsasoutcomemeasuresAnalysis:Threeraters,usingavalidityrubrictoscoreeachproject,reachedconsensusontheprojectswithmisalignmentbetweentheinterventionandthestandardizedtestusedasanoutcomemeasure.

Results:Theratersflagged54%oftheprojectsforamismatchbetweentheinterventionandthetest.• Testsmeasuredtoomuchacademiccontent• Learninggoalsweredifficulttomeasurewithatypicalstandardizedtest

• E.g.,Conductingscientificinvestigations;participatinginalearningcommunity.

Page 14: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

ThevalidityofstandardizedtestsasoutcomemeasuresAnalysis:Foreachprojectflaggedforvalidityissues,thesamethreeraterscloselyexaminedthecorpusofdataforvalidityevidenceandtojudgetheadequacyofthevalidityevidence.

Data:ReportsfromPIs• Emailed68uniquePI’sforreportsand48responded(70.6%)• 33PI’sprovidedreports

Page 15: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

33 reports

provided

25 projects flagged 11

projects

ReportsfromPIs

Page 16: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Thevalidityofstandardizedtestsasoutcomemeasures

• Analyzedreportsandpublishedarticles

Page 17: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

• Fiveoutofthe11didnotevenmentionvalidityissues.• Sixoutofthe11containedvaliditydiscussions.

Results:Validitydiscussions

Page 18: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

• Onlyoneestablishedadequatevalidityevidence

Results:Adequacyofvalidityevidence

Page 19: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Measurementissuesuncoveredduringtheanalysis• Thestandardizedtestdidnothaveenoughtestitemsthattappedthecontenttaughtbytheintervention.• Ilearnedalessonto“bemorespecificaboutthelearningoutcomesIwanttomeasureandselectanassessmentthatwillbemoresensitivetomeasuringthoseoutcomes.”• Oneinvestigatorcouldnotevaluatetheinterventionbecausethestandardizedtestdidnotmeasuretheappropriateconstruct.• Infollowupresearch,oneinvestigatorselectedasubsetofitemsfromthetest(i.e.,theusefulones).

Page 20: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Summary

• Majorityofprojectsengagedinappliedresearchandevaluationusingastandardizedtest• Abouthalfoftheseprojectswereflaggedaspotentiallyproblematic• Only6of11projectsestablishedany validityevidenceforthespecificuseofthetest• Only1of11establishedadequatevalidityevidence

Page 21: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Recommendations

• Cautiouslyinterpretevaluationsofnewcurriculathatpositiondatafromstandardizedtestsastheprimaryoutcomemeasure–theymaynotprovideaccurateandusefulinformationfordata-baseddecisionmaking.• Carefulitemselection• Proposalsthatincludeimpactevaluationshouldrequireinvestigatorstodiscussmeasurementindetail

Page 22: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Study2:Howmuchalignmentisenoughforvalidevaluation?• Inmanycases,onlyafewitemsonthestandardizedtestalignwiththeintervention(Sussman,2016).• Thisdatasimulationstudydevelopsapsychometricmodeloftherelationshipbetweenalignmentandthetreatmentsensitivity ofanevaluationdefinedastheabilityofanevaluationtodetecttheeffectofaneducationalintervention(Lipsley,1990;Mayetal.,2009).• Thepracticalgoalistodevelopamethod,akintopoweranalysis,thathelpsresearchersaccountformisalignmentwhentheydesignevaluations.

Page 23: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Alignmentbetweenamathtestandanintervention

Cognitive complexity Academiccontent

Addition Subtraction

Singledigit

Double digit

Double digitwithcarryingorborrowing

Interventionteachesthisarea

Page 24: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Method

• Datasimulationofhypotheticalevaluationswithanoutcomemeasurethatismoreorlessalignedwithanintervention• Theprimaryoutcomeistheaveragestatisticalpower,calculatedasafunctionoftestalignmentandinterventioneffectsize.• Powertodetectatruedifferencebetweenexperimentalandcontrol

• PsychometricmodelsfordatagenerationandfordataanalysisfromtheRaschfamilyofitemresponsemodels(Rasch,1960/1980;Adams,Wilson,&Wang,1997).

Page 25: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Keyassumptionsofthesimulation

• Effectivetreatmentsincreasetheprobabilitythatastudentsucceedsonanatestitemthatisaligned• Thetreatmenthasnoimpactonanitemthatisconsiderednotaligned• Thecontrolgroupisunaffected

Page 26: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Simulationvariables

Holdsamplesizeconsistentbutvaryalignmentoverarangeofeffectsizesfortheintervention.• Fixedsamplesize(N =600;300eachinexperimentalandcontrolgroups)• Fixedtestlength(N =50items)• Varyalignment(1– 50items)• Varyeffectsizeoftheintervention(0.1– 2.0SD)

Page 27: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Results Effect Size

0.10.20.30.40.50.60.70.80.91.01.52.0

0.0

0.2

0.4

0.6

0.8

1.0

0 20 40 60 80 100Simulated Alignment (%)

Stat

istic

al P

ower

Page 28: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Conclusions

• Alignmentshouldbenolessthan60%,foradequatestatisticalpowertodetecttreatmenteffectswithaneffectsizeof0.2SD.• Researchersmustbalancealignmentagainstsensitivitytodetectsmalleffectsizes.• Useofmultiplemeasureswithdifferentlevelsofalignmentrepresentanidealscenariofordevelopingacompellingevaluationargument(Cronbach,1963;House,1977;Penuel,2016).

Page 29: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Study3:Onesolutiontothealignmentproblem• Researchmethodsthatcoordinatedataandtheorycanpresentstrongerargumentsfortheefficacyofanintervention.• EmpiricalstudythatdocumentstheeffectivenessoftheLearningMathematicsthroughRepresentations (LMR)lessonsequenceforteachingEnglishLearners(ELs)mathematics.• TheevaluationcoordinatesdatafromaresearcherdevelopedtestandastandardizedtestwiththeoryabouthowthefeaturesofLMRmeettheneedsofELsinthemathematicsclassroom.

Page 30: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

LearningMathematicsthroughRepresentations(LMR)

• LMRisa19-lessonnumberline-basedcurriculumunitthatsupportsupperelementarystudents’understandingsofintegersandfractions(Saxe,deKirby,Le,Sitabkhan,&Kang,2015)• Theunitsupportsmathematicallearningthrough(a)theuseofthenumberlineasacentralrepresentationalcontext,and(b)thebuildingofmathematicaldefinitionsinclassroomcommunitiesthatbecomeresourcestosupportstudentargumentationandproblemsolving.

Page 31: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Method

• 571studentsin21classrooms(4th and5th grade)containingbothELsandEnglishProficient(EP)studentsparticipatedinaquasi-experimentalstudy.• Therewere95ELsinthesample:44ELsin11LMRclassroomsand51ELsin10comparisonclassrooms.

• Studentscompletedasetoffour(pre,interim,post-test,andfollowup)researcherdevelopedassessmentsofintegersandfractions• Studentsalsocompletedthestatetestinmathematicsintheprioryearandtheendoftheinterventionyear

Page 32: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

TheempiricalresultssupporttheefficacyofLMRforstudentsclassifiedasELs• MultilevelanalysisrevealedthattheELsinLMRclassroomsgainedmoreinmathematicsthantheELsinthematchedcomparisongrouponbothanassessmentofintegersandfractions(p=0.011;ES=0.68)andastandardizedassessmentinmathematics(p =0.010,ES =0.49)• LMReliminatedornarrowedtheachievementgapbetweenELsandEPs• Inaddition,theorysupportsLMR’spotentialasamathematicsinterventionbenefittingELs’achievement;narrowly(integersandfractions)andbroadly(grade-levelachievement)

Page 33: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

TworesourcesformeetingtheneedsofELsinthemathematicsclassroom

1. Participationinmathematicalcommunication&argumentation(Darling-Hammond,2007;Moskchovich,2012;NCTM,2000;Schoenfeld,2002).

2. Multimodalopportunitiesforlearningusingvisualandembodiedrepresentations(Bustamante&Travis,1999;Hakuta &Santos,2012;Moschkovich,1999,2002;Schleppegrell,2007.)

Page 34: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Opening Problem

Opening Discussion

Partner Work

Closing Discussion

Closing Problem Student Thinking

& Problem Solving

1. Participationinmathematicalcommunication&argumentation

2. Multimodallearning(visualandembodiedrepresentations

ProvidingELsaccesstoparticipatinginmathematicslessons

Page 35: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

1. Participationinmathematicalcommunication&argumentation

2. Multimodallearning(visualandembodiedrepresentations

ProvidingELsaccesstomathematicaldiscussions

Page 36: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

1. Participateinmathematicalcommunication&argumentation

2. Multimodalopportunitiesforlearning(visualandembodiedrepresentations

Visualresourcesformathematicallearning

Page 37: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

1. Participationinmathematicalcommunication&argumentation

2. Multimodalopportunitiesforlearning(visualandembodiedrepresentations

Embodiedresourcesformathematicallearning

Page 38: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

1. Participationinmathematicalcommunication&argumentation

2. Multimodalopportunitiesforlearning(visualandembodiedrepresentations

MeetingtheneedsofELsinthemathematicsclassroom

Page 39: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Conclusions

• Standardizedtestshaveaplaceandpurpose,buttheyneedtobewellalignedtoserveasoutcomemeasures• Alignmentshouldbenolessthan60%todetectreasonableeffectsizes(0.2SD).• Highqualityevaluationsofeducationalinterventionscoordinatedataandtheory.

Effect Size 0.10.20.30.40.50.60.70.80.91.01.52.0

0.0

0.2

0.4

0.6

0.8

1.0

0 20 40 60 80 100Simulated Alignment (%)

Stat

istic

al P

ower

Page 40: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

Plansforfutureresearch

• Measurementinspecialeducation• MeasuringprogresstowardsIndividualizedEducationPlangoals• Supportdata-baseddecisionmakingforfuturestudenteligibility,goalsandservices(interventions).

Page 41: The Validity of Standardized Tests for Evaluating ... · The Validity of Standardized Tests for Evaluating Curricular Interventions in Mathematics and Science Joshua Sussman Postdoctoral

ReferencesSussman,J.,&Wilson,M.Theuseandvalidityofpreexistingachievementtestsforevaluatingnewcurricularinterventionsinscienceandmathematics.Underreview(reviseandresubmit):AmericanJournalofEvaluation.

Sussman,J.Standardizedtestsasoutcomemeasuresinappliedresearch:Apsychometricsimulationoftherelationshipbetweenalignmentandtreatmentsensitivity.TobesubmittedtoAppliedMeasurementinEducation.

Sussman,J.,&Saxe,G.B.Mathematicslearninginlanguageinclusiveclassrooms:supportingtheachievementofEnglishlearnersaswellastheirEnglishproficientpeers.TobesubmittedtoAmericanEducationalResearchJournal.

[email protected]


Recommended