+ All Categories
Home > Documents > Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI

Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI

Date post: 15-Feb-2016
Category:
Upload: kiefer
View: 68 times
Download: 0 times
Share this document with a friend
Description:
Enhancing the Technical Quality of the North Carolina Testing Program: An Overview of Current Research Studies. Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI. Overview. Comparability Consequential validity Other projects on the horizon. Comparability. - PowerPoint PPT Presentation
Popular Tags:
23
Enhancing the Technical Quality of the North Carolina Testing Program: An Overview of Current Research Studies Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis, NCDPI
Transcript
Page 1: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Enhancing the Technical Quality of the North Carolina Testing

Program: An Overview of Current Research Studies

Nadine McBride, NCDPIMelinda Taylor, NCDPICarrie Perkis, NCDPI

Page 2: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Overview

• Comparability• Consequential validity• Other projects on the horizon

Page 3: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Comparability• Previous Accountability Conference

presentations provided early results• Research funded by an Enhanced

Assessment Grant from the US Department of Education

• Focused on the following topics:– Translations– Simplified language– Computer-based– Alternative formats

Page 4: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

What is Comparability?

Not just “same score”• Same content coverage• Same decision consistency• Same reliability & validity• Same other technical properties (i.e.,

factor structure)• Same interpretations of test results, with

the same level of confidence

Page 5: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Goal

• Develop and evaluate methods for determining the comparability of scores from test variations to scores from the general assessments

• The same inferences should be able to be made, with the same level of confidence, from variations of the same test.

Page 6: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Research Questions

• What methods can be used to evaluate score comparability?

• What types of information are needed to evaluate score comparability?

• How do different methods compare in the types of information about comparability they provide?

Page 7: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Products

• Comparability Handbook– Current Practice

• State Test Variations• Procedures for Developing Test Variations and Evaluating

Comparability– Literature Reviews – Research Reports– Recommendations

• Designing Test Variations• Evaluating Comparability of Scores

Page 8: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Results - Translations

• Replication methodology helpful when faced with small samples and widely different proficiency distributions– Gauge variability due to sampling (random) error– Gauge variability due to distribution differences

• Multiple methods for evaluating structure are helpful• Effect size criteria helpful for DIF• Congruence b/w structural & DIF results

Page 9: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Results – Simplified Language

• Carefully documented and followed development procedures focused on maintaining the item construct can support comparability arguments.

• Linking/equating approaches can be used to examine and/or establish comparability.

• Comparing item statistics using the non-target group can provide information about comparability.

Page 10: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Results – Computer-based

• Propensity score matching produced similar results to studies using within-subjects samples.

• Propensity score method provides a viable alternative to the difficult-to-implement repeated measures study.

• Propensity score method is sensitive to group differences. For instance, the method performed better when 8th and 9th grade groups were matched separately.

Page 11: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Results – Alternative Formats

• The burden of proof is much heavier for this type of test variation.

• A study based on students eligible for the general test can provide some, but not solid, evidence of comparability.

• Judgment-based studies combined with empirical studies are needed to evaluate comparability.

• More research is needed in methods for evaluating what constructs each test type is measuring.

Page 12: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Lessons Learned• It takes a village…

– Cooperative effort of SBE, IT, districts and schools to implement special studies

– Researchers to conduct studies, evaluate results

– Cooperative effort of researchers and TILSA members to review study design and results

– Assessment community to provide insight and explore new ideas

Page 13: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Consequential Validity

• What is consequential validity?– Amalgamation of evidence regarding the

degree to which use of test results have social consequences

– Can be both positive and negative; intended and unintended

Page 14: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Who’s Responsibility?

• Role of the Test Developer versus the Test User?

• Responsibility and roles are not clearly defined in the literature

• State may be designated as both a test developer and a user

Page 15: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Test Developer Responsibility

• Generally responsible for… – Intended effects– Likely side effects– Persistent unanticipated effects– Promoted use of scores– Effects of testing

Page 16: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Test Users’ Responsibility

• Generally responsible for… – Use of scores

• the further from the intended uses, the greater the responsibility

Page 17: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Role of Peer Review

• Element 4.1– For each assessment, including the

alternate assessment, has the state documented the issue of validity…. with respect to the following categories:• g) has the state ascertained whether the

assessment produces intended and unintended consequences?

Page 18: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Study Methodology

• Focus Groups– Conducted in five regions across the state– Led by NC State’s Urban Affairs – Completed in Dec 09 and Jan 10– Input of teachers and administration staff– Included large, small, rural, urban,

suburban schools

Page 19: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Study Methodology

• Survey Creation– Drafts currently modeled after surveys

conducted in other states– However, most of those were conducted

10+ years ago– Surveys will be finalized after focus group

results are reviewed

Page 20: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Study Methodology

• Survey administration– Testing Coordinators to receive survey

notification– Survey to be available in late March to April

Page 21: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Study Results

• Stay tuned!– Hope to make the report publicly available

on DPI testing website

Page 22: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Other Research Projects

• Trying out different item types• Item location effects• Auditing

Page 23: Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie  Perkis , NCDPI

Contact Information

• Nadine [email protected]

• Melinda [email protected]

• Carrie PerkisData [email protected]


Recommended