1 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
Understanding CELF-5 Reliability &
Validity to Improve Diagnostic Decisions
Adam Scheller, Ph.D. Senior Educational Consultant
Pearson
2 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 2 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Disclosures
• Dr. Scheller is an employee of Pearson, publisher of the CELF-5. No other language assessments will be presented in this presentation.
2 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
3 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 3 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Agenda
1. Introduction 1. Disclosure
2. Agenda
3. Learning Objectives
2. Research Overview 1. Standardization
2. Reliability
3. Validity
3. Summary/Questions
4 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 4 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Learning Outcomes
1. Name at least one study conducted to evaluate the reliability of the CELF-5.
2. Describe two procedures used to evaluate test score and index score differences to determine if the differences are significant.
3. Describe the average difference of CELF-4 and CELF-5 scores in a study conducted with typically developing students.
4. Identify the sensitivity/specificity on a chart using the cut score used in the clinicians’ place of employment.
5. Name the optimal cut score on CELF-5 that provides the best balance between the sensitivity and specificity measures.
3 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
Research Overview
6 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 6 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Research Overview
• Multiple research phases
• Over 4000 students tested in standardization and related reliability and validity studies
• Students tested from March through December 2012
• Over 450 SLPs across the U.S. participated in standardization testing
4 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
7 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 7 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Multiple Bias Studies
• Multiple phases of objective and subjective reviews of administration directions, cues, test items, and test formats – Assessment/bias experts examined test items for
potential bias related to • Socioeconomic status
• Race/Ethnicity
• Gender
• Culture
• Region
– Clinicians in the field provided feedback about students’ responses and engagement in test tasks
– Statistical analysis of bias verified or refuted subjective bias concerns
8 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 8 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
A Diverse STDZ Sample: Race/Ethnicity
Race/Ethnicity N %
Asian 87 3.7%
Black 328 13.8%
Hispanic 476 20.0%
Other 137 5.8%
White 1352 56.8%
Total Sample 2380 100%
5 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
9 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 9 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
A Diverse STDZ Sample: Parent Education Level
Parent Education Level
N %
Less than 11 years 292 12.3%
H.S. Diploma or GED 544 22.9%
1-3 Years College or Technical School
817 34.3%
4 or more Years of College
727 30.6%
Total Sample 2380 100%
10 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 10 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
A Diverse STDZ Sample: Region
Region of the US N %
Midwest 567 23.8%
Northeast 363 15.3%
South 873 36.7%
West 577 24.3%
Total Sample 2380 100%
6 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
11 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 11 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Other Demographic Variables
Gender N %
Female 1190 50%
Male 1190 50%
Total Sample 2380 100%
12 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 12 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Raw Score Means and SDs
7 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
13 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 13 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Raw Score Means and SDs (cont.)
Reliability
8 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
15 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 15 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Reliability 101
• How confident are you in the accuracy of a test score?
• Reliability = accuracy, consistency and stability of test scores across situations
• True Score = Observed Score + Error
– Errors are systemic and random
16 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 16 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Internal Consistency & Test-retest
• Internal consistency: estimates how consistently the items of the test measure one construct (homogeneity)
– Split-half method (Spearman-Brown correction): correlation between the total scores of two half-tests
• Test-retest stability: correlation between test and retest scores.
– Time interval between the test and retest is as short as possible.
9 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
17 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 17 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Test Reliability Coefficients
CELF-5 Test
Average Reliability
Coefficients
(across target ages)
Sentence Comprehension .87 Good
Linguistic Concepts .91 Excellent
Word Structure .89 Good
Word Classes .90 Excellent
Following Directions .91 Excellent
Formulated Sentences .86 Good
Recalling Sentences .94 Excellent
Understanding Spoken Paragraphs .85 Good
Word Definitions .89 Good
Sentence Assembly .93 Excellent
Semantic Relationships .89 Good
Pragmatics Profile .98 Excellent
Reading Comprehension .87 Good
Structured Writing .75 Acceptable
18 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 18 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Index Score Reliability Coefficients
CELF-5 Index Scores
Average Reliability
Coefficients
(across target ages)
Core Language Score .96 Excellent
Receptive Language Index .95 Excellent
Expressive Language Index .95 Excellent
Language Content Index .95 Excellent
Language Structure Index .96 Excellent
Language Memory Index .95 Excellent
10 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
19 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 19 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Reliabilities for Clinical Groups
Clinical Group
Test
Language Disorder (n=166)
Learning Disability
(Reading & Writing) (n=69)
Autism Spectrum Disorder (n=66)
Average rxx
Sentence
Comprehension .94 Excellent -- .96 Excellent .95 Excellent
Linguistic
Concepts .96 Excellent -- .98 Excellent .97 Excellent
Word Structure .93 Excellent -- .94 Excellent .94 Excellent
Word Classes .96 Excellent .92 Excellent .97 Excellent .95 Excellent
Following
Directions .96 Excellent .90 Excellent .98 Excellent .96 Excellent
Formulated
Sentences .97 Excellent .89 Good .96 Excellent .95 Excellent
Recalling
Sentences .98 Excellent .92 Excellent .97 Excellent .96 Excellent
20 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 20 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Clinical Group
Test
Language Disorder (n=166)
Learning Disability
(Reading & Writing) (n=69)
Autism Spectrum Disorder (n=66)
Average rxx
Understanding
Spoken
Paragraphs
.81 Good .75 Acceptable .91 Excellent .84 Good
Word
Definitions .87 Good .91 Excellent .95 Excellent .92 Excellent
Sentence
Assembly .92 Excellent .94 Excellent .97 Excellent .95 Excellent
Semantic
Relationships .88 Good .89 Good .96 Excellent .92 Excellent
Pragmatics
Profile .99 Excellent .99 Excellent .99 Excellent .99 Excellent
Reading
Comprehension .93 Excellent .86 Good .93 Excellent .91 Excellent
Reliabilities for Clinical Groups (cont.)
11 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
21 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 21 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Standard Error of Measurement (SEM) & Confidence Interval
22 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 22 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Standard Error of Measurement (SEM) & Confidence Interval (cont.)
12 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
23 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 23 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Test-retest Stability
Test Retest Stability (n= 137)
Test-retest interval: 7-46 days
Mean: 19 days
24 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 24 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Inter-scorer Agreement
Inter-scorer agreement Reliability
Word Structure .99
Formulated Sentences .95
Word Definitions .91
Structured Writing .96
13 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
25 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 25 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Score Differences
• Interpretation of performance:
1. Examine if difference is statistically significant
• Reflection of Standard Error
2. Examine if difference is clinically significant (rare)
Validity
14 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
27 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 27 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Validity 101
• How well can your test results predict the presence of a disorder (or, predict one’s skill)?
• Validity is demonstrated through evidence supporting a test’s interpretations and uses. – Validity Evidence: the degree to which specific
data, research, or theory supports that: 1. A test measures the concepts it’s supposed to
measure
2. The test is applicable to its intended population
28 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 28 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Evidence Based on Test Content
• Validity evidence related to test content
– Content Relevance: when the content areas being measured are accepted as relating to the proposed construct
– Content Coverage: when the content areas measured by the test are accepted to be an adequate sampling of these areas (Also developmentally appropriate)
• Validity evidence includes: literature review; users’ feedback; and expert review
• CELF-5 designed to measure…
15 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
29 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 29 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Intercorrelation Studies
30 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 30 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Relationship with Other Variables (CELF-4)
16 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
31 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 31 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
CELF-5 – CELF-4 (cont.)
32 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 32 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Language Disorder Study
17 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
33 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 33 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
LD Study (cont.)
34 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 34 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Sensitivity and Specificity 101
Sensitivity: The probability that someone who has the “condition” will test positive for it.
Specificity: The probability that someone who does not have the “condition” will test negative.
Errors False Positive: A student who is falsely identified as
having a condition or disorder.
False Negative: A student with a condition or disorder
who is not correctly identified by a test (the most serious error).
18 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
35 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 35 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
CELF-5 Diagnostic Accuracy
36 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 36 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Hypothetical using 1.5 SD (77) cutoff with CELF-5: 10,000 Students, 10% Prevalence
Language Disorder
(.85)
No Language Disorder
(.99) Total
Positive Test Results
850 90 940
Negative Test Results
150 8,910 9,060
Total 1,000 9,000 10,000
19 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
37 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 37 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Another Example: October is Breast Cancer Awareness Month!! (These numbers are approximates and based a 12% prevalence rate)
http://www.memorialbreastcenter.org/how-accurate-are-mammograms/
Breast Cancer
No Breast Cancer
Total
Positive Mammogram
Results 100
62 (≈7%)
162
Negative Mammogram
Results
20 (≈17%)
818 838
Total 120
(≈12%) 880 1,000
38 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 38 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Learning Disorder (Reading and/or Writing)
20 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
39 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 39 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Autism Spectrum Disorder
40 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 40 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Summary/Questions?
• Reliability is reflection of error
• Validity helps us determine how we apply test interpretations and predict
• Choosing cutoff has implications for including or not including people in groups.
21 Copyright © 2014. Pearson, Inc., or its affiliates. All rights reserved.
CELF-5 Reliability and Validity
Adam Scheller, Ph.D.
Pearson Clinical Assessment
CELF5Family.PearsonClinical.com
PEARSON Customer Service
1-800-627-7271 (USA) PearsonClinical.com
1-866-335-8418 (Canada)
PearsonAssess.ca
Facebook.com/SpeechandLanguage
Twitter.com/SpeechnLanguage
pinterest.com/speechandlang/the-new-celf-5/