DDM Part IIAnalyzing the Results
Dr. Deborah Brady
Agenda Overview of how to measure growth in 4 “common sense” ways Quick look at “standardization” Not all analyses are statistical or new We’ll use familiar ways of looking at student work Excel might help when you have a whole grade’s scores, but it is
not essential
Time for your questions; exit slips
My email [email protected]; PowerPoint and handouts at http://tinyurl.com/k23opk6
2 Considerations Local DDMs,”1. Comparable across schools
Example: Teachers with the same job (e.g., all 5th grade teachers)
Where possible, measures are identical Easier to compare identical measures Do identical measures provide meaningful information about all students?
Exceptions: When might assessments not be identical? Different content (different sections of Algebra I) Differences in untested skills (reading and writing on math test for ELL
students) Other accommodations (fewer questions to students who need more time) NOTE: Roster Verification and Group Size will be considerations by DESE
3
2. Comparable across the District Aligned to your curriculum (comparable content) K-12 in all disciplines
Appropriate for your students Aligned to your district’s content Informative, useful to teachers and administrators
“Substantial” Assessments (comparable rigor): “Substantial” units with at least 2 standards and/or concepts
assessed. (DESE began talking about finals/midterms as preferable recently)See Core Curriculum Objectives (CCOs) on DESE website if you are concernedhttp://www.doe.mass.edu/edeval/ddm/example/
Quarterly, benchmarks, mid-terms, and common end of year exams
NOTE: All of this data stays in your district. Only HML goes to DESE with a MEPID for each educator.
Examples of 4 +1 Methods for Calculating Growth
Each is in handout
Pre-post test Repeated measures Holistic Rubric (Analytical Rubric)Post test only
A look at “standardization” with percentiles
Typical Gradebook and Distribution Page 1 of handout
Alphabetical order (random) Sorted low to high Determine “cut scores” (validate in the student
work) Use “Stoplight Method” to help see cut scores Graph of distribution of all scores Graph of distribution of High, Moderate, Low scores
Random Sorted90 5276 6092 6172 6380 6598 7291 7575 7660 7652 7776 7877 7996 8061 8063 8478 8579 8695 9080 9185 9286 9584 9665 98
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 230
20
40
60
80
100
120Distribution of whole class all
scores, low to high
High Mod Low02468
101214
High, Moderate, Low Distribution
HighCount 6ModCount 12LowCount 5
“Cut” Scores and “common sense”: validate them with performances.
What work is not moving at an average rate?
What work shows accelerated growth?
Some benchmarks have determined rates of growth over time
Pre/Post Test Description:
The same or similar assessments administered at the beginning and at the end of the course or year
Example: Grade 10 ELA writing assessment aligned to College and Career Readiness Standards at beginning and end of year
Measuring Growth: Difference between pre- and post-test.
Check if all students have an equal chance of demonstrating growth
8
Pre- Post TestsPre-testLowest
to highest
Post test
Difference
(Growth)Post minus
pre
AnalysisRange
of growth
Pre/post
%age growthDifference %age based
on diff/pre
20 35 15 15 75% 25 30 5 5 20%30 50 20 20 67%35 60 25 25 42%35 60 25 25 42%40 70 35 35 87%40 65 25 25 62%50 75 25 25 50%50 80 30 30 60%50 85 35 35 70%
low moderate high
2
5
3
How many L/M/H?
Cut score?Look at work.Look at distribution.
Holistic Description:
Assess growth across student work collected throughout the year.
Example: Tennessee Arts Growth Measure System Measuring Growth:
Growth Rubric (see example) Considerations:
Option for multifaceted performance assessments Rating can be challenging & time consuming
10
11
Holistic Example (unusual rubric)
11
1 2 3 4
Details
No improvement in the level of detail.One is true* No new details across versions
* New details are added, but not included in future versions.
* A few new details are added that are not relevant, accurate or meaningful
Modest improvement in the level of detailOne is true* There are a few details included across all versions
* There are many added details are included, but they are not included consistently, or none are improved or elaborated upon.
* There are many added details, but several are not relevant, accurate or meaningful
Considerable Improvement in the level of detailAll are true* There are many examples of added details across all versions,
* At least one example of a detail that is improved or elaborated in future versions
*Details are consistently included in future versions
*The added details reflect relevant and meaningful additions
Outstanding Improvement in the level of detailAll are true* On average there are multiple details added across every version
* There are multiple examples of details that build and elaborate on previous versions
* The added details reflect the most relevant and meaningful additions
Example taken from Austin, a first grader from Anser Charter School in Boise, Idaho. Used with permission from Expeditionary Learning. Learn more about this and other examples at http://elschools.org/student-work/butterfly-drafts
HOLISTIC Easier for Large-Scale Assessments
like MCAS Rubric Topic or Conventions and useful when categories overlap
CriteriaIn one cell
Advanced Proficient NI At Risk
Writing
1) Claims/evidence2) Counterclaims3) Organization4) Language/style
1)Insightful, accurate, carefully developed claims and evidence. 2) Counterclaims are thoughtfully, accurately, completely discussed and argued.3) Whole essay and each paragraph are carefully organized and show interrelationships among ideas. 4) Sentence structure, vocabulary, and mechanics show control over language use
AdequateEffective“Gets it”
Misconceptions; some errors
Serious errors
MCAS Has 2 Holistic Rubrics6 5 4 4 5 6
Topic/Development
Rich topic/idea developmentCareful, subtle organizationEffective rich use of language
Full topic/idea developmentLogical organizationStrong detailsAppropriate use of language
Moderate topic/idea development and organizationAdequate, relevant detailsSome variety in language
Rudimentary topic/idea development and/or organizationBasic supporting detailsSimplistic language
Limited or weak topic/idea development, organization, and/or detailsLimited awareness of audience and/or task
Little topic/idea development, organization, and/or detailsLittle or no awareness of audience and/or task
Conventions
Control of sentence structure, grammar, usage, and mechanics, (length and complexity of essay) provide opportunity for student to show control of standard English conventions)
Errors do not interfere with communication and/orFew errors relative to length of essay or complexity of sentence structure, grammar and usage, and mechanics
Errors interfere somewhat with communication and/orToo many errors relative to the length of the essay or complexity of sentence structure, grammar and usage, and mechanics
•Errors seriously interfere with communication AND•Little control of sentence structure, grammar and usage, and mechanics
Pre and Post Rubric (2 Criteria) Growth
Add the scoresPretestsTopic
Conventions
Post testsTopicConventions
Differece
AnalysisAdd
together criteria gains
as raw score
In order % of growth difference/pre
1/1 1/1 0/0 0 0 01 /2 2/2 1/0 1 1 100%1/2 2/3 1/1 2 1 100%2/3 3/3 1/0 1 2 50%
Rubrics do not represent percentages. A student who received a 1 would probably receive a 50. F?
1= 50 FSeriously at risk
2= range 60-72, 75? D to C-At risk
3= 76-88, 89? C+ to B+ Average
4= 90-100 A to A+Above most
Holistic Rubric or Holistic DescriptorKeeping 1-4 scale
Pre Post Difference
Rank order
Cut
0 1 +1 -1 -10 1 +1 0 00 1 +1 0 01 0 -1 1 11 1 0 1 11 1 0 1 11 3 +2 1 11 1 0 1 12 3 +1 2 2 low mod High
0
1
2
3
4
5
6
7
distribution
Converting Rubrics to PercentagesNot recommended for classroom use because it distorts the meaning of the
descriptors.May facilitate this large-scale use. District Decision
Pre Converte
d“grade”
Post Converted“grade”
Difference %age growth
Difference/pre
0 0 1 50 50 50%
0 0 1 50 50 50%
0 0 1 50 50 50%
1 50 0 0 -50 -50%1 50 1 50 0 01 50 1 50 0 01 50 3 82 32 64%1 50 1 50 0 02 65 3 82 17 26%
Common Sense analysisWas the assessment too difficult?Zeros in pretest (3)Zero growth Only 1 student improved
Change assessment scale?
Look at all of the grade-level assessments.
% conversion not helpful in this case?
Repeated Measures Description:
Multiple assessments given throughout the year.
Example: running records, attendance, mile run
Measuring Growth:GraphicallyRanging from the sophisticated to simpleLess pressure on each administration.Authentic Tasks (reading aloud, running) 17
Repeated Measures Description:
Multiple assessments given throughout the year. Example: running records, attendance, mile run
Measuring Growth:GraphicallyRanging from the sophisticated to simple
Considerations:Less pressure on each administration.Authentic Tasks
18
Repeated Measures Example Running Record Errors in ReadingAverage of high, moderate, and low error groups
19
September Sept September
November
J anuary March April J une Ra
65 48 30 15 15 13 6863 65
30 35 20 22 18 10 6532
22 10 12 5 2 1 30 30282422 2220
Error Chart of Averages from each assessment
1 2 3 4 5 60
10
20
30
40
50
60
70Error Chart of Averages from each assessment
Post test onlyAP exam: Use as baseline to show growth for
each level or… for classroom
This assessment does not have a “normal curve” An alternative for post test only for a classroom and to show student
growth is to give a mock AP pre and post.
five four three two one0
2
4
6
8
10
12
14
16
Post Test Only AP Exam Example
Looking for Variability
Low Moderate High0
50
100
150
200
Good#
of
stud
ents
Low Moderate High0
50
100
150
200
Problematic
# o
f st
uden
ts
The second graph is problematic because it doesn’t give us information about the difference between average and high growth because so many students fall into the “high” growth category.
NOTE: Look at the work and make “common sense” decisions. Consider the whole grade level; one class’s variation may be caused by teacher’s
effectiveness Critical Question: Do all students have equal possibility for success?
21
“Standardizing” Local NormsPercentages versus Percentiles
% within class/course %iles across all courses in district
22
Many Assessments
with different standards
Student A English:
15/20 Math:
22/25 Art:
116/150 Social Studies: 6/10 Science:
70/150 Music:
35/35
“Standardized”Normal Curve
Student A English: 62
%ile Math: 72
%ile Art: 59
%ile Social Studies: 71 %ile Science:
70 %ile Music: 61
%ile
Percentage of 100%
Student A• English
75%• Math
88%• Art
77%• Social Studies 60%• Science
46%• Music
100
StandardizationIn Everyday Terms
Standardization is a process of putting different measures on the same scale
For exampleMost cars cost $25,000 give or take $5,000Most apples costs $1.50 give or take $.50Getting a $5000 discount on a car is about
equal to what discount on an apple? Technical terms
“Most are” = mean “Give or take” = standard deviation
23
Percentile/Standard Deviation
Excel FunctionsSort high to low or low to high, Graphing Function,
Statistical Functions including Percentiles and Standard Deviation
Student grades can be sorted from highest to lowest score with one command
Table of student scores can be easily graphed with one command
Excel will easily calculate %, but this is probably not necessary
“Common Sense” The purpose of DDMs is to assess Teacher Impact The student scores, the Low, Moderate, and High
growth rankings are totally internal DESE (in two years) will see
MEPIDS and L, M or H next to a MEPID
The important part of this process needs to be the focus: Your discussions about student learning with colleagues Your discussions about student learning with your
evaluator An ongoing process