Date post: | 13-Jan-2016 |
Category: |
Documents |
Upload: | rosalyn-davis |
View: | 223 times |
Download: | 0 times |
Traditional Assessment Info.:
Norm-Referenced Tests:Testing in which scores are compared with the average performanceof others (Class, School, District,State, National). Scores reflect general knowledge rather than mastery of specific skills.They are useful when trying to measure overall achievement and/orSelecting a few top candidates (e.g. only top 5% ACT accepted to HarvardExamples- Standardized tests such as ACT, SAT, Stanford Binet, NTE (Praxis I)
Limitations-don’t tell you if students are ready to move on to more advanced material (e.g. just because a student scores in top 5%of class on a test of trigonometry doesn’t tell you if they are ready tomove on to calculus. Everyone in the class may have limited understanding of trigonometry).Less appropriate for measuring affective and psychomotor objectives.Tend to encourage competition and score comparison.
Criterion Referenced:Testing in which scores are compared to a given criterion or set performance standard. Used to measure mastery of specific objectives. Can be used to ‘ability’ group students.A standard for mastery (e.g. 17 out of 20) must be set.
Examples- drivers license, Scuba diving, sailing, Orals
Limitations-absolute standards often difficult to set in some areas.Standards tend to be arbitrary.
Standardized test scores-Tests given, usually nationwide,under uniform conditions and scored according to uniform procedures. (i.e. standard methods of administration, scoring,and reporting).
Test items and instructions have been tried out; final versions have been administrered to a norming sample (comparison group).
Examples-ACT, DAT, NTE (Praxis 1), High-stakes tests, etc.
Types of standardized tests-
Achievement test- measuring how much students have learned in a given content area. Examples-TerraNova (for basic skills), IOWA test, WRAT, NTE(?)
Diagnostic Test- individually administered tests to identify speciallearning problems.Examples-hearing test, vision test etc.
Aptitude test- tests meant to predict future performance.Examples-PSAT, SAT, ACT, GRE, NTE (?), WISC-III
What do standardized test scores mean?Interpreting your students standardized test score
Frequency Distribution: Record showing number of people who obtained each score or a range of scores.
For example: 11 students took a spelling test.The scores were: 55, 65, 75, 75, 75,80, 85, 95, 95, 95, 100.
A Frequency distribution (depicted as a histogram graph could look like this:
50 55 60 65 70 75 80 85 90 95 100
3
2
1
Scores on Test
Number ofstudentsmakingeachscore:
Central tendency- typical score for a group of scores.There are three basic methods for determining central tendency.The mean (arithmetic average) is the most widely used measureof central tendency.
Mean=Addition of all scores divided by total number of scores.
X= X N
So in the case of our last example the mean on the spelling test would be:55 + 65 + 75 + 75 + 75 + 80 + 85 + 95 + 95 + 95 + 100=895
X=895/11=81.36etc.
*Note:X is the symbolic notation for mean, and is the symbol for sum.
Central tendency continued…
Another measure of central tendency is the median.Median= The middle score in the distribution.
So for our spelling test scores of:55, 65, 75, 75, 75,80, 85, 95, 95, 95, 100.
The median= 80, with 5 scores above and 5 scores below it.
The third measure of central tendency is the mode.
Mode= the most frequently occurring score.
In the case of our example:
Hence, in this example there are two modes- 75 and 95.We call this type of distribution bimodal.
50 55 60 65 70 75 80 85 90 95 100
While measures of central tendency give you a number that isrepresentative of the group of scores it does not tell you anythingabout how the scores are distributed.
For example look at the following 2 sets of scores for a spelling test:
Class 1: 50, 45, 55, 55, 45, 50, 50Class 2: 100, 0, 50, 90, 10, 50, 50.
In both cases the mean, median, and mode are 50 but the distributionsare very different.
Standard deviation gives us a way of measuring how widely scoresvary from the mean.
Standard deviation- Measure of how widely scores vary from the mean.The larger the standard deviation the more spread out the scores.(NOTE-standard deviation is abbreviated as SD or
The smaller the standard deviation the more the scores cluster aroundthe mean.Hence, distributions with small standard deviations have less variabilityin the individual scores.
So in our last example:Class 1: 50, 45, 55, 55, 45, 50, 50Class 2: 100, 0, 50, 90, 10, 50, 50.
Class 1 has a smaller SD than Class 2.
Calculating standard deviation:
1. Calculate the Mean (X) of scores2. Subtract the Mean from each individual score. Written as (X-X).3. Square each difference (multiply by itself) so it doesn’tsum to zero. (X-X)2 4. Add all the squared differences. Written as (X-X)2
5. Divide this total by the total number of scores.Written as:(X-X)2
N
6. Find the square root. This is written as:
(X-X)2
N this is the formula for SD.
Why standard deviation is useful:Tells you about variability.Variability= Degree of difference or deviation from mean.Gives you a better picture of the meaning of an individual score.
For example suppose you got a score of 90 on a History exam.You would be very pleased if the mean of the test was 75 and thestandard deviation was 5. In that case your score would be threestandard deviations above the mean!
But now consider the difference if the mean of the test remained at 75, but the standard deviation was 15. Now your score of 90 is only 1standard deviation above the mean. Your score would now be considered only slightly above average.
Knowing the standard deviation tells you more than just knowing the range (distance between highest and lowest scores).
Standard deviations are especially helpful when dealing with tests that have a normal distribution.
Normal distribution=The most commonly occurring distribution,in which scores are distributed evenly about the mean.Also known as the BELL CURVE. In the normal distribution the mean, median, and mode are all at the same point.The majority of scoresfall in the middle giving the curve its puffy appearance.
10987654321
Number ofCases
Standard Deviations
-4 MeanTest score
10987654321
Number ofCases
Standard Deviations
-4 MeanTest score
2% 14% 34% 34% 14% 2%
.1 1 10 30 50 70 90 99 99.9
Percentile RanksA convenient property of the normal distribution is that percentagesof scores falling within each area of the curve is known. Hence,by looking at the curve you can see 68% of all scores are located within 1SD of the mean. While only 32% are more or less than 1SDabove or below the mean.
Percentile Rank Score- Percentage of those in the norming samplewho scored at or below an individual’s score.
Raw Score 75 79 83 87 95 99
Percentile Rank 50 60 90 99
Joan on math
Alice onmath
Joan onlanguage
Alice on languageCautions :
•Refers to % of people in norming sample who score less than or equal to a particular score not the number of items the student answered correctly.•Diffs in % ranks do not mean the same thing in terms of raw scoresat the middle of the scale as they do at the tails. It takes a greater difference in raw scores to make a difference in % rank at the extreme ends.
A 4pt raw score diff. in the middleequals a 10%diff in % rank.But, an 8 pointdiff yieldsonly a 9 % rank diff. Atthe tail.
Grade equivalent score: Measure of grade level based on comparison with norming sample at each grade.The average of the scores of the norming sample at each grade leveldetermines the grade equivalent score.
Generally listed as numbers such as 2.4, 4.5, 8.6, 10.8
Limitations- •Different forms of the test often used at different grade levels•High score indicates superior mastery of material at grade level notnecessarily capacity for doing advanced work.•Often misleading, in-general should not be used.
Standard Scores: Scores based on the SD (standard deviation)
The most commonly used forms of standard scores are the Z score, T score, and Stanine score.
Z score- standard score indicating the number of SDs above or belowthe mean for a specific raw score.
Z=X-X = X-X SD (X-X)
NBecause the Z score can yield negative numbers which are inconvenient we often convert them to a T score.T score- standard score with a mean of 50 and an SD of 10.To convert a Z score to a T score simply:Multiply the Z score by 10 and add 50. The Sat uses a similar procedure but sets the mean=500, and the SD=100
Stanine Score (Standard Nine)-Whole number scores from 1 to 9, each representing a wide range of raw scores.
On the Stanine score the X=5, SD=2. Each unit from 2 to 8=.5 SD.
-3SD -2SD -1SD 0 +1SD +2SD +3SDZ score -3 -2 -1 0 +1 +2 +3T score 20 30 40 50 60 70 80SAT 200 300 400 500 600 700 800
Stanine 1 2 3 4 5 6 7 8 9
Recall from our discussion of Learner Differences in chapter 4that we said any test must be evaluated in terms of three criterion:
Validity: Degree to which a test measures what it is intended tomeasure.
Reliability:Consistency of test results.
Bias:When the content of a test discriminates against a group of students on the basis of gender, SES, race, ethnicity, etc
In addition with standardized tests keep in mind:•Tests are imperfect estimators of the qualities/skills they seek tomeasure. There are errors in every testing situation (sometimesin the test takers favor sometimes not).
Ideally, standardized tests would report a students-True score- the hypothetical average of all of an individual’s scoresif repeated testing under ideal conditions were possible.
In reality, students only take most standardized tests once. Test makerstake this into account in a measure called:Standard Error of Measurement- hypothetical estimate of variation in scores if testing were repeated.Never base an opinion about a student’s ability or achievement on the exact score a student achieves. Instead, use -Confidence Interval (standard error band)- Range of scores within which an individual’s score is likely to fall.Determined by test makersusing the standard error of measurement.
Innovative Assessment Info.:
Portfolio:
“A record of learning that focuses on the student’s work and his/her reflection on that work.Material is collected through collaborative effort between student and staff members and is indicative of progresstowards essential outcomes.”-NEA
The process (in a nutshell):
•Collection•Selection•Reflection•Projection
•(from Brayko & Laraby (2004) Portfolio assessment- OshkoshWriter’s Conference)
Guideline in creating a portfolio (from Woolfolk, p.559):
1. Student’s should be involved in selecting the piecesDuring a unit students select work that fits certain criteria (e.g. most improved, best work, three approaches to etc.)
2. Portfolio should include information that shows studentself-reflection and self-criticismAsk students to include a rational for their selections. Have students Include peer critiques
3. Portfolio should reflect the students’ activities in learningInclude a representative selection of projects, writings, drawings etc.Ask students to relate learning goals to the contents of the portfolio
Guideline in creating a portfolio (from Woolfolk, p.559):
4. The portfolio can serve different functions at different times of year.Early in the year it might hold unfinished or problem pieces.At the end of the year it should contain only what the student is willing to make public.
5. Portfolios should show growthAsk students to create a history of their progress/learning alongcertain dimensions and to illustrate this growth through specific works.
6. Teach students how to create and use portfolios.Provide models of well done portfolios- but stress that each portfolioIs an individual statement.Examine your students portfolios on a regular basis. Give constructive feedback.
Process Portfolio v. Best works portfolio(Woolfolk p. 558)Process Portfolio:Subject: Indiv. Student:/Team:Science Document (records/logs) use of sci. method to
solve lab problemsMathematics Documentation of math reasoning (e.g.
double-column math problem solving-computation on left, commentary on right)
Lang. Arts Evolution of compositions from notes to draft to final
Best Works Portfolio:Subject: Indiv. Student:/Team:Lang. Arts best compositions in a varity of styles
Soc. Studies Best historical research paper, essay, etc.
Fine arts best creative products- drawings, paintings, performanceetc.
Evaluating Portfolios - developing a rubric W p. 561
•Look at models•List criteria•Articulate gradations of quality•Practice on models•Use self and peer assessment•Revise•Use teacher assessment•Look at websites
Aligning Diff. Assess tools with their targets (Woolfolk-p.563):
Again, recall from our discussion of Learner Differences in chapter 4that we said any test/measure must be evaluated in terms of three criterion:
Validity: Degree to which a test measures what it is intended tomeasure.
Reliability:Consistency of test results.
Bias:When the content of a test discriminates against a group of students on the basis of gender, SES, race, ethnicity, etc
This applies as much to innovative/alternative assessment as it does to traditional assessment. Right?