Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | justina-hudson |
View: | 262 times |
Download: | 2 times |
Educational Research
Chapter 5Selecting Measuring Instruments
Gay, Mills, and Airasian
Topics Discussed in this Chapter
Data collection Measuring instruments
Terminology Interpreting data Types of instruments
Technical issues Validity Reliability
Selection of a test
Data Collection
Scientific inquiry requires the collection, analysis, and interpretation of data Data – the pieces of information that
are collected to examine the research topic
Issues related to the collection of this information are the focus of this chapter
Data Collection
Terminology related to data Constructs – abstractions that cannot
be observed directly but are helpful when trying to explain behavior
Intelligence Teacher effectiveness Self concept
Obj. 1.1 & 1.2
Data Collection Data terminology (continued)
Operational definition – the ways by which constructs are observed and measured
Weschler IQ test Virgilio Teacher Effectiveness Inventory Tennessee Self-Concept Scale
Variable – a construct that has been operationalized and has two or more values
Obj. 1.1 & 1.2
Data Collection Measurement scales
Nominal – categories Gender, ethnicity, etc.
Ordinal – ordered categories Rank in class, order of finish, etc.
Interval – equal intervals Test scores, attitude scores, etc.
Ratio – absolute zero Time, height, weight, etc.
Obj. 2.1
Data Collection Types of variables
Categorical or quantitative Categorical variables reflect nominal
scales and measure the presence of different qualities (e.g., gender, ethnicity, etc.)
Quantitative variables reflect ordinal, interval, or ratio scales and measure different quantities of a variable (e.g., test scores, self-esteem scores, etc.)
Obj. 2.2
Data Collection Types of variables
Independent or dependent Independent variables are purported causes Dependent variables are purported effects Two instructional strategies, co-operative groups
and traditional lectures, were used during a three week social studies unit. Students’ exam scores were analyzed for differences between the groups.
The independent variable is the instructional approach (of which there are two levels)
The dependent variable is the students’ achievement
Obj. 2.3
Measurement Instruments Important terms
Instrument – a tool used to collect data Test – a formal, systematic procedure
for gathering information Assessment – the general process of
collecting, synthesizing, and interpreting information
Measurement – the process of quantifying or scoring a subject’s performance
Obj. 3.1 & 3.2
Measurement Instruments Important terms (continued)
Cognitive tests – examining subjects’ thoughts and thought processes
Affective tests – examining subjects’ feelings, interests, attitudes, beliefs, etc.
Standardized tests – tests that are administered, scored, and interpreted in a consistent manner
Obj. 3.1
Measurement Instruments
Important terms (continued) Selected response item format – respondents
select answers from a set of alternatives Multiple choice True-false Matching
Supply response item format – respondents construct answers
Short answer Completion Essay
Obj. 3.3 & 11.3
Measurement Instruments
Important terms (continued) Individual tests – tests administered
on an individual basis Group tests – tests administered to a
group of subjects at the same time Performance assessments –
assessments that focus on processes or products that have been created
Obj. 3.6
Measurement Instruments Interpreting data
Raw scores – the actual score made on a test
Standard scores – statistical transformations of raw scores
Percentiles (0.00 – 99.9) Stanines (1 – 9) Normal Curve Equivalents (0.00 – 99.99)
Obj. 3.4
Measurement Instruments Interpreting data (continued)
Norm-referenced – scores are interpreted relative to the scores of others taking the test
Criterion-referenced – scores are interpreted relative to a predetermined level of performance
Self-referenced – scores are interpreted relative to changes over time Obj. 3.5
Measurement Instruments
Types of instruments Cognitive – measuring intellectual
processes such as thinking, memorizing, problem solving, analyzing, or reasoning
Achievement – measuring what students already know
Aptitude – measuring general mental ability, usually for predicting future performance Obj. 4.1 & 4.2
Measurement Instruments Types of instruments (continued)
Affective – assessing individuals’ feelings, values, attitudes, beliefs, etc.
Typical affective characteristics of interest Values – deeply held beliefs about ideas, persons, or
objects Attitudes – dispositions that are favorable or
unfavorable toward things Interests – inclinations to seek out or participate in
particular activities, objects, ideas, etc. Personality – characteristics that represent a
person’s typical behaviors Obj. 4.1 & 4.5
Measurement Instruments Types of instruments (continued)
Affective (continued) Scales used for responding to items on affective
tests Likert
Positive or negative statements to which subjects respond on scales such as strongly disagree, disagree, neutral, agree, or strongly agree
Semantic differential Bipolar adjectives (i.e., two opposite adjectives)
with a scale between each adjective Dislike: ___ ___ ___ ___ ___ :Like
Rating scales – rankings based on how a subject would rate the trait of interest
Obj. 5.1
Measurement Instruments Types of instruments (continued)
Affective (continued) Scales used for responding to items on
affective tests (continued) Thurstone – statements related to the trait of
interest to which subjects agree or disagree Guttman – statements representing a uni-
dimensional trait
Obj. 5.1
Measurement Instruments Issues for cognitive, aptitude, or affective
tests Problems inherent in the use of self-report
measures Bias – distortions of a respondent’s performance or
responses based on ethnicity, race, gender, language, etc.
Responses to affective test items Socially acceptable responses Accuracy of responses Response sets
Alternatives include the use of projective tests
Obj. 4.3, 4.4
Technical Issues
Two concerns Validity Reliability
Technical Issues Validity – extent to which
interpretations made from a test score are appropriate Characteristics
The most important technical characteristic Situation specific Does not refer to the instrument but to the
interpretations of scores on the instrument Best thought of in terms of degree
Obj. 6.1 & 7.1
Technical Issues
Validity (continued) Four types
Content – to what extent does the test measure what it is supposed to measure
Item validity Sampling validity Determined by expert judgment
Obj. 7.1 & 7.2
Technical Issues Validity (continued)
Criterion-related Predictive – to what extent does the test
predict a future performance Concurrent - to what extent does the test
predict a performance measured at the same time
Estimated by correlations between two tests Construct – the extent to which a test
measures the construct it represents Underlying difficulty defining constructs Estimated in many ways
Obj. 7.1, 7.3, & 7.4
Technical Issues Validity (continued)
Consequential – to what extent are the consequences that occur from the test harmful
Estimated by empirical and expert judgment Factors affecting validity
Unclear test directions Confusing and ambiguous test items Vocabulary that is too difficult for test takers
Obj. 7.1, 7.5, & 7.7
Technical Issues Factors affecting validity (continued)
Overly difficult and complex sentence structure
Inconsistent and subjective scoring Untaught items Failure to follow standardized
administration procedures Cheating by the participants or
someone teaching to the test items
Obj. 7.7
Technical Issues
Reliability – the degree to which a test consistently measures whatever it is measuring Characteristics
Expressed as a coefficient ranging from 0 to 1
A necessary but not sufficient characteristic of a test
Obj. 6.1, 8.1, & 8.7
Technical Issues Reliability (continued)
Six reliability coefficients Stability – consistency over time with the
same instrument Test – retest Estimated by a correlation between the two
administrations of the same test Equivalence – consistency with two parallel
tests administered at the same time Parallel forms Estimated by a correlation between the parallel
testsObj. 8.1, 8.2, 8.3, & 8.7
Technical Issues Reliability (continued)
Six reliability coefficients (continued) Equivalence and stability – consistency
over time with parallel forms of the test Combines attributes of stability and equivalence Estimated by a correlation between the parallel forms
Internal consistency – artificially splitting the test into halves
Several coefficients – split halves, KR 20, KR 21, Cronbach alpha
All coefficients provide estimates ranging from 0 to 1
Obj. 8.1, 8.4, 8.5, & 8.7
Technical Issues
Reliability (continued) Six reliability coefficients
Scorer/rater – consistency of observations between raters
Inter-judge – two observers Intra-judge – one judge over two occasions Estimated by percent agreement between
observations
Obj. 8.1, 8.6, & 8.7
Technical Issues Reliability (continued)
Six reliability coefficients (continued) Standard error of measurement (SEM) –
an estimate of how much difference there is between a person’s obtained score and his or her true score
Function of the variation of the test and the reliability coefficient (e.g., KR 20, Cronbach alpha, etc.)
Estimated by specifying an interval rather than a point estimate of a person’s score
Obj. 8.1, 8.7, & 9.1
Selection of a Test Sources of test information
Mental Measurement Yearbooks (MMY) The reviews in MMY are most easily accessed
through your university library and the services to which they subscribe (e.g., EBSCO)
Provides factual information on all known tests Provides objective test reviews Comprehensive bibliography for specific tests Indices: titles, acronyms, subject, publishers,
developers Buros Institute
Obj. 10.1 & 12.1
Selection of a Test
Sources (continued) Tests in Print
Tests in Print is a subsidiary of the Buros Institute The reviews in it are most easily accessed
through your university library and the services to which they subscribe (e.g., EBSCO)
Bibliography of all known commercially produced tests currently available
Very useful to determine availability Tests in Print
Obj. 10.1 & 12.1
Selection of a Test
Sources (continued) ETS Test Collection
Published and unpublished tests Includes test title, author, publication date, target
population, publisher, and description of purpose Annotated bibliographies on achievement,
aptitude, attitude and interests, personality, sensory motor, special populations, vocational/occupational, and miscellaneous
ETS Test Collection
Obj. 10.1 &12.1
Selection of a Test Sources (continued)
Professional journals Test publishers and distributors
Issues to consider when selecting tests Psychometric properties
Validity Reliability Length of test Scoring and score interpretation
Obj. 10.1, 11.1, & 12.1
Selection of a Test
Issues to consider when selecting tests Non-psychometric issues
Cost Administrative time Objections to content by parents or
others Duplication of testing
Obj. 11.1
Selection of a Test Designing your own tests
Get help from others with experience in developing tests
Item writing guidelines Avoid ambiguous and confusing wording and
sentence structure Use appropriate vocabulary Write items that have only one correct answer Give information about the nature of the desired
answer Do not provide clues to the correct answer See Writing Multiple Choice Items
Obj. 11.2
Selection of a Test Test administration guidelines
Plan ahead Be certain that there is consistency
across testing sessions Be familiar with any and all
procedures necessary to administer a test
Obj. 11.4