by: LLOYD PSYCHE T. BALTAZAR
Validity
Validity - is the extent to which a measurement tool
measures what it's supposed to measure.
Remember your thermometer?
It's measuring the room temperature, notyour body temperature. Since it's supposedto be measuring your body temperature, thethermometer is not valid.
Validity
VALIDITY is an indication of how sound your
research is.
▪ Validity applies to both the design and the
methods of your research.
▪ Validity in data collection means that your
findings truly represent the phenomenon you are
claiming to measure. Valid claims are solid
claims.
Validity
Internal Validity - is affected by flaws within the
study itself such as not controlling some of the major
variables (a design problem), or problems with the
research instrument (a data collection problem).
"Findings can be said to be internally invalid becausethey may have been affected by factors other thanthose thought to have caused them, or because theinterpretation of the data by the researcher is notclearly supportable" (Seliger & Shohamy 1989, 95).
Validity
Here are some factors which affect internal validity:
✓ Subject variability
✓ Size of subject population
✓ Time given for the data collection or experimentaltreatment
✓ History
✓ Maturation
✓ Instrument/task sensitivity
ValidityExternal Validity - is the extent to which you can
generalize your findings to a larger group or other
contexts. If your research lacks external validity, the
findings cannot be applied to contexts other than the
one in which you carried out your research.
"Findings can be said to be externally invalid because[they] cannot be extended or applied to contextsoutside those in which the research took place"(Seliger & Shohamy 1989, 95).
Validity
Here are some factors which affect external validity:
✓ Population characteristics (subjects)
✓ Interaction of subject selection and research
✓ The effect of the research environment
✓ Researcher or experimenter effects
✓ Data collection methodology
✓ The effect of time
4 Types of Validity
Construct Validity
Content Validity
Criterion Validity
Face Validity
4 Types of Validity
Construct validity occurs when the theoretical
constructs of cause and effect accurately represent
the real-world situations they are intended to model.
This is related to how well the research is
operationalized. A good research turns the theory
(constructs) into actual things you can measure.
Example: A test designed to measure depression must only measure thatparticular construct, not closely related ideals such as anxiety or stress.
4 Types of Validity
Content validity occurs when the
experiment provides adequate coverage of
the subject being studied. This includes
measuring the right things as well as
having an adequate sample.
4 Types of ValidityExamples of measurements that are content valid:
➢Height (construct) measured in centimeters (measurement).
➢AP Physics knowledge (construct) measured by the AP exam(measurement).
Examples of measurements that have debatable content validity:
➢The Bar Exam is not a good measure of ability to practice law.
➢IQ tests are not a good way to measure intelligence.
4 Types of Validity
Criterion validity (or criterion-related
validity) measures how well one measure
predicts an outcome for another measure.
A test has this type of validity if it is useful
for predicting performance or behavior in
another situation (past, present, or future).
4 Types of Validity
Examples of Criterion validity:
➢A job applicant takes a performance test during the interviewprocess. If this test accurately predicts how well the employeewill perform on the job, the test is said to have criterion validity.
➢A graduate student takes the GRE (Graduate Record Exam). TheGRE has been shown as an effective tool (i.e. it has criterionvalidity) for predicting how well a student will perform in graduatestudies.
4 Types of ValidityFace validity also called logical validity, is a
simple form of validity where you apply a
superficial and subjective assessment of
whether or not your study or test measures
what it is supposed to measure.
▪ You can think of it as being similar to “face value”, where youjust skim the surface in order to form an opinion.
Reliability
Reliability - is the degree to which an assessment
tool produces stable and consistent results.
➢ It is a measure of the stability or consistency of test
scores. You can also think of it as the ability for a
test or research findings to be repeatable.
▪ For example, a medical thermometer is a reliable toolthat would measure the correct temperature eachtime it is used.
4 Types of Reliability
Inter-rater Reliability
Test-retest Reliability
Parallel Forms Reliability
Internal Consistency Reliability
4 Types of Reliability
Inter-rater reliability is a measure of reliability used to assess the degree
to which different judges or raters agree in their assessment decisions.
Type of Reliability When to Use How to UseAn Example of What You can Say When
You’re Done
Inter-rater Reliability
When you want to know
whether there is
consistency in the rating
of some outcome.
Examine the percent of
agreement between
raters.
The inter-rater reliability
for the best-dressed
Football player judging
was 0.9, which indicates
a high degree of
agreement among
judges.
4 Types of ReliabilityTest-retest reliability is a measure of reliability obtained by administering
the same test twice over a period of time to a group of individuals. The
scores from Time 1 and Time 2 can then be correlated in order to
evaluate the test for stability over time.
Type of Reliability When to Use How to UseAn Example of What You can Say When
You’re Done
Test-retest Reliability
When you want to know
whether a test is reliable
over time.
Correlate the scores
from a test given in Time
1 with the same test
given in Time 2.
The Bonzo test of
identity formation for
adolescence is reliable
over time.
4 Types of ReliabilityParallel Forms reliability is a measure of reliability obtained by
administering different versions of an assessment tool (both versions
must contain items that probe the same construct, skill, knowledge base,
etc.) to the same group of individuals.
Type of Reliability When to Use How to UseAn Example of What You can Say When
You’re Done
Parallel Forms
Reliability
When you want to know
if several different forms
of a test are reliable or
equivalent.
Correlate the scores
from one form of the test
with the scores from a
second, different form of
the same test of the
same content.
Set A and Set B of the
Math Exams are
equivalent to one
another.
4 Types of ReliabilityInternal Consistency reliability is a measure of reliability used to
evaluate the degree to which different test items that probe the same
construct produce similar results. Usually Cronbach Alpha (α) is used.
Type of Reliability When to Use How to UseAn Example of What You can Say When
You’re Done
Internal Consistency
Reliability
When you want to know
if the items on a test
assess one, and only
one dimension.
Correlate each individual
item score with the total
score
All of the items on the
emotional intelligence
test assess the same
construct.
4 Types of Reliability
Type of Reliability Description
Inter-rater Reliability Different people, same test
Test-retest Reliability Same people, different times
Parallel Forms ReliabilityDifferent people, same time,
different/equivalent test
Internal Consistency Reliability Different questions, same construct