Validity Psych 395 - DeShon. Example: Validity of a Measure “The use of the polygraph (lie...

Validity

Psych 395 - DeShon

Example: Validity of a Measure

“The use of the polygraph (lie detector test) is not nearly as valid as some say and can easily be beaten and should never be admitted into evidence in courts of law, say psychologists from two scientific communities who were surveyed on the validity of polygraphs.” – APA News Release

Measures and Constructs, Again!

X YUnobservableVariables

X YMeasuredVariables

Basic Insights…

Differences in Observed Measures are Caused by Variations in the Unobserved Construct.

One Way to Think About Validity: How well does the observed variable capture the unobserved variable?

Apply this Idea to the Polygraph…

Example: Validity of a Measure

“The use of the polygraph (lie detector test) is not nearly as valid as some say and can easily be beaten and should never be admitted into evidence in courts of law, say psychologists from two scientific communities who were surveyed on the validity of polygraphs.” – APA News Release

Issues of Validity

Does the test actually measure what it is purported to measure?

Do differences in tests scores reflect true differences in the underlying construct?

Are inferences based on the test scores justified?

It’s All About Inferences….

Cronbach (1971): Validation is the process of collecting evidence to support the types of inferences that are drawn from test scores.

There is no such thing as “the” validity of a test. Why? Many different kinds of inferences can be made from the same test.

“Validity for what?” Inferences and decisions based on test scores

A person with this score is likely to Be a better parent Do well in law school Be most satisfied as an engineer Steal from his/her employer

Types of validity

Content Criterion-related Construct

construct(general evidence-gathering)

content(more theory-based)

criterion-related(more data-based)

Content Validity of a Measure

Collectively, do the items adequately represent all of the domains of the construct of interest?

Staring Point: A Well Defined Construct. Often have a panel of experts judge whether

items adequately sample the domain of interest.

Example: 1st Grade Math Objectives

What 1st Graders in School District X Should: A. Be able to add any two positive numbers

whose sum is 20 or less. B. Subtract any two numbers (each less than

15) whose difference is a positive number.

Item Pool – Which are Content Valid?

1. 13 + 2 =___ 2. 12 - 5 =____ 3. 10 - 5= ____ 4. 26 - 15 = ____ 5. 13 + 4 – 7 = ____ Sammy has 10 pennies. He lost 2. How many

pennies does Sammy have? A. 2 pennies B. 8 pennies C. 10 pennies D. 12

pennies

Example: Depression (Modified from the DSM – IV)

A complex of symptoms marked by:– Disruptions in appetite and weight – Insomnia or hypersomnia– Loss of interest or pleasure in activities– Loss of energy– Feelings of worthlessness– Feels sad or empty nearly everyday– Frequent death-related thoughts

Item Pool – Which are Content Valid?

I feel blue or sad. I feel nervous when speaking to someone in

authority. I have crying spells. I’m always willing to admit it when I make a

mistake. I felt that everything I did was an effort. I never resent being asked to return a favor. I experience spells of terror or panic.

Contamination & Deficiency

Construct Measure

Relevance or Content Validity

MeasureContamination

MeasureDeficiency

What do we want?

A measure that samples from all important domains or aspects (Low Deficiency)

A measure that does not include anything irrelevant (Low Contamination)

That is, a measure that adequately captures all of the domains of the construct that it is intended to measure. (High Content Validity)

What Else Do We Want: A Measure that Predicts Something It Should!

Criterion-related Evidence for a Measure

What should this test predict? What inferences are we going to use this test to make?

Criterion-related validation is data based. Does the test actually predict behavior that it is

supposed to predict?– Correlate an honesty test with employee theft– Correlate a pencil and paper measure of delinquency with

arrest records– Correlate a measure of study habits with actual grades

Two types of criterion-related validity

Predictive validity – future criteria

Concurrent validity – current criteria

This distinction makes no procedural difference (Both correlations)

Think of a Relevant Criterion

SAT or ACT Scores A Measure of Conscientiousness A Measure of Political Liberalism A Measure of Relationship Satisfaction

Criterion-related validity: Concurrent validity

Students who have been admitted to MSU take the SAT. Their GPA is recorded at the same time.

The correlation between the test scores and performance is computed. This correlation is sometimes called a validity coefficient.

Criterion-related validity: Predictive validity

Students take the SAT (or ACT) during High School and then some are selected into MSU. Later, their SAT scores are correlated with their college GPA.

This correlation is also sometimes called a validity coefficient.

If SAT scores and college GPA are correlated, then the SAT has some degree of predictive validity for predicting college GPA.

In both cases the degree of criterion-related validity is inferred from the

size of the correlation….

Issues

What is our Criterion? How do we measure it?– Reliability of Predictor and Criterion– Recall: What does measurement error do?

What sample will we use?– Small Samples – More Imprecision– Issues of Generalization

Restriction of Range– Want Variability on both Predictor and Criterion variables

Predictor-Criterion Overlap– Same “items” on both measures … bad!

Measurement Error

Reliability – Index of the presence of measurement error (1.0 reliability = No error)

Unreliability in the predictor and criterion increases the error variance and therefore serves to reduce (attenuate) the observed correlation between them

When/where might we find unreliability? … Everywhere!

Tests used as predictors (e.g., measures of depression)

Criterion measures (e.g., ratings of client well-being)

Unreliability is a concern for both predictors and criteria – unreliability in both can reduce correlations

Correcting Correlations for Attenuation

rxy = observed correlation between x and yrxx and rxx = reliability coefficients of x and y

Construct Validity – How Well Does a Measure Actually Assess the Underlying

Conceptual Variable?

• Often the focus is on the Construct (i.e., the idea) and NOT just the properties of a single measure.

• How does this construct fit into a nomological network (a lawful network of expected relations)?

• Can we get convergence across different measures of the SAME construct?

• Can we get divergence? Are measures of different constructs unrelated?

Key Terms (Campbell & Fiske, 1959)

Convergent Validity: Associations Between Different Methods of Assessing the Same Construct. Confirmation of the Measurement of the Construct using Multiple Methods.

Discriminate Validity: Distinctiveness of Constructs. This is indicated by a lack of association between measures of different constructs.

Jingle Fallacy (Kelley, 1927)

Jingle fallacy: Belief that because the same name is applied to measures of different constructs, these measures are really assessing the same thing.– Smith’s Measure of Extraversion and Robert’s

Measure of Extraversion might not actually measure the same thing.

Jangle fallacy (Kelley, 1927)

Jangle fallacy: Belief that because measures are called by different names they are measuring different constructs.– Smith’s Measure of Sociability and Robert’s

Measure of Surgency might both actually measure Extraversion.

Q: How do examine all of these ideas?

A: Use Correlation Matrices!

Multitrait-Multimethod Matrices (MTMM)

Suppose we measure three different personality traits– Extraversion– Conscientiousness– Neuroticism

Suppose we measure each of these traits in three different ways

– Self-report– Informant Report– Behavior test (Won’t Show this on Charts)

(.84).11-.11.39.19-.16N

(.94).22.45.32CSelf

(.93).57E

(.76)-.25-.28N

(.89).30CInt

(.89)E

NCENCE

Self-ReportInformant

Where is the convergent validity?Where is the discriminant validity?

Note.: E=Extraversion, C=Conscientiousness, N=Neuroticism

Convergent Evidence

Same construct assessed using different methods (self versus informant)

Convergent Validity diagonal (blue font)– E: .57– C: .45– N: .39

Technical Label: Monotrait-Heteromethod correlation (“trait correlation”)

– Same Trait – Different Method

(.84).11-.11.39.19-.16N

(.94).22.45.32CSelf

(.93).57E

(.76)-.25-.28N

(.89).30CInt

(.89)E

NCENCE



Divergent evidence

Different traits assessed using the same method – (Want Low Correlations)

– Technical: Heterotrait-Monomethod (Method Correlation)– Glop or Method Variance

Different traits assessed using different methods - (Want Low Correlations)

– Technical: Heterotrait-Hetromethod (“Neither”)– Should be the lowest correlations in the MTMM Matrix

(.84).11-.11.39.19-.16N

(.94).22.45.32CSelf

(.93).57E

(.76)-.25-.28N

(.89).30CInt

(.89)E

NCENCE



Differentiation Between Groups

Examine the difference in test scores arising from groups known to differ on the construct– Kids with ADHD versus Kids Without ADHD– Depressed versus Non-Depressed People– Criminals versus Non-Criminals– Masculinity versus Femininity

– Discriminant group Validity

Factor Analysis

Basic Ideas

Figure out what is related and what is not? A construct-validity question … (Talking about convergence and divergence)

We do factor analysis in our heads all the time in real life!

Statistical Procedure to reduce a large number of intercorrelations to a smaller number of factors that summarize the pattern of observed correlations between variables

What is a Factor?

Variables that give rise to correlations between items on a questionnaire.

The existence of factors is inferred from patterns of association between observed variables.

Factors are sometimes called Source Traits or Latent Traits.

Goal of Factor Analysis is to identify these latent (unobserved) variables.

Date post:	03-Jan-2016
Category:	Documents
Upload:	miles-stafford
View:	221 times
Download:	0 times

Validity Psych 395 - DeShon. Example: Validity of a Measure “The use of the polygraph (lie...

Documents