Post on 23-Jun-2015
description
transcript
By:video.edhole.com
video.edhole.com
Measurement ErrorWhatever measurement we might make with
regard to some psychological construct, we do so with some amount of errorAny observed score for an individual is their true score
with error added in
There are different types of “error”, but here we are concerned with a measure’s inability to capture the true response for an individualObserved Score = True score + Error of measurement
video.edhole.com
ReliabilityReliability refers to a measure’s ability to capture an
individual’s true score, i.e. to distinguish accurately one person from another
While a reliable measure will be consistent, consistency can actually be seen as a by-product of reliability, and in a case where we had perfect consistency (everyone scores the same and gets the same score repeatedly), reliability coefficients could not be calculated No variance/covariance to give a correlation
The error in our analyses is due to individual differences but also the lack of the measure being perfectly reliable
video.edhole.com
Reliability Criteria of reliability
Test-retest Test components (internal consistency)
Test-retest reliability Consistency of measurement for individuals over time
The score similarly e.g. today and 6 months from now Issues
Memory If too close in time the correlation between scores is due to memory of item
responses rather than true score captured Chance covariation
Any two variables will always have a non-zero correlation Reliability is not constant across subsets of a population
General IQ scores good reliability IQ scores for college students, less reliable
Restriction of range, fewer individual differences
video.edhole.com
Internal ConsistencyWe can get a sort of average correlation
among items to assess the reliability of some measure1
As one would most likely intuitively assume, having more measures of something is better than few
It is the case that having more items which correlate with one another will increase the test’s reliabilityvideo.edhole.com
What’s good reliability?While we have conventions, it really kind of dependsAs mentioned reliability of a measure may be different
for different groups of peopleWhat we may need to do is compare reliability to those
measures which are in place and deemed ‘good’ as well as get interval estimates to provide an assessment of the uncertainty in our reliability estimate
Note also that reliability estimates are biased upwardly and so are a bit optimistic
Also, many of our techniques do not take into account the reliability of our measures, and poor reliability can result in lower statistical power i.e. an increase in type II errorThough technically increasing reliability can potentially also
lower power1
video.edhole.com
Replication and Reliability While reliability implies replicability, assessing reliability does not
provide a probability of replication Note also that statistical significance is not a measure of reliability or
replicability1
Replication is not perhaps conducted as much as should be in psychology for a number of reasons Practical concerns, lack of publishing outlets etc.
Furthermore, knowing our estimates are biased and variable themselves, we might even think that in many cases we would not expect consistent research findings
In psychology, many people spend a lot of time debating back and forth about the merits of some theory, citing cases where it did or did not replicate
However the lack of replication could be due to low power, low reliability, problem data, incorrectly carrying out the experiment etc. In other words, we didn’t repeat because of methodology, not because
the theory was wrong
video.edhole.com
Factors affecting the utility of replicationsYou can’t step in the same river twice!
Heraclitus1
WhenLater replications are not providing as much
information, however they can contribute greatly to the overall assessment of an effect Meta-analysis
HowThere is no perfect replication (different people
involved, time it takes to conduct etc.)Doing ‘exact’ replication gives us more confidence in the
original finding (should it hold), but may not offer much in the way of generalization Example: doing a gender difference study at UNT over and
over. Does it work for non-college folk? People outside of Texas?
video.edhole.com
Factors affecting the utility of replicationsBy whom
It is well known that those with a vested interest in some idea tend to find confirming evidence more than those that don’t
Replications by others are still being done by those with an interest in that research topic and so may have a ‘precorrelation’ inherent in their attempt Direct: correlation of attributes of persons involved Indirect: correlation of data to be obtained
Gist, we can’t have truly independent replication attempts, but must strive to minimize bias
The more independent replication attempts are, the more informative they will bevideo.edhole.com
ValidityValidity refers to the question of whether our
measurements are actually hitting on the construct we think they are
While we can obtain specific statistics for reliability (even different types), validity is more of a global assessment based on the evidence available
We can have reliable measurements that are invalidClassic example: The scale which is consistent and able
to distinguish from one person to the next but actually off by 5 pounds
video.edhole.com
Validity Criteria in Psychological Testing Content validity Criterion validity
Concurrent Predictive
Construct-related validity Convergent Discriminant
Content validity Items represent the kinds of material (or content areas) they are
supposed to represent Are the questions worth a flip in the sense they cover all domains of a
given construct? E.g. job satisfaction = salary, relationship w/ boss, relationship w/ coworkers
etc.video.edhole.com
Validity Criteria in Psychological TestingCriterion validity
the degree to which the measure correlates with various outcomes Does some new personality measure correlate with the Big 5
ConcurrentCriterion is in the present
Measure of ADHD and current scholastic behavioral problems
PredictiveCriterion in the future
SAT and college gpavideo.edhole.com
Validity Criteria in Psychological TestingConstruct-related validity
How much is it an actual measure of the construct of interest
ConvergentCorrelates well with other measures of the construct
Depression scale correlates well with other dep scales
DiscriminantIs distinguished from related but distinct constructs
Dep scale != Stress scale
video.edhole.com
Validity Criteria in Experimentation Statistical conclusion validity
Is there a causal relationship between X and Y? Correlation is our starting point (i.e. correlation isn’t causation, but
does lead to it) Related to this is the question of whether the study was sufficiently
sensitive to pick up on the correlation Internal validity
Has the study been conducted so as to rule out other effects which were controllable? Poor instruments, experimenter bias
External validity Will the relationship be seen in other settings?
Construct validity Same concerns as before
Ex. Is reaction time an appropriate measure of learning?video.edhole.com
SummaryReliability and Validity are key concerns in
psychological researchPart of the problem in psychology is the lack of
reliable measures of the things we are interested in1
Assuming that they are valid to begin with, we must always press for more reliable measures if we are to progress scientificallyThis means letting go of supposed ‘standards’ when they
are no longer as useful and look for ways to improve current onesvideo.edhole.com