Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | beatrix-bell |
View: | 225 times |
Download: | 2 times |
Introduction to Descriptive StatisticsIntroduction to Descriptive Statistics
Objectives:Objectives:Determine the general purpose of Determine the general purpose of
correlational statistics in correlational statistics in assessment & evaluationassessment & evaluation
““Data have a story to tell. Statistical analysis is Data have a story to tell. Statistical analysis is detective work in which we apply our intelligence detective work in which we apply our intelligence and our tools to discover parts of that story.”and our tools to discover parts of that story.”
-Hamilton (1990)-Hamilton (1990)
CorrelationCorrelation
Once you know:Once you know:– MiddleMiddle– SpreadSpread– ShapeShape– Relative position of specific casesRelative position of specific cases
It is now useful to know It is now useful to know relationships between variables.relationships between variables.
CorrelationCorrelation
Direction of RelationshipsDirection of Relationships Positive or NegativePositive or Negative Magnitude of RelationshipsMagnitude of Relationships Weak , Moderate, Strong Weak , Moderate, Strong ScatterplotsScatterplots OutliersOutliers
CorrelationCorrelation
Quantitative index of associationQuantitative index of associationScaling of Pearson rScaling of Pearson r––1 = perfect negative relationship1 = perfect negative relationship0 = no relationship0 = no relationship+1 = perfect positive relationship+1 = perfect positive relationshipMost common measure of Most common measure of
association for interval and ratio association for interval and ratio variablesvariables
ExamplesExamples
Parent educational level and Parent educational level and student academic achievementstudent academic achievement
Parent income or SES and student Parent income or SES and student academic achievementacademic achievement
Coping strategies and perceived Coping strategies and perceived stressstress
CorrelationCorrelation
For positive correlations between For positive correlations between two variables:two variables:
High values on x tend to be High values on x tend to be associated with high values on yassociated with high values on y
Low values on x tend to be Low values on x tend to be associated with low values on yassociated with low values on y
High Positive Correlation, r=.825
30.00
40.00
50.00
60.00
70.00
30.00 40.00 50.00 60.00 70.00
Curriculum
To
tal S
core
GABIRTH
50403020
WE
IGH
T
5000
4000
3000
2000
1000
0
r=.337 2001-2002 NC State System Level Datar=.337 2001-2002 NC State System Level Data
FRL
908070605040302010
TU
RN
OV
ER
40
30
20
10
0
CorrelationCorrelation
For negative correlations between For negative correlations between two variables:two variables:
Low values on x tend to be Low values on x tend to be associated with high values on yassociated with high values on y
High values on x tend to be High values on x tend to be associated with low values on yassociated with low values on y
Percieved Control
9080706050403020
PS
S t
ota
l
60
50
40
30
20
10
r=-.613r=-.613
r=-.716 2001-2002 NC State System Level Datar=-.716 2001-2002 NC State System Level Data
FRL
908070605040302010
EO
G
100
90
80
70
60
50
40
r=-.560 2001-2002 NC State System Level Datar=-.560 2001-2002 NC State System Level Data
TURNOVER
403020100
EO
G
100
90
80
70
60
50
40
Interpretation GuidelinesInterpretation Guidelines
Correlation is not causality. Correlation is not causality.
Correlation is necessary for causal Correlation is necessary for causal inference, but not sufficient.inference, but not sufficient.
Causal inference requires Causal inference requires experimental designs.experimental designs.
Interpreting the Correlation Interpreting the Correlation CoefficientCoefficient
Correlation does NOT imply Correlation does NOT imply causation!!!causation!!!
Possible Explanations:Possible Explanations:
1.1. X causes YX causes Y
2.2. Y causes XY causes X
3.3. A third factor, or multiple extraneous A third factor, or multiple extraneous factors, cause both X and Yfactors, cause both X and Y
X Y
YX
X
a
Y
Interpretation GuidelinesInterpretation Guidelines
Rum use and number of people Rum use and number of people entering the priesthood. entering the priesthood.
Square footage of home and Square footage of home and student academic achievement.student academic achievement.
Percent of women in a state who Percent of women in a state who earn high salaries and percent of earn high salaries and percent of public officials who are women.public officials who are women.
Interpretation GuidelinesInterpretation Guidelines
The third variable problem.The third variable problem.– SES and home size.SES and home size.
The risk factor vs. causal agent problem.The risk factor vs. causal agent problem.– Length of time smoking and life Length of time smoking and life
expectancy.expectancy.
The direction of causality problem.The direction of causality problem.– Productivity and job satisfactionProductivity and job satisfaction
Interpreting MagnitudeInterpreting Magnitude
Strong Moderate Weak Weak Moderate Strong
-1.0 -0.7 -0.3 0.0 0.3 0.7 1.0
Perfect No PerfectNegative Relationship Positive
What would you expect?What would you expect?
Perceived stressPerceived stressDepressionDepression
r=.582r=.582
BDI Total
50403020100-10
PS
S t
ota
l
60
50
40
30
20
10
What would you expect?What would you expect?
DepressionDepressionSelf-acceptanceSelf-acceptance
r=-.596r=-.596
Self-Acceptance
807060504030
BD
I T
ota
l
50
40
30
20
10
0
-10
Uses of Correlation In Uses of Correlation In AssessmentAssessment Inter-rater reliabilityInter-rater reliabilitySplit-half reliabilitySplit-half reliabilityConstruct validityConstruct validityConcurrent validityConcurrent validityConvergent and Discriminant Convergent and Discriminant
validityvalidity
An Example Using the PRIAn Example Using the PRI Table 2Correlation between PRI scale scores and scores from measures of psychological functioning
Perceived Maintaining Social Self- PreventiveMeasure Control Perspective Resource. Acceptance Scanning Resources
Loss of Control -0.544 -0.619 -0.380 -0.544 -0.558 -0.542Loss of Efficacy -0.585 -0.520 -0.447 -0.597 -0.467 -0.599Perceived Stress -0.640 -0.572 -0.467 -0.646 -0.519 -0.646
Beck Depression Inventory -0.505 -0.539 -0.405 -0.597 -0.427 -0.543
Interpersonal Sensitivity -0.549 -0.558 -0.447 -0.587 -0.476 -0.579Interpersonal Ambivalence -0.211 -0.194 -0.262 -0.229 -0.193 -0.233Aggression -0.332 -0.351 -0.248 -0.393 -0.316 -0.359IIP Total Score -0.450 -0.455 -0.393 -0.499 -0.407 -0.482
Neuroticism -0.153 -0.157 -0.164 -0.205 -0.150 -0.180Extroversion 0.110 0.120 0.153 0.163 0.093 0.139Openness 0.089 0.077 0.102 0.110 0.062 0.099Agreeableness 0.071 0.066 0.091 0.129 0.076 0.094Conscientiousness 0.206 0.231 0.168 0.253 0.244 0.238
Note. n=344. All correlation coefficients with an absolute value greater than .106 are statistically significant at p<.05.IIP = Inventory of Interpersonal Problems.
Determine whether each statement is TRUE or FALSE:
T/F 1. A measure of central tendency is a summary score that represents a set of scores.
T/F 2. The mean is a score that occurs most frequently.
T/F 3. A distribution of scores may have more than 1 mode (bi- or multi- modal) or no mode (amodal).
T/F 4. The mode is the point in a distribution where 50% of the scores are above and 50% are below.
T/F 5. The exact median can always be computed by averaging the two middle scores together.
T/F 6. The median is calculated by dividing the total sum of the scores by the number of scores.
T/F 7. The mode is the measure of central tendency that is used the most.
T/F 8. The mean should be used with categorical data. T/F 9. To describe a distribution, a measure of central
tendency AND variability should be reported. T/F 10. In groups with a narrow spread of scores, the
range and SD are larger than in groups where scores spread out.
Describing Data: Self Describing Data: Self AssessmentAssessment
Fill in the blank with the appropriate descriptive statistic.
11. The _____ is the measure of central tendency that should be used with nominal data.
12. The _____ is not appropriate when extreme scores are present, because it will be misleading. In these cases, the ____ should be used.
13._____ and _____ are most often used for descriptive statistics and not for inferential statistics.
14.The _____ is used for both descriptive and inferential statistics.
15.The _____ is the distance between the highest and lowest scores.
16.The _____ is the average distance of scores from the mean.
17._____ is used most extensively to describe the variability of a distribution.