+ All Categories
Home > Documents > 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent...

12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent...

Date post: 17-Jan-2016
Category:
Upload: susanna-reed
View: 218 times
Download: 0 times
Share this document with a friend
Popular Tags:
82
06/18/22 Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis. Each dot represents the combination of scores on both variables for one or more cases. If two or more cases have the same scores, they will be shown by the same dot. The relationship between two quantitative variables is pictured with a scatterplot. This scatterplot depicts the relationship between Percent of the population living below the national poverty line [poverty] and Percent of population enrolled in primary, secondary, and tertiary schools [enrolPop]. The SPSS syntax file CorrelationAndRegression.sps was used to produce the following output.
Transcript
Page 1: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 1

The dependent variable, poverty, is plotted on the vertical axis.

The independent variable, enrolPop, is plotted on the horizontal axis.

Each dot represents the combination of scores on both variables for one or more cases. If two or more cases have the same scores, they will be shown by the same dot.

The relationship between two quantitative variables is pictured with a scatterplot. This scatterplot depicts the relationship between Percent of the population living below the national poverty line [poverty] and Percent of population enrolled in primary, secondary, and tertiary schools [enrolPop].

The SPSS syntax file CorrelationAndRegression.sps was used to produce the following output.

Page 2: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 2

The histogram at the bottom of the display shows the distribution of the independent variable. It is color coded in red to link the histogram and the scatterplot.

The histogram at the top of the display shows the distribution of the dependent variable. It is color coded in green to link the histogram and the scatterplot.

To facilitate our understanding of the distribution of variables in the scatterplot, we add a histogram for each variable to the display of charts. The histograms support the evaluation of skewness and the presence of outliers.

Each histogram has a normal curve overlay, skewness, and kurtosis, to help us evaluate the shape.

Page 3: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 3

For the dependent variable, the mean is green and the standard deviation units are tan.

To support the location of outliers and the evaluation of normality, lines have been added at the location of the mean and standard deviation units to both charts.

In the distribution of the dependent variable, we see that all of the cases fall within three tan standard deviations of the green mean, so there are no outliers more than 3 standard deviations from the mean.

Page 4: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 4

For the dependent variable, the mean is red and the standard deviation units are orange.

The skewness and kurtosis statistics for each histogram tell us that we satisfy the criteria for a nearly normal distribution.

Page 5: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 5

We add a blue trend line or linear fit line that summarizes the overall pattern of the cases in the scatterplot.

The strength of the relationship is depicted by the narrowness of the band around the trend line, though this is somewhat distorted by the desire to spread the points out throughout the graph space. Strength is measured with greater precision by the r and rho statistics in the scatterplot title.

Page 6: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 6

We add the purple colored loess smoother fit line to evaluate the linearity of the relationship. A loess smoother averages subsets of points and thus tracks more closely where the points are concentrated. Differences between the linear fit line and the loess smoother are a visual tool for determining whether the relationship is linear.

I would judge the overall pattern in this plot to be linear for because the differences between the linear fit line and the loess smoother are small and does not suggest a well-defined curve.

Page 7: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 7

This chart shows a clear pattern of non-linearity. At the left side of the chart, increasing per capital health expenditures has a substantial impact on the rate of infant mortality rate, but increasing per capita health expenditures past $1,000 does not appear to produce further reductions in the rate of infant mortality.

Page 8: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 8

• To quantify the relationship between two quantitative variables, we use a correlation coefficient, Pearson’s r or Spearman’s rho.

• The correlation coefficient tells us:• If there is a relationship between the variables• The strength of the relationship• The direction of the relationship

• Correlation coefficients vary from -1.0 to +1.0.• A correlation coefficient of 0.0 indicates that there is no

relationship.• A correlation coefficient of -1.0 or + 1.0 indicates a perfect

relationship, i.e. the scores on one variable can be accurately determined by the scores on the other variable.

Page 9: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 9

• If a correlation coefficient is negative, it implies an inverse relationship, i.e. the scores on the two variables move in opposite directions, higher scores on one variable are associated with lower scores on the other variable.

• If a correlation coefficient is positive, it implies a direct relationship, i.e. the scores on the two variables move in the same direction, higher scores on one variable are associated with higher scores on the other variable.

• When we talk about the size of a correlation, we refer to the value irrespective of the sign – a correlation of -.728 is just as large or strong as a correlation of +.728

• The Pearson R correlation coefficient treats the data as interval.

• Spearman’s Rho treats the data as ordinal, using the rank order of the scores for each variable rather than the values.

Page 10: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 10

• Suppose I had the data to the right showing the relationship between GPA and income.

• SPSS would calculate Pearson’s r for this data to be .911 and Spearman’s rho to be .900.

GPA Income

3.2 45000

3.3 42000

3.5 48000

3.7 50000

3.8 55000

GPA Rank

Income Rank

1 2

2 1

3 3

4 4

5 5

• The ranks for the values for each of the variables are shown in the table to the right.

• Using the ranks as data, SPSS would calculate both Pearson’s r and Spearman’s rho to be .900.

Page 11: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 11

• Suppose the fifth subject had an income of 100,000 instead of 55,000.

• SPSS would calculate Pearson’s r for this data to be .733 and Spearman’s rho to be .900.

GPA Income

3.2 45000

3.3 42000

3.5 48000

3.7 50000

3.8 100000

GPA Rank

Income Rank

1 2

2 1

3 3

4 4

5 5

• The ranks for the values did not change. The fifth subject had the highest income, so Spearman’s rho has the same value.

• The Pearson’s r decreased from .911 to .733.

• Outliers, and the skewing of the distribution by outliers, have a greater effect on Pearson’s r than they do on Spearman’s rho.

Page 12: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 12

• In the scatterplot, outliers (the case I changed from 55,000 to 100,000) will draw the loess line toward them away from the linear fit line, making the pattern of points appear less linear. or more non-linear.

55,000

100,000

The lines demonstrate the point, but the cyan line is really a quadratic fit rather than a loess line because I can’t do much smoothing with only 5 data points.

Page 13: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 13

• Outliers, and the skewing of the distribution by outliers, have a greater effect on Pearson’s r than they do on Spearman’s rho.

• As the outliers become more extreme, and the distribution becomes more skewed, Spearman’s rho becomes larger than Pearson’s r, and the overall trend in the data is non-linear.

• To accurately model the relationship, we have three choices:1. use a more complex non-linear model to analyze the

relationship2. re-express the data to reduce skewing and the impact of

outliers, and analyze the relationship with a linear model3. Exclude outliers to reduce skewing, and analyze the

relationship with a linear model• The second alternative is preferred, though it may not always

be possible.

Page 14: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 14

• If the three following conditions are present, re-expressing the data may reduce the skewness and increase the size of Pearson’s r to justify treating the relationship as linear:1. If the model appears non-linear because of the

difference between the loess line and the linear fit line, 2. If Spearman’s rho is larger than Pearson’s r (by ±.05 or

more), 3. If one or both of the variables violates the skewness

criteria for a normal distribution.

• We will employ the transformations we have used previously: if the distribution is negatively skewed, we re-express the data as squares; if the distribution is positively skewed, we re-express the data as logarithms.

Page 15: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 15

• There are two sets of guidelines used to translate the r correlation coefficient into a narrative phrase, guidelines attributed to Tukey and guidelines attributed to Cohen.

• Tukey’s guidelines interpret a correlation:• between 0.0 up to ±0.20 as very weak; • equal to or greater than ±0.20 up to ±0.40 as weak; • equal to or greater than ±0.40 up to ±0.60 as moderate; • equal to or greater than ±0.60 up to ±0.80 as strong; and • equal to or greater than ±0.80 as very strong.

• Cohen’s guidelines interpret a correlation:• less than ±0.10 = trivial; • equal to or greater than ±0.10 up to ±0.30 = weak or small; • equal to or greater than ±0.30 up to ±0.50 = moderate; • equal to or greater than ±0.50 or greater = strong or large

Page 16: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

Examples

Slide 16

Page 17: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

Slide 17

In this chart, both variables are positive skewed, and are more peaked that expected for normally distributed variables. The scatterplot is clearly not linear and Spearman’s rho suggests a much stronger relationship that Pearson’s r.

Page 18: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

Slide 18

The log re-expression of both variables was effective in reducing the skewness of both variables, improving the linearity of the relationship, and producing a Pearson’s r that is the same strength as Spearman’s rho.

Page 19: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

Slide 19

Re-expressions is not always effective at improving the relationship.

In this example, Percent of females enrolled in primary education [enrFPri] is negatively skewed, and the scatterplot suggests a non-linear relationship.

The birth rate declines rapidly at low levels of female enrollment (35% to 60%).

At high level of female enrollment (80% to 100%), birth rates range from 10 to 45.

Page 20: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

Slide 20

The re-expression reduced the skewness from -1.47 to -1.09, but it was not successful in improving the linearity of the relationship. There is little advantage to reporting the more complicate model that utilizes re-expression.

The birth rate declines rapidly at low levels of female enrollment.

At high level of female enrollment, birth rates continue to be spread across a wide range.

Page 21: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 21

• We might legitimately choose not to use any transformation, or to ignore the non-linearity of the relationship, but we should look at the plots and statistics for the distribution of the variables in the analysis so we are making an informed choice.

• The consequence of ignoring the issue of linearity is usually that we fail to state the actual importance of a relationship, though there are occasions when we might be citing a relationship as important when it’s strength is the result of extreme outliers.

• I am not confident that we can always draw a correct conclusion about re-expression by visually inspecting histograms and scatterplots. While I can avoid those that move the distribution in the wrong direction, I usually test all that are potentially applicable before making a decision.

Page 22: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

Homework Problems

Slide 22

Page 23: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

Slide 23

SOLVING THE HOMEWORK PROBLEMS

Pearson's r correlation coefficient measures the strength of the linear relationship between the distributions of two quantitative variables. If the relationship is not linear, the application of statistics that assume linearity may give questionable results. Determining whether a relationship should be characterized as linear or non-linear is challenging.

One indicator of non-linearity is the difference between the rank-order correlation correlation coefficient (Spearman's rho) and Pearson's r. When Spearman's rho is larger than Pearson's r, the relationship is likely to be non-linear, and Pearson's r may understate the strength of the relationship.

However, we can improve the linearity of the relationship and justify the use of statistics that assume linearity if one or both variables are badly skewed due to outliers, but can be corrected by re-expressing the data.

Page 24: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 24

This example is from: Applied Statistics: From Bivariate through Multivariate Techniques by Rebecca M. Wagner, page 303.

Problems for this assignment are based on the following summary of a correlation analysis.

Page 25: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 25

Correlation of Quantitative Variables - 1This is an sample of the problems in this assignment, with the correct answers displayed.

In these problems, we will assess the normality conditions for both variables, but we will not re-express the variables or omit outliers.

We will interpret the direction and strength of the relationship

We will interpret the difference between the correlation measures.

Page 26: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 26

Correlation of Quantitative Variables - 2

The first paragraph asks about the number of cases included in the analysis.

The notes provide information about:• the data set and variables to use, • the criteria for evaluating normality,

and • the criteria for assessing effect size.

Page 27: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 27

Correlation of Quantitative Variables - 3

To include only the cases that have valid data for both variables, choose the Select Cases command from the Data menu.

When we use z-scores for outlier detection, we can either create z-scores for all cases in the distribution of a variable without regard to cases that are missing data for the other variable. These z-scores use information from cases that are not included in the correlation.

To make sure that the z-scores we use for outlier detection are the same cases in the rest of the analysis, we will explicitly exclude cases that are missing data for either variable.

Page 28: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 28

Correlation of Quantitative Variables - 4

First, mark the option button: If condition is satisfied

Second, when the option button is marked, the If… button is activated. Click on the If… button to specify the condition.

Page 29: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 29

Correlation of Quantitative Variables - 5

The NMISS function counts the number of variables that have missing data for a case. If NMISS equals 0, the case has valid data for all of the variables and should be included in the analysis.

Click on the Continue button to close the dialog box.

SPSS includes function commands that perform specific calculations which we can use for creating new variables or for selecting cases.

Page 30: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 30

Correlation of Quantitative Variables - 6

Having entered the condition, click on the OK button to complete the selection.

The condition we entered is printed to the right of the If… button.

Page 31: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 31

Correlation of Quantitative Variables - 7

SPSS marks the cases that will not be included with slashes through the case number.

Page 32: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 32

Correlation of Quantitative Variables - 8

To compute the descriptive statistics in SPSS, select the Descriptive Statistics > Descriptives command from the Analyze menu.

Now that we have specifically included only the cases that are not missing data for any variable, we create the statistics and standard scores we need to assess the normality of the distribution of the variables to be correlated.

Page 33: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 33

Correlation of Quantitative Variables - 9

Move the variables for the analysis poverty and prison to the Variable(s) list box.

Click on the Options button to select optional statistics.

Page 34: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 34

Correlation of Quantitative Variables - 10

The check boxes for Mean and Std. Deviation are already marked by default.

Click on Continue button to close the dialog box.

Mark the Kurtosis and Skewness check boxes. This will provide the statistics for assessing normality.

Page 35: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 35

Correlation of Quantitative Variables - 11

Click on the OK button to produce the output.

Mark the check box Save standardized values as variables.

Page 36: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 36

Correlation of Quantitative Variables - 12

The table of Descriptive Statistics tells us the number of valid cases for the problem, 128.

Page 37: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 37

Correlation of Quantitative Variables - 13We enter the number of valid cases from the Descriptives output table, 128.

The next paragraph asks us to evaluate the normality of the distribution based on three criteria:

•skewness•kurtosis•outliers more than three standard deviations from the mean.

The criteria to be applied are listed in note 2.

We postpone the characterization of the distribution until we have examined the three individual criteria.

Page 38: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 38

Correlation of Quantitative Variables - 14We can use the Correlation and Regression script to produce the histograms and scatterplot.

Highlight the dependent variable, poverty, and the independent variable, prison, and click on the Run button.

Mark the checkboxes to overlay means and standard deviations on the chart.

Page 39: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 39

Correlation of Quantitative Variables - 15

The histogram for poverty has a nearly normal shape, and the skew and kurtosis are both between -1.0 and +1.0. It is not clear from histogram whether there are any outlier cases to the right above three standard deviations.

The three standard deviation line does not appear in the scatterplot, reinforcing the interpretation that there are no outliers. To make certain, we need to check the z-scores for the variable.

Mean + 1 S.D.

Mean + 2 S.D.

NOTE: the apparent number of outliers in the scatterplot may not be accurate because two or more cases with the same scores will appear as a single dot.

Mean - 1 S.D.

Page 40: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 40

Correlation of Quantitative Variables - 16

The histogram for prison has does not appear nearly normal. It is skewed to the right and shows 3 outliers to the left of the 3 standard deviation orange line.

Both skewness and kurtosis are well above +1.0

The three outliers clearly show up in the scatterplot.

Rho indicates a stronger relationship that r, suggesting that the re-expression may result in a stronger relationship

+ 1 S.D.

+ 2 S.D.

+ 3 S.D.

NOTE: the apparent number of outliers in the scatterplot may not be accurate because two or more cases with the same scores will appear as a single dot.

Page 41: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 41

Correlation of Quantitative Variables - 17

We can create a histogram in SPSS.

Select Legacy Dialogs > Histogram from the Graphs menu.

Page 42: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 42

Correlation of Quantitative Variables - 18

Move the variable prison to the Variable: text box.

Mark the check box for Display normal curve.

Click on the OK button to produce the output.

Page 43: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 43

Correlation of Quantitative Variables - 19

The histogram with the normal curve overlay appears in the SPSS Viewer.

Reference lines can be added to the histogram manually if we calculated the descriptive statistics, but we can answer our questions just by examining the statistical output.

Page 44: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 44

Correlation of Quantitative Variables - 20

The first item in the second sentence asks us to enter and characterize the degree and direction of the skewness for the distribution.

Page 45: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 45

Correlation of Quantitative Variables - 21

Since the skewness is positive, we characterize it as skewness to the right.

Since skewness (.57) is smaller than +1.0, we characterize it as slightly skewed to the right.

We enter the value of skewness from the table of descriptive statistics.

Page 46: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 46

Correlation of Quantitative Variables - 22

The second part of the sentence asks us to enter and characterize the kurtosis of the distribution.

Page 47: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 47

Correlation of Quantitative Variables - 23

Since the kurtosis is negative, we characterize it as flat. Since the kurtosis is greater than -1.0, we characterize it as slightly flatter.

We enter the value of kurtosis from the table of descriptive statistics.

Page 48: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 48

Correlation of Quantitative Variables - 24

The next sentence asks us to identify the number of extreme outliers, defined in note 2 as standard scores that were three or more standard deviations from the mean.

Page 49: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 49

Correlation of Quantitative Variables - 25

In this example, we will count the number of outliers by sorting the column of data values.

First, click on the column header for the variable (Zpoverty) containing the standard scores to select the column of data.

Second, right click on the column header (Zpoverty) and select Sort Ascending from the popup menu. This will show any negative outliers at the top of the column.

Page 50: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 50

Correlation of Quantitative Variables - 26

Scroll down in the data editor, past the cases with missing values.

With the data for Zpoverty sorted in ascending order, we see that the smallest z-score was -1.55323. There are no outliers at the negative end of the distribution.

Page 51: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 51

Correlation of Quantitative Variables - 27

Click the right mouse button again on the column header for Zpoverty, and select Sort Descending from the pop-up menu. This will show any positive outliers at the top of the column.

Page 52: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 52

Correlation of Quantitative Variables - 28

With the data for Zpoverty sorted in descending order, we see that the largest z-score was 2.67952. There are no outliers at the positive end of the distribution.

Since there were no outliers at either the positive or negative ends of the distribution, there are no outliers for this variable.

Page 53: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 53

Correlation of Quantitative Variables - 29

We enter 0 for the number of extreme outliers.

Page 54: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 54

Correlation of Quantitative Variables - 30

Since the distribution was slightly skewed to the right, slightly flatter than expected, and contained no extreme outliers, it is nearly normal.

Page 55: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 55

Correlation of Quantitative Variables - 31

The next paragraph asks us to evaluate the normality of the distribution of the second variable based on the same three criteria:

•skewness•kurtosis•outliers more than three standard deviations from the mean.

The criteria to be applied are listed in note 2.

We postpone the characterization of the distribution until we have examined the three individual criteria.

Page 56: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 56

Correlation of Quantitative Variables - 32

The first item in the second sentence asks us to enter and characterize the degree and direction of the skewness for the distribution.

Page 57: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 57

Correlation of Quantitative Variables - 33

Since the skewness is positive, we characterize it as skewness to the right.

Since skewness 1.93 is larger than +1.0, we characterize it as badly skewed to the right.

We enter the value of skewness from the table of descriptive statistics.

Page 58: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 58

Correlation of Quantitative Variables - 34

The second part of the sentence asks us to enter and characterize the kurtosis of the distribution.

Page 59: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 59

Correlation of Quantitative Variables - 35

Since the kurtosis is positive, we characterize it as peaked. Since the kurtosis is greater than +1.0, we characterize it as much more peaked.

We enter the value of kurtosis from the table of descriptive statistics.

Page 60: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 60

Correlation of Quantitative Variables - 36

The next sentence asks us to identify the number of extreme outliers, defined in note 2 as standard scores that were three or more standard deviations from the mean.

Page 61: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 61

Correlation of Quantitative Variables - 37

In this example, we will count the number of outliers by sorting the column of data values.

First, click on the column header for the variable (Zprison) containing the standard scores to select the column of data.

Second, right click on the column header (Zprison) and select Sort Ascending from the popup menu. This will show any negative outliers at the top of the column.

Page 62: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 62

Correlation of Quantitative Variables - 38

Scroll down in the data editor, past the cases with missing values.

With the data for Zprison sorted in ascending order, we see that the smallest z-score was -1.05312. There are no outliers at the negative end of the distribution.

Page 63: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 63

Correlation of Quantitative Variables - 39

Click the right mouse button again on the column header for Zprison, and select Sort Descending from the pop-up menu. This will show any positive outliers at the top of the column.

Page 64: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 64

Correlation of Quantitative Variables - 40

With the data for Zprison sorted in descending order, we see that there are three outliers at the positive end of the distribution that have z-scores greater than 3.0.

This reinforces our conclusion that the distribution was badly skewed to the right.

Page 65: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 65

Correlation of Quantitative Variables - 41

We enter 3 for the number of extreme outliers.

Page 66: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 66

Correlation of Quantitative Variables - 42

Since the distribution was badly skewed to the right, more peaked than expected, and contained three extreme outliers, it is not nearly normal.

Page 67: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 67

Correlation of Quantitative Variables - 43

Though one of the variables did not satisfy the normality assumption, we can still compute and present the findings for the correlation of the two variables.

At worst, we can acknowledge the violation of the assumption as a limitation to the analysis. Or, further analysis may suggest re-expression to meet the expected assumption.

The next paragraph presents the correlation between the two quantitative variables.

Page 68: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 68

Correlation of Quantitative Variables - 44

The first sentence in this paragraph asks us to enter the direction of the relationship.

To answer this question, we compute the correlation.

NOTE: the correlation measures are included in the script output in the title for the scatterplot.

Page 69: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 69

Correlation of Quantitative Variables - 45

To compute correlations, select Correlate > Bivariate from the Correlate menu.

Page 70: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 70

Correlation of Quantitative Variables - 46

First, move the variables poverty and prison to the Variables list box.

Second, mark the check box for Spearman and leave the check box for Pearson marked.

Third, click on the OK button to produce the output.

Page 71: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 71

Correlation of Quantitative Variables - 47

The Pearson Correlation r was -0.16.

The negative sign of r means that the relationship is negative.

Page 72: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 72

Correlation of Quantitative Variables - 48

The negative sign of r means that the relationship is negative.

The next sentence further interprets the direction of the relationship, i.e. whether higher scores on the independent variable are associated with higher or lower scores on the dependent variable.

Page 73: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 73

Correlation of Quantitative Variables - 49

A negative correlation means that the scores of the two variables move in opposite directions, i.e. as one increases the other decreases. Thus, larger prison populations are associated with lower rates of poverty.

The next sentence asks us to enter the value of Pearson’s r.

Page 74: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 74

Correlation of Quantitative Variables - 50

The Pearson Correlation is entered into the narrative.

Page 75: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 75

Correlation of Quantitative Variables - 51

The next sentence interprets the Pearson Correlation as the Coefficient of Determination, r².

R² is not computed as part of the Correlations output, but we can compute it with a calculator or in Excel. It is computed as r multiplied by itself.

Page 76: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 76

Correlation of Quantitative Variables - 52

By my calculations, r² is equal to:

-0.16 x -0.16 = 0.0256, or 0.03

The R squared of 0.03 converts to 3%, which is interpreted as the percent of variance in the dependent variable explained or predicted by the variance in the independent variable.

Page 77: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 77

Correlation of Quantitative Variables - 53

The final sentence in the paragraph asks for the effect size interpretation of the correlation.

We use Cohen’s criteria for characterizing the effect size, as shown in Note 3.

Page 78: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 78

Correlation of Quantitative Variables - 54

The last paragraph calls for us to enter Spearman’s rho and compare its size to Pearson’s r.

A Pearson r of -0.16 falls in the interval from -.10 to -.30, characterized as weak.

Page 79: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 79

Correlation of Quantitative Variables - 55

Spearman’s rho is entered into the narrative.

A coefficient of -0.25 implies a stronger relationship than a correlation of -0.16.

Page 80: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 80

Correlation of Quantitative Variables - 56

The final part of the sentence focuses on whether or not there is a stronger relationship between the variables than that represented by Pearson’s r and we should consider re-expression.

Page 81: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 81

Correlation of Quantitative Variables - 57

When Spearman’s rho indicates a stronger relationship that Pearson’s r, we characterize Pearson’s r as understating the strength of the relationship.

If one or both of the variables was not nearly normal, we could try re-expression, but it is not required for these problems.

Page 82: 12/14/2015Slide 1 The dependent variable, poverty, is plotted on the vertical axis. The independent variable, enrolPop, is plotted on the horizontal axis.

04/21/23 Slide 82

Correlation of Quantitative Variables - 58

Submitting the problem for grading supports the correctness of our answers.


Recommended