+ All Categories
Home > Documents > 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null...

1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null...

Date post: 28-Mar-2015
Category:
Upload: gabrielle-mcguire
View: 214 times
Download: 1 times
Share this document with a friend
Popular Tags:
154
1 SESSION 2 ANOVA and regression
Transcript
Page 1: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

1

SESSION 2

ANOVA and regression

Page 2: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

2

Only the starting point

• In ANOVA, the rejection of the null hypothesis leaves many questions unanswered.

• Further analysis is needed to pinpoint the crucial patterns in the data.

• So, unlike the t test, the ANOVA is often just the first step in what may be quite an extensive statistical analysis.

Page 3: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

3

Comparisons among the five treatment means

Page 4: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

4

Simple and complex comparisons

• You might want to make SIMPLE COMPARISONS between the mean for each of the four drug conditions and the Placebo mean.

• Or you might want to compare the Placebo mean with the mean of the four drug means. This is a COMPLEX COMPARISON.

Page 5: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

5

Page 6: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

6

Non-independence of comparisons

• The simple comparison of M5 with M1 and the complex comparison are not independent.

• The value of M5 feeds into the value of the average of the means for the drug groups.

Page 7: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

7

Systems of comparisons

• With a complex experiment, interest centres on SYSTEMS of comparisons.

• Which comparisons are independent or ORTHOGONAL?

• What is the probability, under the null hypothesis, that at least one comparison will show significance?

• How much variance can we attribute to different comparisons?

Page 8: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

8

The crumpled paper fallacy

• We owe this to Thouless. • Uncrumple a piece of paper. • The wrinkles are unique. • Therefore, they are statistically significant. • Data sets from complex experiments may,

ex post facto, show all manner of interesting patterns.

• Inferences from such patterns are dangerous.

Page 9: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

9

Over-analysis?

• You have run a complex experiment and submitted a paper to a journal.

• Your reviewers will need to be convinced that what you are reporting isn’t just a chance pattern thrown up by sampling error.

• You may well be asked to specify orthogonal comparisons and test them for significance.

Page 10: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

10

Linear functions

• Y is a linear function of X if the graph of Y upon X is a straight line.

• For example, temperature in degrees Fahrenheit is a linear function of temperature in degrees Celsius.

Page 11: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

11

F is a linear function of C

Degrees Fahrenheit

Degrees Celsius (0, 0)

932

5F C

Intercept → 32

Q

P

9 / 5P

SlopeQ

Page 12: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

12

Page 13: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

13

Page 14: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

14

Linear contrasts

• Any comparison can be expressed as a sum of terms, each of which is a product of a treatment mean and a coefficient such that the coefficients sum to zero.

• When so expressed, the comparison is a LINEAR CONTRAST, because it has the form of a linear function.

• It looks artificial at first, but this notation enables us to study the properties of systems of comparisons among the treatment means.

Page 15: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

15

Page 16: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

16

Page 17: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

17

More compactly, if there are k treatment groups, we can write

Page 18: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

18

Page 19: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

19

Page 20: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

20

Page 21: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

21

Page 22: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

22

Helmert contrasts

• Compare the first mean with the mean of the other means.

• Drop the first mean and compare the second mean with the mean of the remaining means. Drop the second mean.

• Continue until you arrive at a comparison between the last two means.

Page 23: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

23

Helmert contrasts…

• Our first contrast is • 1, -¼, -¼, -¼, -¼• Our second contrast is• 0, 1, -⅓ , -⅓, -⅓• Our third contrast is• 0, 0, 1, -½, -½• Our fourth is • 0, 0, 0, 1, -1

Page 24: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

24

Page 25: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

25

Orthogonal contrasts

• The first contrast in no way constrains the value of the second, because the first mean has been dropped.

• The first two contrasts do not affect the third, because the first two means have been dropped.

• This is a set of four independent or ORTHOGONAL contrasts.

Page 26: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

26

The orthogonal property

• The sum of the products of corresponding coefficients in any pair of rows is zero.

• This means that we have an ORTHOGONAL contrast set.

Page 27: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

27

Size of an orthogonal set

• In our example, with five treatment means, there are four orthogonal contrasts.

• In general, for an array of k means, you can construct a set of, at most, k-1 orthogonal contrasts.

• In the present ANOVA example, k = 5, so the rule tells us that there can be no more than 4 orthogonal contrasts in the set.

• Several different orthogonal sets, however, can often be constructed for the same set of means.

Page 28: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

28

Accounting for variability

• The building block for any variance estimate is a DEVIATION of some sort.

• The TOTAL DEVIATION of any score from the grand mean (GM) can be divided into 2 components: 1. a BETWEEN GROUPS component; 2. a WITHIN GROUPS component.

total deviation between groups deviation

within groups deviation

grand mean

Page 29: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

29

Breakdown (partition) of the total sum of squares

• If you sum the squares of the deviations over all 50 scores, you obtain an expression which breaks down the total variability in the scores into BETWEEN GROUPS and WITHIN GROUPS components.

Page 30: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

30

Contrast sums of squares

• We have seen that in the one-way ANOVA, the value of SSbetween reflects the sizes of the differences among the treatment means.

• In the same way, it is possible to measure the importance of a contrast by calculating a sum of squares which reflects the variation attributable to that contrast alone

• We can use an F statistic to test each contrast for significance.

Page 31: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

31

Formula for a contrast sum of squares

Page 32: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

32

Page 33: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

33

Here, once again, is our set of Helmert contrasts, to which I have

added the values of the five treatment means

Page 34: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

34

Page 35: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

35

Page 36: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

36

Page 37: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

37

Page 38: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

38

Page 39: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

39

Page 40: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

40

Testing a contrast sum of squares for significance

Page 41: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

41

Two approaches

• A contrast is a comparison between two means.

• You can therefore run a one-way, 2-group ANOVA.

• Or you can use a t-test.

• The tests are equivalent.

Page 42: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

42

Degrees of freedom of a contrast sum of squares

• A contrast sum of squares compares two means.

• A contrast sum of squares, therefore, has ONE degree of freedom, because the two deviations from the grand mean sum to zero.

Page 43: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

43

Page 44: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

44

Page 45: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

45

Page 46: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

46

Page 47: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

47

Page 48: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

48

Contrasts with SPSS

• Two approaches: • The simpler is through the One-Way

option in the Compare Means menu.• The General Linear Model, however,

provides many more useful statistics. • I suggest you begin by exploring contrasts

with the One-Way procedure first, then move on to the General Linear Model menu.

Page 49: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

49

Page 50: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

50

Page 51: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

51

Contrasts with SPSS

The coefficients must be integers

Page 52: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

52

Page 53: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

53

Our Helmert contrasts

• Each ringed item is a MEAN. • In the top row, the Placebo mean is compared

with the mean of the drug means. • In the third row, the mean for Drug B is

compared with the mean of the means for Drug C and Drug D.

Page 54: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

54

Summary

• A contrast is a comparison between two means.

• The contrasts can therefore be tested with either F or t. (F = t2.)

• The contrast sums of squares sum to the value of SSbetween.

Page 55: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

55

Page 56: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

56

Heterogeneity of variance

• The lower part of the table shows the results of tests of the same contrasts when homogeneity of variance is not assumed.

• Notice that the degrees of freedom have lower values.

Page 57: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

57

Non-orthogonal contrasts

• Contrasts don’t have to be independent.

• For example, you might wish to compare each of the four drug groups with the Placebo group.

• What you want are SIMPLE CONTRASTS.

Page 58: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

58

Simple contrasts

• These are linear contrasts – each row sums to zero.

• But they are not orthogonal – with some pairings, the sum of products of corresponding coefficients is not zero.

Page 59: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

59

Simple contrasts with SPSS

• Here are the entries for the first contrast, which is between the Placebo and Drug A groups.

• Below that are the entries for the final contrast between the Placebo and Drug D groups.

Page 60: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

60

The results

• In the column headed ‘Value of Contrast’, are the differences between pairs of treatment means.

• For example, Drug A mean minus Placebo mean = 7.90 - 8.00 = -.10. Drug D – Placebo = 13.00 – 8.00 = 5.00.

Page 61: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

61

Trend analysis

• Sometimes the factor (independent variable) may be quantitative and continuous.

• The theory of contrasts can be extended to study trends in the relationship between the factor and the dependent variable.

• The following slides outline the procedure.

Page 62: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

62

Polynomials

• A POLYNOMIAL is a sum of terms, each of which is a product of a constant and a power of the same variable.

• The highest power n is the DEGREE of the polynomial.

Page 63: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

63

Graphs of some polynomials

3y x

LINEAR

26 7Y x x

QUADRATIC

CUBIC

3 220 8 14y x x x

QUARTIC

4(with an term)x

Page 64: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

64

Fitting points with polynomials

• A first-order polynomial (line) does not change direction at all. But you can adjust the constants to fit any TWO points.

• A second-order polynomial (parabola) changes direction ONCE and can be fitted to any THREE points.

• A third-order polynomial changes direction TWICE and can be fitted to any FOUR points.

Page 65: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

65

Fitting points with polynomials…

In general, any k points can be fitted perfectly by a polynomial of order k – 1.

Page 66: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

66

LINEAR QUADRATIC

CUBIC

Page 67: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

67

Another drug experiment

• In the drug experiment, the independent variable (or factor) comprised a set of five qualitatively different conditions.

• There was no intrinsic ordering of the categories. The order in which the variables appeared in Data View was entirely arbitrary.

• Now suppose that the five groups vary in the extent to which the same drug was present.

• The Placebo, A, B, C and D groups have dosages of 0, 10, 20, 30 and 40 units of the drug, respectively.

• The five groups are now ordered with respect to a CONTINUOUS INDEPENDENT VARIABLE.

Page 68: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

68

A linear trend

• There is evidence of a linear TREND in these data.

• The pattern, however, is imperfect – other trends (e.g. quadratic) may be present as well. On the other hand, the irregularity may reflect random error.

Page 69: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

69

Capturing the linear trend

• Consider the linear contrast

-2 -1 0 1 2• If we plot these

values against X (the concentration of the drug), we shall have the graph of a straight line.

3y x

LINEAR

Page 70: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

70

Polynomial coefficients

• The coefficients in this contrast are actually values of the polynomial

y = x – 3

• The sum of squares of this contrast captures or reflects the linear trend in the data.

Page 71: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

71

Orthogonal polynomial contrasts

• Here is a set of orthogonal contrasts.

• The values in each row are values of one polynomial for various values of X, the continuous independent variable.

• The top row is a first degree (linear) polynomial, the next row is a second degree (quadratic) polynomial and so on.

Page 72: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

72

Trend analysis

• Although the entries in a row are values of the same polynomial (whether linear or not), they are still the coefficients of a linear contrast: they sum to zero; moreover, the products of the corresponding coefficients also sum to zero. We have an ORTHOGONAL SET of contrasts.

• Associated with each contrast is a sum of squares which captures that particular trend in the data.

• The contrasts are tested in the usual way.

Page 73: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

73

Ordering a linear polynomial contrast

• You must check the Polynomial box and specify the order of the polynomial.

• Orthogonal polynomial sets are obtainable from tables in statistics books, such as Howell (2007), which provide orthogonal sets for sets of means of various sizes.

You must check the Polynomial box

Specify a linear (1st degree) polynomial

Page 74: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

74

Ordering a quadratic polynomial contrast

• You must now specify a Quadratic (2nd degree) polynomial.

• The coefficients are entered in the usual way.

Specify a 2nd degree (quadratic) polynomial

Page 75: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

75

A trend analysis

• The relevant results are ringed. • You can see that only the linear trend is significant. • This formal analysis confirms the appearance of the

profile plot.

Page 76: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

76

Partition of the between groups sum of squares

Since we have an orthogonal set of contrasts, their sums of squares sum to the ANOVA between groups sum of squares.

Page 77: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

77

Deviations in the ANOVA table

• The DEVIATION sum of squares is what remains of SSbetween when the last contrast sum of squares has been subtracted.

• Each deviation has one degree of freedom fewer than the previous deviation (if there is one).

Page 78: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

78

The deviations

The first deviation SS (with df = 3) is obtained by subtracting the linear SS from SSbetween

The second deviation has df = 2. Both the linear and the quadratic trends have now been removed.

Page 79: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

79

The deviation terms

Page 80: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

80

The t tests

• The t tests produce exactly the same p-values as the F tests.

• As usual, F = t2

Page 81: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

81

Equivalence of F and t

Page 82: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

82

Alternative analyses

• As usual, t-tests are also made without assuming homogeneity of variance (lower half).

• The values of df are markedly lower, suggesting that we should go by the tests in the lower part of the table.

Page 83: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

83

A useful question

• Are you making comparisons or measuring association?

• If you’re making comparisons, you may need statistics such as the t-test and ANOVA

• If you’re investigating associations, you will need techniques such as correlation and regression.

Page 84: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

84

Purpose of this section

• Today I intend to build some bridges between the statistics of comparison and association.

• I hope to show that in some circumstances, the making of a comparison and the investigation of an association are equivalent.

Page 85: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

85

Some regression fundamentals

Page 86: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

86

A scatterplot

Page 87: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

87

A strong linear association

A narrowly elliptical scatterplot like this indicates a strong positive linear association between the two variables.

Page 88: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

88

The Pearson correlation

Page 89: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

89

Page 90: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

90

Page 91: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

91

Warning!

• This significance test presupposes that the distribution is BIVARIATE NORMAL, which implies that the scatterplot is elliptical (or circular) in shape.

• ALWAYS CHECK THIS OUT BY INSPECTING THE SCATTERPLOT.

Page 92: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

92

Independence

• Select a large sample at random from a population and array the values in a column.

• Select another sample from the same population at random and array those values alongside the values of the first sample.

• The two samples are independent, because the data are not paired in any meaningful sense.

• The correlation between the two columns of values should be approximately zero.

Page 93: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

93

Scatterplot indicating no association

Page 94: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

94

Regression

• Regression is a set of techniques for exploiting the presence of statistical association among variables to make predictions of values of one variable (the DV or CRITERION) from knowledge of the values of other variables (the IVs or REGRESSORS).

Page 95: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

95

Simple and multiple regression

• In the simplest case, there is just one IV. This is known as SIMPLE regression.

• In MULTIPLE regression, there are two or more IVs.

Page 96: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

96

The regression line of actual violence upon film preference

Page 97: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

97

The regression line of Violence upon Preference

• The REGRESSION LINE is the line that fits the points best from the point of view of predicting Actual Violence from Preference.

• (A different line would be drawn were we to try to predict Preference from Actual Violence.)

Page 98: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

98

Page 99: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

99

Here is the equation of the regression line

Page 100: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

100

Page 101: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

101

Page 102: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

102

Residual scores

• Suppose we use the regression line of Y upon X to predict the value of a person’s score Y from a particular value of X.

• A RESIDUAL (e) is the difference between a person’s true score on Y and the point on the regression line.

Page 103: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

103

Page 104: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

104

The residuals are shown in the next picture

Page 105: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

105

Page 106: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

106

Summary

• B1 is the slope and B0 is the intercept.

• Y/ is the Y-coordinate of the point on the line above the value X.

• An increase of one unit on variable X will result in an estimated increase of (B1) units on variable Y.

• A NEGATIVE value of B1 means that an increase of one unit on variable X will result in an estimated REDUCTION of B1 units on Y.

regression constant (intercept)

regression coefficient (slope)

Page 107: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

107

The ‘least-squares’ criterion

The regression line is the ‘best-fitting’ line in the sense that it minimises the sum of the squares of the residuals.

Page 108: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

108

Page 109: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

109

Breakdown of the total sum of squares

Page 110: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

110

Coefficient of determination

Page 111: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

111

Explanation

Page 112: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

112

The coefficient of determination (r2)

• The COEFFICIENT OF DETERMINATION (r2) is the proportion of the variance of the predicted variable accounted for by regression.

• The coefficient of determination can take values within the range from 0 to +1, inclusive.

Page 113: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

113

Range of r

Page 114: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

114

Page 115: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

115

Positive bias

• The coefficient of determination is positively biased as an estimator.

• The statistic known as ‘adjusted R2’ attempts to correct this bias.

Page 116: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

116

Page 117: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

117

Page 118: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

118

Using more than one regressor

• By analogous methods, we could try to predict a person’s actual violence from exposure to screen violence and number of years of education.

• This is a problem in MULTIPLE REGRESSION.

Page 119: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

119

Multiple regression

Page 120: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

120

Geometrical interpretation

• This is the equation of a plane (or hyperplane) with slopes B1, B2, …,Bp with respect to axes X1, X2, …, Xp and intercept B0.

• The slopes are the PARTIAL REGRESSION COEFFICIENTS and the intercept is the CONSTANT.

Page 121: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

121

Regression coefficients

• In simple regression the REGRESSION COEFFICIENT (B1 ) is the estimated change in units of the DV that would result from an increase of one unit in the IV.

• In multiple regression, a PARTIAL REGRESSION COEFFICIENT such as B1 is the estimated change in the DV resulting from an increase of one unit in the IV X1 with ALL OTHER IVs HELD CONSTANT.

Page 122: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

122

The multiple correlation coefficient R

• The MULTIPLE CORRELATION COEFFICIENT is the correlation between the estimates Y/ and the actual values of the DV (Y).

• The COEFFICIENT OF DETERMINATION (R2) is the proportion of the variance of Y that is accounted for by regression.

Y

Page 123: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

123

Range of R

• The multiple correlation coefficient R can only have non-negative values:

• 0 ≤ R ≤ +1

• This is because the regression line (or plane) cannot have a slope of opposite sign to that of the elliptical (or hyperelliptical) scatterplot.

Page 124: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

124

Attribution of variance to regressors

If the IVs are uncorrelated, it is easy to attribute variance in Y to each of the independent variables X.

Page 125: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

125

Page 126: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

126

Correlated IVs

• When the IVs are measured, they always correlate to at least some extent.

• It is then impossible to attribute variance unequivocally to any particular IV.

Page 127: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

127

Page 128: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

128

Page 129: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

129

Dummy variables

• Information about group membership is carried by a grouping variable.

• A DUMMY VARIABLE has only two values: 0 and 1, where 0 usually denotes the control or comparison condition – in this case the Placebo.

Page 130: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

130

Point-biserial correlation

• If we correlate the scores in the Group column with the dummy variable in the Score column, we obtain what is known as a POINT-BISERIAL CORRELATION.

• The meaning of ‘point-biserial’ is lost in the mists of antiquity.

• The point is that we are correlating a measured variable with code numbers for category membership.

Page 131: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

131

Page 132: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

132

A link

• The point biserial correlation is of limited value as a descriptive statistic.

• However, it forms a useful conceptual bridge between the statistics of comparison (t-test) and association (correlation).

Page 133: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

133

Regression upon dummy variables

• We shall now regress the scores that people achieved in the Caffeine experiment against those of the dummy variable carrying group membership.

Page 134: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

134

The regression line will pass through the group means

0 1X

Page 135: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

135

Why?

• OLS regression minimises the sums of the squares of the residuals.

• In either group of scores, the sum of the squared deviations about the mean is a minimum.

Page 136: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

136

The sum of squares of deviations about the mean is a minimum

Page 137: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

137

The regression statistics

• When we regress the Score variable against the dummy variable, the intercept of the regression line is the mean score of the Placebo group.

• The slope of the regression line is the difference between the means of the Caffeine and Placebo groups.

Page 138: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

138

The regression statistics

Page 139: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

139

Page 140: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

140

Page 141: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

141

Significance tests

• The intercept (Constant) is 9.25, the value of the Placebo mean.

• The slope is 2.65, which is 11.90 – 9.25, the difference between the Caffeine and Placebo means.

• t(38) = 2.604; p = .013. This is exactly the result we obtained with the independent samples t test.

Page 142: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

142

Equivalence of ANOVA and regression

• When we test the slope of the regression line for significance, we are also testing the difference between the Caffeine and Placebo means for significance.

• Since (in the 2-group case) the F and t tests are equivalent, the regression ANOVA table is identical with the one-way ANOVA table we obtained before.

Page 143: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

143

Dummy coding for the k-group case

• Since MSbetween has only four degrees of freedom, regression will predict the treatment means perfectly if the Score variable is regressed upon four dummy variables X1, X2, X3 and X4.

• As with the two-group example, an interesting equivalence emerges.

Page 144: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

144

Dummy coding for the k-group case

Page 145: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

145

The one-way ANOVA statistics

Page 146: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

146

The regression statistics

We see that B0 is the Placebo mean and B1, B2, B3 and B4 are the differences between the means for the 4 drug conditions and the Placebo mean.

Same as the ANOVA value of F.

Page 147: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

147

In summary

• When the scores in the five-group drug experiment are regressed upon 4 dummy variables,

• The regression constant or intercept B0 is the Placebo mean.

• The partial regression coefficients are the differences between the drug conditions and the Placebo mean.

• The regression sum of squares is equal to the ANOVA between groups sum of squares.

Page 148: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

148

In summary …

• The t - tests of the regression coefficients are equivalent to the t-tests of the sums of squares associated with the four contrasts.

Page 149: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

149

Eta squared • Returning to the one-way

ANOVA, recall that eta squared (also known as the CORRELATION RATIO) is defined as the ratio of the between groups and within groups mean squares.

• It’s theoretical range of variation is from zero (no differences among the means) to unity (no variance in the scores of any group, but different values in different groups).

• In our example, η2 = .447

Page 150: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

150

Eta squared revisited

• If the scores from a k – group experiment are regressed upon k – 1 dummy variables, the square of the multiple correlation coefficient R is the proportion of variance of the scores accounted for by differences among the treatment means.

• Eta squared is R2, which I think is why it is also termed the ‘correlation ratio’.

Page 151: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

151

Formula for SSψ

• We can think of a contrast sum of squares as the between treatments variability that is accounted for by a particular contrast.

• The sums of squares for orthogonal contrasts add up to the ANOVA between groups sum of squares.

Page 152: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

152

The contrast sum of squares revisited

Page 153: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

153

Building bridges

• In these two sessions, in addition to revising (and adding to) some material with which you are already familiar, I have tried to demonstrate some striking equivalences between techniques which many think of as having quite different contexts and purposes.

Page 154: 1 SESSION 2 ANOVA and regression. 2 Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered. Further analysis.

154

Assignment

• Please complete the project and hand it in to Anne

before noon on Wednesday 31st October.

• I shall return your answers (with comments) by

Wednesday 7th November.


Recommended