+ All Categories
Home > Documents > SADC Course in Statistics Comparing Regressions (Session 14)

SADC Course in Statistics Comparing Regressions (Session 14)

Date post: 21-Dec-2015
Category:
View: 225 times
Download: 0 times
Share this document with a friend
Popular Tags:
15
SADC Course in Statistics Comparing Regressions (Session 14)
Transcript

SADC Course in Statistics

Comparing Regressions

(Session 14)

2To put your footer here go to View > Header and Footer

Learning Objectives

At the end of this session, you will be able to

• understand and interpret the components of a linear model with one quantitative variable and one categorical factor

• interpret output from such models

• write regressions equations for each level of the categorical variable using the model estimates

3To put your footer here go to View > Header and Footer

Return to the Paddy example

In the paddy example, consider the possible effects of fertiliser and variety together.

Objective is to explore whether fertiliser or variety of both affect paddy yields.

Note that the two explanatory variables (we will call them factors) being considered here are of different types, one is a quantitative variable, the other is a categorical variable.

4To put your footer here go to View > Header and Footer

Models with each factor in turnPreviously we have fitted each variable one at a time.Thus the model with fertiliser alone is:

yi = 0 + 1 (fert)i + i

while the model with variety alone is:

yij = ’0 + vi + ij

In models above, 0 , ’0 represent constants, 1

is the slope of the line in first model and vi (i=1,2,3)

represent the variety effect in 2nd model.

5To put your footer here go to View > Header and Footer

One model with both factors

We can put the two factors together into a single model as:

yij = 0 + 1 (fert)ij + vi + ij

This model fits a regression lines with common slope for each variety, i.e. it represents three parallel lines.

The intercepts of the lines are:

(0 + v1), (0 + v2) and (0 + v3).

6To put your footer here go to View > Header and Footer

Anova results (sequential)Source d.f. S.S. M.S. F Prob.

Fertiliser 1 29.94 29.94 130.8 0.000

Variety 2 12.29 6.14 26.9 0.000

Residual 32 7.32 0.2288

Total 35 49.55

The Residual M.S. (s2) = 0.2288. It describes the variation not explained by fertiliser and variety.

How may the above results be interpreted?

7To put your footer here go to View > Header and Footer

Anova results (adjusted)Source d.f. Adj.SS. Adj.MS. F Prob.

Fertiliser 1 6.95 6.95 30.4 0.000

Variety 2 12.29 6.14 26.9 0.000

Residual 32 7.32 0.2288

Total 35 49.55

In anova above, each term has been adjusted for the other. So S.S. for fertiliser, variety and residual do not add to the total S.S.

What conclusions may be drawn from above?

8To put your footer here go to View > Header and Footer

Model estimates

Parameter Coeff. Std.error t t prob

0 : constant 4.776 0.322 14.9 0.000

1 : fertiliser 0.526 0.096 5.51 0.000

g1 (new) 0 - - -

g2 (old) -1.207 0.269 -4.49 0.000

g3 (trad) -2.179 0.304 -7.16 0.000

What do these results tell us?

9To put your footer here go to View > Header and Footer

Comparing variety means

Thus: Old - New = -1.207 = Estimate of g2

Trad - New = -2.179 = Estimate of g3

In addition, because the results need to be adjusted for the effect of fertiliser, results again need to be reported in terms of adjusted means!

These are usually calculated at the overall mean of the fertiliser variable = 1.444

As before, comparisons with the base level can be made using the model estimates.

10To put your footer here go to View > Header and Footer

Raw means and adjusted means

Sample Raw Std.error

Variety Size(n) Means (s.d./n)

New improved 4 5.96 0.128

Old improved 17 4.54 0.173

Traditional 15 3.00 0.168

Variety Adjusted means Std.error

New improved 5.54 0.251

Old improved 4.33 0.122

Traditional 3.36 0.139

Variety means adjusted for fertiliser effect:

11To put your footer here go to View > Header and Footer

Parallel lines for each variety

Equations describing the regression of yield on fertiliser for each variety are:

y = 0 + 1 (fert) + vi

y = (0 + vi) + 1 (fert)

Thus for the new improved variety, y = (4.776 + 0) + 0.526 (fert) y = 4.776 + 0.526 (fert)

Similarly, equations can be found for the remaining two varieties.

12To put your footer here go to View > Header and Footer

Model with different slopes

We can put the two factors together into a single model as:

yij = 0 + 1(fert)ij + vi + i(fert)ij + ij

This model fits regression lines with different

intercepts (0 + vi), and diff. slopes (1 + i).

The separate slopes are:

(1 + 1), (1 + 3) and (1 + 3).

13To put your footer here go to View > Header and Footer

Anova with different slopes

Source d.f. Adj.SS. Adj.MS. F Prob.

Fertiliser 1 0.391 0.391 1.6 0.211

Variety 2 1.610 0.805 3.4 0.048

Fert*Var 2 0.143 0.071 0.3 0.745

Residual 30 7.180 0.239

Total 35 49.55

Fitting separate lines involves fitting an interaction term (see below)

What are your conclusions?

14To put your footer here go to View > Header and Footer

Final model….

Clear from above that the added term in the model to allow for different slopes is non-significant.

Hence return to the parallel lines model, i.e.y = 4.776 + 0.526(fert), for new varietyy = 3.569 + 0.526(fert), for old varietyy = 2.597 + 0.526(fert), for traditional

15To put your footer here go to View > Header and Footer

Practical work follows to ensure learning objectives are

achieved…


Recommended