+ All Categories
Home > Documents > CE3502. Environmental Measurements, Monitoring & Data Analysis

CE3502. Environmental Measurements, Monitoring & Data Analysis

Date post: 11-Apr-2022
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
12
3/31/2009 1 CE3502. Environmental Measurements, Monitoring & Data Analysis ANOVA: Analysis of Variance Variance Background (boss)A NOVA - style of Brazilian music created (boss)A NOVA style of Brazilian music created by Antônio Carlos Jobim, Vinicius de Moraes and João Gilberto and was first introduced in Brazil in 1958, ... BosaNova is an electronic guide to archival holdings at Nova Scotia Archives and Records Management Management • The BOSaNOVA 1000 Series models are powered by a 1GHz VIA processor with an 4x AGP video adapter ANOVA – Analysis of Variance
Transcript
Page 1: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

1

CE3502. Environmental Measurements, Monitoring &

Data Analysisy

ANOVA: Analysis of VarianceVariance

Background

• (boss)A NOVA - style of Brazilian music created(boss)A NOVA style of Brazilian music created by Antônio Carlos Jobim, Vinicius de Moraes and João Gilberto and was first introduced in Brazil in 1958, ...

• BosaNova is an electronic guide to archival holdings at Nova Scotia Archives and Records ManagementManagement

• The BOSaNOVA 1000 Series models are powered by a 1GHz VIA processor with an 4x AGP video adapter

ANOVA – Analysis of Variance

Page 2: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

2

Motivation: ANOVA

Comparison of 2 sets of measurements or one set ofComparison of 2 sets of measurements, or one set of measurements and a fixed value (e.g., regulatory limit) canbe done with t-tests.

In many cases, it is desirable to compare multiple sets of datato assess whether they are all part of the same population or

h th th diff ( t ti ti ll ) f th F hwhether they differ (statistically) from one another. For such comparisons, the most common technique is ANOVA.

Examples: ANOVA

1 Multiple groundwater wells have been sampled to determine1. Multiple groundwater wells have been sampled to determinewhether any are contaminated (several replicates at each well);

2. Samples are collected along multiple transects in LakeSuperior to determine if horizontal spatial differences exist;

3. Air samples are collected on multiple days under differentweather conditions to determine if air quality varies systematically with weather conditions;

4. Multiple soil samples are collected within several areasof a Brownfield site to determine if any areas are contaminated.

Page 3: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

3

Theory: ANOVAAnalysis of Variance (ANOVA) consists of the following steps:

1. Test whether groups of data meet assumptions requiredto perform ANOVA (lack of heteroscedasticity);

2. Assess variability (“noise”) within individual “groups” of data;

3. Assess variability between “groups” of data;

4. Compare “within-group” and “between-group” variability;

5. If “within-group” and “between-group” variability is of acomparable magnitude, the groups are deemed to be“not different” or to belong to the same population.

Preliminary test: Heteroscedasticity

Parametric ANOVA assumes that groups of data exhibitsimilar “within-group” variance. Heteroscedasticity is the property of exhibiting different variances.Tests of heteroscedasticity include:

1. Box plots2 Levene’s test2. Levene s test3. Bartlett’s test4. Probability plot of “residuals”

If these tests fail to meet acceptable criteria, either (1) thedata must be transformed (e.g., log-transformation) and retested, or (2) a non-parametric ANOVA must be performed.

Page 4: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

4

ANOVA. Step 1. Test for Heteroscedasticity

Calculate means for each group, and residuals (xij-μi)

Homoscedasticity – Probability Plot

Page 5: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

5

ANOVA: Step 2Consider a set of measurements including k groupsConsider a set of measurements including k groupseach with nj measurements (xi,j):

The variance for each group is defined as usual:

( )2

,2

1 1

jni j j

ji j

x xs

n=

−=

−∑ Group average1i j=

From these variances, a pooled “within-group” variancemay be obtained: 2

12

1

( 1)

( 1)

k

j jj

w k

jj

n ss

n

=

=

−=

Anova: Step 2The “between-group” variance is calculated as:

( )2

12

1

k

j jj

b

n y ys

k=

−=

∑ Average of allsamples from all groups

If the groups do not differ from one another, sb2 should be

~ equal to sw2.

Page 6: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

6

ANOVA: Step 3

Comparison of variances is performed with F test:Comparison of variances is performed with F-test:2

2b

w

sFs

=

Fcrit value is read from Table A.7 as Fdf1, df2, α

If F is less than Fcrit then variances are equal and groupsare not statistically different from one another.

Page 7: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

7

Trouble spots

• If n or n are not in F table either go to• If n1 or n2 are not in F table, either go to Web and use Fcrit calculator or interpolate;

• How does one remember which is on top, sbor sw?– F ratio is always > 1; which is bigger, sb or sw?y gg b w

– “b” comes before “w” and goes on top;

Page 8: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

8

ANOVA in Practice:

The procedure just outlined is correct but it is not theThe procedure just outlined is correct, but it is not theone most often followed. The procedure outlined in Navidi (2006) is more amenable to automationand is the one most commonly employed.

Within Excel (as well as within many other readily-availablesoftware packages) an ANOVA routine is available It issoftware packages), an ANOVA routine is available. It iswell worth your while to learn this procedure and how to interpret the output. The procedure is available underthe pulldown menu Tools, Data analysis, Anova: Single factor

Example 1: ANOVAFine aerosols were measured in 3 rooms with the followingFine aerosols were measured in 3 rooms with the followingresults (conc. in µg/m3).

Room A Room B Room C12 13 1810 17 169 20 21

13 14 1713 14 17

Assess whether any room differs significantlyfrom the others.

Page 9: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

9

Example 1: cont’dIn Excel we choose Anova: Single Factor, highlight the areawith the input data select an area for the outputwith the input data, select an area for the output, and click on OK. The output looks like:

Anova: Single Factor

SUMMARYGroups Count Sum Average Variance

Room A 4 44 11 3.333333Room B 4 64 16 10Room C 4 72 18 4 666667Room C 4 72 18 4.666667

ANOVASource of Variation SS df MS F P-value F critBetween Groups 104 2 52 8.666667 0.007977 4.256492Within Groups 54 9 6

Total 158 11

ANOVA outputAnova: Single Factor

SUMMARYGroups Count Sum Average Variance

Room A 4 44 11 3.333333Room B 4 64 16 10Room C 4 72 18 4.666667

ANOVASource of Variation SS df MS F P-value F critBetween Groups 104 2 52 8.666667 0.007977 4.256492Within Groups 54 9 6

Total 158 11Because F > Fcrit, we can conclude with greater than 99% certaintythat these 3 rooms do NOT all have similar aerosol concentrations.

Page 10: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

10

Beyond ANOVA

ANOVA tells us only if the groups are similar If they areANOVA tells us only if the groups are similar. If they arenot all similar, ANOVA does not identify which of the groupsis dissimilar from the others.

How can we determine which groups are similar andwhich are dissimilar?

t-tests (available in Excel)Bonferroni t-tests (available in statistics software)

Example 1: Which are different?

t-Test: Two-Sample Assuming Equal V

Room A Room BMean 11 16Variance 3.333333 10Observations 4 4Pooled Variance 6.666667Hypothesized Mea 0

Room A Room C11 18

3.333333 4.6666674 440Hypothesized Mea 0

df 6t Stat -2.73861P(T<=t) one-tail 0.016899t Critical one-tail 1.943181P(T<=t) two-tail 0.033798t Critical two-tail 2.446914

06

-4.949750.0012891.9431810.0025792.446914

Page 11: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

11

Example 1: concludedRoom B Room C Room C Room B

Mean 16 18Variance 10 4.666667Observations 4 4Pooled Variance 7.333333Hypothesized Mea 0df 6

18 164.666667 10

4 47.333333

06df 6

t Stat -1.04447P(T<=t) one-tail 0.168256t Critical one-tail 1.943181P(T<=t) two-tail 0.336512t Critical two-tail 2.446914

61.0444660.1682561.9431810.3365122.446914

Beyond 1-way ANOVA

• If heteroscedasticity tests fail either transform• If heteroscedasticity tests fail, either transform data or use non-parametric test;

• If there are “interactions” between groups (e.g., season and % sunny days), use 2-way ANOVA;

• To identify which groups are• To identify which groups are similar/dissimilar, use t-tests, Bonferroni’s t-tests, or a host of other available tests.

Page 12: CE3502. Environmental Measurements, Monitoring & Data Analysis

3/31/2009

12

Example 2. Indoor CO2

1800

0

200

400

600

800

1000

1200

1400

1600C

O2 C

onc.

(ppm

)

Dow 826AtriumDow 852Dow 8th floor office

15:30:14 15:30:58 15:31:41 15:32:24 15:33:07 15:33:50 15:34:34 15:35:17 15:36:00

Time

Anova: Single Factor

SUMMARY

Groups Count Sum Average Variance

826 60 76409 1273 3060

Atrium 60 68886 1148 86

852 60 84696 1412 9632

Office 60 74034 1234 206ANOVASource of Variation SS df MS F P-value F crit

Between Groups 2171028 3 723676 222.9 1.43E-68 2.6Within Groups 766074 236 3246

Total 2937102 239


Recommended