Chapter 10/11

Post on 20-Jan-2016

51 views 0 download

Tags:

description

Chapter 10/11. Chapter 11. Type 1 error: reject null hypothesis – send a innocent man to jail. Type 2 error: don’t reject a false null hypothesis. Guilty man goes free. Our original hypothesis…. our new assumption…. Chapter 11. - PowerPoint PPT Presentation

transcript

Chapter 10/11

n/

XZ

Chapter 11

• Type 1 error: reject null hypothesis – send a innocent man to jail

Our original hypothesis…

our new assumption…

Type 2 error: don’t reject a false null hypothesis. Guilty man goes free.

Chapter 11• The p-value of a test is the probability of observing a

test statistic at least as extreme as the one computed given that the null hypothesis is true.

p-value

P-value =.0069

Z=2.46

Chapter 11 – Type II Error• Recall Example 11.1…• H0: µ = 170• H1: µ > 170

• At a significance level of 5% we rejected H0 in favor of H1 since our sample mean (178) was greater than the critical value of (175.34).

In the question – they will have to give you the new mean to test. ($180 mean)

• β = P( x < 175.34, given that µ = 180), thus…

Example 11.1 (revisited)Our original hypothesis…

our new assumption…

Chance we send a guilty man free

Chapter 12: Inference about a pop

Student-t distribution:

1. for a population (defined as greater then 20 times the sample population!)

2. Interval data

3. Used when you have no standard dev

• T-test Mean: used to get information about a sample of results comparing it to some mean value that you need to provide (u = 450 boxes an hour).

• It will give you ‘t’ stat – to see if it is greater then critical t value (eg 1.656 – defined by confidence level and degrees of freedom.

Chapter 12: Inference about a pop

123456789101112

A B C Dt-Test: Mean

PackagesMean 460.38Standard Deviation 38.83Hypothesized Mean 450df 49t Stat 1.89P(T<=t) one-tail 0.0323t Critical one-tail 1.6766P(T<=t) two-tail 0.0646t Critical two-tail 2.0096

1.89 > 1.67

• Student t-distribution is ROBUST – if nonnormal, results of t-test and con interval estimate are still valid unless it is ‘extremely nonnormal’.

Chapter 12: Inference about a pop

Chapter 12: Inference about a pop

Estimator of u (using t-estimate w added finite pop)

Estimating Totals for Finite Populations: take the t-estimate and multiply the limits by the population size to get the limits of the mean for the population.

(t-estimate value of all purchases in store)

1234567

A B C Dt-Estimate: Mean

Sale-CDsMean 59.04Standard Deviation 20.63LCL 55.35UCL 62.74

672,175$UCL

980,154$LCL

)74.62....35.55(2800n

stxN 2/

Estimate of the total amount:

• Inference: Population Proportion (z-test: proportion)

• Counting number of occurrences of each value (voting poll, to see probability of x winning, using the sample proportion of votes). – NOMINAL data (voting, fruit, colours)– Describing a Population– sampling distribution is approx normal with mean of

‘p’ if n*p and n(1-p) BOTH > 5.

Chapter 12: Inference about a pop

Chapter 12: Inference about a pop

1234567891011

A B C Dz-Test: Proportion

VotesSample Proportion 0.532Observations 765Hypothesized Proportion 0.5z Stat 1.77P(Z<=z) one-tail 0.0382z Critical one-tail 1.6449P(Z<=z) two-tail 0.0764z Critical two-tail 1.96

Q1. Did their total vote tally differ? Did they? P-value: Q2. Did Conservatives get more then Liberals?Did he? P-value: Q2. Did the Liberals get more then the Conservatives?Did he? P-value:

YES .0382

NO .9618

NO .0764

In Question screen, test was looking for conservative votes

Population Estimator (estimator of p)• just like t-estimate, by getting the LCL and UCL, you can

multiply by the population to see how many watched or visited a certain tv show (this instance how many watched friends at 8pm)

Chapter 12: Inference about a pop

123456

A Bz-Estimate: Proportion

ShowsSample Proportion 0.113Observations 2000LCL 0.0991UCL 0.1269

Multiply 100,000 viewersBy .0991 and ,1269 to getLower and upper limit

Chapter 13 (inference about comparing two population)

2 populations, can get difference between

1) means

2) ratio of 2 variances

3) proportions

1. Difference between Means (t-test: two samples assuming unequal/equal variances)

• These are independent samples mutually exclusive samples (if they were related, you would use a matched pair technique!)

• use t-test

– SO 2 cases:• Equal variances • Unequal variances

– TO FIND OUT IF EQUAL/UNEQUAL:1. Will tell you in question2. Look at F-Estimator of ratio of 2 variances3. Look on the equal/unequal chart given to see if their variances are greater

then a 2:1 ratio – if so = UNEQUAL variaces!

• Look to compare the t-stat and the t critical value (for 1 or 2 tail test) on chart. If outside value, then reject null.

Chapter 13 (inference about comparing two population)

Chapter 13 (inference about comparing two population)

12345678910111213

A B Ct-Test: Two-Sample Assuming Unequal Variances

Consumers NonconsumersMean 604.02 633.23Variance 4102.97564 10669.76565Observations 43 107Hypothesized Mean Difference 0df 123t Stat -2.09P(T<=t) one-tail 0.0193t Critical one-tail 1.6573P(T<=t) two-tail 0.0386t Critical two-tail 1.9794

t-test: Equal/unequal variances t-test of Q. Calories eaten at lunch by 2 separate populations – do they differ?

21

Chapter 13 (inference about comparing two population)

• t-estimator – 2 means (unequal/equal variances)

• - same as before, just with upper/lower levels of means *

population.

12345678

A B C D E Ft-Estimate of the Difference Between Two Means (Unequal-Variances)

Sample 1 Sample 2 Confidence Interval EstimateMean 604.02 633.2 -29.21 27.65Variance 4102 10671 Lower confidence limit -56.86Sample size 43 107 Upper confidence limit -1.56Degrees of freedom 122.62Confidence level 0.95

NOTE: If populations are very nonnormal and equal variances, then you can use Wilxocon rank sum test. If, unequal variance – there is no test!

Chapter 13 (inference about comparing two population)

1234567891011121314

A B Ct-Test: Two-Sample Assuming Equal Variances

Method A Method BMean 6.47 6.17Variance 1.30 1.36Observations 25 25Pooled Variance 1.33Hypothesized Mean Difference 0df 48t Stat 0.9198P(T<=t) one-tail 0.1811t Critical one-tail 1.6772P(T<=t) two-tail 0.3623t Critical two-tail 2.0106

Equal Variances Chart (times to assemble)

Chapter 13 (inference about comparing two population)

Matched Pairs Experiment (t-test and estimator of UD )• if you can find a way to pair the independent samples, then you can

use this method. Just cause they have the same number of samples, doesn’t mean they are matched, even if they are ordered, they NEED to be matched on another variable (gpa buckets etc).

Chapter 13 (inference about comparing two population)

1234567891011121314

A B Ct-Test: Paired Two Sample for Means

Finance MarketingMean 65,438 60,374Variance 444,981,810 469,441,785Observations 25 25Pearson Correlation 0.9520Hypothesized Mean Difference 0df 24t Stat 3.81P(T<=t) one-tail 0.0004t Critical one-tail 1.7109P(T<=t) two-tail 0.0009t Critical two-tail 2.0639

Matched Pairs Experiment (t-test and estimator of UD )

Q. Do finance majors make more then marketing? Take 25 random people – then do t-test of equal/unequal variances. BUT, in this they took 25 buckets of GPA’s and took 1 random person from each range of GPA = matched pairs.

Chapter 13 (inference about comparing two population)

2) Inference about the ratio of two variances: (F-test: two sample for variances)

• This is to see if the variances of 2 variances are different. The null is always σ2/ σ2 = 1.

• If you want to show if different then want two tail ‘does not equal’ test. – PRINTOUT will give you one tail p-value (will need to double to get

p-value)

12345678910

A B CF-Test Two-Sample for Variances

Consumers NonconsumersMean 604.02 633.23Variance 4103 10670Observations 43 107df 42 106F 0.38P(F<=f) one-tail 0.0004F Critical one-tail 0.6371

Chapter 13 (inference about comparing two population)

• F-Estimator of ratio of 2 variances – same as usual

123456

A B C D EF-Estimate of the Ratio of Two Variances

Sample 1 Sample 2 Confidence Interval EstimateSample variance 4103 10670 Lower confidence limit 0.2374Sample size 43 107 Upper confidence limit 0.6594Confidence level 0.95

That is, we estimate that σ12 / σ22 lies between .2374 and .6594Note that one (1.00) is not within this interval…

Chapter 13 (inference about comparing two population)

• 3) Inference about the difference between population proportions (with nominal data) – z-test of p1 – p2

• Using nominal data, so win/lose categories – to give you proportions.

• same restriction of the p*n and p*(1-n) > 5 (but now for both populations)

• depending on null hypothesis, there are 2 different formula (one for =0 and one for = D (not 0) – look to the hypothesized mean line in table.

Chapter 13 (inference about comparing two population)

Z-test of p1 – p2 type 1

- eg: testing for the proportion of a certain product being sold in 2 different stores – with a difference of 0 (so seeing if supermarket 1 sold more then supermarket 2)

1234567891011

A B Cz-Test: Two Proportions

Supermarket 1 Supermarket 2Sample Proportions 0.1991 0.1493Observations 904 1038Hypothesized Difference 0z Stat 2.90P(Z<=z) one tail 0.0019z Critical one-tail 1.6449P(Z<=z) two-tail 0.0038z Critical two-tail 1.96

Chapter 13 (inference about comparing two population)

Z-test of p1 – p2 type 2

- eg: testing for the proportion of a certain product being sold in 2 different stores – with a difference of 3% (so seeing if supermarket 1 sold 3% more then supermarket 2)

1234567891011

A B Cz-Test: Two Proportions

Supermarket 1 Supermarket 2Sample Proportions 0.1991 0.1493Observations 904 1038Hypothesized Difference 0.03z Stat 1.14P(Z<=z) one tail 0.1261z Critical one-tail 1.6449P(Z<=z) two-tail 0.2522z Critical two-tail 1.96

Chapter 15 : Analysis of Variance

• comparing 2 or more population of INTERVAL data• determine whether differences exist between population means

– done by analyzing sample variance.• ANOVA uses the errors within groups and between groups, to essentially

determine if the means differ.

3 TYPES OF ANOVA:• Single Factor: For populations which have only 1 factor that you are

comparing them against, then you use the ANOVA: Single Factor. This is like comparing sales from 3 cities with the factor being the marketing strategy.

• Two Factor: When you have 2 or more independent samples with 2 factors (comparing 3 cities based on marketing strategy and marketing medium) – NOT ON EXAM

• Randomized Block (two-way Anova): when you have 1 factor, but the samples in each treatment are groups according to some variable (like age/weight, or gpa).

Chapter 15 : Analysis of Variance

1. SINGLE FACTOR• Condition: • MST/MSE = F compare this value with F crit on the

chart. If F value is greater then F-crit then the means differ.

• REQUIRED CONDITIONS: the random variables must be normally distributed with equal variances. Check by histograms. If NOT normal, then you can Kruskal-Wallis Test.

• If pop variances are unequal – we have NO test!!

Chapter 15 : Analysis of VarianceSingle Factor:- Comparing 3 independent populations, with the factor being

marketing strategy.

Q. Is there enough evidence to support that the sales of this product differ?

123456789101112131415

A B C D E F GAnova: Single Factor

SUMMARYGroups Count Sum Average Variance

Convenience 20 11551 577.6 10775.0Quality 20 13060 653.0 7238.1Price 20 12173 608.7 8670.2

ANOVASource of Variation SS df MS F P-value F crit

Between Groups 57512.23 2 28756.1 3.23 0.0468 3.16Within Groups 506983.5 57 8894.4

Total 564495.7 59

All this says is that at least 2 of the means differ!

Chapter 15 : Analysis of VarianceMust be normal and equal variances:- If nonnormal, replace test with Kruskal-Wallis test (making the

numbers ordinal).- If unequal variances – we CANNOT DO!

All this says is that at least 2 of the means differ!

Chapter 15 : Analysis of Variance

RANDOMIZED BLOCK (Two-Factor Without Replication)

• Here you are only comparing across 1 factor essentially, but splitting the samples into blocks.

• SO: • MST/MSE = F still holds if wanting to compare treatments (standard

case). But if you want to compare between the BLOCKS – then:• F = MST/MSB (ROW ERROR). • In the chart, you will get the variation of between the rows (between the

buckets, which we usually aren’t looking for, cause we know they will be different – but still testable (ROWS Error)), and variation between columns (what we are looking for – difference between treatments (COLUMN Error)). Compare the F value and the F-crit.

• Requirement: Must be normal, and equal var. If nonnormal – use Friedman Test

Chapter 15 : Analysis of Variance

12345252627282930313233343536373839404142

A B C D E F GAnova: Two-Factor Without Replication

SUMMARY Count Sum Average Variance1 4 30.60 7.65 17.072 4 22.30 5.58 10.20

22 4 112.10 28.03 5.0023 4 89.40 22.35 13.6924 4 93.30 23.33 7.1125 4 113.10 28.28 4.69

Drug 1 25 438.70 17.55 32.70Drug 2 25 452.40 18.10 73.24Drug 3 25 386.20 15.45 65.72Drug 4 25 483.00 19.32 36.31

ANOVASource of Variation SS df MS F P-value F crit

Rows 3848.7 24 160.36 10.11 0.0000 1.67Columns 196.0 3 65.32 4.12 0.0094 2.73Error 1142.6 72 15.87

Total 5187.2 99

Q: testing 4 drugs, with 25 men – who are paired with someone else in same age/weight category. Their cholesterol different was measured (interval). Are any drugs more successful then others?Q2. Does the success of the drug differ between age/weight?

Summary of ANOVA

one-way analysis of variance

two-factor analysis of variance

two-way analysis of variancea.k.a. randomized blocks

Ch 15 : Multiple Comparisons

When we conclude from the one-way analysis of variance that at least two treatment means differ (i.e. we reject the null hypothesis that H0) we often need to know which treatment means are responsible for these differences.

We will examine three statistical inference procedures that allow us to determine which population means differ:• Fisher’s least significant difference (LSD) method• Bonferroni adjustment, and• Tukey’s multiple comparison method.

Ch 15 : Multiple ComparisonsFisher’s Least Significant Difference (LSD)

– Compare the mean difference of all combinations with the calculated LSD

– Take the ABSOLUTE value to see if bigger.

Ch 15 : Multiple Comparisons• Bonferri Adjustment

– It takes the LSD format, but instead of using the sig level as is, it divides it by k(k-1)/2 – k being treatments, to get a smaller sig level – and lower change of type 1 error.

– Use the Multiple Comparisons method, but you need to PRE-adjust the sig level before entering it in – it is still a LSD method,

Ch 15 : Multiple Comparisons

• Tukey’s Multiple Comparison Method (w)

– Same concept, see which is higher

RULE OF THUMB: if you have 2/3 pairwise comparisons use Bonferroni’s comparison. If you want to compare ALL possible combinations, use tukey’s.

Chapter 16: Chi-Squared Tests• Goodness of Fit Test – used in 2 ways:• used to describe one population of data with more then 2 nominal

options (no heads/tails, but rock/paper/scissor)– trials must be independent– must have expected frequency > 5 for each (n*p)– have a null hypoth being equal to the p*n for each option, and

goodness of fit test determines of the actual results differ from them.• used to determine if two classifications of a pop are statistically

independent- also interpreted as a comparison of 2+ populations

Given frequencies of .45, .40, and .15, - expected frequency.

This test compares the expected to the actual, and gives p-value.

Chapter 16: Chi-Squared Tests

• Chi-Squared test of a Contingency Table

• is there enough evidence to infer that two nominal variables are related or

• to infer that differences exist between two or more populations of nominal data.

123456789101112131415

A B C D E FContingency Table

DegreeMBA Major 1 2 3 TOTAL

1 31 13 16 602 8 16 7 313 12 10 17 394 10 5 7 22

TOTAL 61 44 47 152

chi-squared Stat 14.7019df 6p-value 0.0227chi-squared Critical 12.5916

Chapter 17: Linear Regression and Correlation • Regression:USED TO: analysis the relationship between interval variables.

Deterministic Model: set of equations to mathematically determine the value of the dependent variable from the values of the independent.

Probabilistic Model: method used to capture the randomness (try to fit an equation relating size of houses and cost of houses)

Deterministic Model: y = 200 + 4x – we know that price of a car goes up by $4 for each horse power unit. Probabilistic Model y = 200 + 4x + ε (error) – this represents the real world variability.

Chapter 17: Linear Regression and Correlation

• Eg: resell value of car with x miles on the odometer

Standard Error (Se) = how good the error in the points is (relate this to the mean)Coefficient of Determination (R^2) = how much of the variation is due to the independent variable (if 1 – no error, and all variation due to indep, if 0 – no linear relationship between variables, and all error).

Chapter 17: Linear Regression and Correlation

• Coorelation (Pearson)

• to see the direction of the relationship, coorelation will be between -1 and 1

• do a 2 tail test if you want to know if there is a relationship

• do a 1 tail test if you want to know if there is a positive/negative relationship

Chapter 17: Linear Regression and Correlation

Prediction Interval Method• To find out the expected value of an individual item (prediction interval) or

the expected value of the mean of a population (confidence interval estimate)

• The confidence interval estimate of the expected value of y will be narrower than the prediction interval for the same given value of x and confidence level. This is because there is less error in estimating a mean value as opposed to predicting an individual value.

Prediction Interval

Confidence Interval Estimator of the mean price

Point Prediction Estimate of the range of 1 population

The interval that the mean of a large number of trails will fall in.

Chapter 18: Multiple Regression (multiple variables, all first-order)

Eg: Hotel profit margin – based on 6 factors.

Chapter 19: Model Building

• Checking the regression tool’s output…

The model fits the data well

and its valid…

Uh oh.multicollinearity

Chapter 21: Nonparametic Techniques

• USED TO: – Testing characteristics of ORDINAL DATA– Used for when interval data is NONNORMAL

– so all about seeing where the population values are when comparing 2 populations

Chapter 21: Nonparametic Techniques

Wilcoxon Rank Sum Test

• compare 2 pop’s

• ordinal (or nonnormal interval)

• independent samples

You rank all the values from both populations, and rank them from lowest to highest and sum up each population by the ranks of each pop’s values.

You need to compare the z-stat and the z-critical for one-tail / two-tail – depending if you want to see if the populations locations are different (two-tail) or if one is greater then the other (one-tail).

NOTE: need to have identical spreads (variance) and shape (distribution)

Chapter 21: Nonparametic Techniques

compare…p-value

Eg: testing quality of new painkiller vs plain old aspirin. It ranks the answers, and sums the ranks and the test determines if there is a difference. This is testing to see if the new one IS better the one one – 1 tail test.

Chapter 21: Nonparametic Techniques

Sign Test• compare 2 pop’s• ordinal or interval (nonnormal)• samples are MATCHED PAIRS• must be similar in shape and spread• Of each match, denote a -1 or 1 (which one is

bigger). • Look to the z-stat /p-value to see if bigger then

z-crit (one or two tail)

Chapter 21: Nonparametic Techniques

compare…p-value

Eg: testing to see if european cars are more comfortable then american cars.

Chapter 21: Nonparametic Techniques

• Wilcoxon Signed Rank Sum Test• used only for comparing

– 2 populations – nonnormal interval data – matched pairs

• Compute paired differences• Rank absolute values of differences’sum the ranks of +ive and

–ve differences • Use Rank Sum Test and compare z-stat / p-value.

Chapter 21: Nonparametic Techniques

• Eg: Flex time experiment.. Sums up the matched pairs, and adds up all the neg numbers and the pos numbers –

• Q1. want to know if flextime times are different then the standard time?

• Q2. If flextime is faster then normal??

compare…

p-value

Chapter 21: Nonparametic Techniques

• Kruskal-Wallis Test

• used to compare two or more populations (3+)

– have ordinal or nonnormal interval data (independent samples)

• can only test to see if they differ!

– Compare p-value/h-stat to critical/sig level

compare…

p-value

Chapter 21: Nonparametic Techniques

Friedman Test• compare 2 or more pop or ordinal or nonnormal interval that is generated

from a randomized block experiment.

• Seeing if at least 2 pop locations differ

• Eg: 4 managers evaluation 8 candidates – 4 treaments, and 8 blocks.

• Compare ft-stat and hi-square critical

Chapter 21: Nonparametic Techniques

• Spearman Rank Correlation Coefficient

• to see if a relationship exists (pos or neg relationship) between 2 ordinal variables or nonnormal interval variables. (or if 1 interval and 1 ordinal!!)

• compare stat to critical – usually 2 tail since you are looking for a general relationship.

Q1. Is there a relationship??

Q2, Is there a positive relationship??

compare…

p-value

SAMPLE EXAM QUESTIONS

Which technique to use?

• The bookstore has a policy that the proportion of books returned should be less than 10%. To see if the policy is working, a random sample of book titles was drawn, and the fraction of the total originally ordered that are returned is recorded. Can we infer at the 10% significance level that the mean proportion of returns is less than 10%.

t-test of μ

• The bookstore has a policy that the proportion of books returned should be less than 10%. To see if the policy is working, a random sample of book titles was drawn, and the fraction of the total (interval) originally ordered that are returned is recorded. Can we infer at the 10% significance level that the mean proportion of returns is less than 10%.

Which technique to use?

• Has the recent drop in airplane passengers resulted in better on time performance? Before the recent downturn, one airline bragged that 92% of its flight were on time. A random sample of 165 flights reveals that 153 were on time. Can we conclude at the 5% significance level that the airline’s on time performance has improved?

z -test of p

• Has the recent drop in airplane passengers resulted in better on time performance? Before the recent downturn, one airline bragged that 92% of its flight were on time. A random sample of 165 flights reveals that 153 were on time (nominal- either on time, or not). Can we conclude at the 5% significance level that the airline’s on time performance has improved?

Which Technique?

• Guggul is a popular remedy in India to lower cholesterol levels. 103 Philadelphia area adults were divided into 3 groups. Group 1 took placebos, 2 took 1000 milligrams of guggul, group 3 took 2000 milligrams of guggul. The changes in low-density cholesterol were recorded. Can we infer that there are differences in the reduction of cholesterol between the groups? Histograms are bell shaped and similar.

ANOVA (one-way)

• Guggul is a popular remedy in India to lower cholesterol levels. 103 Philadelphia area adults were divided into 3 groups. Group 1 took placebos, 2 took 1000 milligrams of guggul, group 3 took 2000 milligrams of guggul. The changes in low-density cholesterol were recorded. Can we infer that there are differences in the reduction of cholesterol between the groups? Histograms are bell shaped and similar.

• -PO: Compare 2 or more populations- group 1, 2 and 3.• -DT: Interval- reduction is cholesterol levels.• -Samples are independent- nothing relating the samples• -Normally Distributed- states in question

Excel Output

Which technique?

• Who spends more on vacations, golfers or skiers? A travel agency surveyed 15 customers who regularly take their spouses on either a golfing or skiing vacation and asked how much money they spend. Can we infer that golfers and skiers differ in their vacation expenses. Assume variance is a 1:1.5 ratio between the golfer and skier spending. Normally distributed.

Equal-variances t-test of

• Who spends more on vacations, golfers or skiers? A travel agency surveyed 15 customers who regularly take their spouses on either a golfing or skiing vacation and asked how much money they spend. Can we infer that golfers and skiers differ in their vacation expenses. Assume variance is a 1:1.5 (rule of thumb is less than 1:2 ratio means variance is equal), ratio between the golfer and skier spending.

21

Which Technique?

• Do waitresses or waiters earn larger tips? To answer this, a study was done involving a measure of the percentage of of the total bill left as a tip. One randomly selected waiter, and waitress was selected in each of the 50 restaurants during a one-week period. What conclusions can be drawn from the data? Normal data

t-test of • Do waitresses or waiters earn larger tips? To

answer this, a study was done involving a measure of the percentage of of the total bill left as a tip. One randomly selected waiter, and waitress was selected in each of the 50 restaurants during a one-week period. What conclusions can be drawn from the data?

• 2 pop’s- waiters and waitresses, data type is interval because percentages were recorded, measuring central location (average tip percentage) and pairs are matched (one male and female from each restaurant).

D

Which technique to use?

• During the winder some grape vines die from the extreme cold. In the spring the vines are pruned; if it is brown the plant is dead, and green means healthy. A random sample of vines is selected to see how well it survived the winter. Each vine is considered (1) alive, or (2) dead. Estimate with 90% confidence the degree of winter kill for this vineyard.

Estimator of p

• During the winder some grape vines die from the extreme cold. In the spring the vines are pruned; if it is brown the plant is dead, and green means healthy. A random sample of vines is selected to see how well it survived the winter. Each vine is considered (1) alive, or (2) dead. Estimate with 90% confidence the degree of winter kill for this vineyard.

Which technique to use?

• A random sample of 223 children aged 8-12, of whom 41 were obese. Each child’s metabolic rate (calories burned per hour) was measured while at rest and also while the children watched television. The differences between the 2 rates were recorded. Column 1 contains the difference in metabolic rate and column 2 codes the children as 1=obese and 2=not obese. nonnormal data

• B) Can we conclude that the decrease in metabolism while watching television is greater among obese children?

Wilcoxon rank sum test

• A random sample of 223 children aged 8-12, of whom 41 were obese. Each child’s metabolic rate (calories burned per hour) was measured while at rest and also while the children watched television. The differences between the 2 rates were recorded. Column 1 contains the difference in metabolic rate and column 2 codes the children as 1=obese and 2=not obese.

• B) Can we conclude that the decrease in metabolism while watching television is greater among obese children?

• -PO: Compare 2 populations-DT: Ordinal-Independent Samples

Excel Output

Rank Sum ObservationsObese 5750 41Non Obese 19226 182z Stat 3.1P(Z<=z) one tail 0.001z Critical one tail 1.6449P(Z<=z) two tail tail 0.002z Critical two tail 1.96

P value= 0.001

Which technique to use?

• A fast food franchiser wants to build a restaurant downtown, but based on financial analysis, the location is only acceptable if the number of pedestrians passing by the location averages more than 200 per hour. To help decide whether to build or not, a statistics practitioner observes the number of pedestrians who pass the site each hour over a 40-hour work week. Should the franchiser build on the street?

t-test of μ

• A fast food franchiser wants to build a restaurant downtown, but based on financial analysis, the location is only acceptable if the number of pedestrians passing by the location averages more than 200 per hour. To help decide whether to build or not, a statistics practitioner observes the number of pedestrians who pass the site each hour over a 40-hour work week. Should the franchiser build on the street? (Question is asking to describe the population, because we want to know if the pop is larger than 200/hour).

t-Test: Mean

PedestriansMean 209.125Standard Deviation 60.0078Hypothesized Mean 200df 39t Stat 0.9617P(T<=t) one-tail 0.1711t Critical one-tail 1.6849P(T<=t) two-tail 0.3422t Critical two-tail 2.0227

P-value = .1711

Which Technique to use?

• To test if there are differences in teaching methods, a professor taught section 1 by lecture, section 2 by case study, and section 3 by excel. At the end of the course each student then ranked the course on a 7 point scale (1=atrocious… 7=excellent). From each section, 25 evaluations were chosen at random. Is there evidence that differences in student satisfaction exists with respect to at least 2 of the 3 teaching methods.

Kruskal Wallis test

• To test if there are differences in teaching methods, a professor taught section 1 by lecture, section 2 by case study, and section 3 by excel. At the end of the course each student then ranked the course on a 7 point scale (1=atrocious… 7=excellent). From each section, 25 evaluations were chosen at random. Is there evidence that differences in student satisfaction exists with respect to at least 2 of the 3 teaching methods.

• -PO: Compare 2 or more populations – 3 sections• -DT: Ordinal or Interval (non-normal distribution if interval)

– In this case data was ordinal– 7 point scale

• -Independent Samples

Excel Output

Kruskal-Wallis Test

Group Rank Sum ObservationsLecture 767.5 25Case 917 25Computer 1165.5 25

H Stat 6.8072df 2p-value 0.0333chi-squared Critical 5.9915

P-value= 0.0333

Which technique?

• In textbook example 12.52, we described the problem with changing light bulbs. We decided they need to be fixed, but there are 2 brands of bulbs to use. The mean and variance of the lengths of life are important, we therefore randomly sample both brands by leaving them on until they burn out. The times were recorded. Can we conclude that the variances differ?

F-test of

• In textbook example 12.52, we described the problem with changing light bulbs. We decided they need to be fixed, but there are 2 brands of bulbs to use. The mean and variance of the lengths of life are important, we therefore randomly sample both brands (2 pop’s) by leaving them on until they burn out. The times were recorded. Can we conclude that the variances differ?

22

21 /

What technique to use?

• To determine the effect of full-page advertisements, the owner of a store asked 200 randomly selected people who visited the store whether or not they had seen the ad. He also determined whether or not the customers bought anything and if so, how much they spent.

• A) Can the owner conclude that customers who see the add are more likely to make a purchase than those who do not see the ad?

Z test of•To determine the effect of full-page advertisements, the owner of a store asked 200 randomly selected people who visited the store whether or not they had seen the ad. He also determined whether or not the customers bought anything and if so, how much they spent. •A) Can the owner conclude that customers who see the add are more likely to make a purchase than those who do not see the ad?

-Compare 2 populations (those who bought and those who didn’t)

-Nominal data (Buy or not buy)

21 pp

Excel output• Can the owner conclude that customers who see the add are more likely to make a purchase than those who do not see the ad?• Can we conclude that the purchase tendency between people who see the ad and people who don’t is different?

z-test of the Difference Between 2 Proportions

Sample 1 Sample 2 z stat 2.83

Sample Proportion

0.4336 0.2414 P(Z<=z) one tail 0.0024

Sample Size 113 87 z critical one tail

1.6449

Alpha 0.05 P(Z<=z) two-tail 0.0052

Z Critical two tail

1.96

Which technique to use?

• 20 people were recruited who were more than 50 pounds overweight to compare 4 diets. The people were matched by age in groups of 4. The number of pounds that each person lost were recorded. Can we infer that there are differences between the four diets? All histograms are bell shaped and similar.

Analysis of variance (randomized blocks)

• 20 people were recruited who were more than 50 pounds overweight to compare 4 diets. The people were matched by age in groups of 4. The number of pounds that each person lost were recorded. Can we infer that there are differences between the four diets? All histograms are bell shaped and similar.

• -PO: Compare 2 or more populations (different diets)

• -DT: Interval (amount of weight lost)

• -Blocks (according to age)

• -Distributed Normally

What technique to use?

• To determine the effect of full-page advertisements, the owner of a store asked 200 randomly selected people who visited the store whether or not they had seen the ad. He also determined whether or not the customers bought anything and if so, how much they spent.

• B) Can the owner conclude that customers who see the ad spend more than those who do not see the ad? (normal dist)

t-test of 21 To determine the effect of full-page advertisements, the owner of a store asked 200 randomly selected people who visited the store whether or not they had seen the ad. He also determined whether or not the customers bought anything and if so, how much they spent.B) Can the owner conclude that customers who see the ad spend more than those who do not see the ad? (assume variances are equal)

-Compare 2 pop’s (see the ad and don’t see the ad)

-Interval data (Amount of money Spent)

-Independent Samples, normal

EQUAL or UNEQUAL????

Excel OutputB) Can the owner conclude that customers who see the ad spend more than

those who do not see the ad? (assume variances are equal)

t-test: Two Sample Assuming Equal Variance

Ad No AdMean 97.38 92.01Variance 621.97 283.26Observations 49 21Pooled Variance 522.35df 68t stat 0.9P(T<=t) one tail 0.1853t Critical one tail 1.6676P(T<=t) two tail 0.3705t Critical two tail 1.9955

P value= 0.1853

T-test: assuming unequal variances

Which technique to use?

• The weekly returns of 2 stocks for a 13 week period were recorded and listed here. Assuming that the returns are non normally distributed, can we infer at the 5% significance level that the stock returns are correlated.

Stock 1 -7 -4 -7 -3 2 -10 -10 5 1 -4 2 6 -13 Stock 2 6 6 -4 9 3 -3 7 -3 4 7 9 5 -7

Spearman Rank Correlation

• The weekly returns of 2 stocks for a 13 week period were recorded and listed here. Assuming that the returns are non normally distributed, can we infer at the 5% significance level that the stock returns are correlated.

• -PO: Analyze the relationship between 2 variables – stock returns of 2 different stocks

• -DT: Interval (non-normal distribution if interval) – interval non normal

Which Technique to use?

• A spokesperson for the postal service said it had a success rate of more than 95% in delivering priority mail letters within a 2 day deadline. Angry mailman decided to conduct an experiment to test this statement by sending letters by priority mail and ordinary mail from his hometown to Waterloo. Letters that arrived on time were recorded with a 2 and late letters were recorded as a 1.

• A) Does the data provide sufficient evidence to support the spokespersons claim?

z -test of p • A spokesperson for the postal service said it had a

success rate of more than 95% in delivering priority mail letters within a 2 day deadline. Angry mailman decided to conduct an experiment to test this statement by sending letters by priority mail and ordinary mail from his hometown to Waterloo. Letters that arrived on time were recorded with a 2 and late letters were recorded as a 1.

• A) Does the data provide sufficient evidence to support the spokespersons claim?

• Describe a population do they deliver letters within the 2 day deadline over 95% of the time?

• Nominal Data- 1 for late letters, 2 for on time.

Excel Outputz-Test: Proportion

2Sample Proportion 0.9713Observations 244Hypothesized Proportion 0.95z Stat 1.5274P(Z<=z) one-tail 0.0633z Critical one-tail 1.6449P(Z<=z) two-tail 0.1266z Critical two-tail 1.96

P value = 0.0633

Which technique to use?

• A spokesperson for the postal service said it had a success rate of more than 95% in delivering priority mail letters within a 2 day deadline. Angry mailman decided to conduct an experiment to test this statement by sending letters by priority mail and ordinary mail from his hometown to Waterloo. Letters that arrived on time were recorded with a 2 and late letters were recorded as a 1.

• B) Does the data provide sufficient evidence to prove that Priority mail met the 2 day deadline more frequently than ordinary mail?

z-test of

• A spokesperson for the postal service said it had a success rate of more than 95% in delivering priority mail letters within a 2 day deadline. Angry mailman decided to conduct an experiment to test this statement by sending letters by priority mail and ordinary mail from his hometown to Waterloo. Letters that arrived on time were recorded with a 2 and late letters were recorded as a 1.

• B) Does the data provide sufficient evidence to prove that Priority mail met the 2 day deadline more frequently than ordinary mail?

• Compare two populations- priority mail and ordinary mail.• Nominal Data- 1 fail, 2 success• 2 categories- on time or not

21 pp

Excel Outputz-Test: Two Proportions

Prority OrdinarySample Proportions 0.9714 0.9101Observations 245 378Hypothesized Difference 0z Stat 3.018P(Z<=z) one tail 0.0013z Critical one-tail 2.3263P(Z<=z) two-tail 0.0026z Critical two-tail 2.5758

P value= .0013

Which Technique?

• A television executive wants to know whether the amount of money spent gambling on football affect the enjoyment of the viewers. A random sample of 200 people who watch football and gamble on the games was asked how much they wagered on the game and to rate enjoyment (1= not enjoyable, 2=some what enjoyable, 3= enjoyable). Do these data provide enough evidence to infer that, the greater the wager, the more enjoyable the game is?

Spearman Rank Correlation

• A television executive wants to know whether the amount of money spent gambling on football affect the enjoyment of the viewers. A random sample of 200 people who watch football and gamble on the games was asked how much they wagered on the game and to rate enjoyment (1= not enjoyable, 2=some what enjoyable, 3= enjoyable). Do these data provide enough evidence to infer that, the greater the wager, the more enjoyable the game is?-PO: Analyze the relationship between 2 variables – amount of money wagered and enjoyment.

• -DT: Ordinal or Interval (non-normal distribution if interval) – ordinal -enjoyment

Excel Output

Spearman Rank Correlation

Wager and EnjoymentSpearman Rank Correlation 0.3912z Stat 5.5192P(Z<=z) one tail 0z Critical one tail 1.6449P(Z<=z) two tail 0z Critical two tail 1.96

P-value= 0

What technique?

• A popular game of craps is based on the probabilities of rolling certain sums with a pair of dice. Ex. Probability of rolling a sum of 3= 2/36, 4=3/36… A stats professor suspects that the dice are not fairly balanced and records each of 1000 throws. A) Does the data allow us to infer that the dice are not fair?

goodness-of-fit test

• A popular game of craps is based on the probabilities of rolling certain sums with a pair of dice. Ex. Probability of rolling a sum of 3= 2/36, 4=3/36… A stats professor suspects that the dice are not fairly balanced and records each of 1000 throws. A) Does the data allow us to infer that the dice are not fair?

• -PO: Describe a population- sums when rolling dice.• -DT: Nominal –sum of the rolls• -Multinomial experiments- more than 2 categories (this is why it is

not z test of p)

2

Which technique to use?

• CFC’s are banned because they damage the ozone layer. The new legislation in Ontario will affect those who use CFC’s in their air conditioners in cars. To see how many vehicles will be affected by the new ban on CFC’s, a survey of 650 vehicles was taken and each car was identified as either (1) use CFC or (2) do not. If there are 5 million vehicles registered in Ontario, estimate with 95% confidence the number of vehicles affected by the new law.

Estimator of p

• CFC’s are banned because they damage the ozone layer. The new legislation in Ontario will affect those who use CFC’s in their air conditioners in cars. To see how many vehicles will be affected by the new ban on CFC’s, a survey of 650 vehicles was taken and each car was identified as either (1) use CFC or (2) do not. If there are 5 million vehicles registered in Ontario, estimate with 95% confidence the number of vehicles affected by the new law.

Which Technique to use?

• To examine whether age is a factor in determining who drinks alcohol 1054 adults were polled and asked “do you ever drink alcohol?” Responses were recorded as 1=yes, 2=no. They were also asked to report there age category where 1=18-29, 2=30-49, 3=50+. Can we infer that differences exist among age categories with respect to alcohol use?

test of a contingency table 2

•To examine whether age is a factor in determining who drinks alcohol 1054 adults were polled and asked “do you ever drink alcohol?” Responses were recorded as 1=yes, 2=no. They were also asked to report there age category where 1=18-29, 2=30-49, 3=50+. Can we infer that differences exist among age categories with respect to alcohol use?

-PO: Compare 2 or more populations-DT: Nominal

Which Technique to Use?

• In a taste test of a new beer, 100 people rated the new beer and the leading brand. Possible ratings were poor (1), fair (2), good (3), very good (4) and excellent (5). Can we infer that the new beer is rated higher than the leading brand?

Sign test

• In a taste test of a new beer, 100 people rated the new beer and the leading brand. Possible ratings were poor (1), fair (2), good (3), very good (4) and excellent (5). Can we infer that the new beer is rated higher than the leading brand?

• -PO: Compare 2 populations (new beer and old)• -DT: Ordinal (numbers with meaningful order)• -Matched Pairs (same people tested both beers)

Which Technique to use?

• The manager of a sports store wants to renovate and increase the floor space in 2 possible areas, the tennis department and the swimming department. She decided that if the swim department has higher gross sales she will choose that area to renovate. She has collected the 2 departments weekly gross sales for 6 months. Which department should she renovate? (Histograms highly skewed and not similar)

Wilcoxon signed rank sum test

• The manager of a sports store wants to renovate and increase the floor space in 2 possible areas, the tennis department and the swimming department. She decided that if the swim department has higher gross sales she will choose that area to renovate. She has collected the 2 departments weekly gross sales for 6 months. Which department should she renovate? (Histograms highly skewed and not similar)

• -PO: Compare 2 populations (tennis and swim)-DT: Interval (weekly gross sales)

• -Measure of Central Location- mean sales• -Matched Pairs (matched week by week). • -non normally distributed (stated in question)

Excel Output

Wilcoxon Signed Rank Sum Test

Difference Tennis - Swimming

T+ 111T- 240Observations (for test) 26z Stat -1.638P(Z<=z) one-tail 0.0507z Critical one-tail 2.3263P(Z<=z) two-tail 0.1014z Critical two-tail 2.5758

P-value= 0.0507

Which technique to use?

• A company wants to select one of three couriers to act as its sole delivery method. An experiment was performed whereby letters were sent using each of the three couriers at 12 different times of the day to a delivery point downtown. The number of minutes required for delivery was recorded. Can we conclude that there are differences in delivery times between the three couriers? Data is highly skewed and non-normal.

Friedman Test

• A company wants to select one of three couriers to act as its sole delivery method. An experiment was performed whereby letters were sent using each of the three couriers at 12 different times of the day to a delivery point downtown. The number of minutes required for delivery was recorded. Can we conclude that there are differences in delivery times between the three couriers? Data is highly skewed and non-normal.

• -PO: Compare 2 or more populations (3 couriers)• -DT: Ordinal or Interval (non-normal distribution if interval)-

data is interval (minutes) and non-normal • -Randomized Block Experiment (blocked according to times

of the day)

Excel OutputFriedman Test

Group Rank SumCourier 1 28.5Courier 2 22.5Courier 3 21

Fr Stat 2.625df 2p-value 0.2691chi-squared Critical 5.9915

P=-value= 0.2691

Which technique?

• To test if a computer came with enough memory the age of the computers was tested. Random samples were taken of computer users where each was asked the brand of the computer and its age in months. Does the data provide sufficient evidence to conclude that there are differences in age between the computer brands. All histograms are bell shaped and similar. Brands identified were Dell, Hewlett Packard, IBM and other.

ANOVA (one-way)

• To test if a computer came with enough memory the age of the computers was tested. Random samples were taken of computer users where each was asked the brand of the computer and its age in months. Does the data provide sufficient evidence to conclude that there are differences in age between the computer brands. All histograms are bell shaped and similar. Brands identified were Dell, Hewlett Packard, IBM and other.

• -PO: Compare 2 or more populations- HP, Dell, IBM, other

• -DT: Interval- months• -Samples are independent- nothing relating the samples• -Normally Distributed- states in question

Which technique?

• It was decided to upgrade the skills of workers because they were unable to master required skills. Experts identified 6 skills and each worker was rated based on these skills and also on the quality of work done on the machine. All data is interval. Identify the skills that affect the quality of work.

Multiple Regression t-test of

• It was decided to upgrade the skills of workers because they were unable to master required skills. Experts identified 6 skills and each worker was rated based on these skills and also on the quality of work done on the machine. All data is interval. Identify the skills that affect the quality of work.

• -PO: Analyze the relationship between two or more variables. –skills and quality of work

• -DT: Interval

i

What technique?

• A professor of statistics hands back his graded midterms in class by calling out the name of each student and personally handing the exam over to its owner. At the end of the process he notes that there are several exams left over, the result of students missing that class. He forms the theory that the absence is caused by a poor performance by those students on the test. If the theory is correct, the leftover papers will have lower marks than those papers handed back. He recorded the marks (out of 100) for the leftover papers and the marks of the returned papers. Do the data support the professor's theory? Histograms are bell shaped and one in more than three times the width of the other.

T test of

• A professor of statistics hands back his graded midterms in class by calling out the name of each student and personally handing the exam over to its owner. At the end of the process he notes that there are several exams left over, the result of students missing that class. He forms the theory that the absence is caused by a poor performance by those students on the test. If the theory is correct, the leftover papers will have lower marks than those papers handed back(2 POPULATIONS). He recorded the marks (out of 100) (INTERVAL) for the leftover papers and the marks of the returned papers. Do the data support the professor's theory? Histograms are bell shaped (normal) and one is more than three times the width of the other (unequal variances).

21 Unequal-variances