Essentials of Marketing Research (Second Edition)

transcript

Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day

Kumar Aaker & DayKumar Aaker & Day

Instructor’s Presentation SlidesInstructor’s Presentation Slides

Chapter Thirteen

Hypothesis Testing: Hypothesis Testing:

Basic Concepts and Tests of Association, Basic Concepts and Tests of Association, Means and ProportionMeans and Proportion

Hypothesis Testing: Basic Concepts

Assumption (hypothesis) made about a population parameter (not sample parameter)

Purpose of Hypothesis Testing

To make a judgement about the difference between two sample statistics or the sample statistic and a hypothesized population parameter

Evidence has to be evaluated statistically before arriving at a conclusion regarding the hypothesis.

Hypothesis Testing

The null hypothesis (Ho) is tested against the alternative hypothesis (Ha).

At least the null hypothesis is stated.

Decide upon the criteria to be used in making the decision whether to “reject” or "not reject" the null hypothesis.

The Logic of Hypothesis Testing

Evidence has to be evaluated statistically before arriving at a conclusion regarding the hypothesis

Depends on whether information generated from the sample is with fewer or larger observations

Problem Definition

Clearly state the null and alternative hypotheses.

Choose the relevant test and the appropriate

probability distribution

Choose the critical value

Compare test statistic and critical value

Reject null

Does the test statistic fall in the critical region?

Determine the significance

Compute relevant test

statistic

Determine the degrees of freedom

Decide if one-or two-tailed test

Do not reject null

Basic Concepts of Hypothesis Testing (Contd.)

The Three Criteria Used Are

Significance Level

Degrees of Freedom

One or Two Tailed Test

Significance Level

Indicates the percentage of sample means that is outside the cut-off limits (critical value)

The higher the significance level () used for testing a hypothesis, the higher the probability of rejecting a null hypothesis when it is true (Type I error)

Accepting a null hypothesis when it is false is called a Type II error and its probability is ()

Significance Level (Contd.)

When choosing a level of significance, there is an inherent tradeoff between these two types of errors

Power of hypothesis test (1 - )

A good test of hypothesis ought to reject a null hypothesis when it is false

1 - should be as high a value as possible

Degree of Freedom

The number or bits of "free" or unconstrained data used in calculating a sample statistic or test statistic

A sample mean (X) has `n' degree of freedom

A sample variance (s2) has (n-1) degrees of freedom

One or Two-tail Test

One-tailed Hypothesis Test Determines whether a particular population

parameter is larger or smaller than some predefined value

Uses one critical value of test statistic Two-tailed Hypothesis Test

Determines the likelihood that a population parameter is within certain upper and lower bounds

May use one or two critical values

Basic Concepts of Hypothesis Testing (Contd.)

Select the appropriate probability distribution based on two criteria

Size of the sample

Whether the population standard deviation is known or not

Hypothesis Testing

DATA ANALYSISOUTCOME

In Population Accept NullHypothesis

Reject NullHypothesis

Null HypothesisTrue

Correct Decision Type I Error

Null HypothesisFalse

Type II Error CorrectDecision

Hypothesis Testing

Tests in this classStatistical Test

Frequency Distributions 2

Means (one) z (if is known)

t (if is unknown)

Means (two) t Means (more than two) ANOVA

Cross-tabulation and Chi Square

In Marketing Applications, Chi-square Statistic Is Used As

Test of Independence Are there associations between two or more variables in a study?

Test of Goodness of Fit Is there a significant difference between an observed frequency

distribution and a theoretical frequency distribution?

Statistical Independence Two variables are statistically independent if a knowledge of one

would offer no information as to the identity of the other

Chi-Square As a Test of Independence

Null Hypothesis Ho

Two (nominally scaled) variables are statistically independent

Alternative Hypothesis Ha

The two variables are not independent

Use Chi-square distribution to test

Chi-square As a Test of Independence (Contd.)

Chi-square Distribution

A probability distribution

Total area under the curve is 1.0

A different chi-square distribution is associated with different degrees of freedom

Chi-square As a Test of Independence (Contd.)

Degree of Freedom

v = (r - 1) * (c - 1)

r = number of rows in contingency table

c = number of columns

Mean of chi-squared distribution

= Degree of freedom (v)

Variance = 2v

Chi-square Statistic (2)

Measures of the difference between the actual numbers observed in cell i (Oi), and number expected (Ei) under independence if the null hypothesis were true

With (r-1)*(c-1) degrees of freedom

r = number of rows c = number of columns

Expected frequency in each cell: Ei = pc * pr * n

Where pc and pr are proportions for independent variables and n is the total number of observations

Chi-square Step-by-Step

1) Formulate Hypotheses

2) Calculate row and column totals

3) Calculate row and column proportions

4) Calculate expected frequencies (Ei)

5) Calculate 2 statistic

6) Calculate degrees of freedom

7) Obtain Critical Value from table

8) Make decision regarding the Null-hypothesis

Example of Chi-square as a Test of Independence

A 10 8

Grade B 20 16

C 45 18

D 16 6E 9 2

This is a ‘Cell’

Chi-square As a Test of Independence - Exercise

Own Income

Expensive Low Middle High

Automobile

Yes 45 34 55

No 52 53 27

Task: Make a decision whether the two variables are independent!

The chi-square distribution

Probability distributions that are continuous, have one mode, and are skewed to the right. Exact shape varies according to the number of degrees of freedom. The critical value of a test statistic in a chi-square distribution is determined by specifying a

significance level and the degrees of freedom. Ex: Significance level = .05

Degrees of freedom = 4

CVx2 = 9.49

The decision rule when testing hypotheses by means of chi-square distribution is:

If x2 is <= CVx2, accept H0 Thus, for 4 df and = .05

If x2 is > CVx2, reject H0 If If x2 is <= 9.49, accept H0

df = 4F(x2) Critical value = 9.49

5% of area under curve

Cross Tabulation Example In a nationwide study of 1,402 adults a question was asked about institutions:

“I am going to name some institutions in this country. As far as the people running these institutions are concerned, would you say have a great deal of confidence, only some confidence, or hardly any confidence at all in them?”

One of the institutions was television.

Answers to the question about television are cross-tabulated with three levels of income below.

Annual Family Income

$10,000

$10,000 – 20,000

Over $20,000

95 57 39 191

272 274 214 760

140 163 148 451

507 494 401 1,402

A great deal

Only some

Hardly any

Amount of confidence in television

Calculations for income-confidence data

Cell Observed Expected Contribution

(Ou – Eu)2/ Eu

Cell11 95 69.1 9.71

Cell12 57 67.3 1.58

Cell13 39 54.6 4.46

Cell21 272 274.8 .03

Cell22 274 267.8 .14

Cell23 214 217.4 .05

Cell31 140 163.1 3.27

Cell32 163 158.9 .11

Cell33 148 129.0 2.80

X2ts = 22.15

df = 4 [(r-1) (c-1)]

n = 1402

X2cv = 9.5

X2ts = 22.15

df = 4F(x2) X2

cv = 9.5

5% of area under curve

Strength of Association

Measured by contingency coefficient

C = x2 o< c < 1

x2 + n 0 - no association (i.e. Variables are statistically

independent) Maximum value depends on the size of table-compare

only tables of same size

Limitations As an Association Measure

It Is Basically Proportional to Sample Size

Difficult to interpret in absolute sense and compare cross-tabs of unequal size

It Has No Upper Bound

Difficult to obtain a feel for its value

Does not indicate how two variables are related

Chi-square Goodness of Fit

Used to investigate how well the observed pattern fits the expected pattern

Researcher may determine whether population distribution corresponds to either a normal, poisson or binomial distribution

Chi-square Degrees of Freedom

Employ (k-1) rule

Subtract an additional degree of freedom for each population parameter that has to be estimated from the sample data

Goodness-of-Fit Test

Suppose a researcher is investigating preferences for four possible names of a new lightweight brand of sandals: Camfo, Kenilay, Nemlads, and Dics. Since the names are generated from random combinations of syllables, thre researcher expects preferences will be equally distributed across the four names (that is, each name will receive 25 percent of the available preferences). After sampling 300 people at reandom and asking them which one of the four names was most preferred, the following distribution resulted (each expected value is 300 * .25 = 75).

Possible Name Observed Preferences Expected Preferences

Camfo 30 75

Kenilay 80 75

Nemlads 120 75

Dics 70 75

Goodness-of-Fit Test (cont.) There are (d – 1) or three degrees of freedom in this instance. If is

specified as 0.01, the critical value is 11.325 from Statistical Appendix Table 3.18 Given this information, the hypothesis to be tested can be stated as:

H0: preferences are equal for the names

Ha: preferences are not equal for the names

And the decision rule is

If x2 is <= 11.325, accept H0.

If x2 is > 11.325, reject H0.

The test statistic is calculated as

x2 = (30-75)2 / 75 + (80-75)2 / 75 + (120-75)2 / 75 + (70-75)2 / 75

= 27.00 + .33 + 27.00 + .33

= 54.66

Hypothesis Testing For Differences Between Means

Commonly used in experimental research Statistical technique used is analysis Of variance (ANOVA)

Hypothesis Testing Criteria Depends on Whether the samples are obtained from different or related populations Whether the population is known on not known If the population standard deviation is not known, whether they can be

assumed to be equal or not

The Probability Values (P-value) Approach to Hypothesis Testing

P-value provides researcher with alternative method of testing hypothesis without pre-specifying

Largest level of significance at which we would not reject ho

Difference Between Using and p-value

Hypothesis testing with a pre-specified Researcher is trying to determine, "is the probability

of what has been observed less than ?" Reject or fail to reject ho accordingly

The Probability Values (P-value) Approach to Hypothesis Testing (Contd.)

Using the p-Value Researcher can determine "how unlikely is the result that has been

observed?"

Decide whether to reject or fail to reject ho without being bound by a pre-specified significance level

In general, the smaller the p-value, the greater is the researcher's confidence in sample findings

P-value is generally sensitive to sample size

A large sample should yield a low p-value

P-value can report the impact of the sample size on the reliability of the results

Hypothesis Testing About a Single Mean - Step-by-Step

1) Formulate Hypotheses

2) Select appropriate formula

3) Select significance level

4) Calculate z or t statistic

5) Calculate degrees of freedom (for t-test)

6) Obtain critical value from table

7) Make decision regarding the Null-hypothesis

Hypothesis Testing About a Single Mean - Example 1

Ho: = 5000 (hypothesized value of population)

Ha: 5000 (alternative hypothesis) n = 100 X = 4960 = 250 = 0.05

Rejection rule: if |zcalc| > z/2 then reject Ho.

Ho: = 1000 (hypothesized value of population)

Ha: 1000 (alternative hypothesis) n = 12 X = 1087.1 s = 191.6 = 0.01

Rejection rule: if |tcalc| > tdf, /2 then reject Ho.

Ho: 1000 (hypothesized value of population)

Ha: > 1000 (alternative hypothesis) n = 12 X = 1087.1 s = 191.6 = 0.05

Rejection rule: if tcalc > tdf, then reject Ho.

Confidence Intervals

Hypothesis testing and Confidence Intervals are two sides of the same coin.

interval estimate

)( xtsX

Confidence Interval Estimation

If = .95 then,

Problem:n = 75 = .01

Since CI is for both sides, z-value is got for /2 = .005Z /2 = 2.58

Test the hypothesis that the true mean weight of the Hawkeyes football team is greater than or equal to 300 pounds with = .05

99.0)46.29454.285(

99.))75

15(58.2290

15(58.2290(

95.)( n

H0: uW 300

H1: uW < 300

At = 0.05, CVZ = -1.645 (for a one-tailed test)

Since Zts falls in the critical region

We ______________________ the null hypothesis

Test the hypothesis that the true mean weight of the Hawkeyes football team is equal to 286 pounds with = 0.01

H0: uW = 286

uW 286

AT = .01

CVZ = 2.58

Since Zts < CvZ we __________________ the null hypothesis

H0: PA = PB

HA: PA not equal to PB

Chain N Proportion of Stores Open for 24 hours

A 40 -45

B 75 -40

= weighted average of sample proportions

Computation of tts would proceed as follows:

df = n1+n2-2

(n1-1) + (n2-1)

df = 113

-1.96.025

+1.96.025

42.115

)40(.75)45(.40ˆ

Descriptive Statistics for two samples of students, liberal arts majors (n = 317) and engineering majors (n = 592) include

The smaller the mean, the more students agree with the statement. The formula for a t-test of mean differences for independent samples is

With being the standard error of the mean difference

Is a weighted average of sample standard deviations. In this situation the hypothesis:

Liberal arts majors Engineering majors

X 2.59 2.29

S 1.00 1.10

Pooled Std. dev

Tts= 2.59-2.29 / .07 = .30 / .07 = 4.29

= 1.07

Statistical techniques

Analysis of Variance (ANOVA)

Correlation Analysis

Regression Analysis

Analysis of Variance

• ANOVA mainly used for analysis of experimental data

• Ratio of “between-treatment” variance and “within- treatment” variance

Analysis of Variance (ANOVA)

Response variable - dependent variable (Y)

Factor(s) - independent variables (X)

Treatments - different levels of factors (r1, r2, r3, …)

One - Factor Analysis of Variance

Studies the effect of 'r' treatments on one response variable

Determine whether or not there are any statistically significant differences between the treatment means 1, 2,... R

Ho: all treatments have same effect on mean responses

H1 : At least 2 of 1, 2 ... r are different

Example (Book p.495)

Product Sales 1 2 3 4 5 Total Xp

39 ¢ 8 12 10 9 11 50 10PriceLevel 44 ¢ 7 10 6 8 9 40 8

49 ¢ 4 8 7 9 7 35 7

Overall sample mean: X = 8.333Overall sample size: n = 15No. of observations per price level: np = 5

Example (Book p.495)

One - Factor ANOVA - Intuitively

If: Between Treatment Variance

Within Treatment Variance

is large then there are differences between treatments

is small then there are no differences between treatments

To Test Hypothesis, Compute the Ratio Between the "Between Treatment" Variance and "Within Treatment" Variance

One - Factor ANOVA Table

Source of Variation Degrees of Mean Sum F-ratio

Variation (SS) Freedom of Squares

Between SSr r-1 MSSr =SSr/r-1 MSSr

(price levels) MSSu

Within SSu n-r MSSu=SSu/n-r

(price levels)

Total SSt n-1

Between Treatment Variance

SSr = np (Xp - X)2 = 23.3

Within-treatment variance

SSu = (Xip - Xp)2 = 34

SSr = treatment sums of squares r = number of groups

np = sample size in group ‘p’ Xp = mean of group p

X = overall mean Xip =sales at store i at level p

i=1 p=1

Between variance estimate (MSSr)

MSSr = SSr/(r-1) = 23.3/2 = 11.65

Within variance estimate (MSSu)

MSSu = SSu/(n-r) = 34/12 = 2.8

Wheren = total sample size r = number of groups

Total variation (SSt): SSt = SSr + SSu = 23.3+34 = 57.3

F-statistic: F = MSSr / MSSu = 11.65/2.8 = 4.16

DF: (r-1), (n-r) = 2, 12

Critical value from table: CV(, df) = 3.89

Essentials of Marketing Research (Second Edition)

Documents