Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | hayfa-brewer |
View: | 15 times |
Download: | 1 times |
Principles of Statistics
Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar ImanFormer Director,
Centre for Real Estate StudiesFaculty of Geoinformation Science and Engineering,
Universiti Teknologi Malaysia,Skudai, Johor.
E-mail: [email protected]
Hypothesis Testing
Content:
• Concepts of hypothesis testing
• Test of statistical significance
• Hypothesis testing one variable at a time
Hypothesis
• Unproven proposition
• Supposition that tentatively explains certain facts or phenomena
• Assumption about nature of the world
• E.g. the mean price of a three-bedroom single storey houses in Skudai is RM 155,000.
Hypothesis (contd.)
• An unproven proposition or supposition that tentatively explains certain facts or phenomena:– Null hypothesis– Alternative hypothesis
• Null hypothesis is that there is no systematic relationship between independent variables (IVs) and dependent variables (DVs).
• Research hypothesis is that any relationship observed in the data is real.
Null Hypothesis
• Statement about the status quo
• No difference
• Statistically expressed as:
Ho: b=0
where b is any sample parameter used to explain the population.
Alternative Hypothesis
• Statement that indicates the opposite of the null hypothesis
• There is difference
• Statistically expressed as:
H1: b 0
H1: b < 0
H1: b > 0
Significance Level
• Critical probability in choosing between the Ho and H1.
• Simply means, the cut-off point (COP) at which a given value is probably true.
• Tells how likely a result is due to chance • Most common level, used to mean “something
is good enough to be believed”, is .95.• It means, the finding has a 95% chance of
being likely true. • What is the COP at 95% chance?
Significance Level (contd.)
• Denoted as • Tells how much the probability mass is in the
tails of a given distribution• Probability or significance level selected is
typically .05 or .01• Too low to warrant support for the null
hypothesis• In other words, high chances to warrant
support for alternative hypothesis• Main purpose of statistical testing: to reject
null hypothesis
Let say we have the following relationship:
Y = β + ei i=1,…, T and ei ~ N(0,σ2) ……………....(1)
The least square estimator for β is: T
b=Yi/T ……………………………………………..(2)
i=1
with the following properties:
1) E[b] = β ………………………………………….(3a)
2) Var(b)=E[(b-β)]2 = σ2/T ………………………...(3b)
3) b~N(β, σ2/T) …………………………………….(3c)
The “standardized” normal random variable for β is: b-βZ =-------- ~ N(0,1) ……………………………………..(4) (σ2T)
The critical value of Z, i.e. Zc, such that α=0.05 of the probability mass is in the tails of distribution, is given as:
P[Z 1.96] = P[Z -1.96]=0.025 ………………………(5a)
and
P[-1.96 Z 1.96]=1-0.05=0.95 ………………………(5b)
Substituting SND for variable β (Eqn. 4) into Eqn (5a), we get:
b-βP[-1.96 --------- 1.96]=0.95 ……………………………..…...(6) (σ2/T)
Solving for β, we get:
P[b-1.96σ/T β b+1.96σ/T]=0.95 ………………………… (7)
In general: P[b-Zcσ/T β b+Zcσ/T]= 1- ……………….. (8a)
b-β b -βAlso: P[------- -Zc] = P[ -------- Zc] = α/2 (2-tail test) ...…(8b) σ/T σ/T
The null hypothesis that the mean is equal to 3.0:
Example
You suspect that the mean rental of 225 purpose-built office units in Johor is RM 3.00/sq.ft. If the std. dev. is RM 1.50/sq.ft., what is the 95% confidence interval of the mean?
The alternative hypothesis that the mean does not equal to 3.0:
Ho: μ = 3.0
H1: μ 3.0
Accept null Reject null
Null is true
Null is false
Correct-Correct-no errorno error
Type IType Ierrorerror
Type IIType IIerrorerror
Correct-Correct-no errorno error
Type I and Type II Errors
Type I and Type II Errorsin Hypothesis Testing
State of Null Hypothesis Decisionin the Population Accept Ho Reject Ho
Ho is true Correct--no error Type I errorHo is false Type II error Correct--no error
Example
You estimate that the average price, μ, of single-
and double-storey houses in Malaysia’s major
industrialised towns to be RM 1,600/sq.m.
Based on a sample of 101 houses, you found
that the mean price, , is 1,579.44/sq.m. with a std
dev. of RM 350.13/sq.m.
(a) Would you reject your initial estimate at 0.05 significance level?
(b) What is the confidence interval of rental at 5% s.l.?
Answer (a)
Ho = 1,600
H1 1,600 1,579.44 – 1,600Test statistic: Z = -------------------- 350.13/101 ≈ -0.59P[Z Zc] = P[Z -Zc] = 0.05
P[0.59 Zc ] = 0.05
From Z-table, Zc = 1.645
Since Z < Zc,do not reject Ho. ∴ Rental = RM 1,600/sq.m.
Answer (b)
1,579.13-1.645(34.84)=RM 1,521.82 (lower limit)
1,579.13+1.645(34.84)=RM 1,636.44 (upper limit)
t-Distribution
• Symmetrical, bell-shaped distribution
• Mean of zero and a unit standard deviation
• Shape influenced by degrees of freedom
or
Xlc StX ..
n
StX lc ..limitUpper
n
StX lc ..limitLower
Confidence Interval Estimate Using the t-distribution
= population mean
= sample mean
= critical value of t at a specified confidence
level
= standard error of the mean
= sample standard deviation
= sample size
..lct
X
XSSn
Confidence Interval Estimate Using the t-distribution
Suppose that a production manager believes the average number of defective assemblies each day to be 20. The factory records the number of defective assemblies for each of the 25 days it was opened in a given month. The mean was calculated to be 22, and the standard deviation, ,to be 5.
XS
Univariate Hypothesis Test Utilizing the t-Distribution
The researcher desired a 95 percent confidence, and the significance level becomes .05.The researcher must then find the upper and lower limits of the confidence interval to determine the region of rejection. Thus, the value of t is needed. For 24 degrees of freedom (n-1, 25-1), the t-value is 2.064.
Univariate Hypothesis Test Utilizing the t-Distribution
Testing a Hypothesis about a Distribution
• Chi-Square test
• Test for significance in the analysis of frequency distributions
• Compare observed frequencies with expected frequencies
• “Goodness of Fit”
x² = chi-square statisticsOi = observed frequency in the ith cellEi = expected frequency on the ith cell
Chi-Square Test
Chi-Square Test Estimation for Expected Number
for Each Cell
Ri = total observed frequency in the ith rowCj = total observed frequency in the jth columnn = sample size
Hypothesis Test of a Proportion
is the population proportion
p is the sample proportion
is estimated with p
0115.Sp
000133.Sp 1200
16.Sp
1200
)8)(.2(.Sp
n
pqSp
20.p 200,1n
Hypothesis Test of a Proportion: Another Example
0115.Sp
000133.Sp 1200
16.Sp
1200
)8)(.2(.Sp
n
pqSp
20.p 200,1n
Hypothesis Test of a Proportion: Another Example