Statistical Model - fisher.utstat.toronto.edufisher.utstat.toronto.edu/~hadas/STA286/Lecture...

STA286 week 9 1

Statistical Model

• A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution thatproduced the data.

• The statistical model corresponds to the information a statistician brings to the application about what the true distribution is or at least what he or she is willing to assume about it.

• The variable θ is called the parameter of the model, and the set Ω is called the parameter space.

• From the definition of a statistical model, we see that there is a unique value , such that fθ is the true distribution that generated the data. We refer to this value as the true parameter value.

Ω∈θθ :f

Ω∈θ

STA286 week 9 2

Examples

• Suppose there are two manufacturing plants for machines. It is known that the life lengths of machines built by the first plant have an Exponential(1) distribution, while machines manufactured by the second plant have life lengths distributed Exponential(1.5). Youhave purchased five of these machines and you know that all fivecame from the same plant but do not know which plant. Further, you observe the life lengths of these machines, obtaining a sample (x1, …, x5) and want to make inference about the true distribution of the life lengths of these machines.

• Suppose we have observations of heights in cm of individuals in a population and we feel that it is reasonable to assume that the distribution of height is the population is normal with some unknown mean and variance. The statistical model in this case is

where Ω = R×R+, where R+ = (0, ∞). ( ) Ω∈2, ,:2 σμσμ

f

STA286 week 9 3

Point Estimate• Most statistical procedures involve estimation of the unknown value

of the parameter of the statistical model.• A point estimate, , is an estimate of the parameter θ.

It is a statistic based on the sample and therefore it is a random variable with a distribution function.

• The standard deviation of the sampling distribution of an estimator is usually called the standard error of the estimator.

• For a given statistical model with unknown parameter θ there could be more then one point estimate.

• The parameter θ of a statistical model can have more then just one element.

( )nxx ,...,ˆˆ1θθ =

STA286 week 9 4

Properties of Point Estimators

• Let be a point estimator for a parameter θ. Then is an unbiased estimator if

• The variance of a point estimator is

Consider all possible unbiased estimators of some parameter θ, the one with the smallest variance is called the most efficient estimator of θ.

( ) .ˆ θθ =E

( ) ( ) ( )( ) .ˆˆˆvar22 θθθ EE −=

θθ

STA286 week 9 5

Common Point Estimators

• A natural estimate for the population mean μ is the sample mean (in any distribution). The sample mean is an unbiased estimator of the population mean.

• A common estimator for the population variance is the sample variance s2.

STA286 week 9 6

Claim• Let X1, X2,…, Xn be random sample of size n from a normal

population. The sample variance s2 is an unbiased estimator of the population variance σ2.

• Proof…

STA286 week 9 7

Example

• Suppose X1, X2,…, Xn are i.i.d Poisson(λ). Let then…X=λ

STA286 week 9 8

The Likelihood Function - Introduction

• Recall: a statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced the data.

• The distribution fθ can be either a probability density function or a probability mass function.

• The joint probability density function or probability mass function of iid random variables X1, …, Xn is

Ω∈θθ :f

( ) ( ).,...,1

1 ∏=

=n

iin xfxxf θθ

STA286 week 9 9

The Likelihood Function• Let x1, …, xn be sample observations taken on corresponding random

variables X1, …, Xn whose distribution depends on a parameter θ. The likelihood function defined on the parameter space Ω is given by

• Note that for the likelihood function we are fixing the data, x1,…, xn, and varying the value of the parameter.

• The value L(θ | x1, …, xn) is called the likelihood of θ. It is the probability of observing the data values we observed given that θ is the true value of the parameter. It is not the probability of θ given that we observed x1, …, xn.

( ) ( ).,...,,...,| 11 nn xxfxxL θθ =

STA286 week 9 10

Examples

• Suppose we toss a coin n = 10 times and observed 4 heads. With no knowledge whatsoever about the probability of getting a head on a single toss, the appropriate statistical model for the data is the Binomial(10, θ) model. The likelihood function is given by

• Suppose X1, …, Xn is a random sample from an Exponential(θ)distribution. The likelihood function is

STA286 week 9 11

Maximum Likelihood Estimators

• In the likelihood function, different values of θ will attach different probabilities to a particular observed sample.

• The likelihood function, L(θ | x1, …, xn), can be maximized over θ, to give the parameter value that attaches the highest possible probability to a particular observed sample.

• We can maximize the likelihood function to find an estimator of θ.

• This estimator is a statistics – it is a function of the sample data. It is denoted by .θ

STA286 week 9 12

The log likelihood function

• l(θ) = ln(L(θ)) is the log likelihood function.

• Both the likelihood function and the log likelihood function have their maximums at the same value of

• It is often easier to maximize l(θ).

.θ

STA286 week 9 13

Examples

STA286 week 9 14

Important Comment

• Some MLE’s cannot be determined using calculus. This occurs whenever the support is a function of the parameter θ.

• These are best solved by graphing the likelihood function.

• Example:

STA286 week 9 1515

Confidence Intervals – Introduction

• A point estimate provides no information about the precision andreliability of estimation.

• For example, the sample mean is a point estimate of the population mean μ but because of sampling variability, it is virtually never the case that

• A point estimate says nothing about how close it might be to μ.

• An alternative to reporting a single sensible value for the parameter being estimated it to calculate and report an entire interval ofplausible values – a confidence interval (CI).

X

.μ=x

STA286 week 9 1616

Confidence level

• A confidence level is a measure of the degree of reliability of a confidence interval. It is denoted as 100(1-α)%.

• The most frequently used confidence levels are 90%, 95% and 99%.

• A confidence level of 100(1-α)% implies that 100(1-α)% of all samples would include the true value of the parameter estimated.

• The higher the confidence level, the more strongly we believe that the true value of the parameter being estimated lies within the interval.

• Ideally, we want a short interval with a high degree of confidence.

STA286 week 9 17week 5 17

CI for μ When σ is Known

• Suppose X1, X2,…,Xn is a random sample from some population with unknown mean μ and known variance σ2.

• A 100(1-α)% confidence interval for μ is,

• Proof:

nzx σα ⋅±2


Example

• The National Student Loan Survey collected data about the amountof money that borrowers owe. The survey selected a random sampleof 1280 borrowers who began repayment of their loans between four to six months prior to the study. The mean debt for the selectedborrowers was $18,900 and the standard deviation was $49,000. Find a 95% for the mean debt for all borrowers.


Width and Precision of CI

• The precision of an interval is conveyed by the width of the interval.

• If the confidence level is high and the resulting interval is quite narrow, the interval is more precise, i.e., our knowledge of the value of the parameter is reasonably precise.

• A very wide CI implies that there is a great deal of uncertaintyconcerning the value of the parameter we are estimating.

• The width of the CI for μ is ….

STA286 week 9 20

Sample Size for Desired Error

• A (1-α)100% confidence interval for population mean will have an error that will not exceed a specific amount e when the sample size is

• Example:A limnologist wishes to estimate the mean phosphate content per unit volume of lake water. It is known from previous studies that the stdev. has a fairly stable value of 4mg. How many water samples must the limnologist analyze to be 90% certain that the error of estimation does not exceed 0.8 mg?

22/ ⎟

⎠⎞

⎜⎝⎛≥

ez

nσα


Important Comment

• Confidence intervals do not need to be central, any a and b that solve

define 100(1-α)% CI for the population mean μ.

ασ

μ−=⎟⎟

⎠

⎞⎜⎜⎝

⎛<

−< 1

/b

nXaP


One Sided CI

• CI gives both lower and upper bounds for the parameter being estimated.

• In some circumstances, an investigator will want only one of these two types of bound.

• A large sample upper confidence bound for μ is

• A large sample lower confidence bound for μ is

nzx σμ α ⋅+<

nzx σμ α ⋅−>

STA286 week 9 23

CI for μ When σ is Unknown

• Suppose X1, X2,…,Xn are random sample from N(μ, σ2) where both μ and σ are unknown.

• If σ2 is unknown we can estimate it using s2 and use the tn-1distribution.

• A 100(1-α)% confidence interval for μ in this case, is …

week 5 23


Large Sample CI for μ• Recall: if the sample size is large, then the CLT applies and we have

• A 100(1-α)% confidence interval for μ, from a large iid sample is

• If σ2 is not known we estimate it with s2.

( ).1,0~/

NZn

X d⎯→⎯−

σμ

nzx σα ⋅±2

STA286 week 9 25

Example – Binomial Distribution

• Suppose X1, X2,…,Xn are random sample from Bernoulli(θ) distribution. A 100(1-α)% CI for θ is….

• Example…

week 5 25

STA286 week 9 26

Prediction Intervals

• Sometimes, an experimenter may also be interested in predicting the possible value of a future observation.

• Suppose X1, X2,…,Xn is a random sample from a normal population with unknown mean μ and known variance σ2.

• A 100(1- α)% prediction interval of a future observation x0, is

• If the variance σ2 is unknown, we estimate it by the sample variance S2 and use the t distribution with n-1 degrees of freedom. The interval is then,

nzx 11

2

+⋅± σα

( ) nstx

n

112

;1+⋅±

−α

STA286 week 9 27

Two Sample Problem

• Sometimes we are interested in comparing means in two independent populations (e.g. mean income for male and females).

• We will select two independent samples one from each population and use the sample means for the estimation of the difference between the population means.

• Example: A medical researcher is interested in the effect of added calcium in our diet on blood pressure. She conducted a randomized comparative experiment in which one group of subjects receive a calcium supplement and a control group gets a placebo.

STA286 week 9 28

Case 1 – Variances are known

• Two independent populations variances known…

STA286 week 9 29

Example• A regional IRS auditor runs a test on a sample of returns filed by

March 15 to determine whether the average return this year is larger than last year. The sample data are shown here for a random sample of returns from each year.

• Assume that the standard deviation of returns is known to be about 100 for both years. Find a 95% CI for the difference in average between this year and last year.

Last Year This YearMean 380 410

Sample size 100 120

STA286 week 9 30

Case 2 – Variances Unknown but Equal

• Two independent populations, both are normal, variances are unknown but equal…

STA286 week 9 31

Example• In a study of heart surgery, one issue was the effects of drugs called

beta blockers on the pulse rate of patients during surgery. The available subjects were divided into two groups of 30 patients each. The pulse rate of each patient at a critical point during the operation was recorded. The treatment group had mean 65.2 and standard deviation 7.8. For the control group the mean was 70.3 and the standard deviation was 8.3.

• Find a 99% CI for the difference in mean pulse rates.

• Denoting the control group as 1 and the treatment group as 2 thesolution is …

STA286 week 9 32

• The pooled standard deviation is

• The 99% CI is,

• Do beta-blocker reduce the pulse rate?

• MINITAB command: Stat > Basic Statistics > 2 Sample t .

05.823030

8.7293.829 22

=−+⋅+⋅

=ps

( ) ( )629.10,429.0301

30105.866.22.653.70 −=+⋅±−

STA286 week 9 33

Example• A study compared various characteristics of 68 healthy and 33

failed firms. One of the variables was the ratio of current assets to current liabilities.

Row Firms(Healthy/Failed) Ratio1 h 1.502 h 0.103 h 1.76

...

99 f 0.13100 f 0.88101 f 0.09

STA286 week 9 34

Stem-and-leaf of Ratio failed N = 33

Leaf Unit = 0.10

5 0 001117 0 22

11 0 445512 0 6

(10) 0 888889999911 1 1111115 1 333 1 42 1 61 1 1 2 0

STA286 week 9 35

Stem-and-leaf of Ratio healthy N = 68

Leaf Unit = 0.10

1 0 12 0 22 0 4 0 66

10 0 89999915 1 0001119 1 222326 1 444555534 1 6666677734 1 8888888999923 2 000011116 2 22222310 2 4557 2 66773 2 82 3 01

STA286 week 9 36

Two Sample T-Test and Confidence IntervalTwo sample T for RatioFirms N Mean StDev SE Meanfailed 33 0.824 0.481 0.084healthy 68 1.726 0.639 0.07895% CI for mu (f) - mu (h): ( -1.129, -0.675)T-Test mu (f) = mu (h) (vs <):T = -7.90 P = 0.0000 DF = 81

Two Sample T-Test and Confidence Interval (pooled test and CI)Two sample T for RatioFirms(He N Mean StDev SE Meanf 33 0.824 0.481 0.084h 68 1.726 0.639 0.07895% CI for mu (f) - mu (h): ( -1.151, -0.652)T-Test mu (f) = mu (h) (vs <): T = -7.17 P = 0.0000 DF = 99Both use Pooled StDev = 0.593

STA286 week 9 37

Case 3 – Variances Unknown and Unequal

• Two independent populations, both approximately normal, variances are unknown and unequal...

STA286 week 9 38

Example• The weight gains for n1 = n2 = 8 rats tested on diets 1 and 2 are

summarized in the following table.

• Find a 95% CI for the difference in the average weight gain between the two diets.

Diet 1 Diet 2n 8 8Std dev. .033 0.070mean 3.1 3.2

STA286 week 9 39

Paired Observations• In a matched pairs study, subjects are matched in pairs and the

outcomes are compared within each matched pair. The experimentercan toss a coin to assign two treatment to the two subjects in each pair. One situation calling for match pairs is when observations are taken on the same subjects, under different conditions.

• A match pairs analysis is needed when there are two measurements or observations on each individual and we want to examine the difference. This corresponds to the case where the samples are not independent.

• For each individual (pair), we find the difference d between the measurements from that pair. Then we treat the di as one sample and use the one sample t confidence interval to estimate the difference between the treatments effect.

STA286 week 9 40

Example• Seneca College offers summer courses in English. A group of 20

students were given the TOFEL test before the course and after the course. The results are summarized in the next slide.

• Find a 95% CI for the average improvement in the TOFEL score.

STA286 week 9 41

Data DisplayRow Student Pretest Posttest improvement

1 1 30 29 -12 2 28 30 23 3 31 32 14 4 26 30 45 5 20 16 -46 6 30 25 -57 7 34 31 -38 8 15 18 39 9 28 33 5

10 10 20 25 511 11 30 32 212 12 29 28 -113 13 31 34 314 14 29 32 315 15 34 32 -216 16 20 27 717 17 26 28 218 18 25 29 419 19 31 32 120 20 29 32 3

STA286 week 9 42

• One sample t confidence interval for the improvementT-Test of the Mean

Test of mu = 0.000 vs mu > 0.000Variable N Mean StDev SE Mean T Pimprovem 20 1.450 3.203 0.716 2.02 0.029

• MINITAB commands for the paired t-testStat > Basic Statistics > Paired t

Paired T-Test and Confidence Interval

Paired T for Posttest – Pretest N Mean StDev SE Mean

Posttest 20 28.75 4.74 1.06Pretest 20 27.30 5.04 1.13Difference 20 1.450 3.203 0.71695% CI for mean difference: (-0.049, 2.949)T-Test of mean difference=0 (vs > 0): T-Value = 2.02 P-Value = 0.029

STA286 week 9 43

Character Stem-and-Leaf Display

Stem-and-leaf of improvement N = 20Leaf Unit = 1.0

2 -0 544 -0 326 -0 118 0 11

(7) 0 22233335 0 44551 0 7

86420-2-4

6

5

4

3

2

1

0

improvement

Freq

uenc

y

STA286 week 9 44

Summary

STA286 week 9 45

One Sample Variance

• In many case we will be interested in making inference about thepopulation variance.

• Suppose X1, X2,…,Xn are random sample from N(μ, σ2) where both μ and σ are unknown. A CI for σ2 is …

STA286 week 9 46

Example

STA286 week 9 47

Two Sample Variance

• In many case we will be interested in comparing the variances oftwo independent populations.

STA286 week 9 48

Example

Date post:	22-May-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Statistical Model - fisher.utstat.toronto.edufisher.utstat.toronto.edu/~hadas/STA286/Lecture...

Documents