Date post: | 16-Dec-2015 |
Category: |
Documents |
Upload: | javen-wittman |
View: | 237 times |
Download: | 2 times |
Topic 7Topic 7
Statistical Estimation and Sampling Distributions
Statistical Estimation and Sampling Distributions
Statistical InferenceStatistical Inference
• A statistical method which involves investigation of properties (estimation) concerning the unknown population parameters based on sample statistic results.
Point EstimatesPoint Estimates
• The parameter, which is denoted by θ , is an unknown property of a population. For example, mean, variance, proportion or particular quantile of the probability distribution.
• The statistic is a property of a sample. For example, sample mean, sample variance, proportion or a particular sample quantile
• Estimation is a procedure by which the information contained within a sample is used to investigate properties of the population from which the sample is drawn
Point Estimates of ParametersPoint Estimates of Parameters
Estimate Population Parameters ( ) ….
with Sample Statistics ( )
Mean ( µ )
Standard Deviation ( ) S
Proportion ( p )
X
p
• A point estimate of an unknown parameter θ is a statistic that represents a “best guess” at the value of θ . There may be more than one sensible point estimate of a parameter. For example,
A point estimates for a parameter is said to be unbiased if
Unbiased and Biased Point EstimatesUnbiased and Biased Point Estimates
ˆE
Unbiasedness is a very good property for a point estimate to possess.
If a point estimate is not unbiased then its bias can be defined to be
ˆBias E
Point Estimate of a Success ProbabilityPoint Estimate of a Success Probability
The obvious point estimate of p is
n
Xp ˆ
Notice that the number of successes X has a binomial distribution, X ~ B(n,p). Therefore
pnpn
XEnn
XEpE
npXE
11ˆ
So that indeed an unbiased point estimate of pp
Point Estimate of a Population MeanPoint Estimate of a Population Mean
Clearly it is since
niXE i 1,
[ Remember, fair coin (n = 2, μ = p = ½) and fair dice (n = 6, μ = p =1/6 ]
So that
nnn
XEn
XEnn
XEXEE
n
i
n
ii
n
ii
n
ii
111
1ˆ
11
1
1
Then indeed an unbiased point estimate of μX
Point Estimate of a Population VariancePoint Estimate of a Population Variance
We know that the sample variance
1
1
2
2
n
XXS
n
ii
Then
2
1 1 1
2
1 1
22
1
2
1
22
21
1
21
1
1
1
1
1
XnXXXEn
XnXXXEn
XXEn
XXEn
SE
n
i
n
i
n
iii
n
i
n
iii
n
ii
n
ii
Point Estimate of a Population VariancePoint Estimate of a Population Variance
Since
XnXn
ii
1
nn
i
1
then
2
1
2
2
1
2
22
1
2
2
1 1 1
22
1
1
1
1
21
1
21
1
XnEXEn
XnXEn
XnXnXEn
XnXXXEn
SE
n
ii
n
ii
n
ii
n
i
n
i
n
iii
Point Estimate of a Population VariancePoint Estimate of a Population Variance
We notice that
ni,XVarXE ii 122
nn
nXVar
n
XVarn
nXVarXVarXE
n
ii
n
ii
n
ii
2
2
2
12
12
1
2
1
1
Remember,
XVarn
XEXEn
nXEnXEnXVar
2
222
22
11
22
2222
22
22
XEXE
XEXEXEXEXEXEXXE
XEXEXEXVar
Putting this all together gives
2222
1
2
2
1
22
1
1
1
1
1
1
nnn
nn
XnEXEn
SE
n
i
n
ii
Point Estimate of a Population VariancePoint Estimate of a Population Variance
1
2
ˆ
ˆ
Var
Varrel
Minimum Variance EstimatesMinimum Variance Estimates
• The best situation is constructing a point estimate that is unbiased and that also has the smallest possible variance
• An unbiased point estimate that has a smaller variance than any other point estimate is called a minimum variance unbiased estimate (MVUE).
• The efficiency of MVUE is shown by its relative efficiency
• The relative efficiency of an unbiased point estimate to an unbiased point estimate is given by
12
2
22
2
2
ˆˆ
ˆˆˆ
ˆˆˆ
ˆˆ
biasVarMSE
EEE
EEE
EMSE
Mean Square ErrorsMean Square Errors
• In the case that two point estimates have different expectations and different variances, we prefer the point estimate that minimizes the value of mean square error (MSE) which is defined to be
ExercisesExercises
936
ˆ,4
3
4ˆ,
22ˆ 21
321
221
1 XXXXXX
• Suppose that E(X1) = μ, Var(X1) = 10, E(X2) = μ, and Var(X2) = 15, and consider the point estimates
a. Calculate the bias of each point estimate. Is any one of them unbiased
b. Calculate the variance of each point estimate. Which one has the smallest variance?
c. Calculate the mean square error of each point estimate. Which point estimate has the smallest mean square error when μ = 8
d. What is the relative efficiency of to the point estimate of ?12
Exercise SolutionExercise Solution
2
92
9ˆˆ
299
369
369
36ˆ
0ˆˆ
34
13
4
1
4
3
4ˆ
0ˆˆ
2
1
2
1
22ˆ
13
21213
22
2121
2
11
2121
1
EofBias
XEXEXXEE
EofBias
XEXEXX
EE
EofBias
XEXEXX
EEa.
Exercise SolutionExercise Solution
9444.1
09
15
36
109
9369
36ˆ
0625.9
1351016
19
16
1
4
3
4ˆ
25.615104
1
4
1
22ˆ
21213
2121
2
2121
1
VarXVarXVarXX
VarVar
XVarXVarXX
VarVar
XVarXVarXX
VarVar
b.
Exercise SolutionExercise Solution
c.
d.
9444.262
899444.1ˆˆ
0625.900625.9ˆˆ
25.6025.6ˆˆ
22
33
2222
2211
biasVarMSE
biasVarMSE
biasVarMSE
69.0
0625.9
25.6ˆ
ˆ
2
1
Var
Varrel
Sampling DistributionsSampling Distributions
Since the summary measures of one sample vary to those of another sample, we need to consider the probability distributions or sampling distributions of the sample mean , the sample variance S2, and the sample proportion .
Xp
Sampling MeansSampling Means
If X1, … , Xn are observations from a population with a mean μ and a variance σ2 , then the central limit theorem indicates that the sample mean has the approximate distribution
X
nNX
2
,~ˆ
The standard deviation of the sample mean is referred to as standard error (SE)
Since the standard deviation σ is usually unknown, it can be replaced by S.
n
XSE
is a chi-square distribution with n – 1 degrees of freedom.
In the case that the variance is unknown, If X1, …. Xn are normally distributed with a mean μ , then
Sample VariancesSample Variances
If X1, … , Xn are normally distributed with a mean μ and a variance σ2 , then the sample variance S2 has the distribution
21
22
1~ nn
S
21n
2
1~)(
nt
nSX
XSE
X
tn-1 is student’s t distribution with n – 1 degrees of freedom.
The standard error of the sample proportion is
Sample ProportionsSample Proportions
If X ~ B(n, p), then the sample proportion has the approximate distribution
n
pppNp
1,~ˆ
n
pppSE
1ˆ
nXp ˆ
ExercisesExercises
1) The capacitances of certain electronic components have a normal distribution with a mean μ = 174 and a standard deviation σ = 2.8. If an engineer randomly selects a sample of n = 30 components and measures their capacitances, what is the probability that the engineer’s point estimate of the mean μ will be within the interval (173, 175)?
2) A scientist reports that the proportion of defective items from a process is 12.6%. If the scientist’s estimate is based on the examination of a random sample of 360 items from the process, what is the standard error of the scientist’s estimate?
3) The pH levels of food items prepared in a certain way are normally distributed with a standard deviation of σ = 0.82. An experimenter estimates the mean pH level by averaging the pH levels of a random sample of n items. What sample size n is needed to ensure that there is a probability of at least 99% that the experimenter’s estimate in within 0.5 of the true mean value?
Exercise SolutionsExercise Solutions
1) Recall
look up the table!
2)
n
XZ
nNthen
XZN
2
2 ,,
9500.00250.09750.0
96.196.1
96.196.1175173
96.1
308.2
174175,96.1
308.2
174173
ZPZP
ZPXP
ZZ upperlower
0175.0
360
126.01126.01ˆ
n
pppSE
126.0%6.12 p
Exercise SolutionsExercise Solutions
3) Recall
The estimate is within 0.5 of the true mean value
nZX
n
XZthen
ZXX
Z
575.2575.2
9950.00050.0
0050.09950.09900.0%99
banda
bZPandaZP
bZaP
188.1725.0
82.0575.2
5.05.0
5.022
2
22
Zn
nZ
thenX
Maximum Likelihood EstimatesMaximum Likelihood Estimates
We have considered the obvious point estimates for a success probability, a population mean and variance. However, it is often of interest to estimate parameters that require less obvious point estimates. For example, how should the parameters of the Poisson, exponential, beta or gamma distributions be estimated?
Maximum likelihood estimation is one of general and more technical methods of obtaining point estimates.
Maximum Likelihood Estimate for One ParameterMaximum Likelihood Estimate for One Parameter
If a data set consists of observations x1, x2, …, xn from a probability distribution f (x,) depending upon one unknown parameter , the maximum likelihood estimate of the parameter is found by maximizing the likelihood function
,xf,xf,x,,xL nn 11
In practice, the maximization of the likelihood function is usually performed by taking the derivative of the natural log of the likelihood function.
0
dLlnd
ExampleExample
Suppose again that x1, x2, …, xn are a set of Bernoulli observation, with each taking the value 1 (success) with probability p and the value 0 (no success) with the probability 1 – p .
pp,fpp,f 10and1
We can write this as
ii xxi ppp,xf 11
The likelihood function is therefore
xxn
i
xxn
nn
ppppp,x,,xL
,xf,xf,x,,xL
ii
1
1
11
11
11
Where x = x1 + x2 +…+ xn and the maximum likelihood estimate is the value that maximize thisp
ExampleExample
pxnpxL 1lnlnln
and
p
xn
p
x
dp
Ld
1
ln
Setting this expression equal to 0 and solving for p produce
n
xp ˆ
Maximum Likelihood Estimate for Two ParameterMaximum Likelihood Estimate for Two Parameter
If a data set consists of observations x1, x2, …, xn from a probability distribution f (x,1, 2) depending upon two unknown parameter, the maximum likelihood estimate and are the values of the parameters that jointly maximize the likelihood function
1 2
21211211 ,,,,,,,, nn xfxfxxL
Again the best way to perform the joint maximization is usually to take derivatives of the log-likelihood with respect to and to set the two resulting expressions equal to 0
1 2
ExampleExample
The normal distribution is an example of a distribution with two parameters, with a probability density function
22 22
2
1,,
xexf
The likelihood of a set of normal observation is therefore
n
i
i
n
n
iin
x
xfxxL
12
22
2
1
221
2exp
2
1
,,,,,,
ExampleExample
So that the log-likelihood is
2
1
2
2
22ln
2ln
n
i ixnL
Taking derivatives with respect to the parameters values and gives
4
1
2
22
21
22
ln
ln
n
i i
n
i i
xnLd
xLd
2
ExampleExample
Setting d ln(L)/dμ = 0 gives
x
And setting d ln(L)/dσ2 = 0 then gives
n
xx
n
xn
i i
n
i i
1
2
1
2
2ˆ
ˆ
Did you see any difference from the variance estimate that we have discussed before?
ExercisesExercises
Suppose that the quality inspector at the glass manufacturing company inspects 30 randomly selected sheets of the glass and records the number of flaws found in each sheet. These data values are shown as follows
0 , 1 , 1 , 1 , 0 , 0 , 0 , 2 , 0 , 1 , 0 , 1 , 0 , 0 , 0 ,
0 , 0 , 1 , 0 , 2 , 0 , 0 , 3 , 1 , 2 , 0 , 0 , 1 , 0 , 0
If the distribution of the number of flaws per sheet is taken to have a Poisson distribution, how should the parameter λ of the Poisson distribution be estimated? And find its value.
Exercise SolutionsExercise Solutions
We should first estimate the parameter of λ. Then, the probability mass function of the data is
!
,x
exf
X
So that the likelihood is
!!,,,,
111
1
n
xxnn
iin xx
exfxxL
n
The log-likelihood is therefore
!!lnlnln 11 nn xxxxnL
Taking its derivative w.r.t. λ and setting it to zero, we get
x
xxn
d
Ld n
ˆ0ln 1
Exercise SolutionsExercise Solutions
Therefore
567.015
0000111 0ˆ
x
Since the variance of each data is λ , then
nn
nXXVar
n
n
XXVarXVarVar
n
n
212
1
1
ˆ
The standard error of the estimate of a Poisson parameter is
1375.030
567.0ˆˆ
nSE