Luo Shan - 上海交通大学数学系math.sjtu.edu.cn/zhiyuancenter/userfiles/files/Lect5.pdf · 1...

ÚOíä�ûü

Luo Shan

Department of Mathematics

Shanghai Jiao Tong University

March 31st, 2015

Luo Shan (SJTU) ÚOíä�ûü March 31st, 2015 1 / 42

Lecture 5

1 What is in the last lecture

2 Asymptotic Criteria and InferenceConsistencyAsymptotic bias, variance and mse

3 UMVUESufficient and complete statisticsA necessary and sufficient conditionInformation inequality


What is in the last lecture

sufficiency, minimal sufficiency, complete statistics, statistical inferenceToday’s homework 1: Exercises 34, 47, 97 (page 147-156.)


Asymptotic Criteria and Inference Consistency






Definition 1 (consistency of point estimators)

Let X1,X2, · · · ,Xn be a sample from P ∈ P and Tn(X ) be a pointestimator of ν for every n.

(i) Tn(X ) is called consistent for ν iff Tn(X )→p ν w.r.t any P ∈ P(weak consistency)

(ii) Let {an} be a sequence of positive constants diverging to ∞,Tn(X ) iscalled an-consistent for ν iff an(Tn(X )− ν) = Op(1) w.r.t any P ∈ P

(iii) Tn(X ) is called strongly consistent for ν iff Tn(X )→a.s. ν w.r.t anyP ∈ P

(iv) Tn(X ) is called Lr -consistent for ν iff Tn(X )→Lr ν w.r.t any P ∈ Pfor some fixed r > 0. (when r = 2, it is also called consistency inmse.)



LLN, CLT, Slutsky’s Theorem, continuous mapping theorem are typicallyapplied to establish consistency of point estimators.

Example 2

Let X1,X2, · · · ,Xn be i.i.d from P ∈ P.

If ν = µ = EX1 is finite, by SLLN, X is strongly consistent for µ

If σ2 = var(X1) is finite, by CLT, X is√

n-consistent for µ, S2 isstrongly consistent for σ2 by SLLN.



Example 3

Let X1,X2, · · · ,Xn be i.i.d from an unknown P with a continuous c.d.f Fsatisfying F (θ) = 1 for some θ ∈ R and F (x) < 1 for any x < θ (e.g,U(0, θ)). For any ε > 0,F (θ − ε) < 1, P(|X(n) − θ| ≥ ε) = [F (θ − ε)]n,

which implies that X(n) is strongly consistent for θ. If assume F (i)(θ−)

vanishes for i ≤ m and F (m+1)(θ−) 6= 0, then

1−F (X(n)) =(−1)mF (m+1)(θ−)

(m + 1)!(θ−X(n))m+1 +o(|θ−X(n)|m+1)a.s. (1)

Note that P(n[1− F (X(n))] ≥ s) = (1− s/n)n → 0 impliesn(θ − X(n))m+1 = Op(1). If m = 0, then X(n) is n-consistent; if m = 1,then it is

√n-consistent.

Remark: In statistical theory, inconsistent estimators should not be used,but a consistent estimator is not necessarily good. Consistency should beused together with one or a few more criteria.


Asymptotic Criteria and Inference Asymptotic bias, variance and mse






(i) In some cases, there is no unbiased estimator. For example, assume Xfollows binomial distribution Bi(n, p) where n is known and p ∈ (0, 1)is unknown, then there is no unbiased estimator for ν = p−1.

(ii) There are many reasonable point estimators whose expectations arenot well defined. For example, (X1,Y1), · · · , (Xn,Yn) are i.i.d frombivariate normal distribution, the ratio estimator Tn = X/Y forν = µx/µy is not defined for any n.



Definition 4

If E(Tn) exists for every n and limn→∞ E(Tn − ν) = 0 for any P ∈ P,then Tn is said to be approximately unbiased.

Let ξ1, ξ2, · · · , ξn be random variables and {an} be a sequence ofpositive numbers satisfying an →∞ or an → a > 0. If anξn →d ξ andE|ξ| <∞, then E(|ξ|)/an is called an asymptotic expectation of ξn.

Let Tn be a point estimator of ν for every n, an asymptoticexpectation of Tn − ν, if exists, is called an asymptotic bias of Tn anddenoted by bTn(P); if lim

n→∞bTn(P) = 0 for any P ∈ P, then Tn is said

to be asymptotically unbiased.

Suppose that there is a sequence of random variables {ηn} such thatanηn →d Y , a2

n(Tn − ν − ηn)→d W where Y ,W are randomvariables with finite means, EY = 0,EW 6= 0. Then we may definea−2n be the order of bTn(P) or define EW /a2

n to be that a−2n order

asymptotic bias of Tn.



Example 5

Let X1,X2, · · · ,Xn be i.i.d random k-vectors with finite Σ = var(X1). Let

X = n−1n∑

i=1Xi ,Tn = g(X ) where g is a function on Rk that is

second-order differentiable at ν = EX1 ∈ Rk . Consider Tn as an estimatorof ν = g(µ). By Taylor’s expansion,Tn − µ = [∇g(µ)]τ (X − µ) + 1/2(X − µ)τ∇2g(µ)(X − µ) + op(1/n).Since (X − µ)τ∇2g(µ)(X − µ)→d Z τ

Σ∇2g(µ)ZΣ where ZΣ = Nk(0,Σ).Thus, E(Z τ

Σ∇2g(µ)ZΣ)/2n is the n−1 order asymptotic bias of Tn = g(X ).



Definition 6

Let Tn be an estimator of ν for every n and {an} be a sequence of positivenumbers satisfying an →∞ or an → a > 0. Assume thatan(Tn − ν)→d Y with 0 < EY 2 <∞.

The asymptotic mean squared error of Tn, denoted by amseTn(P) (oramseTn(θ) if P is in a parametric family indexed by θ, is defined to bethe asymptotic expectation of (Tn − ν)2, i.e, EY 2/a2

n. Theasymptotic variance of Tn is defined to be var(Y )/a2

n.

Let T ′n be another parameter of ν. The asymptotic relative efficiencyof T ′n w.r.t Tn is defined to be eT ′n,Tn(P) = amseTn(P)/amseT ′n(P)

Tn is said to be asymptotically more efficient than T ′n ifflim supn eT ′n,Tn(P) ≤ 1 for any P and < 1 for some P.


UMVUE

Unbiased estimation

Unbiased or asymptotically unbiased estimation plays an important role inpoint estimation theory. How to derive and find the best unbiasedestimators?

Definition 7

X is a sample from an unknown population P ∈ P, ν is a real-valuedparameter related to P. An estimator T (X ) of ν is unbiased iffET (X ) = ν for any P ∈ P. If there exists an unbiased estimator of ν,then ν is called an estimable parameter.

Definition 8 (UMVUE)

An unbiased estimator T (X ) of ν is called the uniformly minimumvariance unbiased estimator (UMVUE) iff var(T (X )) ≤ var(U(X )) for anyP ∈ P and any other unbiased estimator U(X ) of ν.


UMVUE

Theorem 9 (Rao-Blackwell Theorem)

Assume that A is a convex subset of Rk and for any P ∈ P, that L(P, a)is a convex function of a. Suppose also that T is a sufficient statistic forP ∈ P, and that T0 is a nonrandomized decision rule satisfyingE‖T0‖ <∞ and let T1 = E(T0(X )|T ). Then T1 is R-equivalent to orR-better than T0 for any P ∈ P.


UMVUE

The derivation of a UMVUE is relatively simple if there exists a sufficientand complete statistic for P ∈ P.

Theorem 10 (Lehmann-Scheffe Theorem)

Suppose that there exists a sufficient and complete statistic T (X ) forP ∈ P. If ν is estimable, then there is a unique unbiased estimator of νthat is of the form h(T ) with a Borel function h. Furthermore, h(T ) is theunique UMVUE of ν. (two estimators that are equal a.s. P are treated asone estimator.)

Remark: Both of these theorems can be considered as applications ofJensen’s inequality.


UMVUE Sufficient and complete statistics






How to derive a UMVUE when a sufficient and completestatistic T is available?

The first method: directly solving for h

Need the distribution of T

Try some function h to see if E[h(T )] is related to ν

If E[h(T )] = ν for all P, what should h be?

Example 11

Let X1,X2, · · · ,Xn be i.i.d from the uniform distribution on (0, θ), θ > 0.Consider ν = θ. Since the sufficient statistic X(n) has the Lebesgue p.d.f

nθ−nxn−1I(0,θ)(x). Suppose Ef (X(n)) = 0, then nθ−n∫ θ

0 f (x)xn−1dx = 0for all θ > 0. Therefore, X(n) is also a complete statistic.



Example 11 continued.

Consider ν = g(θ), where g is a differentiable function on (0,∞). An

unbiased estimator h(X(n)) of ν must satisfy g(θ) = nθ−n∫ θ

0 h(x)xn−1dx .Differentiating both sides of the equation leads tong(θ) + θg ′(θ) = nh(θ),∀θ > 0. Hence, the UMVUE of ν ish(X(n)) = g(X(n)) + n−1X(n)g ′(X(n)). For example, g(θ) = θ, the UMVUEof θ is (n + 1)X(n)/n.



Example 12

Let X1,X2, · · · ,Xn be i.i.d from the Poisson distribution P(θ) with anunknown parameter θ > 0.

The p.d.f is exp(−nθ + (∑n

i=1 xi ) ln θ)(∏n

i=1 xi !)−1, T (X ) =

∑ni=1 Xi

is sufficient and complete for θ > 0 and has Poisson distributionP(nθ) (by comparing their ch.fs).

Suppose that ν = g(θ) where g is a smooth function such thatg(x) =

∑∞j=1 ajx

j , x > 0. An unbiased estimator h(T ) of ν must

satisfy∑∞

t=0

e−nθ(nθ)th(t)

t!=∑∞

j=0 ajθj for all θ > 0. i.e,∑∞

t=0

(nθ)th(t)

t!= [∑∞

i=0

(nθ)i

i !][∑∞

j=0 ajθj ]. By comparing the

coefficients of θt for all t ≥ 0, we obtain h(t) =t!

nt

∑i ,j :i+j=t

niaji !

for

any nonnegative integer t. For example, if ν = θr for r ≥ 1, then

h(t) =

0 t < rt!

nr (t − r)!t ≥ r

.



Example 13

Let X1,X2, · · · ,Xn be i.i.d from a power series distribution, i.e,P(Xi = x) = γ(x)θx/c(θ), x = 0, 1, 2, · · · . with a known functionγ(x) ≥ 0 and an unknown parameter θ > 0.

T (X ) =∑n

i=1 Xi is sufficient and complete for θ > 0 and it is also ina power series distribution family, P(T (X ) = x) = γn(x)θx/(c(θ))n

where [c(θ)]n =∑∞

x=0 γn(x)θx .

For ν = g(θ) =θr

[c(θ)]p, where r , p are nonnegative integers. Suppose

the UMVUE is h(T ), then∑∞

x=0

γn(x)θxh(x)

[c(θ)]n=

θr

[c(θ)]p, i.e,∑∞

x=0 γn(x)h(x)θx =∑∞

x=0 γn−p(x)θx+r . Hence,

h(x) =

{0 x < r

γn−p(x − r)/γn(x) x ≥ r.



Example 14

Let X1,X2, · · · ,Xn be i.i.d from an unknown population P in anonparametric family P. If the vector of order statistics,T = (X(1),X(2), · · · ,X(n)) is sufficient and complete (for example, when Pis the family of distributions on R having Lebesgue p.d.f.’s), since anestimator ϕ(X1, · · · ,Xn) is a function of T iff the function ϕ is symmetricin its n arguments, then a symmetric unbiased estimator of any estimableν is the UMVUE. For example, X is the UMVUE of ν = EX1; S2 is theUMVUE of var(X1); Fn(t) is the UMVUE of P(X1 ≤ t) for any fixed t.Remark: If T is not sufficient and complete for P ∈ P (for example, ifn > 2 and P contains all symmetric distributions having Lebesgue p.d.f.’sand finite means), then there is no UMVUE for ν = EX1. (Today’shomework 1)



The second method: conditioning

Find an unbiased estimator of ν, say U(X ).

Conditioning on a sufficient and complete statistic T (X ), thenE(U(X )|T (X )) is the UMVUE of ν.

Remark: from the uniqueness of the UMVUE, it does not matter whichU(X ) is used, thus, we should use U(X ) so as to make the calculation ofE(U(X )|T (X )) as easy as possible.



Example 15

Let X1,X2, · · · ,Xn be i.i.d from the exponential distribution E(0, θ),Fθ(x) = (1− e−x/θ)I(0,∞)(x). Consider ν = 1− Fθ(t) = e−t/θ for t > 0.

X is sufficient and complete for θ; U(X ) = I(t,∞)(X1) is an unbiased

estimator for ν, hence T (X ) = E(U(X )|T (X )) = P(X1 > t|X ) is theUMVUE of ν. Since X1/X is an ancillary statistic for θ (scale family), byBasu’s theorem, X1/X and X are independent. P(X1 > t|X = x) equalsP(X1/X > t/x). Note that

∑ni=2 Xi is independent of X1 and it follows

distribution Γ(n− 1, θ). (Lebesgue p.d.f is xn−2θ−(n−1)/Γ(n− 1) exp(−x

θ).

P(X1/X > t/x) = P((nx − t)X1 > t(n∑

i=2

Xi ))

=

∫ ∞0

exp(− tx

(nx − t)θ)

xn−2

Γ(n − 1)θn−1exp(−x

θ)dx

= (1− t/(nx))n−1.

The UMVUE of ν is (1− t/(nX ))n−1.Luo Shan (SJTU) ÚOíä�ûü March 31st, 2015 23 / 42


Example 16

Let X1,X2, · · · ,Xn be i.i.d from N(µ, σ2) with unknown µ and σ2 > 0,T = (X , S2) is sufficient and complete for θ = (µ, σ2). Note that: (i)X ,S2

are independent; (2) X ∼ N(µ, σ2/n); (3)(n − 1)S2/σ2 ∼ χ2(n − 1), itsLebesgue p.d.f is x (n−1)/2−1e−x/22−(n−1)/2/Γ((n − 1)/2) whereΓ(t) =

∫∞0 x t−1e−xdx for t ≥ 0. Moreover, for m > (1− n)/2,

E((n − 1)S2/σ2)m = 2mΓ((n − 1)/2 + m)/Γ((n − 1)/2). (2)

Denote kn,r = nr/2Γ(n/2)/(2r/2Γ((n + r)/2)), then σr = kn−1,rES r forr > 1− n.

By solving for h directly, we can find that (i)The UMVUE for µ is X ;(ii) The UMVUE for µ2 is X 2 − S2/n; (iii)The UMVUE for σr withr > 1− n is kn−1,rS r ; (iv) The UMVUE for µ/σ is kn−1,−1X/S , ifn > 2; (iv) If ν satisfies P(X1 ≤ ν) = p with a fixed p ∈ (0, 1). Let Φbe the c.d.f of N(0, 1), then ν = µ+ σΦ−1(p), its UMVUE isX + kn−1,1SΦ−1(p).



Example 16 cont.

Let c be a fixed constant and ν = P(X1 ≤ c) = Φ((c − µ)/σ).Choose U(X ) = I(−∞,c)(X1), then E(U(X )|T ) = P(X1 ≤ c|T ) is the

UMVUE for ν. Z (X ) = (X1 − X )/S is an ancillary statistic and byBasu’s theorem, it is independent of T .P(X1 ≤ c|T = (x , s)) = P(Z (X ) ≤ (c − x)/s). The Lebesgue p.d.fof Z is

f (z) =

√nΓ(

n − 1

2)

√π(n − 1)Γ(

n − 2

2)

[1− nz2

(n − 1)2

](n/2)−2

I(0,(n−1)/√n)(|z |).

(3)(Today’s homework 2: prove (3))

Hence the UMVUE of ν is P(X1 ≤ c |T ) =∫ (c−X )/S

−(n−1)/√n

f (z)dz .



Example 16 cont.

Let ν = σ−1Φ′((c − µ)/σ), where Φ′ is the first-order derivative of Φ.ν is the Lebesgue p.d.f of X1 evaluated at a fixed c . The conditionalp.d.f of X1 given T = (X , S2) = (x , s2) is s−1f ((x − x)/s). Hence

the UMVUE of ν is1

Sf (

c − X

S).

Remark: The UMVUE does not have necessarily have the minimum MSE.

Let h(X ) =1

n

n∑i=1

(Xi − X )2, then E(h(X )− σ2)2 ≤ var(S2).



Example 17

Let X1,X2, · · · ,Xn be i.i.d with Lebesgue p.d.f fθ(x) = θx−2I(θ,∞)(x),where θ > 0 is unknown.

X(1) is sufficient and complete for θ. The Lebesgue p.d.f of X(1) is

nθnx−(n+1)I(θ,∞)(x), it is apparent that X(1) is complete.

Suppose ν = P(X1 > t) for a constant t > 0. P(X1 > t|X(1)) is theUMVUE for ν. P(X1 > t|X(1) = x(1)) = P(X1/X(1) > t/x(1)) (Basu’stheorem). Without loss of generality, assume θ = 1. Ift ≤ x(1),P(X1 > t|X(1) = x(1)) = 0; otherwise, let s = t/x(1),

P(X1/X(1) > s) = (n − 1)P(X1/X(1) > s,X(1) = Xn)

= (n − 1)

∫x1>sxn,x2>xn,··· ,xn−1>xn

n∏i=1

1

x2i

dx1 · · · dxn

= (n − 1)/(ns) = ((n − 1)x(1)/nt).

The UMVUE for ν is h(X(1)) =

{((n − 1)X(1))/(nt) X(1) < t

1 X(1) ≥ tLuo Shan (SJTU) ÚOíä�ûü March 31st, 2015 27 / 42

UMVUE A necessary and sufficient condition






It could be difficult if a complete and sufficient statistic is not available, insome cases, the following results can be applied if we have knowledgeabout the unbiased estimators of 0.

Theorem 18

Let U be the set of all unbiased estimator of 0 with finite variances and Tbe an unbiased estimator of ν with ET 2 <∞.

(i) A necessary and sufficient condition for T (X ) to be a UMVUE of ν isthat E[T (X )U(X )] = 0 for any U ∈ U ,P ∈ P;

(ii) Suppose that T = h(T ) where T is a sufficient statistic for P ∈ Pand h is a Borel function. Let UT be the subset of U consisting of

Borel functions on T . Then a necessary and sufficient condition for Tto be a UMVUE of ν is that E[T (X )U(X )] = 0 for anyU ∈ UT ,P ∈ P.



Proof of 18.

(i) If T is a UMVUE of ν, Tc = T + cU where U ∈ U , c is a fixedconstant, is also an unbiased estimator for ν andvarTc ≥ var(T ), ∀c ∈ R,P ∈ P. Hence, E[T (X )U(X )] = 0 for anyU ∈ U ,P ∈ P; for the converse, let T0 be another unbiased estimatorof ν with var(T0) <∞,T − T0 ∈ U , therefore, E[T (T − T0)] = 0 forall P ∈ P. Combined with the fact that ET = ET0, we havevar(T ) = cov(T ,T0), hence var(T ) ≤ var(T0) for any P ∈ P0.

(ii) It suffices to show that E[T (X )U(X )] = 0 for any U ∈ UT ,P ∈ Pimplies E[T (X )U(X )] = 0 for any U ∈ U ,P ∈ P. Let U ∈ U ,E(TU) = E[E(TU|T )] = E[E(h(T )U|T )] = E[h(T )E(U|T )] = 0.

�

Remark: If there is a sufficient statistic, then by Rao-Blackwell’s theorem,we only need to focus on functions of the sufficient statistics. (ii) inTheorem 18 is more convenient to use.



Application I of Theorem 18: to find an UMVUE

Example 19

Let X1,X2, · · · ,Xn be i.i.d from the uniform distribution on the interval(0, θ), Θ = [1,∞). X(n) is sufficient for θ but not complete. Its Lebesguep.d.f is nθ−nxn−1I(0,θ)(x). Suppose EU(X(n)) = 0, then∫ θ

0 U(x)xn−1dx = 0 for all θ ≥ 1. This implies that U(X ) = 0 a.e.

Lebesgue measure on [1,∞) and∫ 1

0 U(x)xn−1dx = 0. Let ν = θ.Consider T = h(X(n)), to have E(TU) = 0, we must have∫ 1

0 h(x)U(x)xn−1dx = 0. Consider h(x) =

{c 0 ≤ x ≤ 1

bx x > 1where c , b are

some constants. ET = θ implies θ = cP(X(n) ≤ 1) + bE[X(n)I(1,∞)(X(n))].Hence, c = 1, b = (n + 1)/n. The UMVUE of θ is then

h(X(n)) =

{1 0 ≤ X(n) ≤ 1

(n + 1)X(n)/n X(n) > 1



Application II of Theorem 18: to show the nonexistence ofany UMVUE

Example 20

Let X be a sample (of size 1) from the uniform distributionU(θ − 1/2, θ + 1/2), θ ∈ R, we can show that there is no UMVUE ofν = g(θ) for any nonconstant function g . U ∈ U must satisfy∫ θ+1/2θ−1/2 U(x)dx = 0 for all θ ∈ R. Differentiating both sides leads to

U(x) = U(x + 1) a.e.m. where m is the Lebesgue measure on R. If T isan UMVUE, then T (x)U(x) = T (x + 1)U(x + 1) a.e. m,

T (x) = T (x + 1) a.e.m. Since g(θ) =∫ θ+1/2θ−1/2 T (x)dx for all θ ∈ R,

therefore, g ′(θ) = T (θ + 1/2)− T (θ − 1/2) = 0 a.e. m. It contradictswith the fact that g is a nonconstant function.



Corollary 21

(i) Let Tj be a UMVUE of νj , j = 1, 2, · · · , k, where k is a fixed positive

integer, then∑k

j=1 cjTj is a UMVUE of ν =∑k

j=1 cjνj for anyconstants c1, c2, · · · , ck .

(ii) Let T1,T2 be two UMVUE’s of ν, then T1 = T2 a.s. P for anyP ∈ P.

These are consequences of Theorem 18.


UMVUE Information inequality






Theorem 22 (Cramer-Rao lower bound)

Let X = (X1,X2, · · · ,Xn) be a sample from P ∈ P = {Pθ : θ ∈ Θ}, whereΘ is an open set in Rk . Suppose that T (X ) is an estimator withET (X ) = g(θ) being a differentiable function of θ,Pθ has a p.d.f fθ w.r.t ameasure ν for all θ ∈ Θ; and fθ is differentiable as a function of θ andsatisfies

∂

∂θ

∫h(x)fθ(x)dν =

∫h(x)

∂

∂θfθ(x)dν, θ ∈ Θ (4)

for h(x) ≡ 1 and h(x) = T (x). Then

var(T (X )) ≥ [∂

∂θg(θ)]τ [I (θ)]−1 ∂

∂θg(θ), (information inequalities) (5)

where I (θ) = E{ ∂∂θ

log fθ(X )[∂

∂θlog fθ(X )]τ} is assumed to be positive

definite for any θ ∈ Θ. It is called the Fisher information matrix. Thegreater I (θ) is, the easier it is to distinguish θ from neighboring values, themore accurately θ can be estimated.



Proof of Theorem 22.

Let h(X ) ≡ 1, we have E[∂

∂θlog fθ(X )] = 0; let h(x) = T (x), we have

E[T (X )∂

∂θlog fθ(X )] =

∂

∂θg(θ). Denote qθ(X ) =

∂

∂θlog fθ(X ), it suffices

to show that

var(T (X )) ≥ {cov(T (X ), qθ(X ))}τ [var(qθ(X ))]−1cov(T (X ), qθ(X )) (6)

Denote Uθ(X ) = T (X )− {cov(T (X ), qθ(X ))}τ [var(qθ(X ))]−1qθ(X ), theabove equation can be implied by observing var(Uθ(X )) ≥ 0. �

Remark: I (θ) depends on the particular parametrization. If θ = ψ(η) andψ(η) is differentiable, then the Fisher information that X contains about η

is∂

∂ηψ(η)I (ψ(η))[

∂

∂ηψ(η)]τ .



Proposition

(i) If X ,Y are independent with the Fisher information matricesIX (θ), IY (θ), then the Fisher information about θ contained in (X ,Y )is IX (θ) + IY (θ). In particular, if X1,X2, · · · ,Xn are i.i.d and I1(θ) isthe Fisher information about θ contained in X1, then the Fisherinformation about θ contained in X1,X2, · · · ,Xn is nI1(θ).

(ii) Suppose that X has the p.d.f fθ that is twice differentiable in θ andthat (4) holds with h(x) ≡ 1 and fθ replaced by ∂fθ/∂θ. Then

I (θ) = −E[∂2

∂θ∂θτlog fθ(X )].



Example 23

Let X1,X2, · · · ,Xn be i.i.d with the Lebesgue p.d.f1

σf (

x − µσ

), where

f (x) > 0 and f ′(x) exists for all x ∈ R, µ ∈ R, σ > 0. Let θ = (µ, σ2).Then the Fisher information about θ contained in X1,X2, · · · ,Xn is

I (θ) =n

σ2

∫ [f ′(x)]2

f (x)dx

∫ f ′(x)[xf ′(x) + f (x)]

f (x)dx∫ f ′(x)[xf ′(x) + f (x)]

f (x)dx

∫ [xf ′(x) + f (x)]2

f (x)dx

(7)

For example, let f (x) =1√2π

exp(−x2

2). If θ = (µ, σ), then

I (θ) =n

σ2

(1 00 2

); If θ = (µ, σ2), then I (θ) =

n

σ2

(1 0

01

2σ2

)



Proposition

Suppose that the distribution of X is from an exponential family{fθ : θ ∈ Θ}, i.e, the p.d.f of X w.r.t a σ-finite measure is

fθ(x) = exp{η(θ)τT (x)− ξ(θ)}c(x) (8)

where Θ is an open subset of Rk .

(i) The regularity condition (4) is satisfied for any E|h(X )| <∞ and

I (θ) = E[∂2

∂θ∂θτlog fθ(X )].

(ii) If I¯(η) is the Fisher information matrix for the natural parameter η,

then the variance-covariance matrix var(T ) = I¯(η).

(iii) If I (ν) is the Fisher information matrix for the parameter ν = ET (X ),then var(T ) = [I (ν)]−1.



Proof.

(ii) fη(x) = exp{ητT (x)− ζ(η)}c(x),∂

∂ηlog fη(x) = T (x)− ∂

∂ηζ(η) =

T (x)− ET (X ), var(T ) =∂2

∂η∂ητ.

(iii) I¯(η) =

∂ν

∂ηI (ν)(

∂ν

∂η)τ =

∂2

∂η∂ητI (ν)[

∂2

∂η∂ητ]τ

�

Remark: for exponential family, the variance of any linear function of Tattains the Cramer-Rao lower bound. The following result gives anecessary condition for var(U(X )) of an estimator U(X ) to attain theCramer-Rao lower bound.



Proposition Assume that the conditions in Theorem 22 hold with T (X )replaced by U(X ) and that Θ ⊂ R.

(i) If var(U(X )) of an estimator U(X ) attains the Cramer-Rao lower

bound, then a(θ)[U(X )− g(θ)] = g ′(θ)∂

∂θlog fθ(X ) a.s. Pθ for some

function a(θ), θ ∈ Θ;

(ii) Let fθ,T be given in (8).If var(U(X )) of an estimator U(X ) attainsthe Cramer-Rao lower bound, then U(X ) is a linear function of T (X )a.s. Pθ, θ ∈ Θ.



Let X1,X2, · · · ,Xn be i.i.d from N(µ, σ2) where µ is unknown and σ isknown. X is sufficient and complete. I (µ) = n/σ2.

ν = µ, g ′ = 1, var(X ) = I (µ), it attains the Cramer-Rao lower bound

ν = µ2, g ′(µ) = 2µ, the Cramer-Rao lower bound is4µ2σ2

n.

h(X ) = X 2 − σ2/n is the UMVUE for ν, var(h(X )) =4µ2σ2

n+

2σ4

n2.


Date post:	05-Aug-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Luo Shan - 上海交通大学数学系math.sjtu.edu.cn/zhiyuancenter/userfiles/files/Lect5.pdf · 1...

Documents