Chapter 10: Variance Estimation - Iowa State...

Chapter 10: Variance Estimation

Jae-Kwang Kim

Iowa State University

Fall, 2014

Kim (ISU) Ch. 10: Fall, 2014 1 / 43

Introduction

1 Introduction

2 Taylor series linearization

3 Replication variance estimation

Kim (ISU) Ch. 10: Fall, 2014 2 / 43

Introduction

Introduction

Use of variance estimate in sampling

Inferential purpose : construct CI, hypothesis testingDescriptive purpose : evaluation of survey estimates, future surveyplanning

What is a good variance estimator ?

Unbiased, or nearly unbiased (with positive bias - conservative)Stable : Variance of the variance estimator is low.NonnegativeSimple to calculate

Kim (ISU) Ch. 10: Fall, 2014 3 / 43

Introduction

Introduction

HT variance estimator (or SYG variance estimator): Some problems1 Can take negative values2 Need to know the joint inclusion probability πij , which can be

cumbersome for large sample size.

Kim (ISU) Ch. 10: Fall, 2014 4 / 43

Introduction

Variance of variance estimator

Parameter of interest: V (θ)

Let V be an (unbiased) estimator of V (θ).

May assume that

dV

V (θ)∼ χ2 (d)

for some d . (d.f. of V ).

Kim (ISU) Ch. 10: Fall, 2014 5 / 43

Introduction

Variance of variance estimator (Cont’d)

By the property of χ2 distribution,

E(V)

= V (θ)

and

V(V)

=2{V (θ)

}2

dThus,

CV(V)

=

√V(V)

E(V) =

√2

d.

How to compute d?1 Method of moments: requires an estimate for V

(V)

.

2 Rule of thumb: use d = nPSU − H, where nPSU is the number ofsampled PSU and H is the number of strata.

Kim (ISU) Ch. 10: Fall, 2014 6 / 43

Introduction

Alternative to HT variance estimation

Simplified variance estimator: Motivation1 Consider the variance estimator for PPS sampling:

V0 =1

n (n − 1)

∑i∈A

yipi− 1

n

∑j∈A

yjpj

2

,

which is always nonnegative and simple to compute.2 What if we use V0 as an estimator for the variance of YHT =

∑i∈A

yiπi

by treating YHT∼= YPPS = 1

n

∑i∈A

yipi

?3 Simplified variance estimator: Use the PPS sampling variance

estimator (V0) to estimate the variance of YHT

Kim (ISU) Ch. 10: Fall, 2014 7 / 43

Introduction

Theorem

E(V0

)− Var

(YHT

)=

n

n − 1

{Var

(YPPS

)− Var

(YHT

)}where Var

(YPPS

)is the variance of YPPS using pk = πk/n as the

selection probability for unit k for each of PPS draw, and

Var(YPPS

)=

1

n

N∑i=1

pi

(yipi− Y

)2

.

Kim (ISU) Ch. 10: Fall, 2014 8 / 43

Introduction

Remark

1 In most cases, the bias is positive. (thus, conservative estimation.)

2 Under SRS, the relative bias of the simplified variance estimator is

E (V0)− Var(YHT )

Var(YHT )=

n

N − n

and it is negligible if n/N is negligible.

Kim (ISU) Ch. 10: Fall, 2014 9 / 43

Introduction

Remark (Cont’d)

3 Application to multi-stage sampling: Express

YHT =∑i∈AI

Yi

πIi.

The resulting simplified variance estimator can be written

V0 =1

n (n − 1)

∑i∈AI

(Yi

pi− YHT

)2

=n

(n − 1)

∑i∈AI

(Yi

πIi− 1

nYHT

)2

where pi = πIi/n and n is the size of the sampled PSU’s.

Kim (ISU) Ch. 10: Fall, 2014 10 / 43

Introduction

Remark (Cont’d)

4 The bias is negligible if the primary sampling rate is negligible. If thesampling design is also a stratified (multi-stage) sampling such that

YHT =H∑

h=1

nh∑i=1

whi Yi ,

where AIh = {1, 2, · · · , nh}, the simplified variance estimator can bewritten

V0 =H∑

h=1

nh(nh − 1)

nh∑i=1

whi Yhi −1

nh

nh∑j=1

whj Yhj

2

.

Kim (ISU) Ch. 10: Fall, 2014 11 / 43

Taylor series linearization

1 Introduction



Kim (ISU) Ch. 10: Fall, 2014 12 / 43



Estimate variance of nonlinear estimator by approximating estimatorby a linear function

First-order Taylor linearization: For p-dimensional y, ifyn = YN + Op

(n−1/2

), then

g (yn) = g(Y)

+

p∑j=1

∂g(Y)

∂yj

(yjn − Yj

)+ Op

(n−1)

Linearized variance

V {g (yn)} .=p∑

i=1

p∑j=1

∂g(Y)

∂yi

∂g(Y)

∂yjCov {yin, yjn}

Kim (ISU) Ch. 10: Fall, 2014 13 / 43


Two methods of obtaining linearized variance estimation

1 Direct method: use

V {g (yn)} .=p∑

i=1

p∑j=1

∂g (yn)

∂yi

∂g (yn)

∂yjC {yin, yjn}

2 Residual technique:1 Obtain a first-order Taylor expansion to get

g (yn).

= g(Y)

+1

N

∑i∈A

1

πiei

for some ei .2 The variance of g (yn) is then approximated by the variance of

N−1∑

i∈A1πiei . If we observed ei , then we would estimate the variance

of N−1∑

i∈A1πiei . Obtain a variance estimator of N−1

∑i∈A

1πiei and

replace ei by ei .

Kim (ISU) Ch. 10: Fall, 2014 14 / 43


Example: Ratio

R =y

x, R =

Y

X

Taylor expansion:

R = R + X−1 (y − Rx) + Op

(n−1)

Method 1 (Direct method)

V(R).

= x−2V (y) + x−2R2V (x)− 2x−2RC (x , y)

Method 2 (Residual technique)

V(R).

=1

N2

∑i∈A

∑j∈A

πij − πiπjπij

eiπi

ejπj

where ei = x−1(yi − Rxi

).

Kim (ISU) Ch. 10: Fall, 2014 15 / 43


Ratio estimator ˆYr = X R

V1.

=1

N2

∑i∈A

∑j∈A

πij − πiπjπij

yi − Rxiπi

yj − Rxjπj

V2.

=1

N2

(X

x

)2∑i∈A

∑j∈A

πij − πiπjπij

yi − Rxiπi

yj − Rxjπj

Which one do you prefer ?

Kim (ISU) Ch. 10: Fall, 2014 16 / 43


Variance estimation for GREG estimator

For simplicity, assume that ci = λ′x so that

YGREG =∑i∈A

1

πigiyi

where

gi = X′

(∑i∈A

1

πicixix′i

)−11

cixi

Two types of variance estimators

V1 =∑i∈A

∑j∈A

πij − πiπjπij

eiπi

ejπj

V2 =∑i∈A

∑j∈A

πij − πiπjπij

gieiπi

gjejπj

They are asymptotically equivalent because gi.

= 1.V2 has a good conditional property.

Kim (ISU) Ch. 10: Fall, 2014 17 / 43


Variance estimation for Poststratified estimator

Poststratified estimator

Ypost =G∑

g=1

Ng

Ng

Yg

Unconditional variance estimator

V1 =∑i∈A

∑j∈A

πij − πiπjπij

eiπi

ejπj

where ei = yi − yg for xig = 1.

Conditional variance estimator

V2 =∑i∈A

∑j∈A

πij − πiπjπij

gieiπi

gjejπj

where ei = yi − yg and gi = Ng/Ng for xig = 1.

Kim (ISU) Ch. 10: Fall, 2014 18 / 43


Variance estimation for Poststratified estimator

Under SRS:

V1 =N2

n

(1− n

N

) G∑g=1

ng − 1

n − 1s2g

V2 =(

1− n

N

) n

n − 1

G∑g=1

N2g

ng

ng − 1

ngs2g

where

s2g =1

ng − 1

∑i∈Ag

(yi − yg )2 .

Kim (ISU) Ch. 10: Fall, 2014 19 / 43

Replication variance estimation

1 Introduction



Kim (ISU) Ch. 10: Fall, 2014 20 / 43


Replication method - Idea

1 Interested in estimating the variance of θ.

2 From the original sample A, generate G resamplesA(1),A(2), · · · ,A(G).

3 Based on the observations from the resample A(g), (g = 1, 2, · · · ,G ),

compute the replicate θ(g) for θ.

4 The replicate variance estimator for θ is computed as

V = KG

G∑g=1

(θ(g) − θ(·)

)2for some suitable KG , where θ(·) = G−1

∑Gg=1 θ

(g).

Kim (ISU) Ch. 10: Fall, 2014 21 / 43


Replication methods for variance estimation

Random group method

Independent random group method:Mahalanobis (1939, 1946), Deming (1946)Non-independent random group method

Balanced repreated replication:Plackett and Burman (1946), McCarthy (1966)

Jackknife: Quenoulle (1949), Tukey (1958)

Bootstrap: Efron (1979)

Kim (ISU) Ch. 10: Fall, 2014 22 / 43


Independent Random Group Method

Procedure1 A sample, A1, is drawn from the finite population according to the

design p. Compute θ1 from the observations in A1.2 Sample A1 is replaced into the population and a second sample, A2 is

drawn according to the same sampling design p. Compute θ2 from theobservations in A2.

3 This process is repeated until G ≥ 2 times.4 Use

θRG =1

G

G∑k=1

θ(k) (1)

as an estimator for θ. Use

V(θRG

)=

1

G

1

G − 1

G∑k=1

(θ(k) − θRG

)2(2)

as a variance estimator for θRG .

Kim (ISU) Ch. 10: Fall, 2014 23 / 43



Property

Let θ1, · · · , θG be uncorrelated random variables with common expectationE (θ1) = θ. Then,

1 θRG in (1) is unbiased for θ.

2 V (θRG ) in (2) is unbiased for V (θRG ).

Kim (ISU) Ch. 10: Fall, 2014 24 / 43



ExampleSuppose that a sample of households is to be drawn using a multistagesampling design. Two random groups are desired. An areal frame exists,and the target population is divided into two strata (defined, say, on thebasis of geography). Stratum 1 contains N1 PSUs and stratum 2 consistsof one PSU that is to be selected with certainty. G = 2 independentrandom groups are to be used. Each sample is selected independentlyaccording to the following plan:

Stratum 1: Two PSUs are selected using some πps sampling design.From each selected PSU, an equal probability systematic sample ofm1 households is selected.Stratum 2: The certainty PSU is divided into city blocks, with theblock size varying between 10 and 15 households. An unequalprobability systematic sample of m2 blocks is selected with probabilityproportional to the block sizes. All households in selected blocks areenumerated.

Kim (ISU) Ch. 10: Fall, 2014 25 / 43



Example (Cont’d)

For point estimation, use

ˆθ =(θ(1) + θ(2)

)/2

For variance estimation, use

V(

ˆθ)

=1

2(1)

2∑g=1

(θ(g) − ˆθ

)2=(θ(1) − θ(2)

)2/4.

Easy Concept, Unbiased Variance Estimator; Unstable, Not oftenused in practice

Kim (ISU) Ch. 10: Fall, 2014 26 / 43


Non-independent Random Groups

Idea1 Given sample A, use a random mechanism to divide A into

A = ∪Gg=1A(g), where A(1), · · · ,A(G) are disjoint.

2 Calculate θ(1), · · · , θ(G) and treat them as independent3 Use

V =1

G

1

G − 1

G∑k=1

(θ(k) − θRG

)2as a variance estimator for θ.

Requirement : Each A(g) should have same design as A.

Impractical in some cases, Unstable

Kim (ISU) Ch. 10: Fall, 2014 27 / 43


Non-independent Random Groups

Property

Let θ1, · · · , θG be random variables with common expectation E(θi

)= θ.

Then,

E{V(θRG

)}− V

(θRG

)= − 1

G (G − 1)

∑∑i 6=j

Cov(θ(i), θ(j)

)

If θ(1), · · · , θ(G) are independent, then RHS=0.

If θ(1), · · · , θ(G) are identically distributed, then

RHS=−Cov(θ(1), θ(2)

).

Kim (ISU) Ch. 10: Fall, 2014 28 / 43


Example : Use of non-independent random group methodunder simple random sampling

Interested in the variance estimation for θ = y under simple randomsampling.Partition the sample into G groups of dependent samplesA = ∪Gg=1A(g) where A(g) is a simple random sample of size b = n/G .

Compute θ(g) = y(g) from A(g). Note that

θ =1

G

G∑g=1

y(g).

How large is the bias of V(θRG

)?

Bias(V)

= −Cov(y(1), y(2)

)=

1

NS2

Kim (ISU) Ch. 10: Fall, 2014 29 / 43


Jackknife method for variance estimation

Motivation

Basic Setup : Let (xi , yi ) be IID from a bivariate distribution withmean (µx , µy ). Let θ = µy/µx . Standard ratio estimator of θ,θ = x−1y , has bias of order O

(n−1).

Quenouille (1949) idea : propose a bias-reduced estimator of θ :

θ(.) =1

n

n∑k=1

θ(k)

where θ(k) = nθ − (n − 1) θ(−k) and

θ(−k) =

∑i 6=k

xi

−1∑i 6=k

yi

Kim (ISU) Ch. 10: Fall, 2014 30 / 43


Jackknife method for variance estimation

Motivation (Cont’d)

Tukey (1958) : treat θ(1), · · · , θ(G) as an independent random groupof size n to get

VJK

(θ)

.=

1

n

1

n − 1

n∑k=1

(θ(k) − θ(.)

)2=

n − 1

n

n∑k=1

(θ(−k) − θ(.)

)2.

Kim (ISU) Ch. 10: Fall, 2014 31 / 43


Taylor theorem 2 :Let {Xn,Wn} be a sequence of random variables such that

Xn = Wn + Op (rn)

where rn → 0 as n→∞. If g (x) is a function with s-th continuousderivatives in the line segment joining Xn and Wn and the s-th orderpartial derivatives are bounded, then

g (Xn) = g (Wn) +s−1∑k=1

1

k!g (k) (Wn) (Xn −Wn)k

+Op (r sn)

where g (k) (a) is the k-th derivative of g (x) evaluated at x = a.

Kim (ISU) Ch. 10: Fall, 2014 32 / 43


Properties

If θn = y , then

VJK =1

n

1

n − 1

n∑i=1

(yi − y)2 =1

ns2y

Under some regularity conditions, y (−k) − y = Op

(n−1).

For θ = f (x , y), we have

θ(−k) − θ =∂f

∂x(x , y)

(x (−k) − x

)+∂f

∂y(x , y)

(y (−k) − y

)+ op

(n−1)

The jackknife variance estimator defined by

VJK =n − 1

n

n∑k=1

(θ(−k) − θ

)2is asymptotically equivalent to the linearized variance estimator.

Kim (ISU) Ch. 10: Fall, 2014 33 / 43


Example: Ratio

R =y

x,

Jackknife replicates for R:

R(k) =y (−k)

x (−k)=

ny − yknx − xk

Taylor expansion 2:

R(−k) = R + (x)−1(y (−k) − Rx (−k)

)+ Op

(n−1)

Kim (ISU) Ch. 10: Fall, 2014 34 / 43


Example: Ratio (Cont’d)

Jackknife variance estimator

VJK =n − 1

n

n∑k=1

(R(−k) − R

)2.

=n − 1

n

n∑k=1

{(x)−1

(y (−k) − Rx (k)

)}2

=1

n(n − 1)

1

x2

n∑k=1

(yk − Rxk

)2Jackknife variance estimator for ratio estimator ˆYr = X R isasymptotically equivalent to

VJK =

(X

x

)21

n(n − 1)

n∑k=1

(yk − Rxk

)2,

which is often called the conditional variance estimator.Kim (ISU) Ch. 10: Fall, 2014 35 / 43


Example : Post-stratification ( Under SRS)

Point estimator:

Ypost =G∑

g=1

Ng yg =G∑

g=1

Ng

ng

∑i∈Ag

yi

k-th jackknife replicate of Ypost : For k ∈ Ah,

Y(−k)post − Ypost = Nh

(y(−k)h − yh

)= Nh (nh − 1)−1 (yh − yk)

Kim (ISU) Ch. 10: Fall, 2014 36 / 43


Example : Post-stratification ( Under SRS)


VJK

(Ypost

)=

n − 1

n

n∑k=1

(Y

(−k)post − Ypost

)2=

n − 1

n

G∑g=1

N2g (ng − 1)−1 s2g

Asymptotically equivalent to the conditional variance estimator.

Kim (ISU) Ch. 10: Fall, 2014 37 / 43


Extension to complex sampling

Stratified cluster sampling design

YHT =H∑

h=1

nh∑i=1

whi Yhi

Jackknife replicates : Delete j-th cluster in g -th stratum

Y(−gj)HT =

H∑h=1

nh∑i=1

w(gj)hi Yhi

where

w(−gj)hi =

0 if h = g and i = j

(nh − 1)−1 nhwhi if h = g and i 6= jwhi otherwise

Kim (ISU) Ch. 10: Fall, 2014 38 / 43


Extension to complex sampling (Cont’d)


VJK

(YHT

)=

H∑h=1

nh − 1

nh

nh∑i=1

Y(−hi)HT − 1

nh

nh∑j=1

Y(−hj)HT

2

Property

VJK

(YHT

)=

H∑h=1

nhnh − 1

nh∑i=1

whi Yhi −1

nh

nh∑j=1

whj Yhj

2

≡ V0

Kim (ISU) Ch. 10: Fall, 2014 39 / 43


Balanced repeated replication

Basic Set-Up : Stratified sampling with nh = 2, θ =∑H

h=1WhyhHalf Sample Replication : Pick one element for each stratum

θ(v) =H∑

h=1

Wh

{δ(v)h yh1 + (1− δ(v)h )yh2

}, v = 1, · · · , 2H

where δ(v)h =

{1 if yh1 is selected0 if yh2 is selected.

Note that

θ(v) − θ =H∑

h=1

0.5Wh

(2δ

(v)h − 1

)(yh1 − yh2) .

Also, the standard variance estimator for stratified sampling (withnh = 2) is

V (θ) =H∑

h=1

W 2h (yh1 − yh2)2/4.

Kim (ISU) Ch. 10: Fall, 2014 40 / 43



Balanced Condition : There always exist s1, · · · , sG withH < G < H + 4 such that

G∑v=1

(2δ(v)h − 1)(2δ

(v)h′ − 1) = 0 if h 6= h′

Variance Estimator :

VBRR =1

G

G∑v=1

(θ(v) − θ)2

Property: If δ(v)h satisfies the balanced condition, then

VBRR =∑h

W 2h (yh1 − yh2)2 /4

which is equal to the variance estimator of a stratified samplingestimator with nh = 2.

Kim (ISU) Ch. 10: Fall, 2014 41 / 43



Example : BRR for H = 3

MG : Hadamard matrix of order G :

G × G matrix of ± 1.M ′GMG = GIG

If MG satisfies M ′GMG = GIG , then M2G =

(M MM −M

)For G = 4,

M4 =

1 1 1 11 −1 1 −11 −1 −1 11 1 −1 −1

Each column of MG are mutually orthogonal : satisfies the balancedcondition

Kim (ISU) Ch. 10: Fall, 2014 42 / 43



Example : BRR for H = 3 (Cont’d)

G = 4 BRR replicates can be constructed as follows:

θ(1) = W1y11 + W2y21 + W3y31

θ(2) = W1y12 + W2y21 + W3y32

θ(3) = W1y12 + W2y22 + W3y31

θ(4) = W1y11 + W2y22 + W3y32

Can check that

1

G

G∑v=1

(θ(v) − θ)2 =∑h

W 2h (yh1 − yh2)2 /4

Kim (ISU) Ch. 10: Fall, 2014 43 / 43

Date post:	04-Nov-2018
Category:	Documents
Upload:	lydat
View:	218 times
Download:	0 times

Chapter 10: Variance Estimation - Iowa State...

Documents