0 A Dual-Frame Design for an RDD Survey That Screens for a Rare Population K.P. Srinath, Abt...

Post on 26-Mar-2015

217 views 1 download

Tags:

transcript

1

A Dual-Frame Design for an RDD Survey That Screens for a Rare

Population

K.P. Srinath, Abt Associates Inc.Michael P. Battaglia, Abt Associates Inc.

Meena Khare, NCHS, CDC

2

Approach and Application

Dual-Frame Approach for a Survey That Screens for a Rare Population

Potential Application to the National Immunization Survey (NIS)

3

RDD Surveys

Many large surveys use list-assisted random-digit-dialing (RDD) samples.

Some surveys focus on a specific target population.

Example: NIS- children between 19 and 35 months of age.

4

Screening Households for Eligibility

• Large pool of interviewers

•Considerable time and effort

•Substantial cost

•Adds to nonresponse because of screening attempts

•Nonresponse correlated with eligibility criteria

5

Dual-Frame Approach

Frame A:

RDD frame containing entire population of telephone numbers

Frame A offers complete coverage of the target population.

Frame B:

List frame: Telephone numbers of households in targeted population.

Frame B is a subset of Frame A and is incomplete.

We select simple random samples from each frame.

6

Notation

:aM Number of telephone numbers in Frame A only

:bM Number of telephone numbers in Frame B only

:abM Number of telephone numbers

that belong to both frames ab bM M

aM and bM are known.

7

Notation, cont’d.

Let a bM M M be the total number of telephone numbers in Frame A. Let N (unknown) be the number of eligible households in Frame A.

NeM

is the eligibility rate in Frame A.

bb

b

Ne

M

bN is the number of eligible households in Frame B.

8

Notation, cont’d

be 1 . 0 b e c a u s e o f e r r o r s i n t h e l i s t f r a m e b u t be > e .

L e t bMM

b e t h e p r o p o r t i o n o f

t e l e p h o n e n u m b e r s i n F r a m e B o u t o f t h e t o t a l t e l e p h o n e n u m b e r s i n F r a m e A . (1 ) a be e e

w h e r e aa

a

Ne

M

i s t h e e l i g i b i l i t y r a t e f o r t h e p a r t i n F r a m e A n o t i n F r a m e B .

9

Notation, cont’d.

m : S a m p l e n u m b e r o f t e l e p h o n e n u m b e r s s e l e c t e d f r o m F r a m e A

bm : S a m p l e n u m b e r o f t e l e p h o n e n u m b e r s s e l e c t e d f r o m F r a m e B . T o t a l s a m p l e s i z e 0 bm m m .

10

Notation, cont'd.

M a t c h m t e l e p h o n e n u m b e r s s e l e c t e d f r o m F r a m e A w i t h t h e

bM n u m b e r s i n

f r a m e B . L e t t h e n u m b e r t h a t m a t c h b e

a bm .

A s a r e s u l t o f m a t c h i n g t h e s a m p l e f r o m F r a m e A i s s p l i t i n t o t w o p a r t s . L e t

a a bm m m .

W e h a v e 3 s a m p l e s o f t e l e p h o n e n u m b e r s . T o t a l s a m p l e i s

0 a a b bm m m m .

11

Notation,cont’d.

L e t a

n d e n o t e t h e n u m b e r o f e l i g i b l e

h o u s e h o l d s r e s u l t i n g f r o m c a l l i n g t h e

am t e l e p h o n e n u m b e r s .

L e t

a bn a n d

bn b e t h e n u m b e r o f e l i g i b l e

h o u s e h o l d s o u t o f t h e a b

m a n d t h e b

m

t e l e p h o n e n u m b e r s , r e s p e c t i v e l y . W e h a v e 3 s a m p l e s o f e l i g i b l e h o u s e h o l d s . T o t a l i s

0 a a b bn n n n .

12

Estimator

We are interested in estimating a population ratio R. For example, in the NIS, we want to estimate the proportion of children who are up-to-date with respect to a specific vaccine. We first estimate the population totals. Let aY be the estimated population total of Y for the population in Frame A and not in Frame B.

13

Estimator, cont’d.

Since we have simple random samples,

a a

MY ym

where ay is the sample total of number of persons having a characteristic of interest based on an persons. Similarly, abY and bY are the estimated totals for the population in frame B

14

Estimator, cont’d.

ˆaN , ˆ

abN and ˆbN are the estimated sizes

of the eligible population in the two frames.

The estimator of the population ratio R is of the form proposed by Hartely (1962) and is

ˆ ˆ ˆ(1 )ˆ

ˆ ˆ ˆ(1 )ab b

a ab b

Y pY p YR

N pN p N

.

p is the weighting factor to combine the two estimates from Frame B. .

15

Variance

W e w a n t t h e v a r i a n c e o f ˆR u n d e r t h e a s s u m p t i o n t h a t a a bn n n a n d bn a r e f i x e d .

22

2

22

1ˆ( ) ( 1 ) [ { ( 1 ) }

( 1 ) ]

a b

bb

b

MV R R R e p e

N m

Mp e

m

16

Minimizing Variance

We can determine values of p, m and bm such that for a given screening sample size 0m, the variance of R is a minimum. The screening sample size should be such that we get the expected number of completes 0n in the survey.

17

Optimum Values

V a l u e s o f p , m a n d bm t h a t t h e

m i n i m i z e ˆ( )V R a r e

ao p t

b

ep

e ,

0

( )a

b a a

m em

e e e

,

0 ( )

( )b a

b

b a a

m e em

e e e

.

18

Total Screening Sample Size

W e w an t 0b bem e m n . w h ere 0n is th e to ta l n u m b er o f e lig ib le h o u seh o ld s in th e sam p le . T h ere fo re ,

0

( )

( )b a a

b b a a

e e em n

e e e e e

.

F o r a g iv en n , fin d 0m . A llo ca te 0m to th e tw o fram es u sin g th e a llo ca tio n fo rm u las . U se th e v a lu e o f p d e te rm in ed ea rlie r fo r m in im u m v arian ce . T h e lo ss in e ffic ien cy d u e to u n eq u a l w e ig h ts is m in im ized .

19

Allocation for a given “p”

2

0 2

( 1 )

( 1 ) ( 1 )

a b

a b b

e p em m

e p e p e

0 2

( 1 )

( 1 ) ( 1 )

bb

a b b

p em m

e p e p e

2

0 0 2

( 1 ) ( 1 )

( 1 ) ( 1 )

b a b

b b a b

p e e p em n

e p e e e p e

20

National Immunization Survey

•Large ongoing RDD survey, conducted quarterly by the CDC.

•Measures vaccination coverage rates among children aged 19-35 months.

•78 geographic strata

•Quarterly list-assisted RDD samples

Only 3.4% of the households contain age-eligible children.

21

Dual Frame

•RDD frame offers coverage of entire population of residential telephone numbers except those in “0” banks.

•Commercial lists vary in quality of information and coverage of eligible population.

•Experian New Babies List is available for all 50 states and D.C.

22

Dual-Frame Experiment

•Identified 5 urban strata for a test, conducted in Q4/2003.

•Selected random sample of telephone numbers from the New Babies List.

•Selected the RDD sample.

•Matched the RDD sample with the Experian New Babies List sampling frame

23

Results from Q4/2003

St RDD Numbers(M) Experian(Mb) Alpha(%)

1 608,400 986 0.162

2 1,604,500 3,875 0.241

3 699,300 2,356 0.337

4 937,000 4,479 0.478

5 457,200 1,739 0.380

24

Observed Eligibility Rates

St. Overall (e) Experian (eb) p

sqrt(ea/eb)

1 1.28 23.8 0.229

2 1.17 21.6 0.228

3 1.50 20.6 0.264

4 1.73 28.7 0.236

5 1.13 28.8 0.189

25

Screening Sample Size

S RDD Sample Dual Frame m mb Diff.

1 9,844 6, 888 6720 168 2,956

2 9,402 6,411 6,411 162 2,991

3 7,200 4,790 4,600 190 2,600

4 5,029 3,835 3,758 77 1,194

5 9,292 6,317 6,195 122 2,975

26

Number of Completes

St RDD Completes Experian Completes Total

1 86 40 126

2 75 35 110

3 69 39 108

4 65 22 87

5 70 35 105

27

Conclusions

•Dual-frame designs can offer savings during data collection.

•Experian New Babies List performed reasonably well.

•Need estimates of sample design parameters to plan allocation of the sample to the two frames.

•Need to balance saving in screening cost and possible increase in variance.

28

Abt Associates Inc.