Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | wynter-clemons |
View: | 29 times |
Download: | 0 times |
1
A Dual-Frame Design for an RDD Survey That Screens for a Rare
Population
K.P. Srinath, Abt Associates Inc.Michael P. Battaglia, Abt Associates Inc.
Meena Khare, NCHS, CDC
2
Approach and Application
Dual-Frame Approach for a Survey That Screens for a Rare Population
Potential Application to the National Immunization Survey (NIS)
3
RDD Surveys
Many large surveys use list-assisted random-digit-dialing (RDD) samples.
Some surveys focus on a specific target population.
Example: NIS- children between 19 and 35 months of age.
4
Screening Households for Eligibility
• Large pool of interviewers
•Considerable time and effort
•Substantial cost
•Adds to nonresponse because of screening attempts
•Nonresponse correlated with eligibility criteria
5
Dual-Frame Approach
Frame A:
RDD frame containing entire population of telephone numbers
Frame A offers complete coverage of the target population.
Frame B:
List frame: Telephone numbers of households in targeted population.
Frame B is a subset of Frame A and is incomplete.
We select simple random samples from each frame.
6
Notation
:aM Number of telephone numbers in Frame A only
:bM Number of telephone numbers in Frame B only
:abM Number of telephone numbers
that belong to both frames ab bM M
aM and bM are known.
7
Notation, cont’d.
Let a bM M M be the total number of telephone numbers in Frame A. Let N (unknown) be the number of eligible households in Frame A.
NeM
is the eligibility rate in Frame A.
bb
b
Ne
M
bN is the number of eligible households in Frame B.
8
Notation, cont’d
be 1 . 0 b e c a u s e o f e r r o r s i n t h e l i s t f r a m e b u t be > e .
L e t bMM
b e t h e p r o p o r t i o n o f
t e l e p h o n e n u m b e r s i n F r a m e B o u t o f t h e t o t a l t e l e p h o n e n u m b e r s i n F r a m e A . (1 ) a be e e
w h e r e aa
a
Ne
M
i s t h e e l i g i b i l i t y r a t e f o r t h e p a r t i n F r a m e A n o t i n F r a m e B .
9
Notation, cont’d.
m : S a m p l e n u m b e r o f t e l e p h o n e n u m b e r s s e l e c t e d f r o m F r a m e A
bm : S a m p l e n u m b e r o f t e l e p h o n e n u m b e r s s e l e c t e d f r o m F r a m e B . T o t a l s a m p l e s i z e 0 bm m m .
10
Notation, cont'd.
M a t c h m t e l e p h o n e n u m b e r s s e l e c t e d f r o m F r a m e A w i t h t h e
bM n u m b e r s i n
f r a m e B . L e t t h e n u m b e r t h a t m a t c h b e
a bm .
A s a r e s u l t o f m a t c h i n g t h e s a m p l e f r o m F r a m e A i s s p l i t i n t o t w o p a r t s . L e t
a a bm m m .
W e h a v e 3 s a m p l e s o f t e l e p h o n e n u m b e r s . T o t a l s a m p l e i s
0 a a b bm m m m .
11
Notation,cont’d.
L e t a
n d e n o t e t h e n u m b e r o f e l i g i b l e
h o u s e h o l d s r e s u l t i n g f r o m c a l l i n g t h e
am t e l e p h o n e n u m b e r s .
L e t
a bn a n d
bn b e t h e n u m b e r o f e l i g i b l e
h o u s e h o l d s o u t o f t h e a b
m a n d t h e b
m
t e l e p h o n e n u m b e r s , r e s p e c t i v e l y . W e h a v e 3 s a m p l e s o f e l i g i b l e h o u s e h o l d s . T o t a l i s
0 a a b bn n n n .
12
Estimator
We are interested in estimating a population ratio R. For example, in the NIS, we want to estimate the proportion of children who are up-to-date with respect to a specific vaccine. We first estimate the population totals. Let aY be the estimated population total of Y for the population in Frame A and not in Frame B.
13
Estimator, cont’d.
Since we have simple random samples,
a a
MY ym
where ay is the sample total of number of persons having a characteristic of interest based on an persons. Similarly, abY and bY are the estimated totals for the population in frame B
14
Estimator, cont’d.
ˆaN , ˆ
abN and ˆbN are the estimated sizes
of the eligible population in the two frames.
The estimator of the population ratio R is of the form proposed by Hartely (1962) and is
ˆ ˆ ˆ(1 )ˆ
ˆ ˆ ˆ(1 )ab b
a ab b
Y pY p YR
N pN p N
.
p is the weighting factor to combine the two estimates from Frame B. .
15
Variance
W e w a n t t h e v a r i a n c e o f ˆR u n d e r t h e a s s u m p t i o n t h a t a a bn n n a n d bn a r e f i x e d .
22
2
22
1ˆ( ) ( 1 ) [ { ( 1 ) }
( 1 ) ]
a b
bb
b
MV R R R e p e
N m
Mp e
m
16
Minimizing Variance
We can determine values of p, m and bm such that for a given screening sample size 0m, the variance of R is a minimum. The screening sample size should be such that we get the expected number of completes 0n in the survey.
17
Optimum Values
V a l u e s o f p , m a n d bm t h a t t h e
m i n i m i z e ˆ( )V R a r e
ao p t
b
ep
e ,
0
( )a
b a a
m em
e e e
,
0 ( )
( )b a
b
b a a
m e em
e e e
.
18
Total Screening Sample Size
W e w an t 0b bem e m n . w h ere 0n is th e to ta l n u m b er o f e lig ib le h o u seh o ld s in th e sam p le . T h ere fo re ,
0
( )
( )b a a
b b a a
e e em n
e e e e e
.
F o r a g iv en n , fin d 0m . A llo ca te 0m to th e tw o fram es u sin g th e a llo ca tio n fo rm u las . U se th e v a lu e o f p d e te rm in ed ea rlie r fo r m in im u m v arian ce . T h e lo ss in e ffic ien cy d u e to u n eq u a l w e ig h ts is m in im ized .
19
Allocation for a given “p”
2
0 2
( 1 )
( 1 ) ( 1 )
a b
a b b
e p em m
e p e p e
0 2
( 1 )
( 1 ) ( 1 )
bb
a b b
p em m
e p e p e
2
0 0 2
( 1 ) ( 1 )
( 1 ) ( 1 )
b a b
b b a b
p e e p em n
e p e e e p e
20
National Immunization Survey
•Large ongoing RDD survey, conducted quarterly by the CDC.
•Measures vaccination coverage rates among children aged 19-35 months.
•78 geographic strata
•Quarterly list-assisted RDD samples
Only 3.4% of the households contain age-eligible children.
21
Dual Frame
•RDD frame offers coverage of entire population of residential telephone numbers except those in “0” banks.
•Commercial lists vary in quality of information and coverage of eligible population.
•Experian New Babies List is available for all 50 states and D.C.
22
Dual-Frame Experiment
•Identified 5 urban strata for a test, conducted in Q4/2003.
•Selected random sample of telephone numbers from the New Babies List.
•Selected the RDD sample.
•Matched the RDD sample with the Experian New Babies List sampling frame
23
Results from Q4/2003
St RDD Numbers(M) Experian(Mb) Alpha(%)
1 608,400 986 0.162
2 1,604,500 3,875 0.241
3 699,300 2,356 0.337
4 937,000 4,479 0.478
5 457,200 1,739 0.380
24
Observed Eligibility Rates
St. Overall (e) Experian (eb) p
sqrt(ea/eb)
1 1.28 23.8 0.229
2 1.17 21.6 0.228
3 1.50 20.6 0.264
4 1.73 28.7 0.236
5 1.13 28.8 0.189
25
Screening Sample Size
S RDD Sample Dual Frame m mb Diff.
1 9,844 6, 888 6720 168 2,956
2 9,402 6,411 6,411 162 2,991
3 7,200 4,790 4,600 190 2,600
4 5,029 3,835 3,758 77 1,194
5 9,292 6,317 6,195 122 2,975
26
Number of Completes
St RDD Completes Experian Completes Total
1 86 40 126
2 75 35 110
3 69 39 108
4 65 22 87
5 70 35 105
27
Conclusions
•Dual-frame designs can offer savings during data collection.
•Experian New Babies List performed reasonably well.
•Need estimates of sample design parameters to plan allocation of the sample to the two frames.
•Need to balance saving in screening cost and possible increase in variance.
28
Abt Associates Inc.