Download - 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 1

STAT 4060 Design and Analysis of Surveys

Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%


What we have learned:

1. Simple random sampling, confidence interval and choice of sample size.

2. Ratio and regression estimators, systematic sampling.

3. Stratified random sampling, allocation of stratum weights.

4. Cluster sampling.


Population Parameter


Sample Statistics


Simple random sampling

We shall consider the use of simple random samples for estimating the three population characteristics:

the population mean

the population total

and the proportion P.

We shall discuss how any estimators behave in terms of their sampling distributions. The variance is often a crucial measure.

1

1, denoted , ;

N

jj

Y Y YN

1

, denoted , ;N

T T jj

Y Y Y



Proof of (1.9)

n

SfS

Nn

nS

Nn

N

yynnyVarn

YnYyynnYyVarnn

YnyEynnEyn

YyyyEn

YnyEyEyEyVar

jii

jii

jii

jijii

n

ii

222

22222

2222

222

22

1

22

)1(11

)),cov()1()((1

})),)(cov(1())(({1

})1({1

)(1

)/()()()(


Confidence interval for the population mean







Ratio Estimation and Regression Estimation(Chapter 4, Textbook, Barnett, V., 1991)

2.1 Estimation of a population ratio: The ratio estimator In some situations it is useful to estimate a (positive) ratio of two

population characteristics: the totals, or means, of two (positive) variables X and Y.

The sample average of ratio

unbiased for estimating the population mean

Two obvious estimators of R are

The ratio of the sample averages

is widely used.


1 1

1 1( / )

n n

i i ii i

r y x rn n

/ /T Tr y x y x

1 1

1 1( / )

N N

j j jj j

R R Y XN N

but biased for estimating R

The bias in estimating R by r

The bias in estimating R by r is the expectation of the following difference:

(2.3)


( ) /r R y Rx x 1

1y Rx x X

X X

2

1 .y Rx x X x X

X X X

2

[( )( )]( )

y Rx E y Rx x XE r R E

X X

Discussion about the bias


≈


(2.5)

2

21

2 2 22

( )1

1

12

Nj j

j

Y YX X

Y RXf

nX N

fS RS R S

nX

( ) ( )j j j j jZ Y RX Y Y RX RX

2.2 Ratio estimation of a population mean or total


( / )Ry rX X x y

( / )TR T Ry rX NX x y Ny

Variance of ratio estimator




The estimate of the ratio R of the present weight to prestudy weight for the herd is:

Solution:

000929.012

646.848,8)

500

121(

880

11)(

22

2

rSXn

frVar

030485.0000929.0)( rse


This examines when the variance of (2.10) could be less or greater than that of (1.9)


2.3 Regression estimation

Condition (2.15.1) demands that X and Y be linearly related, but, if the linear relationship does not pass through the origin, then, it suggests considering an alternative estimator known as regression estimator.




A practicable simple linear regression model is (2.17)

.

An ideal (perfect) linear relationship is

(2.16)

)( jj XXbYY

(2.18)

jjj EXXbYY )(



Consider the average (mean) of either (2.16) or (2.17),

( )Ly y b X x (2.19)



2( ) [( ) ]L LVar y E y Y 2

2 2 2

2 2

{[( ) ( )] }

1( 2 )

1(1 )

L

Y YX X

Y YX

E y Y b x X

fS bS b S

nfS

n

21( )Y

fS Var y

n

(2.20)

y



From (2.20),

2 2 21min { ( )} min ( 2 )b L b Y YX X

fVar y S bS b S

n

2 21(1 )Y YX

fS

n

The minimum is obtained with 2min / /YX X YX Y Xb b S S S S

Y

Thus the most efficient regression estimator of is

( / )( )L YX Y Xy y S S X x

(2.22)



The optimal value of b of (2.22) suggests the obvious estimate:

1min 2 2

1

( )( )( )

( )

n

i iyx in

x ii

y y x xsb b

s x x

(2.24)

( )Ly y b X x (2.25)

which enjoys the following asymptotic properties:

1( ) ( )LE y Y O n



Asymptotic properties:

( )LVar y

2 2 2 3/21( / ) ( )Y YX X

fS S S O n

n

21( ) ( )L y yx

fV y s bs

n

(2.27)

(2.26) )()1(1 2/322

nOSn

fXYX

2.4 Comparison of ratio and regression estimators



2.4 Comparison of ratio and regression estimators

2 2 2 21( ) ( ) 2R L X YX Y X YX Y

fV y Var y R S R S S S

n

21X YX Y

fRS S

n


Stratified Simple Random Sampling(Chapter 5, Textbook, Barnett, V., 1991)

Consider another sampling method:

Some Notations


To estimate the population mean of a finite population, we assume that the population is stratified, that is to say it has been divided into k non-overlapping groups, or strata, of sizes:

The stratum means and variances are denoted by

and


Estimation of Population Characteristicsin Stratified Populations

Estimating


The stratified sample mean is defined as

Here we assume the weights Wi=Ni /N is given (known).

The mean and variance of


Note that

Since

Because it is assumed that “sampling in different strata are independent”, that is


Simple random sampling

Stratified sampling with proportional allocation


(a) When stratum size is large enough:

N

N i


(b) When stratum size is not large enough:

The stratified sample mean will be more efficient than the s.r. sample mean

If and only if variation between the stratum means is sufficiently large

compared with within-strata variation!

Optimum Choice of Sample Size


To achieve required precision of estimation Some cost limitation

The simplest form assumes that there is some overhead cost, c0 of administering

The survey, and that individual observations from the ith stratum each cost an

Amount ci. Thus the total cost is:


I. Minimum variance for fixed cost (Cont.)


I. Minimum variance for fixed cost (Cont.)

Then

II. Minimum cost for fixed variance


Consider to satisfy for the minimum possible total cost.


iii nwnwGiven ,


Comparison of proportional allocation and optimum allocation


Thus the extent of the potential gain from optimum (Neyman) allocation

Compared with proportional allocation depends on the variability of the

stratum variances: the larger this is, the greater the relative advantage

Of optimum allocation.


Cluster Sampling(Chapter 6, Textbook, Barnett, V., 1991)






Comparison of s.r. sampling with cluster sampling

Systematic Sampling


Systematic sample can be viewed as a cluster sample of size m=1!

Systematic sample mean

Systematic Sampling


Comparison of s.r. sampling with systimatic sampling


Two ways of estimating ---


Y

23/4/19www.uic.edu.hk/~xlpeng 59

n