+ All Categories
Home > Documents > Two-Stage Estimation of Mean in a Negative Binomial Distribution with Applications to Mexican Bean...

Two-Stage Estimation of Mean in a Negative Binomial Distribution with Applications to Mexican Bean...

Date post: 09-Oct-2016
Category:
Upload: b-m
View: 212 times
Download: 0 times
Share this document with a friend
41
This article was downloaded by: [Dalhousie University] On: 24 September 2012, At: 02:10 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Sequential Analysis: Design Methods and Applications Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsqa20 Two-Stage Estimation of Mean in a Negative Binomial Distribution with Applications to Mexican Bean Beetle Data N. Mukhopadhyay a & B. M. de Silva b a Department of Statistics, University of Connecticut, Storrs, Connecticut, USA b Department of Mathematics and Statistics, RMIT University, Melbourne, Australia Version of record first published: 15 Feb 2007. To cite this article: N. Mukhopadhyay & B. M. de Silva (2005): Two-Stage Estimation of Mean in a Negative Binomial Distribution with Applications to Mexican Bean Beetle Data, Sequential Analysis: Design Methods and Applications, 24:1, 99-137 To link to this article: http://dx.doi.org/10.1081/SQA-200046838 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms- and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to
Transcript

This article was downloaded by: [Dalhousie University]On: 24 September 2012, At: 02:10Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

Sequential Analysis: DesignMethods and ApplicationsPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/lsqa20

Two-Stage Estimation ofMean in a Negative BinomialDistribution with Applicationsto Mexican Bean Beetle DataN. Mukhopadhyay a & B. M. de Silva ba Department of Statistics, University ofConnecticut, Storrs, Connecticut, USAb Department of Mathematics and Statistics, RMITUniversity, Melbourne, Australia

Version of record first published: 15 Feb 2007.

To cite this article: N. Mukhopadhyay & B. M. de Silva (2005): Two-Stage Estimationof Mean in a Negative Binomial Distribution with Applications to Mexican Bean BeetleData, Sequential Analysis: Design Methods and Applications, 24:1, 99-137

To link to this article: http://dx.doi.org/10.1081/SQA-200046838

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden.

The publisher does not give any warranty express or implied or make anyrepresentation that the contents will be complete or accurate or up to

date. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damageswhatsoever or howsoever caused arising directly or indirectly in connectionwith or arising out of the use of this material.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Sequential Analysis, 24: 99–137, 2005Copyright © Taylor & Francis, Inc.ISSN: 0747-4946 print/1532-4176 onlineDOI: 10.1081/SQA-200046838

Two-Stage Estimation of Mean in a NegativeBinomial Distribution with Applications to

Mexican Bean Beetle Data

N. MukhopadhyayDepartment of Statistics, University of Connecticut, Storrs,

Connecticut, USA

B. M. de SilvaDepartment of Mathematics and Statistics, RMIT University,

Melbourne, Australia

Abstract: Working with insect counts, Anscombe (1949) emphasized negativebinomial modeling by introducing a parameterization involving ��>0� and��>0�. The parameters �� � stood for average “infestation” and “clumping,”respectively. Assuming that � was known, Willson and Folks (1983) adoptedpurely sequential sampling to estimate �, whereasMukhopadhyay andDiaz (1985)developed a two-stage methodology because of its operational convenience. Wefirst prove a new striking result (Theorem 2.1) that claims the asymptotic second-order efficiency property of the two-stage procedure.

In order to handle the case when � is unknown, we develop a new approach(section 3) for evaluating estimators of �. We control a new criterion, namelythe integrated coefficient of variation (ICV), by averaging the CV with respect toa weight function for �. A two-stage methodology is proposed, and both first-and second-order properties are highlighted (Theorems 3.1–3.3).

We summarize findings from extensive sets of simulations of the two-stagemethodologies both when � is known or unknown. When � is unknown, therobustness of the proposed methodology with respect to choices of a weight

Received July 2003, Revised March and May 2004, Accepted June 2004Recommended by Linda YoungAddress correspondence to N. Mukhopadhyay, Department of Statistics,

UBox 4120, University of Connecticut, Storrs, CT 06269-4120, USA; E-mail:[email protected]

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

100 Mukhopadhyay and de Silva

function is critically examined. In the end, both methodologies are applied tofour sets of Mexican bean beetle data with encouraging findings.

Keywords: Bounded CV; Bounded integrated CV; CV approach; IntegratedCV approach; Known clumping parameter; Practical application; Second-orderproperties; Simulations; Unknown clumping parameter; Weight function.

Subject Classifications: 62L12; 62G35.

1. INTRODUCTION

The practical uses of a negative binomial distribution in modeling datafrom biological and agricultural studies are too numerous to list them all.Early on, working with insect counts, Anscombe (1949) emphasized therole of negative binomial modeling. Bliss and Owen (1958) consideredsome related issues as well. Kuno (1969, 1972) developed sequentialsampling procedures for estimating the mean of a population whosevariance was a quadratic function of the population mean. Binns (1975),on the other hand, considered sequential confidence interval proceduresfor the mean of a negative binomial model. The works of Willson(1981) and Willson and Folks (1983), however, constitute comprehensivestudies of modern sequential sampling approaches for estimating themean of a negative binomial model. The purely sequential nature ofsampling, advocated in Willson (1981) and Willson and Folks (1983), cansometimes be operationally unattractive, and hence Mukhopadhyay andDiaz (1985) developed a two-stage estimation methodology in the caseof the point estimation problem.

References to a negative binomial model or sequential and multistagesampling in agricultural experiments, including the area of monitoringpests, such as weeds and insects, are quite numerous. Other than thecitations that have already been provided, we simplymention the followingworks in an alphabetical order for brevity: Allen, Gonzalez, and Gokhale(1972), Anscombe (1950), Barrigossi (1997), Berti et al. (1992), Johnsonet al. (1995), Marshall (1988), Mukhopadhyay (2002), Mulekar andYoung (1991, 2004), Mulekar, Young, and Young (1993), Nyrop andBinns (1991), Onsager (1976), Plant and Wilson (1985), Sterling (1976),Sylvester and Cox (1961),Waters (1955),Wiles et al. (1992),Wilson (1982),Young (1994, 2004), and Zou (1998). Sequential and two-stage samplingthat are most relevant to this present investigation can be found inWillson (1981), Willson and Folks (1983), and Mukhopadhyay and Diaz(1985) when the clumping parameter in a negative binomial model isassumed known.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 101

Let us suppose that we can observe a sequence of responses�X1� X2� � � � � of independent random variables, each having theprobability mass function

��X = x �� �� =(� + x − 1� − 1

)(�

� + �

)x(�

� + �

)�

� x = 0� 1� 2� � � � �

(1.1)

The parameters �� � are both assumed finite and positive. Briefly, wewrite that X has the NB(�� �) distribution.

Here, X may, for example, stand for the count of insects on asampling unit of plants, or it may be the count of some kind of weedon a sampling unit of agricultural plot. In these examples, for thedistribution (1.1), the parameter � stands for the average number ofinsects or average number of weeds per sampling unit, whereas theparameter � points toward the degree of clumping of infestation persampling unit. A small (large) � indicates heavy (light) clumping. Theparameterization laid down in (1.1) was introduced by Anscombe (1949).For the distribution (1.1), it turns out that the mean and variance aregiven by

E���X� = � and �2 = V���X� = � + �−1�2� (1.2)

Willson (1981) and Willson and Folks (1983) investigated purelysequential point and interval estimation of � when � was known.Mukhopadhyay and Diaz (1985) came up with a two-stage samplingstrategy for the point estimation problem considered earlier by Willson(1981) and Willson and Folks (1983) because a two-stage samplingdesign is operationally simpler than full-blown purely sequential sampling.Mukhopadhyay and Diaz (1985) continued to assume that � was known.With regard to the loss function, the main theme in these papers revolvedaround controlling the coefficient of variation (CV) associated with themean estimator.

In section 2 of this paper, we first discuss (Theorem 2.2) a verystriking property of the two-stage sampling strategy of Mukhopadhyayand Diaz (1985) in which we claim that for large sample sizes, thedifference between the average sample size and the optimal fixed samplesize remains bounded! This shows that the two-stage procedure ofMukhopadhyay and Diaz (1985) is asymptotically second-order efficient inthe sense of Ghosh and Mukhopadhyay (1981).

Then, we develop a suitable two-stage sampling strategy in section 3for the point estimation of � assuming that � is unknown. One maybe tempted to plug in an appropriate estimate of � in the two-stage procedure of Mukhopadhyay and Diaz (1985) and hope to

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

102 Mukhopadhyay and de Silva

proceed accordingly. But that approach does not work well, becauseestimation of � when � is also unknown runs into difficulties. Themethod of maximum likelihood or the method of moment estimation of� crosses our mind, and yet either method fails to estimate � well in alarge segment of the parameter space! Willson (1981) investigated andreported some of the ramifications when � had to be estimated.

When � is unknown, we take a completely different approach withthe realization that a practitioner may be able to combine the associateduncertainty in the form of a suitable weight function g�k� where k�>0�is a typical value of �. Then, while estimating the mean �, insteadof trying to control the coefficient of variation, we make it our goalto control a new criterion, namely the integrated coefficient of variation(ICV). Here, the averaging of the CV is carried out with the help of theweight function g�·�. It is interesting to note that this procedure enjoysa broad sense of robustness property with respect to possible choices ofg�·�.

With the help of computer simulations, we evaluate themethodologies both when � is known or unknown. The role of theweight function g�·� is critically examined. We suggest guidelines for apractitioner to choose an appropriate weight function g�·� that is to beused in the methodology.

We have also applied the original two-stage methodology (assumingthat � is known) and the new two-stage methodology (assuming that � isunknown) for Mexican bean beetle datasets. Dr. Jose Barrigossi gatheredthese massive datasets when he was a Ph.D. student under Dr. LeonG. Higley in the Department of Entomology, University of Nebraska–Lincoln. Dr. Barrigossi and Dr. Higley, the sources for these datasets,and Professor Linda Young kindly made these datasets available to us.Some prior information about the parameters came along these lines:The parameter � for these data is anticipated to be quite small, near 0�3or 0�4, but it could be smaller. Incidentally, we have applied the originaltwo-stage methodology for this data assuming first that � is known tobe 0�3 or 0�4. Then, we have proceeded to apply the new two-stagemethodology with the mean ICV (MICV) approach assuming that � isunknown. In this latter situation, the weight function g�·� is chosen sothat the weighted average of � is close to 0�3 or 0�4 but then � is also“allowed” to be smaller. The sense of uncertainty about � is thus builtwithin the weight function g�·�.

2. KNOWN CLUMPING PARAMETER: THE CV APPROACH

In this section, we assume that the mean ��>0� is unknown but theclumping parameter ��>0� is known. Having recorded n observations

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 103

X1� � � � � Xn, along the line of Willson (1981), we suppose that the lossincurred in estimating � by the sample mean Xn�= n−1 ∑n

i=1 Xi� isgiven by

Ln = �−2�Xn − ��2� (2.1)

The risk function associated with (2.1) is then given by E�Ln� =�2�n�2�−1. Observe that this risk function may be interpreted as thesquare of the CV and hence we refer to the associated methodology asthe CV approach.

A very reasonable goal of an experimenter is to make E�Ln� “small.”One may attempt to achieve this goal by first fixing a small preassignednumber c�>0� and then designing a sampling strategy that would enableone to claim E�Ln� ≤ c2. Now, the fixed sample size required to achievethe goal E�Ln� ≤ c2 turns out to be the smallest integer

n ≥ �2�c��−2 = c−2��−1 + �−1� = n∗� say� (2.2)

Here, n∗ is referred to as the “optimal” fixed sample size, but itsmagnitude remains unknown! Hence, aiming for the implementation ofthe optimal fixed-sample-size design to collect data is definitely outof question. At this point, Willson (1981) and Willson and Folks(1983) developed an appropriate purely sequential estimation strategy.But purely sequential estimation strategies may be time-consuming,costly, and operationally cumbersome in some situations. The two-stageestimation strategy of Mukhopadhyay and Diaz (1985) is operationallymore convenient. For a general overview of the area of sequentialestimation, one may refer to Ghosh, Mukhopadhyay, and Sen (1997),a comprehensive resource. First we summarize the two-stage samplingdesign (section 2.1) and then establish some of its associated second-orderproperties (section 2.2).

2.1. Sampling Design of Mukhopadhyay and Diaz

Recall the expression of n∗ from (2.2) and observe that n∗ > ��c2�−1,whereas this lower bound is a known entity. Let us define the pilotsample size

m ≡ m�c� = ���c2�−1� + 1 (2.3)

where �u� denotes the largest integer smaller than u. Let X1� � � � � Xm

denote the pilot observations. Next, we choose and fix a number �>0�and define

N ≡ N�c� = ⟨{�Xm +m− �−1 + �−1

}c−2

⟩+ 1� (2.4)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

104 Mukhopadhyay and de Silva

It should be clear that N is an estimator of n∗ based on the pilotobservations. The role of the term m− is to make sure that the pilotestimator Xm +m− for � remains positive with probability one.

Now, having determined N , we sample the difference �N −m� inthe second stage, thereby obtaining additional observations Xm+1� � � � � XN .Then, based on the totality of all gathered observations X1� � � � � Xm�Xm+1� � � � � XN , Mukhopadhyay and Diaz (1985) proposed to estimate �by the associated sample mean, XN .

Mukhopadhyay and Diaz (1985) proved the following first-orderresults among others for the two-stage estimation methodology (2.3)–(2.4): with all fixed �>0�, as c → 0, one has

�i� N/n∗ → 1 w�p� 1 �ii� E�N/n∗� → 1 �iii� E�LN �/c

2 → 1� (2.5)

These exact same properties were also proved by Willson (1981) andWillson and Folks (1983) for their purely sequential estimation strategy.

2.2. Second-Order Properties

There is some chance that the two-stage procedure (2.3)–(2.4) may notproceed beyond the pilot stage, but Theorem 2.1 clearly shows that theprobability of that happening is very small indeed, especially when c issmall. In what follows, we also considerably strengthen the result givenin part (ii) of (2.5) by deriving second-order bounds (Theorem 2.2) forE�N�− n∗ for small c. Theorem 2.3 states some of the other specificcharacteristics that are important in their own right.

Theorem 2.1. For the two-stage procedure (2.3)–(2.4), with > 0, wehave as c → 0:

��N = m� = O�m−p��

where p is any arbitrary positive number.

Theorem 2.2. For the two-stage procedure (2.3)–(2.4), with > 1, wehave as c → 0:

�+ o�1� ≤ E�N�− n∗ ≤ �+ 1+ o�1�

where � = ��2�−3.

Theorem 2.3. For the two-stage procedure (2.3)–(2.4), we have:

(i) n∗−1/2�N − n∗�£→ N�0� �� as c → 0, if > 1/2;

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 105

(ii) n∗−1�N − n∗�2 is uniformly integrable for 0 < c < c0 with sufficientlysmall c0, if > 1;

(iii) V�N� = �n∗ + o�n∗�, if > 1.

The proofs are deferred to the appendix. It is easy to see that part (ii) in(2.5) follows from Theorem 2.2, that is, the conclusion in Theorem 2.2 isindeed stronger if we choose > 1. Part (ii) in (2.5), however, holds forall > 0.

2.3. Simulation Results

We generated simulated data from negative binomial distributions witha wide range of values of � and �. Having fixed a set of values of c� �� �,and , we determined m and N (denoted by n) independently 15�000�= r , say) times and obtained nmin = min1≤i≤r ni, nmax = max1≤i≤r ni, n =r−1 ∑r

i=1 ni, the minimum, maximum and average estimated sample sizerespectively. For every fixed pair of values of � and c, however, it becameclear to us that the performances of our simulated experiments generallyshowed unmistakably similar features whether we had fixed � = 3� � < 3�or � > 3. Hence, we have taken the liberty in summarizing our findingsonly from those simulations that were run with fixed � = 3 and known� which is specified within Figures 1–4 that would follow. Plots of nmin,nmax, and n have been provided in Figure 1.

Figures 1(a)–1(d) respectively illustrate our results obtained in thecase of small sample sizes �n∗ ≤ 100�, low-moderate sample sizes �100 <n∗ ≤ 200�, high-moderate sample sizes �1000 < n∗ ≤ 2000�, and largesample sizes �n∗ > 10000� respectively. These plots show the variation ofnmin� nmax, and n with different choices of ∈ �1� 3�. These plots showclearly that the choice of that is used in (2.4) did not appreciablyinfluence N , the estimator of the sample size n∗. Hence, in what follows,we fix only one value for , namely = 2.

We do know that �N −m�/m would converge to �/� in probabilityas c → 0. Figure 2 plots �n−m�/m, �nmin −m�/m, and �nmax −m�/m fordifferent values of c. We have also drawn a horizontal line that wouldcorrespond to the constant value �/�. For example, in Figure 2(a), wehave � = � = 3, and hence a horizontal line is drawn at 1 to highlightthe limiting value. Clearly, the values of �n−m�/m always stayed veryclose to the limiting value �/� whatever be the choice for c withinthe range under consideration. Also, the values of �nmin −m�/m and�nmax −m�/m respectively increase and decrease to the same limitingvalue �/� as c → 0.

In Figure 3, we see some “empirical justification” of the second-orderresult stated as Theorem 2.2. We should expect the values of n− n∗ to lie

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

106 Mukhopadhyay and de Silva

Figure 1. Maximum, minimum and average values of N versus (+ =nmax�× = nmin� = n and horizontal line at n∗).

between � and �+ 1, especially for “large” values of n∗. As we did in thecase of Figure 1, we had again chosen c and � so that we could illustratethe behavior of n− n∗ in the case of small, moderate, and large valuesof n∗.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 107

Figure 2. �N −m�/m versus c for = 2 (+ = nmax�× = nmin� = n andhorizontal line at �/�).

What we have found is that the values of n− n∗ clearly lie between� and �+ 1 for all choices of c under consideration, whatever be theknown value of �, large or small. In other words, the asymptotic boundfor E�N − n∗� stated in Theorem 2.2 does guide us fairly accuratelyabout what we may expect of E�N − n∗� for small and moderate valuesof n∗.

In Theorem 2.2, we concluded that E�N − n∗� was asymptoticallybounded, but we could not provide the limiting value of E�N − n∗�as c → 0. In order to investigate the status of the limiting valueof E�N − n∗�, we decided to proceed empirically as follows. For afixed set of values of c, � with � = 3, = 2, we found one n valuebased on r = 10�000 simulations. We record this as replication #1 andrewrite the corresponding n value as n1. Now, for the same fixed setof values of �� c� �� and , we ran 99 more independent replications,each based on r = 10�000 simulations that successively gave rise to 99more ni values, i = 2� � � � � 100. This way, we obtained an entire set of100 independent estimated values of E�N�, each based on r = 10�000simulations. Figure 4 plots ni − n∗, i = 1� � � � � 100 for � = 3, = 2, andfour fixed sets of values of c and �. We have included two small values,one medium value, and one very large value of n∗ in this exercise. Only

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

108 Mukhopadhyay and de Silva

Figure 3. �n− n∗� versus c for = 2 with vertical lines drawn from � to �+ 1.

in Figure 4(d), we noticed a minuscule number (seven) of values ofn− n∗ out of 100 such values to stray outside of the interval ��� �+ 1�.In each situation, n− n∗ values clearly appeared to converge to a limitingvalue that depended upon � and �� At this point, we conjecture thatlimc→0 E�N − n∗� is a constant � ≡ ���� ��. It will be an interestingmathematical exercise in the future to find an appropriate expression forthe limiting value �.

3. UNKNOWN CLUMPING PARAMETER:THE ICV APPROACH

In this section, we assume that the mean ��>0� is unknown andthe clumping parameter ��>0� is also unknown. Having recorded n

observations X1� � � � � Xn� we reconsider the loss function Ln from (2.1)incurred in estimating � by the sample mean Xn�= n−1 ∑n

i=1 Xi�. The riskfunction associated with Ln can be written as

E���Ln� = �2�n�2�−1 = n−1��−1 + �−1�� (3.1)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 109

Figure 4. �n− n∗� versus replication number for = 2.

Now, let us assume an appropriate weight function (or a densityfunction) g�k� for � such that

�i� g�k� = 0 if k ≤ 0� �ii� g�k� ≥ 0 if k > 0� �iii�∫

0g�k�dk = 1�

�iv�∫

0k−sg�k�dk is finite for all integers s�>0�� (3.2)

Let us denote ∫

0k−1g�k�dk = a−1 with a�>0��

If g�k� is chosen as a discrete weight function, then these and otherintegrals over k are to be replaced by the analogous finite (or infinite)sums over the appropriate domain of k. For brevity, however, wecontinue to work with the integrals with respect to g�k� whenever needed.

In section 2, the risk function (3.1) was interpreted as the square ofthe CV. Now, the integrated risk (IR) function associated with (3.1) canbe expressed as

IRn =∫

0E��kLn�g�k�dk = n−1��−1 + a−1�� (3.3)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

110 Mukhopadhyay and de Silva

where IRn is interpreted as the average of squared CV, averaged withrespect to the chosen weight function g�k�. Hence, we refer to themethodology that we are about to present as the integrated CV (ICV)approach.

A very reasonable goal of an experimenter is to try and make IRn

“small.” One may attempt to achieve this goal by first fixing a smallpreassigned number c�>0� and then designing a sampling strategy thatwould enable one to claim that IRn ≤ c2� Now, the fixed sample sizerequired to achieve this goal, namely to have IRn ≤ c2, turns out to bethe smallest integer

n ≥ c−2��−1 + a−1� = n∗∗� say� (3.4)

Here, n∗∗ may be referred to as the “optimal” fixed sample size, but itsmagnitude remains unknown because it involves �! Hence, aiming toimplement the “optimal fixed-sample-size design” to collect data is againdefinitely out of question.

First, we develop a new two-stage sampling design and thensummarize some of its associated properties in section 3.1 analogous tothose discussed earlier in sections 2.1–2.2.

3.1. A New Sampling Design and Its Properties

Recall the expression of n∗∗ from (3.4) and observe that n∗∗ > �ac2�−1,whereas this lower bound is a known entity with a−1 = ∫

0 k−1g�k�dk. Letus define the pilot sample size

m ≡ m�c� = ⟨�ac2�−1

⟩+ 1� (3.5)

and let X1� � � � � Xm be the pilot observations. Next, we choose and fix anumber �> 0� and define

N ≡ N�c� = ⟨{�Xm +m− �−1 + a−1

}c−2

⟩+ 1� (3.6)

It should be clear that N is an estimator of n∗∗ based on the pilotobservations. The role of the term m− is to make sure again that thepilot estimator Xm +m− for � remains positive with probability one.

Now, having determined N , we sample the difference �N −m�in the second stage, thereby obtaining additional observationsXm+1� � � � � XN . Then, based on the totality of all gathered observationsX1� � � � � Xm�Xm+1� � � � � XN , we propose to estimate � by the associatedsample mean, XN .

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 111

Two important entities we should carefully analyze are the following.First, we define the integrated average sample number (IASN )

IASN ≡ IASNc =∫

0E��kN�g�k�dk� (3.7)

and then the integrated sequential risk (ISR)

ISR ≡ ISRc =∫

0E��kLN �g�k�dk� (3.8)

Obviously, both IASN and ISR depend upon c. The following theoremssummarize crucial asymptotic behavior of the IASN and ISR criteria.

Theorem 3.1. For the two-stage procedure (3.5)–(3.6), with > 0, wehave as c → 0:

n∗∗−1IASNc → 1� that is�

[ ∫

0E��kN�g�k�dk

]/n∗∗ → 1�

where g�k� satisfies the conditions (i)–(iv) in (3.2). This property is referredto as the (first-order) integrated asymptotic efficiency.

Theorem 3.2. For the two-stage procedure (3.5)–(3.6), with > 0, wehave as c → 0:

c−2ISRc → 1� that is�

[ ∫

0E��kLN �g�k�dk

]/c2 → 1�

where g�k� satisfies the conditions (i)–(iv) in (3.2). This property is referredto as the (first-order) integrated asymptotic risk efficiency.

Theorem 3.1 shows that the two-stage procedure is “asymptoticallyefficient” in the sense that the integrated average sample size maybe expected to be close to n∗∗, whereas Theorem 3.2 shows that theintegrated sequential risk may be expected to be near c2, the preassignedtarget.

Theorem 3.3. For the two-stage procedure (3.5)–(3.6), with > 1, wehave as c → 0:

�∗ + o�1� ≤ IASNc − n∗∗ ≤ �∗ + 1+ o�1�

where g�k� satisfies the conditions (i)–(iv) in (3.2) and �∗ = a�−3�� +a−1�2�. This property is referred to as the (second-order) integratedasymptotic efficiency.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

112 Mukhopadhyay and de Silva

Theorem 3.3 provides the asymptotic second-order bounds forIASNc − n∗∗. Obviously, this result is stronger than Theorem 3.1 when > 1. When we could assume that � was known, Theorem 2.2 gaveanalogous asymptotic second-order bounds � and �+ 1 with � ≡���� �� = ��−3�� + �−1�2�. One may note that �∗ is not merely given by∫ 0 ���� k�g�k�dk, that is, by averaging ���� k� with the assumed weightfunction g�k� of �.

We require complicated new techniques to prove these results. Thedetails are delegated to the appendix.

3.2. Some Guidelines to Choose a Weight Function

In order to implement the methodology (3.5)–(3.6), one only has toobtain a where a−1 = ∫

0 k−1g�k�dk. If one feels comfortable with thecharacteristics cited in Theorems 3.1–3.2, then any choice of > 0 wouldwork just fine. The boundedness of IASNc − n∗∗ holds, however, when is chosen to exceed one. The condition regarding the finiteness of∫ 0 k−sg�k�dk for all positive integral s is used only in the proofs ofTheorems 3.1–3.3. A practitioner, however, needs to specify g�k� so thata can be found easily. Other specific elicitations of g�k� beyond this arenot essential for the methodology to work or its characteristics as statedin Theorems 3.1–3.2 to hold. That is, the proposed methodology is very“robust” with regard to many choices of the weight function g�k�.

Now, if the support of g�k� is chosen compact, then conditionsstated in (3.2) will be automatically satisfied. Again, in this situationthere may be many possible choices of g�k�. Suppose, for example, thatthe unknown parameter � is believed to be around 0�8, but there isconsiderable uncertainty around this value. In what follows, we giveexamples (Table 1) of six possible supports and some associated g�k�functions. These choices tend to model the uncertainty of � around thevalue 0�8 differently from one another, and yet for each g�k�, one mayeasily verify that a−1 = 1�2987.

For these six and innumerable other choices of g functions with thesame value of a−1� the methodology (3.5)–(3.6) will coincide!

Table 1. Examples of weight functions g�k� for � � a−1 = 1�2987

k g1�k� g2�k� g3�k� g4�k� g5�k� g6�k�

0�7 0�3 0�25631 0�53343 0�4313 0�48961 0�500000�8 0�6 0�70000 0�06657 0�3000 0�20000 0�300000�9 0�1 0�04369 0�40000 0�2687 0�30000 0�150931�5 0�01039 0�04907

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 113

The g function’s support may instead be �0�� or some particularsubinterval of �0��. In reality, one may alternatively think of a specificg function first and then accordingly determine a that is to be usedin (3.5)–(3.6). But then there can be many other possible g functionswith the matching value a. In this sense, we can again claim that theproposed estimation methodology (3.5)–(3.6) is “robust.” This is indeedan attractive characteristic of the proposed approach.

From a purely methodological point of view, pondering about aspecific choice of the g function may amount to more along mentalconditioning rather than something that is an absolute necessity. Whenthe support is �0��, we may want to focus on the density functioncorresponding to an inverse gamma distribution among a host of otherchoices. Again, a specific choice hardly matters as long as a remainsunchanged. We may let

g�k� ≡ g�k �� �� ={0 if k ≤ 0

�������−1k−�−1 exp�−��k�−1� if k > 0(3.9)

with known � > 0� � > 0. Obviously, in this case one has∫ 0 k−sg�k�dk =

�s��s + ��/����� s > 0.An experimenter may start by plotting the g�k� function with several

choices of �, �, examine which one captures best the sense of uncertaintyabout �, and then proceed from there. On the other hand, even though� is unknown, an experimenter may be able to specify what may beexpected on an “average.” That is, one may start with a conceivedvalue of the reciprocal of ���− 1� that immediately demands that � bechosen exceeding one. But, having fixed ���− 1� with some � > 1, theexperimenter would still face many possible choices! Again, one may plotsome of these g functions and fairly quickly zero in on one that capturesbest the sense of uncertainty in practice.

For example, suppose that g has to be chosen in such a waythat the “average” of unknown � with respect to g should be 1

2 ,that is,

∫ 0 kg�k �� ��dk = �−1��− 1�−1 = 1

2 . One can check that ��� �� =�1�1� 20�� �2� 2�� �3� 1�, �4� 2

3 � are then some possible choices. One canclearly see from Figure 5 that these four g functions capture the sense ofuncertainty about � differently from one another. Once one zeroes in ona specific pair of ��>1�� ��>0�, then a−1 is simply ��.

Alternatively, we may consider the density function associated witha lognormal distribution and let

g�k� ≡ g�k �� �� ={0 if k ≤ 0[�√2�

]−1k−1 exp

(− 12 �log k− ��2�−2

)if k > 0

(3.10)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

114 Mukhopadhyay and de Silva

Figure 5. Inverse gamma density (3.9) with ��� �� = �1�1� 20�� �2� 2�, �3� 1�,�4� 2

3 �: “average” � is 12 for each choice.

with known − < � < , � > 0. Then, one obviously has

0k−sg�k�dk = exp

(− s�+ 12 s

2�2)�

An experimenter may again start by plotting the g�k� functionwith several choices of �, �, examine which one captures best thesense of uncertainty about �, and then proceed from there. We havea−1 = exp

(−�+ 12�

2), and we pick, for example, the pairs ��� �2� =

�2� 2�� �3� 4�� � 54 �12 �� �

98 �

14 �. In Figure 6, plots of these four g functions

show how differently each captures the uncertainty about �, and yet wehave a−1 = e−1 ≈ 0�36788 for each choice. That is, under any of these gfunctions, the statistical methodology (3.5)–(3.6) would coincide!

It should be reassuring to know that an explicit model for theuncertainty of � would largely affect, for all practical purposes, neitherdata collection nor analysis of such data as long as one can zero in onthe magnitude of a.

Remark 3.1. A referee remarked that although an infinite number ofg�k�’s result in any given value of a, the integrated risk for valuesof � within the domain of g�k� would likely depend on the shape ofg�k�, and that it would be nice to know how the two might interrelate.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 115

Figure 6. Lognormal density (3.10) with ��� �2� = �2� 2�� �3� 4�� � 54 �12 �� �

98 �

14 ��

a−1 = e−1 ≈ 0�36788 for each choice.

From Figures 5 and 6, one notes that we have considered severalkinds of shapes for the weight function g�k�. Admittedly, we have notexperimented with all possible shapes for g�k�, but for those shapes thatwe have included here, we find that the actual shape of g�k� has noimpact on the performances of our proposed methodology (3.5)–(3.6) aslong as one can zero in on the magnitude of a.

3.3. Simulation Results

A statistical summary of simulation results obtained for the case where� is assumed unknown is given in the following tables. As in section 2.3,the simulation study was again conducted using a wide range of valuesof � and �. Each simulation run was replicated 15�000 times for a fixedset of values of c� �� � and = 2. We considered c = 0�2� 0�1� and 0�03in order to include examples corresponding to small, moderate, andlarge sample sizes n∗∗. But, for every fixed pair of values of � and c,we again noted that the performances of our simulated experimentsgenerally showed unmistakably similar features whether we had fixed� = 3� � < 3� or � > 3. Hence, we summarize our findings only fromthose simulations that were run with fixed � = 3.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

116 Mukhopadhyay and de Silva

Table 2. Simulated values of integrated risk IRn∗∗ with fixed = 2� a−1 = 1�2987 and g1�k�

� IRn∗∗ min(IRn∗∗ ) max(IRn∗∗ )

c = 0�20� n∗∗ = 40�800�7 0�03937 0�02911 0�043150�8 0�03938 0�02819 0�042340�9 0�03940 0�02852 0�04234

c = 0�10� n∗∗ = 163�200�7 0�00996 0�00922 0�010400�8 0�00996 0�00930 0�010270�9 0�00996 0�00941 0�01030

c = 0�03� n∗∗ = 1813�370�7 0�00090 0�00089 0�000910�8 0�00090 0�00089 0�000910�9 0�00090 0�00089 0�00091

In the first row of Table 2, having fixed � = 3� c = 0�2� � = 0�7, and = 2, first we used the weight function g1�k� from Table 1 so thata−1 = 1�2987 which gave rise to n∗∗ = 40�801. We generated a negativebinomial distribution with � = 3� � = 0�7, but pretending that they wereunknown, we took a random sample of size n = 41, estimated IRn

given by (3.3), and replicated this process 15�000 times independently.The corresponding estimated average (IRn∗∗ ), estimated minimum(min�IRn∗∗�), and estimated maximum (max�IRn∗∗�) values of thecomputed (integrated) risk are provided in the first row of Table 2. Theother rows in Table 2 were constructed analogously by using the sameweight function g1�k� from Table 1. Clearly, we found IRn∗∗ ≤ c2 in everysituation; however, in a very small number of simulations out of 15�000simulations, the computed value IRn∗∗ exceeded c2 ever so slightly, andhence max�IRn∗∗� provided in the last column turned out slightly largerthan c2.

In Table 3, we have supplied the average sample size (n) andestimated population mean (x) together with their estimated standarderrors sn� sx found in columns 4 and 8, respectively. We emphasize thatthe results presented in Table 3 were obtained assuming the weightfunction g1�k� for � from Table 1 using 15�000 simulations to produceeach row. Further, nmin from column 5 clearly shows that simulationsnever terminated with pilot samples alone, nmin < n∗∗ < nmax� and�n− n∗∗� is positive as well as small for all values of c� � underconsideration.

The sets of simulations that provided the Tables 2–3 were thenrepeated assuming other g functions from Table 1. But all these weight

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 117

Table 3. Estimated average, minimum, and maximum sample size, and averageestimated � with fixed = 2� a−1 = 1�2987 and g1�k�

� n̄ sn̄ nmin nmax x̄ sx̄

c = 0�20 0.7 41�75 0.0177 37 60 3.031 0.005m = 33 0.8 41�74 0.0170 37 63 3.018 0.005n∗∗ = 40�80 0.9 41�67 0.0160 37 62 3.026 0.005

c = 0�10 0.7 164�17 0.0329 152 184 3.006 0.003m = 130 0.8 164�14 0.0312 153 186 3.004 0.002n∗∗ = 163�20 0.9 164�05 0.0293 153 181 3.007 0.002

c = 0�03 0.7 1814�27 0.1069 1770 1867 3.001 0.001m = 1443 0.8 1814�14 0.0987 1773 1862 3.001 0.001n∗∗ = 1813�37 0.9 1814�32 0.0942 1775 1861 3.000 0.001

functions had the same associated a−1 value, namely 1�2987, and hencethose additional sets of simulations produced results that were practicallyindistinguishable from what we found in Tables 2 and 3.

Next, in order to have some ideas about the performance of theproposed methodology for unknown but larger values of � �>1�, weincluded the following two weight functions in our investigation:

g7�k� = 0�25 for k = 2� 4� 6� 8� and

g8�k� = 0�25� 0�50� 0�25 for k = 10� 15� 20� respectively�

Note that a−1 value associated with the weight functions g7�k� and g8�k�is 0�2604 and 0�0708, respectively. Thus, these two weight functions areassociated with different n∗∗ and �∗ values.

The integrated average sample number (IASNc) from (3.7) wasestimated in order to examine the validity of the result stated inTheorem 3.3 when the sample size n∗∗ was small or moderate. Thesewere estimated from 10� 000 simulations each, successively assuming theweight functions gi�k�, i = 1� � � � � 8� For the sake of brevity, however,we present �, estimated values of IASNc − n∗∗, in Table 4 that areassociated with the weight functions gi�k�, i = 1� 2� 3� 5� 7� and 8. Clearly,the entries in Table 4 show that the estimated values of IASNc − n∗∗ liebetween �∗ and �∗ + 1. This feature was intuitively expected in view ofTheorem 3.3 regardless of the specific nature of the weight function g�k�that was used.

Remark 3.2. From Table 3, it may appear that the proposed estimatorof � has a slight positive bias, especially when c = 0�20�m = 33, andn∗∗ = 40�80. But, in this particular case, we should note that both m

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

118 Mukhopadhyay and de Silva

Table 4. Estimated IASNc − n∗∗�=�� values associated with = 2 and theweight function g�k�

g1�k� g2�k� g3�k� g5�k� g7�k� g8�k�

a−1 1�2987 1�2987 1�2987 1�2987 0�2604 0�0708�∗ 0�42 0�42 0�42 0�42 0�76 1�90

c = 0�20� 0�93 0�91 0�94 0�92 1�27 1�93n∗∗ 40�80 40�80 40�80 40�80 14�84 10�10

c = 0�10� 0�94 0�92 0�94 0�94 1�32 2�36n∗∗ 163�20 163�20 163�20 163�20 59�37 40�41

c = 0�03� 0�83 0�91 0�81 0�84 1�20 2�40n∗∗ 1813�40 1813�40 1813�40 1813�40 659�70 449�04

and n∗∗ are quite small, whereas c appears fairly large. Hence, we wouldrather stay away from inferring too much from the three given simulatedaverage x values! In the same table, however, when we consider the x andsx values together for c = 0�10� 0�03� we note that the true ��=3� valueis included in five out of the six two-standard deviation intervals aroundx, namely x − 2sx� x + 2sx�. Mathematically speaking, the magnitude ofbias in XN certainly depends upon �� ��m, and the bias is of the orderO�m−2�. That is, any bias in the estimator of � is expected to be quitesmall in practice if c is small. This sentiment is clearly validated byTable 3.

4. APPLICATION TO MEXICAN BEAN BEETLE DATASETS

In this section, we handle four datasets consisting of beetle infestationof Mexican bean crop originally collected and investigated by Dr. JoseBarrigossi in his Ph.D. dissertation on integrated pest management (IPM)under his advisor, Dr. Leon G. Higley, in the Department ofEntomology, University of Nebraska–Lincoln. For purposes of identi-fication, we have named these datasets dataset 1, dataset 2, dataset 3,and dataset 4, respectively.

We aim at estimating the mean beetle infestation (�) in Mexican beancrop with preassigned risk-bound c2. Initially, we assumed that � wasknown and hence used the CV approach (2.3)–(2.4), originally developedby Mukhopadhyay and Diaz (1985), to determine the required samplesize for achieving the pre-fixed precision of the final estimator of �.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 119

Figure 7. Bar graphs of the four datasets on beetle infestation of Mexican beancrop.

It was anticipated that � was close to 0�3 or 0�4, and hence analysis wascarried out using both these values. Next, the integrated coefficient ofvariation (ICV) approach developed in section 3 was applied to thesedatasets for the situation assuming that � was unknown.

Figures 7(a)–7(d) present the bar graphs of the datasets 1–4. Thenumbers 3840, 1695, 5210, and 1545 in these figures respectively stand

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

120 Mukhopadhyay and de Silva

Table 5. Descriptive statistical summaries of the four datasets on beetleinfestation of Mexican bean crop

Dataset 1 Dataset 2 Dataset 3 Dataset 4

No. of obs. 3840 1695 5210 1545Average 0.0148 0.0743 0.1451 0.1942Variance 0.0157 0.0901 0.1690 0.2835Std. dev. 0.1252 0.3002 0.4111 0.5325

for the number of observations in the datasets. The gaps seen betweenbars correspond to 0 (zero) observed values. From Figure 7(a), we seethat even though there are 3,840 recorded observations in dataset 1,many of them are 0’s. Datasets 3–4 look more appropriate for ouranalysis because these consist of large number of positive values andalso include some high values. For example, dataset 3 show two 4’s,and dataset 4 shows two 6’s. In Table 5, we also present brief statisticalsummaries of these datasets. We assume that each dataset representsa random sample from a respective population, and hence a randomsample drawn with replacement from such a dataset would representindependent negative binomial random variables. In this analysis, wetreat each dataset as a “population” and proceed to estimate the mean(�) of each population by drawing random samples with replacement.That is, in each case, such observations drawn can be treated asindependent random samples from a negative binomial population.

4.1. Analysis with Known Clumping Parameter

As explained earlier, we started our analysis assuming that the clumpingparameter � was known. First, we proceeded with the assumption that� = 0�3 and then the analysis was repeated assuming � = 0�4. From thesimulation study given in section 2.3, we found that “results” were notgoing to be influenced by the choice of , and so we had fixed thevalue = 2 for all subsequent applications. We chose c = 0�1 in orderto come up with moderately large sample sizes associated with the two-stage procedure (2.3)–(2.4). For a given value of �, the initial sample sizem was computed from (2.3). Table 6 shows m = 334 when � = 0�3 andm = 250 when � = 0�4. Clearly, a smaller value of � is associated witha larger initial sample size m. The final sample size N was computedfrom (2.4). Here, Xm was obtained by taking a random sample of size mwith replacement from the dataset under consideration. Next, by takinganother random sample of size N −m with replacement from the same

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 121

Table 6. Application of two-stage procedure (2.3)–(2.4) using datasets 1–4: c =0�1 and = 2

Dataset 1 Dataset 2 Dataset 3 Dataset 4

� = 0�3 m 334 334 334 334N 5898 1618 1076 890XN 0.0129 0.0773 0.1292 0.1809n∗ 7070.2 1678.6 1022.5 848.3

� = 0�4 m 250 250 250 250N 5247 1337 926 794XN 0.0133 0.0733 0.1350 0.1877n∗ 6986.8 1595.2 939.2 765.0

dataset and then combining two sets of observations, we obtained thesample mean XN .

Since there are 3,785 (that is, 98�6%) zero values in dataset 1, itturns out that with c = 0�1 the required number of observations fromthis population unfortunately exceeds the number of observations in thedataset. Further, for the given parameter values, the optimal samplesize, n∗, was computed using (2.2) and replacing the population mean �with the mean from the dataset. In the case of dataset 1, we note thatn∗ = 7070�2 when � = 0�3 and n∗ = 6986�8 when � = 0�03 (see Table 6).Clearly, these n∗ values exceed the maximum number of observationsavailable in the dataset. Both sample means obtained from dataset 1using � = 0�3 and � = 0�4 respectively underestimate the mean of thedataset.

While using datasets 1, 3, and 4 with � = 0�3 and � = 0�4, we findthat the sample mean underestimates the respective mean of each dataset.In the case of dataset 2, the sample mean overestimates (underestimates)the mean of this dataset when � = 0�3 �� = 0�4�. However, the marginof over- or underestimation is rather small in all cases.

4.2. Analysis with Unknown Clumping Parameter

We were told that the unknown clumping parameter � could besomewhere around 0�3 and 0�4, and hence for simplicity we assumed theweight function associated with a uniform distribution on the interval�0�3� 0�4�. That is, we fixed

g�k� = 10I�0�3 ≤ k ≤ 0�4�

with a−1 = 2�876821 ≈ 2�877. Here and elsewhere, I�·� stands for theindicator function of �·�.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

122 Mukhopadhyay and de Silva

Table 7. Application of two-stage procedure (3.5)–(3.6) using datasets 1–4: c =0�1, = 2 and a−1 = 2�877

Dataset 1 Dataset 2 Dataset 3 Dataset 4

m 288 288 288 288N 5085 1488 1027 842XN 0.0136 0.0726 0.1315 0.1876n∗∗ 7024.5 1632.9 976.9 802.7

Table 7 gives the results obtained by using the ICV approach(3.5)–(3.6) with a−1 = 2�877, c = 0�1, and = 2 successively on datasets1–4. As expected, both the initial and final sample sizes in this situationfell between the corresponding values of m and N that were obtained(Table 6) when we had assumed a known value � = 0�3 or � = 0�4. Finalestimator XN underestimated � for all datasets 1–4.

Table 7 (and Table 6) shows that the accuracy of the estimatedvalue of � will depend on the calculated value of a−1. To examine the

Figure 8. Sample size N and sample mean xN versus a−1 with c = 0�1, = 2for dataset 1.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 123

Figure 9. Sample size N and sample mean xN versus a−1 with c = 0�1, = 2(– ·– = dataset 2, – – = dataset 3, and –+ – = dataset 4).

sensitivity of the chosen value of a−1, we obtained N and XN for differentchoices of a−1. Figures 8 and 9 show the nature of sensitivity of Nand XN over chosen values for a−1 while using datasets 1–4. We maysuppose, for example, that the unknown clumping parameter � has itsweight function g�k�, which is positive whenever k ∈ �0�20� 1�0� and 0elsewhere. Then, clearly, we have 1 < a−1 < 5. Therefore, in Figures 8and 9, we present “sensitivity analysis” with a−1 = 1�0� 1�2� 1�4� � � � � 5�0.From these figures, we note that although N increases with a−1, thevalues of XN become quite stable when a−1 > 2�5.

Remark 4.1. A referee has pointed out that in practice, the mean (andhence variance) would be related to the sampling unit size. We couldnot agree more. In order to take the effect of varying sampling unitsize into account, one may start with a kind of stratification withrespect to sampling unit size. Then, one might try to come up withsome appropriate modifications of the proposed methodologies. The ideasounds simple enough, but it may be something else when one wouldproceed to check some of the associated theoretical properties. We hopeto pursue this problem in a later communication.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

124 Mukhopadhyay and de Silva

APPENDIX A. DERIVATIONS AND PROOFS

We start with some preliminaries and then supply proofs ofTheorems 2.1–2.3 and Theorems 3.1–3.3. In order to proveTheorems 3.1–3.3, we need some lemmas, which are proved here too.

In what follows, we frequently use a generic function f�Xm�, andobtain the Taylor expansion of f�x� with a remainder term by treatingx as a positive continuous variable. We emphasize, however, that in thesequel an entity such as E�f�Xm�� or E���f�Xm�� is evaluated using thetrue discrete distribution of the random variable Xm under consideration.

A.1. Some Preliminaries

We suppose that X has the distribution NB(�� �). Let us temporarilydrop the subscript for expectations, namely � or �� � as the casemay be. Recall that EX� = � and VX� = � + �−1�2. The followingfactorial moments of X are easily found in Johnson and Kotz (1969,pp. 125–126):

EX�X − 1�� = �� + 1��−1�2

EX�X − 1��X − 2�� = �� + 2��� + 1��−2�3 (A.1)

EX�X − 1��X − 2��X − 3�� = �� + 3��� + 2��� + 1��−3�4�

Now, expressing X3 and X4 as

X3 = X�X − 1��X − 2�+ 3X�X − 1�+ X�

X4 = X�X − 1��X − 2��X − 3�+ 6X�X − 1��X − 2�+ 7X�X − 1�+ X�

(A.2)

we obtain from (A.1)–(A.2):

E�X − ��3� = 2�−2�3 + 3�−1�2 + �

E�X − ��4� = 3�� + 2��−3�4 + 6�� + 2��−2�3 + �3� + 7��−1�2 + ��

(A.3)

Next, observing that∑m

i=1 Xi has the distribution NB(m��m�), from(A.3) we immediately obtain

E�Xm − ��3� = �2�−2�3 + 3�−1�2 + ��m−2 = a1m−2� say

E�Xm − ��4� = 3�2�1+ ��−1�2m−2

+�6�−3�4 + 12�−2�3 + 7�−1�2 + ��m−3

= a2m−2 + a3m

−3� say� (A.4)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 125

We remark in passing that we did not find the expressions given by(A.3)–(A.4) in a readily accessible source. So these are included here forcompleteness.

A.2. Proof of Theorem 2.1

We can obviously claim that

m ≤ �−1c−2 + 1 and N ≥ c−2{�Xm +m− �−1 + �−1

}�

Thus, we can express

���N = m� ≤ ��

[m ≥ c−2

{�Xm +m− �−1 + �−1

}]≤ ��

[�−1c−2 + 1 ≥ c−2

{�Xm +m− �−1 + �−1

}]= ��

[Xm − � ≥ c−2 − � −m−

]� (A.5)

It is clear that E� Xm − � 2p� = O�m−p� for any fixed p�>0�. Now, wepick c sufficiently small (≤ c0�>0�) such that c−2 − � −m− is positive.Then, for c ≤ c0, using Markov inequality, we can rewrite (A.5) as

���N = m� ≤ ��

[∣∣Xm − �∣∣ ≥ c−2 − � −m−

]≤ �c−2 − � −m− �−2pE�

[ ∣∣Xm − �∣∣2p ]

= �c−2 − � −m− �−2pO�m−p�

= O�m−3p�� (A.6)

But p�>0� is arbitrary! �

A.3. Proof of Theorem 2.2

Let us assume that > 1. Now, using (A.4) and Taylor expansion, withsome random variable U between Xm and �, we can write

E��Xm +m− �−1� = �� +m− �−1 + �� +m− �−3E��Xm − ��2�

− �� +m− �−4E��Xm − ��3�+ �� +m− �−5

×E��Xm − ��4�− E��Xm − ��5�U +m− �−6�

= �−1 + O�m− �+ ��−3 + O�m− ���2m−1 + O�m−2�

−E��Xm − ��5�U +m− �−6�� (A.7)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

126 Mukhopadhyay and de Silva

Next, we proceed to prove that

E��Xm − ��5�U +m− �−6� = O�m−2�� (A.8)

Now, we first note that U > 12� on the set Xm ≥ 1

2��, and hence

E�

[∣∣�Xm − ��5�U +m− �−6I�Xm ≥ 12��

∣∣]≤ � 12��

−6E�

[∣∣Xm − �∣∣5]

= O�m−5/2�� (A.9)

which is O�m−2�. Recall that we write I�·� for the indicator functionof �·�.

Next, we first use Holder’s inequality and then Markov inequalityto obtain

E�

[∣∣�Xm − ��5�U +m− �−6I�Xm < 12��

∣∣]≤ m6 E�

∣∣Xm − �∣∣5 I�Xm < 1

2���

≤ m6 E5/6�

∣∣Xm − �∣∣6��1/6

� �Xm < 12��

≤ O�m6 − 52 ��1/6

� �∣∣Xm − �

∣∣ > 12��

= O�m6 − 52 �O�m− p

6 �� (A.10)

which is O�m−2�, if p�>0� is chosen sufficiently large. Thus, (A.8) followsfrom (A.9) and (A.10). (A.7) is now rewritten as

E��Xm +m− �−1� = �−1 + O�m− �+ ��−3 + O�m− ���2m−1 + O�m−2��(A.11)

Now, from (2.4), we obtain

{�−1 + �Xm +m− �−1

}c−2 ≤ N ≤ {

�−1 + �Xm +m− �−1}c−2 + 1�

(A.12)

and hence utilizing (2.2) and (A.11), we can write

{E��Xm +m− �−1�− �−1

}c−2 ≤ E�N − n∗�

≤ {E��Xm +m− �−1�− �−1

}c−2 + 1� (A.13)

But the left-hand side of (A.13) reduces to ��2�−3 once we exploit thefact that c−2 ≈ m�. �

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 127

A.4. Proof of Theorem 2.3

Part (i). With > 12 , observe that

m1/2�Xm +m− �− ��£→ N�0� �2� as c → 0

⇒ Ym ≡ m1/2�Xm +m− �−1 − �−1�£→ N�0� �2�−2� as c → 0�

(A.14)

via Taylor expansion. Next, from (A.12), note that n∗−1/2�N − n∗� andn∗−1/2c−2�Xm +m− �−1 − �−1� must have the same asymptotic distri-bution. But we can rewrite

n∗−1/2c−2�Xm +m− �−1 − �−1� = n∗−1/2c−2m−1/2Ym�

That is, in view of (A.14), we can conclude that

n∗−1/2�N − n∗�£→ N�0� �� as c → 0�

since limc→0 n∗−1/2c−2m−1/2 = �−1��1/2.

Part (ii). Let us denote Q = {�−1 + �Xm +m− �−1

}c−2 so that we

obviously have Q ≤ N ≤ Q+ 1, and hence we can write

N − n∗ ≤ Q− n∗ + 1� (A.15)

Thus, part (ii) would follow if we verify that the following assertionholds:

�Q− n∗�2/n∗ is uniformly integrable for sufficiently small c� if > 12 �

(A.16)

Now, with � = min�2� �, we can rewrite (A.11) as

E��Xm +m− �−1� = �−1 + �−3�2m−1 + o�m−��� (A.17)

Using a similar technique, we can also have

E��Xm +m− �−2� = �−2 + ��−2 + 2�−4��2m−1 + o�m−��� (A.18)

Then, combining (A.17)–(A.18), one obtains

E�

[{�Xm +m− �−1 − �−1

}2] = �−2�2m−1 + o�m−��� (A.19)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

128 Mukhopadhyay and de Silva

so that

E�

[m{�Xm +m− �−1 − �−1

}2] → �−2�2

when > 1. In other words, E�mc4�Q− n∗�2� → �−2�2, and henceE��Q− n∗�2/n∗� → � when > 1. Also, along the lines of part (i), we

can claim that n∗−1/2�Q− n∗�£→ N�0� ��. This distributional convergence

along with the immediately preceding moment convergence imply thevalidity of the assertion made in (A.16).

Part (iii). We start by writing

V�N� = E��N − E�N��2� = E��N − n∗�2�− �E�N�− n∗�2� (A.20)

Next, in view of parts (i) and (ii), we have E��N − n∗�2� = �n∗ + o�n∗�if > 1. From Theorem 2.2, we immediately conclude that

n∗−1/2�E�N�− n∗� = o�1� ⇒ �E�N�− n∗�2 = o�n∗� (A.21)

when > 1. Now, part (iii) follows immediately by combining (A.20)and (A.21). �

A.5. Proof of Theorem 3.1

Along the line of (A.7), with arbitrary > 0 and some random variableU between Xm and �, we can write

E����Xm +m− �−1� = �� +m− �−1 + E����Xm − ��2�U +m− �−3�

= �� +m− �−1 + E���W�� say� (A.22)

Thus, along the line of (A.13), we can obtain

{a−1 + E����Xm +m− �−1�

}c−2 ≤ E���N�

≤ {a−1 + E����Xm +m− �−1�

}c−2 + 1�

so that we have{a−1 + �� +m− �−1 +

0E��kW�g�k�dk

}c−2 ≤ IASNc

≤{a−1 + �� +m− �−1 +

0E��kW�g�k�dk

}c−2 + 1

⇒ �a−1 + �−1�−1[{a−1 + �� +m− �−1

}+ ∫

0E��kW�g�k�dk

]

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 129

≤ IASNc/n∗∗ ≤ �a−1 + �−1�−1

[{a−1 + �� +m− �−1

}+∫

0E��kW�g�k�dk

]+ n∗∗−1�

(A.23)

Now, the desired result clearly follows from (A.23) if we prove thefollowing: ∫

0E��kW�g�k�dk → 0 as c → 0� (A.24)

In doing so, let us upgrade the argument that led to (2.6) inMukhopadhyay and Diaz (1985) to obtain

E���WI�Xm ≥ 12��� ≤ 8�� + �−1�2��−3m−1

⇒ 0 ≤∫

0E��kWI�Xm ≥ 1

2���g�k�dk ≤ 8�� + a−1�2��−3m−1

⇒ limc→0

0E��kWI�Xm ≥ 1

2���g�k�dk = 0� (A.25)

Similarly, upgrading the argument that led to (2.7) in Mukhopadhyayand Diaz (1985), with Z = ∣∣Xm − �

∣∣ and arbitrary positive integer p, weclaim

E���WI�Xm < 12��� ≤ �2/��2pm3 E1/2

���Z4�E1/2

���Z2p�� (A.26)

At this point, referring to the expression of the jth factorial moment of Xgiven in Johnson and Kotz (1969, p. 126), it follows that for sufficientlysmall c, with some positive integer q ≡ q�p�, we can express

E���Z4�E���Z

2p� ≤ m−2−pq∑

s=0

bs�−s

where bs’s are nonnegative, bs’s may involve � and other constants, butnot �. Then, we obtain

0 ≤∫

0E��kWI�Xm < 1

2���g�k�dk ≤ �2/��pm m−1− 12 q�∗�

where �∗ =∫

0

√q∑

s=0

bsk−sg�k�dk� �∗ > 0� (A.27)

But, by Jensen’s inequality, one observes that

0 < �∗ ≤√

q∑s=0

0bsk

−sg�k�dkwhich is finite, anddoes not involve c

⇒ limc→0

0E��kWI�Xm < 1

2���g�k�dk = 0� (A.28)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

130 Mukhopadhyay and de Silva

once p is chosen sufficiently large. Now combining the last steps from(A.25) and (A.28), the validity of (A.24) is immediate. �

A.6. Proof of Theorem 3.2

Let us first denote Xm = �X1� � � � � Xm�, Sm = ∑mi=1 Xi, SN = ∑N

i=1 Xi, S∗N =∑N−m

i=1 Xi, and express

E����XN − ��2�

= E���

[�Sm−m��2

N 2 + �S∗N−�N−m���2

N 2 + 2 �Sm−m���S∗N−�N−m���

N 2

]= E���

{E[�Sm−m��2

N 2 + �S∗N−�N−m���2

N 2 + 2 �Sm−m���S∗N−�N−m���

N 2

∣∣∣Xm

]}�

(A.29)

Now, while taking the conditional expectation, we observe that givenXm, we can claim (1) both Sm and N are fixed, and (2) S∗

N is the sum ofN −m independent and identically distributed random variables. Thus,(A.29) simplifies to

E����XN − ��2� = E���

[�Sm−m��2

N 2

]+ E���

[�N−m��2

N 2

]= J�1�

��� + J�2����� say� (A.30)

Lemma A.1. For the two-stage procedure (3.5)–(3.6), with > 0, wehave:

c−2�−2∫

0�2E��k

[1N

]g�k�dk → 1 as c → 0�

Lemma A.2. For the two-stage procedure (3.5)–(3.6), with > 0, wehave:

mc−2�−2∫

0�2E��k

[1N 2

]g�k�dk → �

� + aas c → 0�

Lemma A.3. For the two-stage procedure (3.5)–(3.6), with > 0, wehave:

c−2�−2∫

0J�1���kg�k�dk → �

� + aas c → 0

where J�1���� = E���

[1N 2 �Sm −m��2

].

Subsequently, we prove these three results. Recall from (3.7) thatISRc/c

2 = �−2c−2∫ 0 E��k�XN − ��2�g�k�dk. Hence, assuming the validity

of these three lemmas, the desired result follows immediately from(A.30). �

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 131

A.6.1. Proof of Lemma A.1

Let us first define

Md = c−2{a−1 + �Xm +m− �−1

}+ d with d = 0� 1

and investigate the behavior of c−2�−2�2E���

[1Md

]. Observe that

M−1d = ac2�Xm+m− �

�Xm+m− ��1+dac2�+a= f�Xm�� say� (A.31)

where f�x� = ac2�x+m− �

�x+m− ��1+dac2�+a� x > 0. Now, we note that

f ′�x� = a2c2�x +m− ��1+ dac2�+ a�−2 and

f ′′�x� = −2a2c2�1+ dac2��x +m− ��1+ dac2�+ a�−3�

Then, by Taylor expansion, with some random variable U betweenXm and �, we can write

M−1d = f���+ �Xm − ��f ′���+ 1

2 �Xm − ��2f ′′�U�

⇒ E���M−1d � = ac2��+m− �

��+m− ��1+dac2�+a− a2c2�1+ dac2�

×E����Xm − ��2��U +m− ��1+ dac2�+ a�−3�

= J�3���� − a2c2�1+ dac2�J �4�

���� say� (A.32)

First, it is easy to see that

limc→0

0c−2�−2�2J

�3���kg�k�dk

= limc→0

a��+m− �

��+m− ��1+dac2�+a�−2

0�2g�k�dk

= limc→0

a��+m− �

��+m− ��1+dac2�+a�−2�� + a−1�2� = 1� (A.33)

Next, we estimate J�4���� as follows: We write

J�4���� ≤ �m− �1+ dac2�+ a�−3E���

[�Xm − ��2

]≤ �m− �1+ dac2�+ a�−3�2m−1�

Now, we consider the second term in the last step of (A.32). With �∗ ≡∫ 0 �2g�k�dk = � + a−1�2, we have

c−2�−2a2c2�1+ dac2��m− �1+ dac2�+ a�−3m−1∫

0�2g�k�dk

≤ �∗�−2a2�1+ dac2��� 12� +m− ��1+ dac2�+ a�−3m−1� (A.34)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

132 Mukhopadhyay and de Silva

which converges to zero as c → 0. A combination of (A.31)–(A.34)proves the following result for all > 0:

limc→0

c−2�−2∫

0�2E��k

[1Md

]g�k�dk = 1 with d = 0� 1� (A.35)

Along the line of (A.12), from (3.5) we can claim that

M−11 ≤ N−1 ≤ M−1

0 �

and hence, in view of (A.35), the lemma follows. �

A.6.2. Proof of Lemma A.2

We reconsider Md from section A.6.1 and investigate the behavior of

mc−2�−2�2E���

[1

M2d

]�

Observe that

M−2d = a2c4�Xm+m− �2

�Xm+m− ��1+dac2�+a�2= f�Xm�� say� (A.36)

where f�x� = a2c4�x+m− �2

�x+m− ��1+dac2�+a�2� x > 0. Observe that

f ′�x� = 2a3c4�x +m− ��x +m− ��1+ dac2�+ a�−3 and

f ′′�x� = 2a3c4�−2�x +m− ��1+ dac2�+ a��x +m− �

× �1+ dac2�+ a�−4�

Then, by Taylor expansion, with some random variable W betweenXm and �, we can write

M−2d = f���+ �Xm − ��f ′���+ 1

2 �Xm − ��2f ′′�W��

which implies that

E���M−2d � = a2c4��+m− �2

��+m− ��1+dac2�+a�2+ a3c4E���

[�Xm − ��2 �−2�W+m− ��1+dac2�+a�

�W+m− ��1+dac2�+a�4

]= J�5�

��� + a3c4E���

[J�6����

]� say� (A.37)

First, it is easy to see that

limc→0

c−2�−2∫

0m�2J

�5���kg�k�dk = lim

c→0�−2m a2c2��+m− �2

��+m− ��1+dac2�+a�2

0�2g�k�dk

= limc→0

�−2m a2c2��+m− �2

��+m− ��1+dac2�+a�2�� + a−1�2� = �

� + a� (A.38)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 133

Next, we first write

J�6����I�Xm ≥ 1

2�� ≤ �−�� + 2m− ��1+ dac2�+ a�a−4�2m−1� (A.39)

and with �∗ ≡ ∫ 0 �4g�k�dk, we can immediately express

limc→0

c−2�−2a3c4∫

0m�2E��k

[J�6���kI�Xm ≥ 1

2��]g�k�dk

≤ �−2a−1�∗ limc→0

c2�−�� + 2m− ��1+ dac2�+ a�� (A.40)

which converges to zero as c → 0.Now, along the lines of (A.26)–(A.29), with a positive integer p and

an appropriate positive integer q ≡ q�p�, and with a positive number �∗

not involving c, we can claim:

limc→0

c−2�−2a3c4∫

0m�2E��k

[J�6���kI�Xm < 1

2��]g�k�dk

≤ �∗�−2�2/��p limc→0

c2m1− 12 q = 0� (A.41)

since q can be made arbitrarily large by choosing p large. A combinationof (A.40)–(A.41) proves the following result for all > 0:

limc→0

c−2�−2a3c4∫

0m�2E��k

[J�6���k

]g�k�dk = 0 with d = 0� 1� (A.42)

Now, a combination of (A.37), (A.38), and (A.42) proves thefollowing result for all > 0:

limc→0

c−2�−2m∫

0�2E��k

[1

M2d

]g�k�dk = �

� + awith d = 0� 1� (A.43)

Again, since we have M−11 ≤ N−1 ≤ M−1

0 , we can claim that the lemmafollows. �

A.6.3. Proof of Lemma A.3

In order to handle c−2�−2J�1����, we reconsider Md yet one more time from

section A.6.1 and investigate the behavior of c−2�−2E���

[�Sm −m��2 1

M2d

].

Using techniques similar to those in (A.36)–(A.37), with some randomvariable W between Xm and �, we note that

E���M−2d �Sm −m��2� = a2c4��+m− �2m�2

��+m− ��1+dac2�+a�2+ 2a3c4m2

×E���

[�Xm − ��3 �W+m− �

�W+m− ��1+dac2�+a�3

]= J�7�

��� + 2a3c4J�8����� say� (A.44)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

134 Mukhopadhyay and de Silva

First, it is easy to see that

limc→0

c−2�−2∫

0J�7���kg�k�dk

= limc→0

�−2m a2c2��+m− �2

��+m− ��1+dac2�+a�2

0�2g�k�dk

= limc→0

�−2m a2c2��+m− �2

��+m− ��1+dac2�+a�2�� + a−1�2� = �

� + a�

(A.45)

Next, we can obviously claim:

∣∣J�8����

∣∣ ≤ E���

[∣∣Xm − �∣∣3 1

�W+m− ��1+dac2�+a�2

]≤ a−2E���

[∣∣Xm − �∣∣3] ≤ E3/4

���

[∣∣Xm − �∣∣4]

= a2 + a3�3/4m−3/2� using (A.4)� (A.46)

Thus, with �∗ ≡ ∫ 0 a2 + a3�

3/4g�k�dk, which does not involve c, from(A.46) we can immediately write

limc→0

c−2�−2a3c4∫

0

∣∣∣J�8���k

∣∣∣ g�k�dk ≤ �−2a3�∗ limc→0

c2m−3/2 = 0� (A.47)

In other words, by combining (A.44), (A.45), and (A.47), we haveshown that

limc→0

c−2�−2∫

0E��k

[�Sm −m��2

1

M2d

]g�k�dk = �

� + a�

Again, we claim that the lemma follows since M−11 ≤ N−1 ≤ M−1

0 . �

A.7. Proof of Theorem 3.3

Let us assume that > 1. In what follows, when we include terms suchas O�m−s�� it is our understanding that these terms do not involvethe unknown parameter �. Now, using (A.4), (A.7), and some randomvariable U between Xm and �, we can write

E���

[�Xm +m− �−1

]= �� +m− �−1 + �� +m− �−3�2m−1 − �� +m− �−4a1m

−2

+ �� +m− �−5�a2m−2 + a3m

−3�− E���

[�Xm − ��5�U +m− �−6

]�

(A.48)

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 135

where the expressions of ai ≡ ai��� ��� i = 1� 2� 3, are explicitly available.Let us denote

a∗i ≡ a∗

i ��� =∫

0ai��� k�g�k�dk� i = 1� 2� 3�

Next, employing arguments similar to those used previously, one canprove that

limc→0

m2∫

0E��k

[∣∣Xm − �∣∣5 �U +m− �−6

]g�k�dk = 0� (A.49)

Now, combining (A.48)–(A.49), we can express∫

0

{k−1 + E��k

[�Xm +m− �−1

]}c−2g�k�dk

= c−2a−1 + c−2[�� +m− �−1 + �� +m− �−3m−1�� + a−1�2�

− �� +m− �−4a∗1m

−2 + �� +m− �−5�a∗2m

−2 + a∗3m

−3�+ O�m−2�]�

(A.50)

so that using the lower bound in (A.12), we have∫

0E��k N � g�k�dk− n∗∗

≥∫

0

{k−1 + E��k

[�Xm +m− �−1

]}c−2g�k�dk− c−2�a−1 + �−1�

= c−2[O�m− �+ ��−3 + O�m− ��m−1�� + a−1�2�− ��−4 + O�m− ��

× a∗1m

−2 + ��−5 + O�m− ��m�a∗2m

−2 + a∗3m

−3�+ O�m−2�]� (A.51)

Now, noting that limc→0 m−1c−2 = a and simplifying (A.51), we can write

limc→0

[∫

0E��k N � g�k�dk− n∗∗

]≥ a�−3�� + a−1�2�� (A.52)

The upper bound in Theorem 3.3 follows similarly from (A.50) afterusing the upper bound from (A.12). �

ACKNOWLEDGMENTS

Dr. Jose Barrigossi gathered the Mexican bean beetle datasets whenhe was a Ph.D. student in the Department of Entomology, Universityof Nebraska–Lincoln. We express our sincerest gratitude to Dr. JoseBarrigossi, Dr. Leon Higley, and Professor Linda Young for kindlymaking these datasets available to us. We are also immensely grateful to

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

136 Mukhopadhyay and de Silva

three referees for sharing with us a number of insightful comments on anearlier draft. Some of the thoughtful concerns and queries raised on anearlier draft have led us to rethink our position and add Remarks 3.1–3.2 and Remarks 4.1–4.3 for clarifications. We take this opportunity tothank the associate editor and the referees for sharing their enthusiasmas well as for giving helpful pointers and positive feedback.

REFERENCES

Allen, J. D., Gonzalez, D., and Gokhale, D. V. (1972). Sequential SamplingPlans for the Bollworm, Helliothis zea Environmental Entomology 1:771–780.

Anscombe, F. J. (1949). The Statistical Analysis of Insect Counts Based on theNegative Binomial Distribution, Biometrics 5:165–173.

Anscombe, F. J. (1950). Sampling Theory of the Negative Binomial andLogarithmic Series Distribution, Biometrika 37:358–382.

Barrigossi, J. A. F. (1997). Development of an IPM System for the MexicanBean Beetle (Epilachna Varivestis Mulsant) as a Pest of Dry Bean(Phaseolus Vulgaris L.), Ph.D. diss., University of Nebraska-Lincoln.

Berti, A., Zanin, G., Baldoni, G., Grignani, C., Mazzoncini, M., Montemurro,P., Tei, F., Vazzana, C., and Viggiani, P. (1992). Frequency Distributionof Weed Counts and Applicability of a Sequential Sampling Method toIntegrated Weed Management, Weed Research 32:39–44.

Binns, D. (1975). Sequential Estimation of the Mean of a Negative BinomialDistribution, Biometrika 62:433–440.

Bliss, C. L., and Owen, A. R. C. (1958). Negative Binomial Distributions witha Common k, Biometrika 45:37–38.

Ghosh, M., and Mukhopadhyay, N. (1981). Consistency and AsymptoticEfficiency of Two-Stage and Sequential Estimation Procedures, Sankhya,Ser. A, 43:220–227.

Ghosh, M., Mukhopadhyay, N., and Sen, P. K. (1997). Sequential Estimation,New York: Wiley.

Johnson, G., Mortensen, D. A., Young, L. J., and Martin, A. R. (1995). TheStability of Weed Seedling Population Models and Parameters in EasternNebraska Corn (Zea mays) and Soybean (Glycine max) Fields, WeedScience 43:604–611.

Johnson, N. L., and Kotz, S. (1969). Distributions in Statistics: DiscreteDistributions, New York: Wiley.

Kuno, E. (1969). A New Method of Sequential Sampling to Obtain PopulationEstimates with a Fixed Level of Accuracy, Research in Population Ecology11:127–136.

Kuno, E. (1972). Some Notes on Population Estimation by Sequential Sampling,Research in Population Ecology 14:58–73.

Marshall, E. J. P. (1988). Field-Scale Estimates of Grass Weed Population inArable Land, Weed Research 28:191–198.

Mukhopadhyay, N. (2002). Sequential Sampling, in The Encyclopedia ofEnvironmetrics, vol. 4, A. H. Shaarawi and W. W. Piegorsch, eds.,pp. 1983–1988, Chichester, England: Wiley.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012

Two-Stage Estimation of Mean 137

Mukhopadhyay, N., and Diaz, J. (1985). Two-Stage Sampling for Estimatingthe Mean of a Negative Binomial Distribution, Sequential Analysis 4:1–18.

Mulekar, M. S., and Young, L. J. (1991). Approximations for a FixedSample Size Selection Procedure for Negative Binomial Populations,Communnications in Statistics-Theory & Methods 20:1767–1776.

Mulekar, M. S., and Young, L. J. (2004). Sequential Estimation inthe Agricultural Sciences, in Applied Sequential Methodologies,N. Mukhopadhyay, S. Datta, and S. Chattopadhyay, eds., pp. 293–318,New York: Marcel Dekker.

Mulekar, M. S., Young, L. J., and Young, J. H. (1993). Introduction to 2-SPRTfor Testing Insect Population Densities, Environmental Entomology 22:346–351.

Nyrop, J. P., and Binns, M. (1991). Quantitative Methods for Designing andAnalyzing Sampling Program for Use in Pest Management, in Handbookof Pest Management, vol. 2, D. Pimental, ed., pp. 67–132, Boca Raton,Flo.: CRC Press.

Onsager, J. A. (1976). The Rationale of Sequential Sampling, with Emphasis onIts Use in Pest Management, Technical Bulletin 1526, Washington, D.C.:Agricultural Research Service, USDA.

Plant, R. E., and Wilson, L. T. (1985). A Bayesian Method for SequentialSampling and Forecasting in Agricultural Pest Management, Biometrics41:203–214.

Sterling, W. L. (1976). Sequential Decision Plans for the Management of CottonAnthropods in Southeast Queensland, Australian Journal of Ecology 1:265–274.

Sylvester, E. S., and Cox, E. I. (1961). Sequential Plans for Sampling Aphids onSugar Beets in Kern County, California, Journal of Economic Entomology54:1080–1085.

Waters, W. E. (1955). Sequential Analysis of Forest Insect Surveys, ForestScience 1:68–79.

Wiles, L. J., Oliver, G. W., York, A. C., Gold, H. J., and Wilkerson, G.G. (1992). Spatial Distribution of Broadleaf Weeds in North CarolinaSoybean (Glycine max), Weed Science 40:554–557.

Willson, L. J. (1981). Estimation and Testing Procedures for the Parametersof the Negative Binomial Distribution, Ph.D. diss., Oklahoma StateUniversity.

Willson, L. J., and Folks, J. L. (1983). Sequential Estimation of the Mean ofthe Negative Binomial Distribution, Sequential Analysis 2:55–70.

Wilson, L. T. (1982). Development of an Optimal Monitoring Programin Cotton: Emphasis on Spider Mites and Heliothis spp., Entomophaga27:45–50.

Young, L. J. (1994). Computations of Some Exact Properties of Wald’sSPRT When Sampling from a Class of Discrete Distributions, BiometricalJournal 36:627–637.

Young, L. J. (2004). Sequential Testing in the Agricultural Sciences, inApplied Sequential Methodologies, N. Mukhopadhyay, S. Datta, andS. Chattopadhyay, eds., pp. 381–410, New York: Marcel Dekker.

Zou, G. (1998). Weed Population Sequential Sampling Plan and Weed SeedlingEmergence Pattern Prediction, Ph.D. diss., University of Connecticut.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

02:

10 2

4 Se

ptem

ber

2012


Recommended