Sampling for estimating characteristicsof mackerel in northeast Brazil
Item Type text; Thesis-Reproduction (electronic)
Authors Albuquerque, José Jackson Lima de, 1937-
Publisher The University of Arizona.
Rights Copyright © is held by the author. Digital access to this materialis made possible by the University Libraries, University of Arizona.Further transmission, reproduction or presentation (such aspublic display or performance) of protected items is prohibitedexcept with permission of the author.
Download date 08/07/2018 08:01:52
Link to Item http://hdl.handle.net/10150/318361
SAMPLING FOR ESTIMATING CHARACTERISTICS
OF MACKEREL IN NORTHEAST BRAZIL
by
Jose Jackson Lima de Albuquerque
A Thesis Submitted to the Faculty of the
COMMITTEE ON STATISTICS
In Partial Fulfillment of the Requirements For the Degree of
MASTER OF SCIENCE
In the Graduate College
THE UNIVERSITY OF ARIZONA
1969
STATEMENT BY AUTHOR
This thesis has been submitted in partial fulfillment of requirements for an advanced degree at the University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.
Brief quotations from this thesis are allowable without special permission, provided that accurate acknowledgement of sources is made. Request for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.
SIGNED:
APPROVAL BY THESIS DIRECTOR
This thesis has been approved on the date shown below:
tZ " Dr. Henry Tucker — y DateProfessor of Systems Engineering /
ACKNOWLEDGEMENTS
I am grateful to my major professor Dr. Robert 0. Kuehl from
whom I have always received secure guidance and friendly advice through
out my study, and to Dr. Alan B. Humphrey for his effort in criticizing
the manuscript.
To Dr. Henry Tucker, Chairman of Committee on Statistics, I
express my sincere thanks for his orientation during the course of
this investigation. He suggested the problem and provided the assist
ance that made the completion of this thesis possible.
Special appreciation is given to my wife, Claudia, and my
sons, Ricardo, Eduardo and Adriano for their patience, love and under
standing during this extended period of graduate training.
I thank the United States Agency for International Development
(USAID) and The University of Arizona/University of Ceara contract for
providing the financial support that made my stay at Tucson possible.
iii
TABLE OF CONTENTS
. V - ' Page
LIST OF TABLES „ . . . . . „ .. . . . . . . . . . . . . . . . . v
LIST OF ILLUSTRATIONS . . . . . .... . . . . . . . . . . . . . . vi
ABSTRACT . . . . . . . . . . . .. . . . . . . . . . . . . . . . . vii
CHAPTER 1: INTRODUCTION . . . . . . . . . . . 1
CHAPTER 2: REVIEW OF LITERATURE . . . . . . . . . . . . . . . . . 3
CHAPTER 3: SAMPLING MODELS . . . . . . . . . . . . . . . . . . . 8
Section 1: Three-stage Random Sampling . . . . . . . . 8Section 2: Stratified Three-stage Random Sampling . . . 10
CHAPTER 4: MATERIAL . . . . . . . . . . . . . . . . . . . . . . . . 13
CHAPTER 5: COST FUNCTION . . . . . . . . . . . . . . . . . . . . . 17
Section 1: Three-stage:Random Sampling . . . . . . . . . 17Section 2: Stratified Three-stage Random Sampling . . . 18
CHAPTER 6: OPTIMIZATION . 19
Section 1: Three-stage Random Sampling . . . . . . . . . 19Section 2: Stratified Three-stage Random Sampling . . . 21
CHAPTER 7: WEIGHT OF STRATUM ., . . . . ; . . . . . . . . . . . . 23
Section 1: Cost Study . . . . . . . . . . . . . . . . . 23, Section 2: Cost Efficiency . . . . . . . . . . . . . . . 24
CHAPTER 8: SUMMARY AND CONCLUSIONS . . . . . . . . . . . . . . . 30
APPENDIX: DERIVATION OF VARIANCE OF OVER-ALL .MEAN . ........ 32
List of references . . . . . . . . . . . . . . . . . . . . . . . *3
.
LIST OF TABLES
Table Page
1. Analysis of variance of two-stage unbalancedhierarcha! classification for king andSpanish mackerel . . . . . . . . . . ........ . . . . . . 13
2. Means, variances and number of units in three- .stage random sampling for king and Spanishmackerel ............'...........14
3. Analysis of variance for king and Spanishmackerel considering two strata . . . . . . . . . . . . . 15
4. Values of means, variances and number of unitsfor king and Spanish mackerel in stratifiedthree-stage random sampling ......... 16
5. Optimum allocation values for king and Spanishmackerel at various level of precision inthree-stage random sampling . . . . . . . . . . . . . . . 21
6. Optimum sub-sampling number in stratified three-stage random sampling at different precision .and equal weight for king and Spanish mackerel . . . . . . . 22
7. Optimum psuvs for king and Spanish mackerel atvarious level of precision in stratified three-stage random sampling . . * . 0 . . . . . . . . . . . . . . 26
8. Cost efficiency for king and Spanish mackerel at1% level of precision (% of Y ^ ) ............. 29
v
LIST OF ILLUSTRATIONS'
Figure Page
1. Total cost of sampling for king and Spanish mackerel at 1% of precision (% of in.stratifiedthree-stage random sampling . . . . . . . . . . . . . . 27
2. Total cost of sampling for king and Spanish mackerelat 5 and 10% of precision (% of Y^) . . . . . . . . . . 28
Vi
ABSTRACT
In this paper the problem of ctiddslng the optimum sub-sample
number in finite population for king mackerel, Scomberomorus cavalla
(Cuvier, 1829) and Spanish mackerel, Scomberomorus maculatus (Mitchill,
1815) was studied.
. Two designs were considered, three-stage random sampling and:
stratified three-stage random sampling both with sampling without re
placement. The stages considered were days., fisherman/day and
fish/fisherman/day.
, Lagrange multiplier technique was used for minimizing the total
cost of sampling for a fixed precision (% of over-all mean). Strati
fication reduced the optimum number of first-stage units for both
species. The second and third-stage units for both designs were the
same at all levels of precision.
Cost efficiency of the two designs, for both species was evalu
ated by the index CE = C /C , where C was defined as the.J mrs stars’ mrstotal cost of sampling for three-stage random sampling and ^smrs for
stratified three-stage random samplings Shra.tification was efficient
only at 1% level of precision when the weight of stratum was between
,35 and .65.
vii
CHAPTER 1
INTRODUCTION
A small but flourishing fish industry exists in the cities along
the coast in Northeast Brazil. Several species of fish of commercial importance are caught in this area.
Among the species of fish in Northern Brazil; king mackerel,
Scomberomorus cavaila (Cuvier, 1829) and Spanish mackerels Scorn-
beromorus m'aculatus (Mitchill, 1815) are the most important. Both
species occur during the entire year and their abundance is a func
tion of the seasonal variation.
They are fished mostly by rafts and some small motor boats
.with trawling hook lines baited chiefly with sardines. The length
of the rafts and boats vary between four and six meters.
These embarcations are rowed by a fisherman and two or three
members of his family or hired men to within several kilometers off
the coast. ,The distance rowed depends upon the direction and velocity
of the winds, the season, and the availability of fish. .The hour
of catch is greater during the early hours of the day. Only one
trip is made per day. The embarcations return before dusk and are
left overnight on the beaches.
With the progress of these growing fish industries, there is
now an ever pressing demand to examine population growth and mortality.
, ' " '■ ' ; ' ■ 2 migration, effdft' of datch and biology of several species, of commercial
importance.
In order to provide some scientific information to these in
dustries the Federal University of Ceara through its Marine Biology
Station has developed a program of scientific research on the species
of-, commercial importance in this area.
The difficulties of making reliable estimates of biological
measurements-.',are evident since variability occurs among seasons, days;
and fishermen.
The aim of the present study was to determine the optimum
distribution for ..physical measurement of the sample and sub-sample
units according to the variation due to days, fishermen and fish for
both species cited above. Precision and economy will be the criteria
in the choice of an appropriate sub-sample number. .
The two models will be used, three-stage random sampling and
stratified three-stage random sampling both with sampling without
replacement.
CHAPTER 2
REVIEW OF LITERATURE
The term nested or hierarchal classification has been used to
designate a sampling technique frequently used in many fields of works
which consists of a multistage random sampling. The principal purpose
is to investigate the total variance per element ascribable to the
various stages of sampling and estimate the mean.
Perhaps the first detailed work with the general results for
multistage sampling was done by Ganguli (1941). In his paper formulas
for the expected values.of the mean squares were developed for.the
case when there are unequal numbers in the various subclasses.
Cochran (1939) has demonstrated that the individual components
of variance in multistage design can be identified by analysis of
variance. -
Pulley (1957) presented a computer program to calculate the
sum of squares, mean squares and all coefficients in the mean square
expectation for the unbalanced hierarchal analysis of variance, with
as many as 4 stages.
Recently, computational methods for any number of levels and
a more compact notation of Ganguli*s (1941) results have been presented
by Gates and Shiue (1962) and Gower (1962).
Detailed description of the algebraic calculation of expectation
of sum of square and estimation of variance for nested classification
using more than three stages was reported by King and Henderson (1954).
One of the most common uses of variance components is in the
planning of an experiment in which it is desired to minimize the cost
of obtaining a sample estimate when the precision is fixed* or con
versely* to maximize the precision.of an estimate obtained for a given
cost«' An excellent discussion of this use of variance components was
given by Marcuse (1949)* which includes formulas for optimum alloca
tion for different stages using the Lagrange multiplier*
Singh (.1958) has shown how the method of analysis of variance
can be suitably used for studying variance components in multistage
sampling from finite populations. He also has shown how a fairly
good approximation of these estimates can be obtained without under
taking’ detailed computation.
Practical, application of components of variance obtained from
hierarchal classification have been reported by the Southern Coopera
tive Group (1951) and Stearman* Ward and Webster (1953).
A detailed discussion about variance component analysis was
presented by Anderson (1960). He discussed its uses in quantitative
genetics studies* and showed that a non-balanced sampling scheme may
be desirable in order to obtain efficient estimates of all variance
components. Methods of analysis for the mixed model were considered*
with examples of split-plot experiments.
Roy (1957). studied the effects over the variance of the mean
when the units are chosen .with different probabilities (but with
• - _ ■ ■ 5replacement) at each stage• The total variation is split up into dif
ferent meaningful components depending on the type of sampling used,
and unbiased estimators for these components were derived.
Theoretical consideration about three-stage sampling has been
given by Ecimovic (1956) when probabilities vary in every stage of
selection. He also studied the effects of selection with and without
replacement.and presented formulas for modification on the three-stage
sampling design.
One of the most extensive theoretical investigations for multi
stage sampling have been conducted by Johnson and Rao (1959). They
defined various mathematical sampling models and a design for compar
ing and contrasting them with respect to estimation and efficiency.
Formulas for variance of the mean were derived by expectation of un
conditional variance. ^
One of the modifications of multistage sampling is technically
termed as the stratified multistage random sampling. All the first
units to be surveyed are first divided, into subdivisions called strata
and from each of these a certain number of experimental units are se
lected at random by the procedure of successive stages of selection.
Uses of this design have been reported by Sukhatme (1950),
Sukhatme and Panse (1951), Mokashi (1954), Sukhatme, Panse and Sastry
(1958), Sen and Chakrabarty (1964), Sen,. Chakrabarty and Sarkar (1966a)
and Sen, Sarkar and Chakrabarty (1966b).
Formulae appropriate for estimating gains in precision due to
stratification in a sub-sampling design from finite populations was
6developed and illustrated with yield data of wheat by Sukhatme (1950).
Mokashi (1954) showed how stratified multistage sampling methods
could improve the estimation of crop acreage for cereals in India and
investigated the appropriate type of sub-unit.
Sukhatme et al (1958) described a stratified multistage sampl
ing design.used in India for estimating the monthly catch of marine
fish brought to the coast by fishing boats. They presented a method
to choose the optimum first-stage unit for sampling and the number of
such units required for estimating the monthly catch for the entire
coast with given precision.
For estimating the average yield of cotton and cereals in India*
Sukhatme and Panse (1951) used stratified multistage sampling. They
--showed that the simple arithmetic mean of plot yields is the most ef
ficient of the estimates.and that geographical stratification increases
precision of the variance of the mean.
Sen and Chakrabarty (1964) have adopted a stratified three-stage
sampling design to estimate the loss of tea crops by insects and dis
eases in some states of.Northeast India.
Sen et ai (1966a) discussed the use.of stratified multistage
sampling design and double sampling to estimate the degree of infesta
tion of pests in tea states in India. The authors also estimated an
optimum.scheme using both simple and double sampling. .
In order to estimate the loss of tea crop due to all pest and
diseases in Assan (India), Sen et al (1966b) used a stratified three-
stage. design.
7
An extensive study of this design has -been done by Stuart (1964)
which studied the effects of gain in efficiency of preliminary random
stratification for any multistage design using sampling, with and with
out replacement. This study considered an original design (any multi
stage sampling design in which sampling is with replacement) .and a
modified design . (the original design with first-stage units stratif ied
in n groups). Stuart pointed out that if n/N is negligible and a
uniform sampling fraction is used in each randomly-formed stratum,
there can at most only negligibles gains in efficiency from any pre
liminary random stratification.
CHAPTER 3
SAMPLING MODELS
Two models of multistage sampling were adopted in this paper.
They were three-stage random sampling and stratified three-stage ran
dom sampling with the first stage units stratified.
balanced design. The sampling to be described consists of N primary
sampling units (psu-days) in the population. The ith psu contains
M second stage units (ssu-fisherman) and each jth ssu consists of
K third stage units (usu-fish). The small letters (n, m and k)
denote the corresponding number in the sample.
represent the record of uth fish from the jth fisherman in the
ith day. Since the sources of variability act independently and ad-
ditively the linear mathematical model appropriate to this scheme may
be written as:
Section 1
Three-stage Random Sampling
The situation discussed here is confined to the three-stage
Let Y^j^(i = 1, 2, ... N; j = 1, 2, M; u = 1, 2, ... K)
+ 6iju (3.1.1)
8
where y represents the general mean and is thus a fixed constant;
is the effect of the ith day; 3 ^ is the effect of the jth
fisherman in the ith day and 6,. is the effect of the uth fish— iju —from the j th fisherman in the ith day. Let us assume that a ,, 3 ^
and j[ju are random variables normally distributed with exception2 2 2and covariances zero and variances S , and respectively.
Let us assume that selection at any stage is independent of
the other stages and that sampling within any unit at a given stage
is independent of the sampling within other units at that stage. Fur
ther, we shall assume that the units at each stage are selected with
equal probability.
An unbiased estimate of the population mean per usu is the
simple average of all the elements in the sample, or
__ - n m kY ... = — ~ - E E Z Y . . (3.1.2)n m k ijui j u
For obtaining an unbiased estimate of the variance of the mean,
use was made of the lemma (Appendix) adopted by Johnson and Rao (1959).
Applying this lemma it can be shown (Appendix) that
V(Y...) = E[E{V(Y...)}] + E[V{E (Y...)}] + V[E{E(Y...)}] (3.1.3)1 2 3 1 2 3 1 2 3
Equation (3.1.3) after simplification and using Cochran’s
notation may be written
where S , and are the variance among primary, secondary and
tertiary units respectively.
The variance is seen to be made up of three components cor
responding to the three stages of sampling.
Section 2
Stratified Three-stage Random Sampling
One of the most common designs in surveys is stratified multi
stage sampling. In this design the population of psu's is first
divided into strata.
Let us assume that the psu's are classified into L strata
(seasons) with psu's (first stage units) in the hth stratum
and n^ psu's in the sample from the hth stratum. Let and
be the population and sample numbers, respectively, of second-
stage units (ssu's) in the hith psu, and and the popula
tion and sample numbers of third-stage units (usu's) in the hijth
ssu.
The first, second and third stages has the same definition
as described previously.
If Y, .. represents the record of the uth fish from thehiju r —jth fisherman in the ith day in the hth season, the additive model
for this design can adequately be represented by
11
Yhiju = * + ?h + “hi + Bhlj + 5hlju (3-2.1)
h = 1, 2 ... L whore h refers to strata or seasons
i - 1, 2, ... N where i refers to day in seasons
j = 1, 2, ... M where j refers to fisherman in day
u == 1, 2, ... K where u refers to fish in fisherman.
All the components except the general mean (p) are normally
independently distributed with mean and covariance zero and variances2 2 2 2 Sq, S , $2 and respectively.
It seems worthwhile to consider the above model and (3.1.1)
to be Model II as defined by Eisenhart (1947), that is, all the ele
ments of the linear model within strata except y are regarded as
random variables.
An unbiased estimate of the population mean per usu is given
by
. L _,EN M, K, Y, L
Yst ■ L " ■ = 2 Wh Yh (3.2.2)
I \ Mh KhhL
where M^ K^/l is the relative size of the stratumh
in terms of usu's and is the sample mean in the stratum.
The variance of Y is given by
L „V(Y ) = I WT V(Y, ) (3.2.3)st , n n n
12
Using the same lemma referred to before it can easily be shown
(Appendix) that
V(Y ) = E[E{V(Y )}] + E[V{E(Y )}] + V[E{E(Y, )}]. (3.2.4)1 2 3 1 2 3 1 2 3 .
After simplification, equation (3.2.4) becomes
,(v
Substituting (3.2.5) in (3.2.3) we have finally:
V < V = I + S S- + nSx S3h] ( 3 - 2 ' 6 )h h h n h h h
^h ^h 2 2 2Where flh = V f2h “ ST- f3h = kT and Slh’ S2h and S3h areh h hthe variances among units corresponding to the first, second and third
stage within the hth stratum.
CHAPTER, 4
, MATERIAL
The data used in this paper has been discussed by Costa and
Paiva (1965). It consists of 5,628 king mackerel and 3,492 Spanish
mackerel caught off the coast of Brazil near Fortaleza in the State
of Ceara. It corresponds to samples of 232 days of fishing carried
out during the period of January 1st to December 31st in 1964.
The number of fish sampled per day and per month differed
for both specie studied. The measurement used in the analysis was
fork length (cm). Since the data does not provide any information
about variation due to months and among fish within months a two-
stage hierarchal analysis of variance was carried out for both
species. The results are shown In Table 1.
Table 1. Analysis of variance of two-stage unbalanced hierarchal classification for king and Spanish mackerel.
King Mackerel Spanish MackerelSources of Variation df MS df MS
Among months 11 6597.8 11 2983.0
Among fish/months 5616 144.0 3480 82.3
Comparing the values obtained in Table 1 it is seen that king
13
142mackerel showed a variance among fish within months (a ) and among
2months (a ) larger than Spanish mackerel.
It was necessary to provide values for variance among days and
among fisherman within days. Based on the results of Table 1 it can be
seen that for both species the variance for months compared with the
variance for fish/months were very small. In order to provide some2basis for computation we divide o^ by four for finding the variance
among days. The variance among fisherman within days was obtained 2dividing o^ by two. These divisors are under estimates of the true
unknown coefficients and as such made the results conservatives.
Table 2 shows the respective variance for both species.
Table 2. Means, variances and number of units in three-stage random sampling for king and Spanish mackerel.
Variance Number of Units
Species Mean 4 S3 N M K
king mackerel 71.85 36 72 144 300 100 50
Spanish mackerel 53.91 20 41 82 300 100 50
The estimation of variance among fish within months and among
months when the psu’s were stratified was obtained by a two-stage
unbalanced hierarchal analysis of variance. The values are shown in
Table 3.
15
Table 3. Analysis of variance for king and Spanish mackerel consider- . ing two strata.
King Mackerel Spanish MackerelSources of Stratum I Stratum II Stratum I Stratum IIVariation df MS df MS df MS df MS
Among months 5 4769.6 5 2469.5 5 5508.3 5 2373.8
Among fish/months 3701 139.1 1915 153.8 2329 81.5 1151 84.2
Only 2 strata of the same size were considered in our study.
For both species studied stratum I corresponds to the months of great
est indices of catches and stratum II the months of lowest indicies
of catches.
In this design, the variance among days and among fisherman
within days for both species were obtained using the same criteria
described as before and are given in Table 4.
Table 4? Values of means„ variances and number of units for king and Spanish mackerel in stratified three-stage random sampling.
Stratum I Stratum 11Mean Variance Stage Units Mean Variance Stage Units
Mackerel — 2 2 2 2 2 2Species,' Yh SU S21 S31 h M1 K1 . ?h 12 22 32 N2 ■ -m 2 K2
King 74.3 35 69 139 150 80-130 20-100 69,4 39 77 154 150 70-100 10-50
Spanish 54.9 20 41 82 150 80-130 20-100 52.9 21 42 84 150 70-100 10-50
Hch
CHAPTER 5
COST FUNCTION
The cost of a survey includes travel, labour, statistical,
analysis and contingencies, but for this study only the cost of sampl
ing measurement was assumed. The cost functions used in this paper
were measured in terms of time taken to complete the field work.
Three components of cost were identified in both .sampling designs
used. '■' .
Section 1
Three-stage Random Sampling
Consider the total cost of the survey C for a particular'
year to be expressed in time (hours) since there was no information
about expenditure on fish sampling in the area studied.
The total time required to complete the field work can be
approximately well represented by:
C = nG^ +" n m C^ -t n m k (5.1.1)
where nC^ comprises time that varies directly with the number of
psu’s selected, thus C^ includes the journey time and the time
taken in organizing the survey and preparing, the frame of the ssu’s.
18
The component nmC^ is proportional to. the total number of ssu's
chosen, representing the time spent in drawing a sample of fisher
men from a day. The last component mnkC^ is proportional to the
total number of usu’s selected, represents the time taken to
obtain information from a fish.
Based on past experiences the following estimates were used:
=? 8.00 hours; = .16 hours and - ,04 hours.
Section 2
Stratified Three-stage Random Sampling
Consider that cost for selecting a particular stratum can
be disregarded and that the total cost of carrying the survey can
be approximated well enough by a simple cost function of the follow
ing form
L L L .C = £ nh CIh + £ nh “h C2h * J nh kh S h (5.2.1)h h h
Lwhere C represents the total cost; E n^ represents the time
hthat varies with the numbers of days selected from each stratum;LE n^ m^ is proportional to the total number of ssu's selectedh Lfrom each day and 1 n^ m^ k^ is directly related to the total
number of fish selected.
The values of C^, and for this design are those
given for C^, and respectively, in the previous design.
CHAPTER 6
OPTIMIZATION
It is well-known that for any good sample survey one of the
prerequisites is to find estimates quickly, cheaply and accurately.
In order to meet these requirements it is necessary to find the
optimum values of the units at various stages of sampling.
Precision, expressed in percentage of over-all mean, and
economy are the criteria in the choice of an appropriate sub-sampling
number which is attained in the present "instance with fixed variance
and minimum cost of sampling.
Section 1
Three-stage Random Sampling
To obtain the optimum allocation problem in this case the
cost function C is minimized subject to the constraint that vari
ance of the mean (precision) is fixed.1Using the method of Lagrange multiplier in the usual way
the function F is minimized where
F = C + X V (6.1.1)
1. For a discussion of the Lagrange method.of.obtaining a relative minimum or maximum value of a function subject to condition see Taylor, A. E., Advanced Calculus, 1955, Ginn and Co., pp 198 or other texts
19
20
C is the.total cost: of the survey and defined by (5,1.1), A is the
Lagrange multiplier and V is the Variance of over-all mean given by
(3.1.4).
It is shown (Appendix) that the values of n, m, and k
which minimize F are given by:
y (6.1.2)
S2 - S3/K
Sl - S2/M (6.1.3)
Sg - Sg/K(6.1.4)
where is the standard deviation of the mean within the ith stage
of sampling.
The computed optimum values of n, m and k shown in Table
5 were obtained by using numerical quantities from Table 2.
21
Table 5. Optimum allocation values for king and Spanish mackerel at various levels of precision,'in three-stage random sampling.
Precision King Mackerel Spanish Mackerel(% of Y...)_______ n m k n m k
1 63 10 3 49 10 3
5 13 10 3 10 10 3
10 6 10 3 5 10 3
15 4 10 3 3 10 3
It will be seen that only the first stage units were affected
by the precision of estimates, and since king mackerel measurements
had a large variance of the mean more sample units were required with
the other species.
Section 2
Stratified Three-stage Random Sampling
The optimum sampling values in this design were obtained using
the same procedure described in the preceding section. The cost func
tion (5.2.1) was minimized for a fixed variance (3.2.6).
The computational derivation of the optimum values of n^,
m^ and k^ are shown in the Appendix.
These values are:
%u2hulh
L (6.2,1)
22
*hIh
G2h2h S3h/Kh
lh S2 h %(6.2,2)
"2h 3h
lC3h H h ~ S3h/Kh(6.2.3)
The numerical values of n^, • and for king mackerel
and Spanish mackerel using equal weights are presented in Table 6 and
were found by solving with quantities shown in Table 4.
Table 6. Optimum sub-sampling number in stratified three-stage random sampling at different precision and equal weight for king and Spanish mackerel. •
PrecisionKing Mackerel
Stratum I Stratum IISpanish Mackerel
Stratum I Stratum II(% of ¥h) nl ml kl n2 m2 k2 nl ml kl n2 m2 ,k2
1 28.. 10 3 31 10 3 21 10 3 23 10 3
5 : 6 10 3 7 10 3 5 10 3 5 10 3
10 3 10 3 4 10 3 2 10 3 3 10 3
15 2 10 3 2 10 3 2 10 3 2 10 3
As can be seen m^ and k^ were not affected by the basic
structure and variance of the design, and only n^ changed by chang
ing in precision. In this case the among fisherman conponent and
among fish component were approximately the same size for both species
and both strata, hence the allocation of and k^ presented above.
CHAPTER 7
WEIGHT OF STRATUM
Section 1
Cost Study
This section deals with the effect of weight of stratum on the
cost of sampling and number of units for both species studied.
As mentioned in Section 4 only two strata (seasons) were con
sidered. The first corresponded to high monthly production and the
second one to low monthly production of fish. Each species had a
different monthly cycle according to the season.
For king mackerel» the greatest indices of monthly yield showed
the existence of harvest in the period from April to July and from
November to December; and the harvest of Spanish mackerel was from
January to February and from September to December (Costa and Paiva
1965). The season of low indices, of monthly yield for both species
corresponded to the remaining months of the year.
During the season of great production there existed an in
crease in the number of fisherman/day and in the number of fish caught
per fisherman.
The two seasons whose population sizes were equal were con
sidered, i.e., Nj. = Ng = 150 days, but the number of fisherman/day
and fish/fisherman-varied in both seasons. (Table 4).
23
24
The weight of stratum in terms of usu's was defined in Sec-L L
tion 3.2 by ^ where E = 1.h h
Table 7 shows the variation in psu's computed from (6.2.1)
both species as related to the weight of stratum at different levels
of precision.
As can be seen n^ reached its minimal number when the weight
of both strata was approximately the same.
A study was done in order to examine the effect of weight of
stratum and level of precision over the total cost of sampling. Costs
were computed from equation (5.2.1) using optimum values shown in
Tables 6 and 7. The graphs in Fig. 1 and 2 showed that the minimum
cost of sampling at all levels for king and Spanish mackerel was
reached when the weight of stratum was between .45 and .55.
Section 2
Cost Efficiency
The efficiency of the two designs used was compared by the in
dex defined by
CCost Efficience = (C E) = —m-r— - (7.2.1)
smrs
where Cmrg is the total cost of sampling for multistage random sampl
ing from (5.1.1) and csmrs for stratified multistage random sampling
from (5.2.1).
Stratified designs will be superior (less expensive) when
25
CE > 1. It can be seen from the results presented in Table 8 that for
both species, at 1% level of precision, stratification was efficient
when the weight of stratum was between .35 and .65. At the other
levels of precision there was no great advantage in stratification.
26
Table 7. Optimum psu's .for king.and Spanish mackerel at various levels of precision in stratified three-stage random sampling.
King Mackerel _ Spanish Mackerel 'Weight psu's Precision (% of V Precision (% of X )
1 5 10 15 : 1 5 10 15n1 34 8 4 3 26 6 3 2
,.25 n2 37 9 5 3 28 7 3 2
nl 31 7 4 2 24 6 3 2, :3o 35 8 4 3 26 6 3 2
n1 30 7 3 2 23 5 3 2.35 4 33 8 4 3 25 6 3 2
n 28 7 3 2 22 5 3 2.40 n2 32 8 4 3 24 5 3 2
n 28 6 3 2 21 5 2 2.45 : JL
n2 31 7 4 2 23 5 3 2n -, 28 6 3 2 . 21 5 2 2
.50 n2 31 7 4 2 23 5 3 2n. 28 6 3 2 21 5 2 2
.55 4 31 7 4 2 23 5 3 2
ni 28 7 3 2 22 5 3 2.60
e 4 31 7 4 2 24 5 3 2n 29 7 3 2 23 5 3 2
.65 4 32 8 4 3 25 6 3 2n 31 7 4 3 24 6 3 2
.70 n2 34 8 4 3 26 6 3 2
nl 33 8 4 3 26 6 3 2.75 4 36 9 4 3 28 6 3 2
n 35 8 4 3 28 7 1 3 2
GO O 4 39 10 5 3 30 7 3 2
ni 37 9 5 3 30 7 4 2.85 n2 42 10 5 3 32 8 4 3
n 41 10 , 5 3 33 8 4 3.90 \ 4 45 11 6 4 35 8 4 3
.95 54448
1113
66
44
3637
99
45
33
TOTAL
COST OF
SAMPLING
(hours)
HOOr
1000Spanish Mackerel
King Mackerel900
800
700
600
500r
vOIn00Ou>o o m
WEIGHT OF STRATUM I
Figure 1. Total cost of sampling for king and Spanish mackerel at 1% of precision (% of Y ) in stratified three-stage random sampling.
N5
TOTAL
COST OF
SAMPLING
(hours)
35(V
3001l
250
200
150
100
50XL
Spanish Mackerel
King Mackerel
10%
NJVi U)O WVi o Vi Vio ViVI o ONVI -Vjo 'VVi 00o coVi VOo VOV.
WEIGHT OF STRATUM I
Figure 2. Total cost of sampling for king and Spanish mackerel at 5 and 10% of precision (% of Y ).
N)00
29Table 8. . Cost efficiency for; king and Spanish mackerel at 1% level of
precision (% of Y^).
Weight of stratum KingCost Efficiency
Spanish.
.25- .89 .91
.30 .90 .98 .
.35 1.00 1.02
.40 1.05 1.07
.45 1.08 1.11
.50 1.08 1.11
.55 1.08 1.11
.60 1.08 1.07
.65 1.03 1.02
. 70 .97 .98
.75 .91 .91
OCO .85 .84
.85 .78 .79
.90 .73 .72
.95 .68 .67
CHAPTER 8
SUMMARY AND CONCLUSIONS
THree-stage random sampling and stratified three-stage random
sampling designs were used in a sample of 5.628 king mackerel and
3.492 Spanish mackerel. The optimum subsampling number for both de
signs were determined by Lagrange, multipliers using a simple cost
function which was minimized for a fixed over-all variance of mean
obtained by expectation of unconditional variance.
The results of the investigation presented in the preceding
sections demonstrated that king mackerel required more psu’s than
Spanish mackerel at the same level of precision in both designs
studied.
The optimum number of ssu’s and usu's for both species
showed the same value since they were hot affected by the level of
precision or weight of stratum.
.The relationship between total cost of sampling and weight
of stratum was determined for a range of stratum weight between .25
and .95. In all the weights, king mackerel presented a total sampl
ing cost higher than the other specie at the same level of precision.
The cost efficiency of the two designs adopted was compared
by the index C. E = C /C , where C is the: total cost in J mrs smrs mrs
30
multistage random design and C is the total cost when stratifies-. smrstion is done in the psu's.
For both species stratification was efficient when the.weight
of•stratum was between .35 and ,65. Beyond these values multistage sampling was more efficient.
APPENDIX A
DERIVATION OF VARIANCE OF OVER-ALL MEAN
In this Appendix the derivation of the variance of the over
all mean in the two models studied will be given. The technique
used to obtain them depends on a well known lemma from statistical
theory (see Johnson and Rao 1959).
Lemma: In a k-stage sampling scheme, the over-all vari
ance V^ k of a sample statistics 0 is composed of k com
ponents and is given by
V , (0) ={E E ... E ... E V + E E ... E ... E V E 1 2 s k-1 k 1 2 s k-2 k-1 k
+ E E . . . E V E E + ... + V E E ... E ... E} (0) (*)1 2 s s+1 s+2 k 1 2 3 s k
Proof: V (0) = E E ... E ... E (0)2 - {EE ... E ... E (0)}21 2 s k 1 2 s k
= E E ... E (V(0) + { E(0)> 2] - {EE ... E (0)} 21 2 k-1 k k 1 2 k
= E E ... E V(0) + E E ... E [ V E (0) 1 2 k-1 k 1 2 k-2 k-1 k
32
33
+ { E E(0)}2] - {E ... E ... E(0)}2 k-1 k 1 s k
= E E . . i E V(0) + E E ... E V E(0) + E E ... E [ V E E(0)1 2 k-1 k 1 2 k-2 k-1 k 1 2 k-3 k-2 k-1 k
+ { E E E(0)}2] - { E E ... E(0)}2. k-2 k-1 k 1 2 k
By proceeding with expansion of the terms in brackets we obtain
the result (*).
Three-stage random sampling
The first situation considered is the one in which the units
has unequal subclass numbers in all three stages.
The unbiased estimator of the mean when sampling is with equal
probabilities at eveiy stage is given by:
or
N n M ra
* ^ j ^ ( A ’ 2 )
Ni Miwhere T «= E E K .i j J
Since at the third stage each of the selected ssu are sampled
and k _. elements are selected out of this is a stratified
sample. By Theorem 2.2 in Cochran (1963)
2where a^. is the population variance in the second stage unit from
which the sample was drawn.
The variance of (A.2) is
2
V(Y...) = E fci)2 I K? r^- (1 - (A.4)3 Tnl i' mi j kij Kij
Now in order to find the first term of (3.1.3) take expectation of
(A. 4)
N H M, M , o2 k. .E[E V(Y...)]= Z ~~ Z K IT1 <1 ” IT* )• (A-5)1 2 3 T n, i i j J klj Klj
It can be shown that
E(Y...) = (A.6)
where is the total y-value in this ijth unit.
The variance of (A,6) is given by
2 N. 0 n T , . m.
V{E(Y...)}- (z— ) Z1 H — (1 - -p) (A. 7)in- . i m. M.
where a is the variance between ssu totals within the ith psu,ij
35
X j " v 1 f (TiJ^ ) 2
Taking the expectation of (A.7) yields
_ N N - mE[V{E(Y...)}] = I r1 of, (1 -rr)- (A.8)1 2 3 T iu i mi Tij Mi
Finally, the last term of (3.1.3) is given by
N n M.E{E(Y. . . )} = — - I I3 T . (A. 9)2 3 1 i j 3
The variance of (A.9) after some manipulation is
2N V n
V [E{ E(Y... )} ] = (.■— ) ~~~~ (1 - Tp') (A. 10)1 2 3 1 K1
2where o is the variance between totals of the y-values in the i
psu’s, i.e.,
2 1 N1 N1 2 0t1 “ nt I (T1-J VVNiand T = T T .j 3
Substituting (A.5), (A.8) and (A.10) in (3.1.3) yields
N. , °T n N. N °T., mV(Y...) = -ji (1 - ji) + - A - I1 (1 - ^ )
36
T2n, i mi j lj kij Kij
In the symmetrical case, and using Cochran's notation:
= N for "all i n^ = n for all i
IL " M for all i, j m^ = m for all i, *
= K for all i, j, u = k for all i.
The expression (A.11) may be written as
Define the following
Y1... = T./MK
01 = ?21 = t k I - Y...)2 = o ^ /(mk)2
2 2 i N M _ __ 2 ^ 2°2 - 2 ^ - T ) - ( V V - >
2 2 1 N M K - 2 1 N M 2 °3 = S3 ™ N M(K-l) 2 2 2 (Yiju ” Yij) = NM 2 2 °ij
2 2 2where o^, and are the variance between means in the
second and third stage respectively.
Finally, substituting these values in (A.12) gives
S1 S2 . S2V(Y...) = (1 - 5.) — + (1 - ^) — + ( ! - - )N n M n m m n m k
(A.11)
j, u.
(A.12)
first,
(A.13)
Stratified three-stage random sampling
The over-all variance of sample mean in this model is
in the same fashion described in the previous section.
An estimate of the population mean per usu is given
L _Yst " L------- Yh " I \ Yh
P h "h Kh
Lwhere K^/ZN^ is the weight of stratum and "S
hsample mean per usu in the hth stratum defined by
y . !m i £hij *h V h i i "hi j khij» hl^u
L N Mwhere T = I Z Z K .
h h i j hljThe variance of Y is given by
V(Yst> - \ wh V(Yh)- h
derived
by
(A.14)
is the
(A.15)
(A.16) .
Proceeding in the same fasion as before V(Y^) was computed
which yielded
382 2 2where S^, and are respectively the variance among unit
means in the first, second and third stage in the hth stratum.
Substituting (A.17) into (A.16)
V(Yst> ■ I Wh[(1 - flh) ^ + (1 • f2h> + (1 " f3h> nh ^ kh 1
(A. 18)where = nh/Nh; = "h^h and f3h “ kh/Kh- ' '
Derivations of Optimum Allocation Formulas
Three-stage random sampling
The computational details of the derivation of the optimum
values n, m and k in three-stage random sampling are presented
here.
As defined previously in Section (5.1.1) the cost function C
may be represented by
C = n C ^ + n m C ^ + n m k C .
The over-all variance of mean was shown to be
s2 s2 s2v = v (y ...) = a - £) — + (i - 5) — + a - £) 3N n M n m K n m k
or V = - (S, - sl/H) + — (S - sl/K) - S?/N + sl/nmk (A.19)n 1 z n m z j i jTo obtain the values of n, m and k which minimize the
/
39
cost C subject to a fixed variance V the calculus method of
Lagrange multipliers was used by differentiating with respect to n,
m and k the quantity F = C + A V where A is the Lagrange multi
plier.
The equations are:
37? a o o 9 9 9^ + m + mk - (— [(S - S^/H) + (S - S^/K)/m + S^/mk] = 0n
(A.20)
|| = tl C2 + nk C3 - (- ) [ (S3 - S3/K)/n + S^/nk] = 0 (A.21)m
= n m - (-— ■) S^/nm = 0 ' (A.22)k
Taking the values of A in (A.21) and (A.22) comparing and
solving for k yields
C2 S31 5 , (A.23)C3 \kl - S23/KFt
Similarly, multiplying (A.20) by m and subtracting (A.21)
by n gives
S 1 - S 2 / M "
Substituting A in (A.22) yields
40
ms : - S^/K
S. - S^/M(A.24)
The value of n is obtained substituting (A.23) and (A.24)
in (A.19). Thus after simplification and ordering the terms the final
expression is
n =Sg/M £ (Sui
f T (V + S1/N)(A.25)
where is the standard deviation among unit mean within each stage
of sampling. In this particular case
Sul " M - S2/M for the first stage
= jsSu2 " VS2 - S3/K for the second stage
Su3 = S3 for the third stage
Stratified three-stage random sampling
In this design the cost function C as defined in Chapter 5
can be represented by
C ~ £ nh Clh + £ nh mh C2h + ^ nh mh kh C3h
41
and the over-all variance of the mean given by (3.2.6) after rearrang
ing the terms is
V ■ V < V = l Mh[(Slh - S2h/Mh)/nh + <S2h ' S3h/Kh)/nhmh h
+ S3h/nh V h " Slh/Nh]> (A-'26)
The optimum values of n^, m^ and which minimize the
total cost function of sampling subject to the constraint that the
amount of variance is preassigned was found by the same method used
before
Setting up the Lagrangian F = C + A V and taking the partial
derivatives with respect to n^, m^ and k^ yields
-AW2= ( 2 T ^ Slh - S2h^Ih + s2h “ S3h/Kh)y'inh + S3h/mhkh + Clh
h nh
+ mh C2h + V h C3h = 0 (Ae27)
-AW2AT? h 0 0 O3m^= ''iY')[(S2h "" S3h/lSi)/nh + S3h/nhmh + nh C2h + nhkh C3h
h
= 0 • (A.28)
2
|f-+ ( ^ t S ^ / n j , n^] + = 0. (A.29)
42
Comparing the value of X from (A.28) and (A.29) and
solving for
kk ='2h 3h
I'3*1 iS2h - S3h/Kh ‘- (A. 30)
Multiplying (A.27) by m^ and subtracting (A.28) multiplied
by n^ gives
XHh = 72nh cih
Slh S2h/I1h
Substituting XW^ in (A.29) yields
mh =lbM S2h - S3h/Kh
Slh - S2h/Mh(A.31)
The optimum values of n^ is obtained substituting (A.30)
and (A.31) in the variance function. Thus, after simplification
the final expression is
n. =Sulh {* Hh h
3[E (S i uih cih1}
'lhL
(V + Z "h Slh/Nh)(A.32)
where S ^ is the standard deviation of among unit means correspond
ing to each stage of sampling.
LIST OF REFERENCES
Anderson, R. L. 1960. Uses of variance components analysis in the ini terpretation of biological experiments. Bull, of the Int. Stat. Instit. 37:3: 71-90.
Cochran, W. G. 1939. The use of analysis of variance in enumerationby sampling. Jour. Amer. Stat. Assoc. 34: 492-501.
Cochran, W. G. 1963. Sampling Techniques. 2nd Ed. John Wiley &Sons, Inc., New York.
Costa, R. S. and M. P. Paiva. 1965. Nptas sobre a pesca da cavala e da serra no Ceara-Dados de 1964. Arq. Est. Biol. Mar. Univ.Ceara 5: 93-101.
Ec.imovic, J. P. 1956. Three-stage sampling with varying probabilitiesof selection. Jour. Ind. Soc. Agr. Stat. 8: 14-44.
Eisenhart, C. 1947. The assumptions underlying the analysis of variance. Biometrics 3: 1-21.
Ganguli, M. 1941. A note on nested sampling. Sankhya 5: 449-452.
Gates, C. E. and C. Shine. 1962. The analysis of variance of the s-stage hierarchal classification. •, Bimetrics 18: 529-536.
Gower, J. C. 1962. Variance component estimation for unbalancedhierarchial classification. ; Biometrics 8: 537-542.
Johnson, P. 0. and M. S. Rao. 1959, Modern Sampling -Methods. The University of Minnesota Press, Minneapolis.
King, S. E. and C. R. Henderson. 1954. Variance Components analysisin heritability studies. Poultry Science 33: 147-154.
Marcuse, S. 1949. Optimum allocation and variance components innested sampling with an application to chemical analysis. Biometrics 5: 189-205.
Mokashi, V. K. 1954. Investigation, on sampling for estimation of crop acreages-II. Jour. Ind. Soc. Agr. Stat. 7: 115-126.
43
44
Pulley, Paul, Jr. 1957. A program for the analysis of variance of hierarchal classification design. M. S. Thesis. Oklahoma State University, Stillwater.,. Oklahoma.
Roy, J. 1957. A note on estimation of variance components in multistage sampling with varying probabilities. Sankhya 17: 367-372.
Sen, A. R. and R. P. Chakrabarty. 1964. Estimation of loss of crop from pest and diseases of tea from sample surveys. Biometrics 20: 492-504.
Sen, A. R., R. P. Chakrabarty and A. R. Sarkar. 1966a. Sampling techniques for estimation of incidence of red spider mite on tea crop in North-East India. Biometrics 22: 385-403.
Sen, A. R., A. R. Sarkar and R. P. Chakrabarty. 1966b. Sample survey of pests and diseases of tea in North-East India. Expl. Agric. 2: 161-172.
Singh, D.• 1958. Estimates of variance components in finite population. Jour. Ind. Soc. Agr. Stat. 10: 1-14.
Southern Cooperative Group. 1951. Studies of sampling techniques and chemical analysis of vegetables. Sout. Coop. Series Bull.No. 10.
Stearman, R. L.* T. G. Ward and R. A. Webster. 1953. Uses of components of variance techniques in biological experimentation.Amer. Jour. Hyg. 58: 340-351.
Stuart, A. 1964. Multistage sampling with preliminary stratification of first stage units. Rev. Inst. Int. Stat. 32: 193-198.
Stikhatme, P. V. 1950. Efficiency of sub-sampling design in yield survey. Jour. Ind. Soc. Agr. Stat. 2: 212-228.
Sukhatme, P. V. and V. G. Pause. 1951. Crop surveys in India II. Jour. Ind. Soc. Agr. Stat. 3: 96-168.
Sukhatme, P. V., V. G. Panse and K. V. R. S as try. 19 5 8. Samplingtechniques for estimating the catch of sea fish in India. Biometrics 14: 78-96. ,
Taylor, A. E. 1955. Advanced Calculus. Gin and Co., New York.