+ All Categories
Home > Documents > Generalized estimators of avian abundance from count survey data ...

Generalized estimators of avian abundance from count survey data ...

Date post: 06-Jan-2017
Category:
Upload: dodung
View: 215 times
Download: 0 times
Share this document with a friend
12
375 Animal Biodiversity and Conservation 27.1 (2004) © 2004 Museu de Ciències Naturals ISSN: 1578–665X Royle, J. A., 2004. Generalized estimators of avian abundance from count survey data. Animal Biodiversity and Conservation, 27.1: 375–386. Abstract Generalized estimators of avian abundance from count survey data.— I consider modeling avian abundance from spatially referenced bird count data collected according to common protocols such as capture– recapture, multiple observer, removal sampling and simple point counts. Small sample sizes and large numbers of parameters have motivated many analyses that disregard the spatial indexing of the data, and thus do not provide an adequate treatment of spatial structure. I describe a general framework for modeling spatially replicated data that regards local abundance as a random process, motivated by the view that the set of spatially referenced local populations (at the sample locations) constitute a metapopulation. Under this view, attention can be focused on developing a model for the variation in local abundance independent of the sampling protocol being considered. The metapopulation model structure, when combined with the data generating model, define a simple hierarchical model that can be analyzed using conventional methods. The proposed modeling framework is completely general in the sense that broad classes of metapopulation models may be considered, site level covariates on detection and abundance may be considered, and estimates of abundance and related quantities may be obtained for sample locations, groups of locations, unsampled locations. Two brief examples are given, the first involving simple point counts, and the second based on temporary removal counts. Extension of these models to open systems is briefly discussed. Key words: Abundance estimation, Avian point counts, Detection probability, Hierarchical models, Metapopulation models, Population size. Resumen Estimadores generalizados de abundancia en aves a partir de datos de estudios de recuento.— En el presente estudio se analiza la modelación de la abundancia en aves mediante datos de recuento de aves, referenciados espacialmente y obtenidos a partir de protocolos comunes, como los de captura–recaptura, muestreo por observadores múltiples, muestreo por eliminación y recuentos de puntos simples. Las muestras de pequeño tamaño, así como el amplio número de parámetros, han propiciado numerosos análisis que no tienen en cuenta la indexación espacial de los datos y, por consiguiente, no proporcionan un tratamiento adecuado de la estructura espacial. En este trabajo se describe un marco general para la modelación de datos replicados en el espacio, que considera la abundancia local como un proceso aleatorio, todo ello basado en el punto de vista de que el conjunto de poblaciones locales referenciadas espacialmente (en los lugares de toma de muestras) constituye una metapoblación. De este modo, la atención puede centrarse en el desarrollo de un modelo para la variación en la abundancia local que sea independiente del protocolo de muestreo que se esté utilizando. La estructura del modelo metapoblacional, en combinación con el modelo de generación de datos, define un modelo jerárquico simple que puede analizarse mediante el empleo de métodos convencionales. El marco de modelación propuesto es de carácter general, en el sentido de que permite considerar amplias clases de modelos metapoblacionales, covariantes del nivel del emplazamiento sobre datos de detección, y la abundancia, pudiendo obtenerse estimaciones de abundancia y cantidades relacionadas para emplazamientos de muestreo, grupos de emplazamientos y emplazamientos no muestreados. A tal efecto, se incluyen dos breves ejemplos; el Generalized estimators of avian abundance from count survey data J. A. Royle
Transcript
Page 1: Generalized estimators of avian abundance from count survey data ...

375Animal Biodiversity and Conservation 27.1 (2004)

© 2004 Museu de Ciències NaturalsISSN: 1578–665X

Royle, J. A., 2004. Generalized estimators of avian abundance from count survey data. Animal Biodiversityand Conservation, 27.1: 375–386.

AbstractGeneralized estimators of avian abundance from count survey data.— I consider modeling avian abundancefrom spatially referenced bird count data collected according to common protocols such as capture–recapture, multiple observer, removal sampling and simple point counts. Small sample sizes and largenumbers of parameters have motivated many analyses that disregard the spatial indexing of the data, andthus do not provide an adequate treatment of spatial structure. I describe a general framework for modelingspatially replicated data that regards local abundance as a random process, motivated by the view that theset of spatially referenced local populations (at the sample locations) constitute a metapopulation. Underthis view, attention can be focused on developing a model for the variation in local abundance independentof the sampling protocol being considered. The metapopulation model structure, when combined with thedata generating model, define a simple hierarchical model that can be analyzed using conventionalmethods. The proposed modeling framework is completely general in the sense that broad classes ofmetapopulation models may be considered, site level covariates on detection and abundance may beconsidered, and estimates of abundance and related quantities may be obtained for sample locations,groups of locations, unsampled locations. Two brief examples are given, the first involving simple pointcounts, and the second based on temporary removal counts. Extension of these models to open systemsis briefly discussed.

Key words: Abundance estimation, Avian point counts, Detection probability, Hierarchical models,Metapopulation models, Population size.

ResumenEstimadores generalizados de abundancia en aves a partir de datos de estudios de recuento.— En elpresente estudio se analiza la modelación de la abundancia en aves mediante datos de recuento de aves,referenciados espacialmente y obtenidos a partir de protocolos comunes, como los de captura–recaptura,muestreo por observadores múltiples, muestreo por eliminación y recuentos de puntos simples. Lasmuestras de pequeño tamaño, así como el amplio número de parámetros, han propiciado numerososanálisis que no tienen en cuenta la indexación espacial de los datos y, por consiguiente, no proporcionanun tratamiento adecuado de la estructura espacial. En este trabajo se describe un marco general para lamodelación de datos replicados en el espacio, que considera la abundancia local como un procesoaleatorio, todo ello basado en el punto de vista de que el conjunto de poblaciones locales referenciadasespacialmente (en los lugares de toma de muestras) constituye una metapoblación. De este modo, laatención puede centrarse en el desarrollo de un modelo para la variación en la abundancia local que seaindependiente del protocolo de muestreo que se esté utilizando. La estructura del modelo metapoblacional,en combinación con el modelo de generación de datos, define un modelo jerárquico simple que puedeanalizarse mediante el empleo de métodos convencionales. El marco de modelación propuesto es decarácter general, en el sentido de que permite considerar amplias clases de modelos metapoblacionales,covariantes del nivel del emplazamiento sobre datos de detección, y la abundancia, pudiendo obtenerseestimaciones de abundancia y cantidades relacionadas para emplazamientos de muestreo, grupos deemplazamientos y emplazamientos no muestreados. A tal efecto, se incluyen dos breves ejemplos; el

Generalized estimators of avianabundance from count survey data

J. A. Royle

Page 2: Generalized estimators of avian abundance from count survey data ...

376 Royle

primero trata de los recuentos de puntos simples, mientras que el segundo se basa en los recuentos porextracción temporal. También se apunta la posibilidad de ampliar estos modelos a sistemas abiertos.

Palabras clave: Estimación de la abundancia, Recuentos de puntos aviares, Probabilidad de detección,Modelos jerárquicos, Modelos metapoblacionales, Tamaño poblacional.

J. Andrew Royle, USGS Patuxent Wildlife Research Center, 12100 Beech Forest Road, Laurel MD 20708,U.S.A.

Page 3: Generalized estimators of avian abundance from count survey data ...

Animal Biodiversity and Conservation 27.1 (2004) 377

commodating site–specific covariates in these com-mon sampling protocols. Royle et al. (2004a) de-scribe an approach for incorporating abundancecovariate effects in distance sampling models thatis related to the models described here.

One important difficulty present in most spatiallyreplicated bird counting surveys (of breeding birds)is that typical abundance at individual sampledlocations is very small. Consequently, site–specificsample sizes (number of observed birds) are small.The general small sample situation is problematicwhen it comes to estimation because the likelihoodcontains many (abundance) parameters each ofwhich is ill–informed by the available sparse data.Estimation of spatially explicit abundance is usuallyinfeasible. A common solution is to aggregate dataacross sites and apply conventional estimationmethods to the aggregate counts. In doing so, site–specific information is lost so that, for example,estimation and modeling of site–specific covariateeffects on abundance and detection is infeasible. Inaddition, spatial scale becomes a concern whendeciding how data should be aggregated. While itmay be reasonable to combine multiple sampleswithin a small forest or woodlot, additional consid-erations should be relevant at larger scales. Finally,the use of aggregated counts cannot generally bejustified based on the likelihood for the observeddata, i.e., the site–specific counts. That is, theaggregated counts are not sufficient statistics forthe objective (total) abundance "parameter" under asampling scheme involving spatial replication. Ad-ditional assumptions are required to formally justifyaggregation of count statistics among sites; this iselaborated on in "The likelihood under spatial repli-cation" section. These deficiencies motivate theneed for a more general approach to dealing withspatially replicated count survey data.

In this paper, I describe a general framework formodeling and estimation of abundance from spatiallyreplicated animal count data. The key idea is intro-duction of a metapopulation model that characterizesthe (spatial) variation in abundance of the spatiallyreferenced populations being sampled. Themetapopulation view provides a concise frameworkfor combining the data collected at multiple samplelocations, regardless of the sampling protocol used tocollect data. Specification of a metapopulation modelis a great advantage because it allows the biologist tofocus on explicit formulation of the abundance modelat the level of the sample unit, independent of thedetection process. The main benefit of adopting themetapopulation view is that a broad class of morecomplex models are possible including models whichdescribe variation in site–specific abundance explic-itly (e.g., with covariates), and models which allow forlatent spatial variation (overdispersion, spatial corre-lation) that is not modeled explicitly by covariates.These metapopulation models form the basis for thedevelopment of generalized estimators of abundancebased on any of the previously mentioned protocols.These are generalized in the sense that they canaccommodate variation in site–specific abundance,

Introduction

The detectability of individuals is a fundamentalconsideration in many studies of animal populations.The need to properly account for detectability hasgiven rise to an extensive array of sampling protocolsand statistical procedures for estimating demographicparameters in the presence of imperfect detection(Williams et al., 2002).

Conventional capture–recapture methods in whichindividual animals are marked, released, and recap-tured (or resighted) constitute the most useful classof methods in terms of the information content pro-vided by the data, and the complexity of detectionprocess models that may be considered. In studiesof avian populations, implementation of capture–recapture methods is often difficult in field situations.Because of this, there has been considerable recentinterest in methods based on avian point countingthat are capable of controlling for imperfect detectionwhile remaining efficient to implement in field situa-tions. These methods include those based multipleobserver sampling (Cook & Jacobson, 1979; Nicholset al., 2000), temporary removal (Farnsworth et al.,2002), distance sampling (Rosenstock et al., 2002)and even simple point counts (Royle, 2004a). Theseand similar methods are also widely used in thestudy of other organisms including marine mam-mals, ungulates, and amphibians. My motivationderives from studies of bird populations, and sosubsequent discussion and examples focus on birdsampling problems.

Many small–scale studies of animal populationsand large–scale monitoring efforts rely on samplingdesigns in which one (or more) of these commonsampling protocols is replicated spatially. This ispartly out of necessity —many species exist at lowdensities, and effective sampling areas are small—but often there is direct interest in characterizingspatial variation in abundance. These replicatedsurveys yield spatially indexed count data yi forsample location (or site) i = 1,2,...,R. In mostsampling protocols for which it is possible to esti-mate abundance in the presence of imperfect de-tection, yi = {yik; k = 1,2,...,K} is a vector of counts.As an example, if a removal protocol is used then yi= (yi1, yi2, yi3) are the number of animals firstobserved (and "removed" from further counting) inconsecutive time intervals of say 3 minutes. Theprecise nature of the count statistic under othercommon sampling protocols is described in "Nota-tion and preliminaries" section.

There are two objectives considered in manystudies of animal populations that are demographi-cally closed. First is estimation of "abundance",population size, or density. In the context of spa-tially replicated surveys, this is often defined astotal or average abundance of the sampled area.The second objective is estimation of the effects of(site–specific) covariates on abundance or density.Typical covariates of interest are those that de-scribe habitat or landscape structure. Interestingly,there have been few general suggestions for ac-

Page 4: Generalized estimators of avian abundance from count survey data ...

378 Royle

factors that influence detectability, and additionalconsiderations described in "The metapopulationview" section.

Under this metapopulation formulation, localabundance is regarded as a "random effect", andthe general model structure is commonly referredto in statistics as a hierarchical model. Hierarchicalmodels are commonly analyzed by either integratedlikelihood or Bayesian methods. This is describedin "Estimation and inference" section. Modelingabundance effects, estimating density, estimatinglocal population size, and even predicting abun-dance at unsampled locations are straightforwardproblems under this hierarchical modeling frame-work.

Notation and preliminaries

Let Ni be the number of birds available to be countedat location i; i = 1,2,...,R. Sampling yields the vectorof counts yi for each sample location. The precisenature of the data vector yi depends on the sam-pling protocol used. For several of the more com-mon sampling protocols, the data structure is asfollows:

(1) For independent double or multiple observerprotocols, k indexes an "observer detection his-tory". For example, with two independent observ-ers, K = 3 and yi1, and is the number of birds seenby observer 1 (but not observer 2), yi2 is the numberseen by observer 2 (but not 1), and yi3 is thenumber seen by both observers. In general, with Tobservers, there are K = 2T – 1 observable observerhistories.

(2) For a removal protocol, k indexes the timeinterval of (first) detection. i.e., yi1 is the number ofbirds first seen in interval 1, yi2 in interval 2, and yi3in interval 3, etc.

(3) For distance sampling, the count statisticsare indexed by distance, so that yik is the number ofbirds seen in distance class k at site i.

(4) For conventional "capture–recapture" experi-ments, the data structure is analogous to thatobtained under multiple observer sampling exceptthat the capture history is organized in time. Forexample, in a two period study, K = 3, and let yi1 bethe number of individuals with capture history "10"(seen in the first interval, but not the second), yi2 bethe number of individuals with capture history "01",and yi3 be the number of individuals with capturehistory "11".

Various other protocols may also be considered,including that based on simple point counts (see"Point counts" section).

A final bit of notation will be useful. In someapplications that involve spatial replication, puta-tive interest lies in estimation of the total abun-dance at the sampled sites: Ntotal = i Ni. Thefamiliar "dot notation" will be used to indicate vari-ous sums. Let yi0 = Ni – yi. be the number of birdsnot detected at each site where (thetotal number detected at site i). Thus, Ntotal can be

expressed as the total count i yi. plus the sum ofthe unobserved individuals at each locationy.0 = i yi0.

The likelihood under spatial replication

Hereafter, I assume that the observations yi areindependent when conditioned on local populationsize Ni and parameters of the detection process.Under this assumption, the sampling distribution ofthe data y = {yi; i = 1,2,...,R} under most commonsampling protocols is the product multinomial

(1)

for a sampling protocol yielding 3 observablefrequencies (e.g., 3 period removal, 2 observers,etc). The cell probabilities, k, are functions ofone or more detection probability parameters p(the precise function depends on the protocolbeing used) and (the probability thatan individual is not captured). For example, un-der a removal sampling protocol with three re-moval periods, the cell probabilities have thefollowing form when detection probability is as-sumed constant:

1 = p

2 = (1 – p)p

3 = (1 – p)2p

0 = (1 – p)3.

For other sampling protocols, these cell prob-abilities are different functions of various detectionprobability parameters but their precise form is notrelevant in any of the following discussion.

While the product multinomial likelihood (1)is not inherently intractable, in many practicalsituations there are important considerations thatrender it so. In particular, there are usually manyunknown abundance parameters (the Ni's), inaddition to the parameters that describe thedetection process. Also, local population sizesare frequently very small and, consequently, thesample sizes (number of captured individuals)for each location are small. In many surveys,there may in fact be many zero counts. Onecommon solution to dealing with these problemsis to aggregate the counts (i.e., pool data frommultiple sample locations). This is discussedsubsequently.

Spatial aggregation of count statistics

A common goal of many studies is estimation ofthe total abundance, Ntotal, across all sampled loca-tions. Ignoring the fact that the data are indexed bysample location, one might focus on the likelihoodof the aggregated count statistics:

Page 5: Generalized estimators of avian abundance from count survey data ...

Animal Biodiversity and Conservation 27.1 (2004) 379

(2)

To be more concise, using the notation introducein "Notation and preliminaries" section, Eq. (2) is

. (3)

Interestingly, the use of aggregated counts can-not be justified under the (correct) likelihood for thedisaggregated data given by (1) without some addi-tional assumptions elaborated on shortly. While itis true that Eq. (3) is the correct likelihood for thetotal counts if the site–specific counts are unknown,it is not equivalent to the likelihood based on thedisaggregated site–specific counts. That is, giventhe site–specific counts yik, the totals y.k = i yik arenot the sufficient statistics for Ntotal when the Ni areviewed as fixed but unknown parameters.

I believe that the idea of pooling the site–specificsufficient statistics is partially motivated by conven-ience. The main support for use of (3) over (1) seemsto be that there are too many Ni parameters in thejoint likelihood (1) and this motivates one to considerthem as nuisance parameters. However, estimationbased on aggregated data does not appear consistentwith usual notions of the treatment of nuisance pa-rameters. For example, integration of the nuisanceparameters from the likelihood under a suitable priordistribution, or conditioning on sufficient statistics,both of which are fairly conventional treatments ofnuisance parameters.

It can be demonstrated that if Ni are assumed tohave a Poisson distribution with mean , then onecan justify aggregation (i.e., (3)) from likelihood (1).In this sense, estimation based on aggregatedcounts can be viewed as having implied a Poissonassumption on Ni with constant mean. Importantly,it precludes other possibilities: That Ni are over–dispersed relative to the Poisson, or that the meanis not constant. Thus, technical details aside, theimportant reason that one should not aggregatedata is that it renders impossible the considerationof covariate effects on both abundance and detec-tion probability, and consideration of more complexvariance structure.

The conditional likelihood under spatial aggregation

As an alternative to using the likelihood (1), it iscommon to use so–called "conditional" estimatorsbased on obtaining an estimate of 0 from theconditional likelihood

(4)

The likelihood given by (4) is motivated by notingthat the sufficient statistic for Ni is yi., and so byconditioning on yi., Ni is removed from the problem.Estimators based on the conditional likelihood (4)and the "unconditional" likelihood (3) are asymptoti-

cally equivalent (Sanathanan, 1972), and both speci-fications are commonly used in practice.

For the common parameterizations of k (underthe sampling protocols described previously), it isclear that the aggregated counts are sufficient sta-tistics for those model parameters contained in k,and hence use of aggregated counts can be justi-fied under likelihood (4) if interest is focused onestimating detection probability parameters. Esti-mation of Ntotal is then based on the assertion thaty.. = i yi is Binomial (Ntotal, 1– 0). While this may betrue, it should be noted that it was not y.. that wasconditioned on in order to obtain Eq. (4), but ratheryi.. The neglected likelihood component is

Once again, there is no way to reformulate thisin terms of Ntotal without additional model structureon Ni (e.g., if Ni has a Poisson distribution).

The metapopulation view

A more appealing and general solution to theproblem of spatial replication can be achieved byregarding the collection of local populations as ametapopulation (Levins, 1969; Hanski & Gilpin,1977). For the present purposes, a useful opera-tional definition of metapopulation is simply "apopulation of (local) populations indexed by space".Interest in the study of metapopulation biology hasexploded in recent years both in terms of theoreti-cal development and applications of metapop-ulation concepts to many taxa. Patch occupancy,local extinction and local colonization are allmetapopulation characteristics of some theoreti-cal and practical interest.

In the present context, that of demographicallyclosed systems during the time of sampling, thelocal population trait in question is size, but ingeneral (open systems) local population mortalityand recruitment events are also of interest, themetapopulation summaries being local extinctionand colonization probabilities, respectively. Notethat mortality at the local population level is theaggregate of individual mortality and emigrationprocesses, and recruitment at the local populationlevel is the aggregate of individual recruitment andimmigration processes. The relationship betweenlocal population processes and several importantmetapopulation parameters are given in table 1.

Note that demographic closure during samplingis not inconsistent with metapopulation theorywhich requires that populations mix across time tosome extent. In a demographically closed system,I view local population size as being a moregeneral description of patch occupancy. The eventthat a patch is occupied is equivalent to the eventthat Ni > 0, and patch occupancy is Pr(N > 0) for acollection of homogeneous patches. In general,Pr(N > 0) is a function of density, and the variationin local population sizes as described shortly.

Page 6: Generalized estimators of avian abundance from count survey data ...

380 Royle

Extinction and colonization events are also inti-mately linked to local abundance (among otherthings). Thus, models of variation in abundance areof more relevance than simply as a characteriza-tion of abundance per se.

Probabilistic characterization of metapopulations

It is natural to express the notion of a metapop-ulation probabilistically, by imposing a probabilitydistribution on abundance. This is expressed by

Ni i g(Nl )

where "i" is read "is distributed as" and g(Nx ) issome discrete probability density. The localpopulations may be independent, or not, but con-siderable simplicity arises when they are independ-ent. Practically, independence means that individu-als cannot occur in more than one local population(i.e., the Ni's do not overlap). The generality thatthis probability characterization permits is that may be allowed to vary spatially in a number ofways, any discrete probability density may be con-sidered for g(Nx ).

The main practical benefit of this metapopulationview is that the metapopulation structure serves as aframework for combining a large number of spatiallyreferenced count data surveys. In essence, this modelis a prior distribution on abundance. More generally,I believe that the structure of the metapopulation isof fundamental interest. That is, the goal of many (ifnot most) studies of avian abundance can be formu-lated in terms of the metapopulation distribution orits summaries such as E[N] (density), covariateeffects (on density), etc.

The simplest example of a metapopulation modelis that resulting from a uniform distribution of indi-viduals across the landscape. Then, aggregatingoccurrence events into non–overlapping sampleareas yields Ni i Poisson ( ).

This seems a natural choice for describing varia-tion in abundance because it arises under a homo-geneous Poisson point process, the standard nulldistribution for the spatial arrangement of organ-isms. Moreover, it is also an assumption that under-lies many common animal sampling methods (e.g.,

distance sampling). More importantly, there is anobvious and simple extension to accommodate anon–uniform distribution of individuals. One can con-sider that varies spatially, for example:

Ni i Poisson ( I)

where log ( I) = b0 + b1xi1 where xi1 is the value ofsome covariate at site i. Several further extensionsare also obvious. One is to allow for excess Poissonvariation by inclusion of a random effect, ei, as:

log ( I) = b0 + ei

where ei i Normal(0, 2). Alternatively, a more natu-ral model of over–dispersion for Ni is the negativebinomial distribution

Ni i NegBin( , )

with variance + 2/ . In any spatial sampling prob-lem, it is natural to consider the possibility that thespatial process is correlated. That is, that thereexists latent structure beyond any covariates that arecontained in the model. Royle et al. (2004b) considera model in which the log–linear model for the meancontains a spatially indexed random effect that is(spatially) correlated. Such structure may be appeal-ing in many animal abundance modeling problemswhere it is likely that habitat affinities are only knownimprecisely, or there is limited ability to quantify therelevant habitat components.

Open systems

The focus of this paper is on modeling and estima-tion of abundance in demographically closed sys-tems. The linkage between local abundance andpatch occupancy in closed systems has been men-tioned previously. However, similar relationshipsbetween other metapopulation attributes and abun-dance can also be made. For example, local colo-nization probability is Pr(Nt+1 > 0 l Nt = 0) and lo-cal extinction probability is Pr(Nt+1 = 0 l Nt > 0). Infact, one can characterize Pr(Nt+1 l Nt) in general,for each discrete state Nt, which represents animportant generalization over the current treat-ments of the problem that characterize occurrenceas being the binary event that N > 0, extinction as

Table 1. Summary of metapopulation concepts.

Tabla 1. Resumen de conceptos metapoblacionales.

Time scale Closure status Local population attribute Metapopulation parameter

Within year Closed local pops. Occurrence Patch occupancy

Size Density

Across years Open local pops. Mortality Extinction probability

Recruitment Colonization probability

Page 7: Generalized estimators of avian abundance from count survey data ...

Animal Biodiversity and Conservation 27.1 (2004) 381

the event that Nt+1 = 0 l Nt > 0 and colonization asthe event that Nt+1 > 0 l Nt = 0. Under this coarsecharacterization of metapopulation dynamics, thereis no consideration of density dependent mecha-nisms, and variation in abundance leads to het-erogeneity in detection probability (Royle & Nichols,2003) which must then be modeled indirectly.These issues are beyond the scope of this paper.

Estimation and inference

The metapopulation description of local abun-dance as a random (spatial) process seems anatural way to describe spatially referencedpopulations and may be appealing to many ecolo-gists. However, local abundance is never ob-served, instead being informed by survey dataaccording to one of the many possible samplingprotocols described in "Notation and preliminar-ies" section (among others). Thus, it is necessaryto incorporate this metapopulation model into aframework that is amenable to estimation andinference from data.

The metapopulation model is essentially a "ran-dom effects" distribution for local abundance, Ni.The classical approach to handling random effects(e.g., Laird & Ware, 1982) is to base inference onthe marginal likelihood of the data, having re-moved the random effects from the likelihood byintegration. In the multinomial sampling problemsconsidered here, the integrated likelihood of yi is:

Integrated likelihood has been considered undersimilar models by Royle & Nichols (2003), Dorazioet al. (2004), Royle (2004b) and Royle et al.(2004).

The Poisson distribution seems to be the de factostandard for the distribution g(Nl ) as it can be usedto justify analysis based on the aggregated counts,and its motivation as a random distribution of indi-viduals in space (a homogeneous point process) isappealing. Subsequently, I will focus on the Poissoncase. In this case, the integrated likelihood is:

where k are functions of p (depending on thesampling protocol used). This does have a closedform that is more amenable to computation. Inparticular,

(5)

This is just the product of (independent) Poissonrandom variables. Maximization of (5) yields esti-

mates of or any covariate effects on abundance,and detection probability parameters. The fact that

appears as a product with each k in Eq. (5) maylead one to question identifiability of model param-eters. However, the k are not freely varying param-eters, but instead are constrained by the samplingprotocol to depend on a smaller set of detectionprobability parameters. One can easily write downconsistent moment estimators for and detectionprobability parameters from Eq. (5) under the com-mon sampling protocols.

It is a simple matter to maximize Eq. (5) numeri-cally using conventional methods found in manypopular software packages. For example, the freesoftware package R (Ihaka & Gentleman, 1996)was used in the analyses of "Applications" section(routines are available from the author upon re-quest).

A natural alternative to integrated likelihoodfor fitting random effects models is to adopt aBayesian view and focus on characterizing theposterior distribution of the model unknowns con-ditional on the data using common Markov chainMonte Carlo (MCMC) methods. While this isstraightforward in the present problem, I neglectthose details here. While there are considerablephilosophical differences between the two ap-proaches, I believe that the main practical differ-ence has to do with estimating the random ef-fects (or summaries of them) and characterizinguncertainty in those estimates. This is discussedin the following section. In more complex models,such as when additional random effects are con-sidered in a model for i, estimation by integratedlikelihood becomes difficult and so adopting aBayesian formulation of the problem might be-come necessary (see Royle et al., 2004b).

Estimating abundance and related quantities

The MLE of , , is an estimate of the prior meanabundance at a site. Or, in the case where i varies

(e.g., covariates), one obtains as a function ofabundance covariates. To estimate Ntotal note that,under the Poisson assumption on Ni, Ntotal i Poisson(R ) and so

where is the MLE from the integrated likelihood.Generally, interest may not be in the estimated

prior means, but rather in estimating the realizedabundance either for the collection of sample loca-tions, or aggregated in some manner (over somespatial domain, or a collection of sample sites).For this, the classical method of estimating ran-dom effects is referred to as Best Unbiased Pre-diction (BUP). That is,

where is used in place of . This is a simplecalculation (see Royle, 2004a for an example).

Page 8: Generalized estimators of avian abundance from count survey data ...

382 Royle

The Bayesian treatment of the problem is moregeneral in the sense that variation in is directlyconsidered. For example, the Bayes estimator ofNi is the posterior mean:

.

In effect, the dependence on has been re-moved by integration. Consequently, one could ex-pect to be more variable than in practicalsample sizes.

Estimates of patch occupancy, say , can alsobe obtained from these random effects models.For example, under the Poisson model for Ni,

= 1– e– .

Goodness–of–fit and model selection

One convenient implication of the closed form like-lihood given by Eq. (5) is that one can use conven-tional deviance statistics for Poisson data to assessgoodness–of–fit (see Dorazio et al. [2005] andRoyle et al. [2004a] for examples). Under negativebinomial models, or when the likelihood is notmultinomial, bootstrap procedures appear to benecessary (Dorazio et al., 2005; Royle, 2004b;Dodd & Dorazio, in press). Model selection basedon integrated likelihood may be carried out usingAIC (Burnham & Anderson, 1998) regardless of theform of the likelihood.

Applications

The modeling framework presented here can beeasily applied to any of the common bird sam-pling protocols described previously. To illus-trate, we consider application to data collectedusing conventional point count data, and alsodata collected according to temporal removalprotocol. A comprehensive analysis of a large–scale capture–recapture data set is consideredby Royle et al. (2004b) and an application todistance sampling data is given by Royle et al.(2004a). Dodd & Dorazio (in press) provide acomprehensive integrated–likelihood analysis offrog count data collected according a point count-ing protocol.

Point counts

Point counts are often considered to be of mar-ginal value to statisticians with an interest inconventional modeling of marked animal databecause there is a widespread misperception thatinformation on abundance cannot be disentan-gled from detection probability. Royle (2004)showed that if point counts are spatially andtemporally replicated within a demographicallyclosed system, then the integrated likelihoodmethods described in "Estimation and inference"section can be used to effectively model bothdetection and abundance effects.

An important distinction between the point countprotocol and the others considered previously isthat temporal replication is necessary to estimatedetection from simple point counts. This is be-cause given simple binomial counts, yi, with indexNi and probability p, where Ni are independentrandom variables from g(N l ), p appears as aproduct with the location parameter of g in theintegrated likelihood. For example, in the Poissoncase with mean , the marginal distribution (theintegrated likelihood) of yi is Poisson with meanp . Royle et al. (unpublished report) gave a heu-ristic explanation to demonstrate that additionalinformation from spatial and temporal replicationis available. In particular, a moment estimator forp is simply the correlation between counts made insuccessive sample periods. i.e.,

(6)

for counts made at two sampling occasions. Then, is .

More formally, the integrated likelihood underthe replicated point count protocol is

(7)

p can vary as a function of covariates, and eventemporally, but we neglect that generality here.Note that Eq. (7) does not close, contrary to themultinomial likelihood case that yields Eq. (5).

Data considered here are a subset of thoseanalyzed by Royle (2004a) consisting of replicatedpoint counts at 50 stops along a North AmericanBreeding Bird Survey route. The point counts werereplicated 11 times within approximately a one monthperiod during the breeding season. Here, we con-sider only the first two counts for all 50 stops.Poisson and negative binomial models were consid-ered for abundance. Under the Poisson model, themoment estimates of p and were also computed (and based on Eq.(6)). For comparison, an abun-dance index being the mean (across sites) of themaximum count (over the two samples) was alsocomputed. The 4 species considered are: Ovenbird(Seiurus aurocapillus), Hermit thrush (Catharusguttatus) Woodthrush (Hylocichla mustelina) andAmerican robin (Turdus migratorius). Results of themodel fitting are given in table 2, along with AICscores (Burnham & Anderson, 1998).

Generally, the overdispersed negative binomialappears favored (except for the Hermit thrush).Estimated mean abundance differs considerablyfrom that reported by Royle (2004) based onanalysis of all 11 replicates. This is consistent withlack of closure over the longer time period orhigher rates of temporary emigration which is whyI have restricted attention to the first two replicateobservations here.

The main purpose of this example is to dem-onstrate that it is feasible to estimate abundance

Page 9: Generalized estimators of avian abundance from count survey data ...

Animal Biodiversity and Conservation 27.1 (2004) 383

from simple point counts while controlling for(i.e., modeling) detection probability. In point countsampling, there is some advantage to reducingthe time interval between counts to the extentpossible in order to minimize temporary emigra-tion which leads to some complication interpret-ing as density (applicable to a known area).Thus, consecutive counts (e.g., consecutive threeminute point counts) may be the best strategy forimplementing the point count estimator.

Removal counts

Next we consider avian point count data collectedin Frederick County, Maryland. The data werecollected at 70 locations within a large forest tract,according to a conventional removal samplingprotocol (Farnsworth et al., 2002) with four sam-ple intervals of length three minutes. The mainobjective was to evaluate the effect of two habitatcovariates: understory foliage cover (UFC) andthe basal area of large trees (BA). See Royle et al.(2004) for further description and an alternativeanalysis of some of these data.

We focus here on data for the Ovenbird (Seiurusaurocapillus). The data for each sample point are yi= (yi1, yi2,yi3,yi4) where yik is the number of malesfirst seen in interval k. For this illustration, weassume that detection probability, p, is constant sothat the multinomial cell probabilities are:

1 = p

2 = (1 – p)p

3 = (1 – p)2p

4 = (1 – p)3p

Several covariates were collected that are thoughtto influence p (e.g., time of day) and a morecomplete analysis of these data is in progress.Here, we consider only the habitat effects on abun-dance. That reasonable covariates on both detec-tion and abundance can be identified is important

motivation for considering the mixture models elabo-rated on in "Estimation and inference" section.Removal data from several sites are shown intable 3, highlighting the typical small sample datasets that arise from local scale bird counting.

Models were fit using the Poisson metapop-ulation model assuming that

Ni i Poisson ( i)

where

log ( i) = b0 + b1UFCi + b2BAi

Results for several models are summarized intable 4, including AIC scores for evaluating therelative merits of each model.

For example, under the constant model = 1.138 (SE = 0.093), or 1.138 male ovenbirds

per point count sample. Point counts in this studywere of radius 100 m, so one could interpret thisas density if so inclined. More importantly, thehabitat effects appear important so that densitychanges as a function of UFC and BA. There is alarge positive effect of UFC and negative effect ofBA. Because ovenbirds are ground nesters, andtherefore would benefit from protection affordedby understory foliage, these results appear sensi-ble. Also, the fact that the model containing botheffects was not favored is not unexpected be-cause UFC and BA are negatively correlated.These results are broadly consistent with thosereported by Royle et al. (2004a) obtained using adistance sampling protocol (data were collectedin a manner consistent with multiple protocols).

Conclusions

In this paper I have considered the problem ofmodeling spatially replicated avian count data thatare collected according to many common sam-

Table 2. MLEs and AIC for Poisson and negative binomial hierarchical models fitted to the avianpoint count data: and are the Poisson moment estimates.

Tabla 2. Estimaciones de los parámetros de máxima verosimilitud (MLE) y criterio de información deAkaike (AIC) para moldelos Poisson y modelos jerárquicos binomiales negativos ajustados a los datosde recuentos puntuales: y son las estimaciones del momento de Poisson.

Poisson Negative binomial

Species Index AIC AIC

Ovenbird 0.96 0.53 1.25 0.43 1.53 215.68 0.33 2.01 1.69 213.70

Hermit thrush 0.10 0.55 0.13 0.55 0.13 47.82 0.55 0.13 inf 49.82

Woodthrush 0.52 0.60 0.63 0.58 0.65 150.59 0.53 0.72 0.78 147.44

Am. Robin 1.12 0.46 1.65 0.38 2.02 241.73 0.17 4.53 1.21 235.16

Page 10: Generalized estimators of avian abundance from count survey data ...

384 Royle

pling protocols. These include methods that yielda multinomial sampling distribution including con-ventional capture–recapture methods, multiple ob-server sampling, temporary removal and even sim-ple point counts. One important statistical consid-eration is that data are frequently sparse (lowcounts and many zeros), owing to generally lowdensities of most breeding birds, and small sampleareas. In addition, the likelihood under spatial replica-tion may contain a large number of abundanceparameters (Ni for each sample location) that renderit intractable using conventional methods.

Conventional methods of analyzing bird countdata often focus on estimating total abundance

Table 4: Results of models fit to ovenbird counts obtained under a temporary removal protocol.

Tabla 4. Resultados de los modelos ajustados a los recuentos del tordo mejicano obtenidos con arregloa un protocolo de extracción temporal.

Model p b0 UFC BA AIC

Constant 0.572 0.130 303.86

+ UFC 0.572 0.113 1.859 303.32

+ BA 0.572 0.106 –0.829 302.35

+ UFC + BA 0.572 0.102 1.042 –0.643 303.72

Table 3: Ovenbird removal data (number firstseen in four consecutive intervals).

Tabla 3. Datos de extracción del tordo mejicano(primer número observado en cuatro intervalosconsecutivos).

t = 1 t = 2 t = 3 t = 4

point 34 0 0 0 0

point 35 0 0 0 0

point 36 2 0 1 0

point 37 1 0 0 0

point 38 0 1 0 0

point 39 1 1 0 0

point 40 1 0 0 0

point 41 2 1 0 0

point 42 0 1 0 0

point 43 1 2 0 0

point 44 0 0 0 1

point 45 1 0 1 0

over the collection of sample locations. Under thislimited treatment of the problem, variation at thelevel of the sample location is, in effect, averagedout. Covariates cannot be considered, and onemust consider spatial scale in deciding how toaggregate data. Importantly, aggregation may onlybe justifiable under certain spatial homogeneityassumptions. For example, if local abundance (atthe level of the sample locations) is assumed to bePoisson with constant mean, then aggregation canbe justified. However, this may not be a reasonableassumption in many problems.

Alternatively, the spatial attribution of the data isan important consideration in many studies, andcan be exploited to develop more general modelsfor describing abundance. For example, the goal ofmany studies is to estimate abundance covariateeffects. And, factors that influence detectability mayalso vary among sample locations. Explicitly ac-knowledging spatial variation in local abundancefacilitates investigation of these possibilities.

The solution to the problem of modeling spa-tially replicated data proposed here is to view localabundance as a random process. Then, attentioncan be focused on developing a model for thevariation in local abundance free of detection prob-ability considerations. This is appealing in thecontext of familiar metapopulation ideas that seekto characterize the structure among spatially refer-enced (local) populations that constitute themetapopulation. Taken together, the data model(the multinomial likelihood) and metapopulationmodel define a simple hierarchical model for whichformal and rigorous methods of analysis are pos-sible. For example, one can estimate parametersand conduct inference based on the integratedlikelihood (having removed the random effects byintegration). Alternatively, Bayesian analysis basedon the posterior distribution is relatively straight-forward.

The generality of the proposed modeling strat-egy is appealing. Mean abundance ( under thePoisson model) may be parameterized in terms ofadditional parameters that describe variation inthe Poisson mean (and hence abundance), and

Page 11: Generalized estimators of avian abundance from count survey data ...

Animal Biodiversity and Conservation 27.1 (2004) 385

there is no need even to restrict attention to aPoisson random effects distribution. Such gener-ality is easily dealt with formally within the contextof the hierarchical model specification.

Two brief examples were given to demonstratehow a classical analysis of such models mightproceed. The first example made use of simplepoint counts (replicated temporally) and consid-ered a simple constant detection model and bothPoisson and negative binomial models for localabundance. In the second example, data collectedaccording to a removal sampling protocol wereconsidered. In that example, habitat covariateswere considered as possible effects on local abun-dance.

Extension to demographically open systems

Considerable generality can be achieved by consid-ering extensions of hierarchical abundance modelsto systems that are demographically open, such asmight occur if sampling is conducted during thebreeding season in multiple years. There are severalinteresting "open population" situations that may beconsidered: (1) Many monitoring programs that gen-erate counts in multiple years may not yield informa-tion on individual animals across years. This iscommon of most "point counting" surveys. In thissituation, a simple metapopulation model structuresuch as Ni,t i Poi( t Ni,t–1) may be useful for integrat-ing data across years. Moreover, they facilitate acharacterization of metapopulation dynamics thatrepresents a generalization over methods consid-ered by, for example, MacKenzie et al. (2003) thatare based on detection/non–detection data; (2) Acommon lack of closure is due to the phenomenonof "temporary emigration". In this case, let Mi bethe size of some super–population located at sam-ple location i. Let Ni,t i Bin(Mi, ) be the number ofindividuals available for sampling during occasiont at site i. Finally, let yi,t be the multinomial datawith index Ni,t collected according to one of thestandard sampling protocols. Note that Ni,t may beremoved by integration so that, marginally, yi,t;t = 1,2,... are multinomial random variables withindex Mi and cell probabilities k. Consequently,the joint likelihood of the data is a productmultinomial, similar to that described in "Pointcounts" section. However, here the temporal repli-cation, combined with some protocol other thansimple point counts, allows estimation of the addi-tional parameter , which is 1 minus the temporaryemigration probability; (3) The third type of openscenario is that in which there exists encounterinformation on individual animals across yearssuch as that arising from sampling based onnetworks of mist net stations. In this case, Nt mustbe decomposed into a survival component and arecruitment component where the survival compo-nent is Bin(Nt–1, t) and the recruitment componentis Poi(Nt–1 t). Note that individual encounter infor-mation is directly informative about whereas aspatial model for abundance is informative about

the total of the survival and recruitment processes.It stands to reason that such models will yieldimproved estimates of (local) survival and recruit-ment parameters.

Acknowledgements

The author would like to thank Deanna K. Dawson.USGS Patuxent Wildlife Research Center,, and ScottBates, U.S. National Park Service, for the datacollected in Frederick County, Maryland.

References

Burnham, K. P. & Anderson, D. R., 1998. ModelSelection and Inference: A Practical Information–Theoretic Approach. Springer–Verlag, New York.

Cook, R. D. & Jacobson, J. O., 1979. A design forestimating visibility bias in aerial surveys. Bio-metrics, 35: 735–742.

Dodd, C. K. & Dorazio, R. M., (2004). Using point–counts to simultaneously estimate abundanceand detection probabilities in a salamander com-munity. Herpetologica, 60: 68–78.

Dorazio, R. M., Jelks, H. & Jordan, F., 2005. Im-proving removal–based estimates of abundanceby sampling spatially distinct subpopulations.Biometrics (to appear).

Farnsworth, G. L., Pollock, K. H., Nichols, J. D.,Simons, T. R., Hines, J. E. & Sauer, J. R., 2002.A removal model for estimating detection probabilities from point–count surveys. Auk, 119(2):414–425.

Ihaka, R. & Gentleman, R., 1996. R: A language fordata analysis and graphics. Journal of Computa-tional and Graphical Statistics, 5: 299–314.

Laird, N. M. & Ware, J. H., 1982. Random–effectsmodels for longitudinal data. Biometrics, 38:963–974.

Hanski, I. A. & Gilpin, M. E. (Ed.), 1997. Metapop-ulation biology: ecology, genetics, and evolution.Academic Press, San Diego, U.S.A.

Levins, R., 1969. Some demographic and geneticconsequences of environmental heterogeneity forbiological control. Bulletin of the EntomologicalSociety of America,15: 237–240.

MacKenzie, D. I., Nichols, J. D., Hines, J. E.Knutson, M. G. & Franklin, A. D., 2003. Estimat-ing site occupancy, colonization and local extinc-tion probabilities when a species is detectedimperfectly. Ecology, 84: 2200–2207.

Nichols, J. D., Hines, J. E., Sauer, J. R., Fallon, F.W., Fallon, J. E. & Heglund, P. J., 2000. Adouble–observer approach for estimating detec-tion probability and abundance from point counts.Auk, 117(2): 393–408.

Rosenstock, S. S., Anderson, D. R., Giesen, K. M.,Leukering, T. & Carter M. F., 2002. Landbirdcounting techniques: Current practices and analternative. Auk, 119(1): 46–53.

Royle, J. A. & Nichols, J. D., 2003. Estimating

Page 12: Generalized estimators of avian abundance from count survey data ...

386 Royle

Abundance from Repeated Presence AbsenceData or Point Counts. Ecology, 84: 777–790.

– 2004a. N–Mixture Models for estimating popula-tion size from spatially replicated counts. Bio-metrics, 60: 108–115.

– 2004b. Modeling abundance index data fromanuran calling surveys. Conservation Biology (toappear).

Royle, J. A., Connery, B. & Sharp, M. (2005).Estimating avian abundance from simple pointcounts. U. S. FWS (unpublished report).

Royle, J. A., Dawson D. K. & Bates. S., 2004a.

Estimating abundance effects in distance sam-pling models. Ecology (to appear).

Royle, J. A., Kéry, M., Schmid, H. & Gautier, R.,2004b. Spatial modeling of avian abundance.Unpublished report.

Sanathanan, L., 1972. Estimating the size of amultinomial population. Annals of MathematicalStatistics, 43: 142–152.

Williams, B. K., Nichols, J. D. & Conroy, M. J.,2002. Analysis and management of animalpopulations. Academic Press, San Diego,California, U.S.A.


Recommended