+ All Categories
Home > Documents > A comparison of conditional autoregressive models used in Bayesian disease mapping

A comparison of conditional autoregressive models used in Bayesian disease mapping

Date post: 12-Sep-2016
Category:
Upload: duncan-lee
View: 220 times
Download: 6 times
Share this document with a friend
11
A comparison of conditional autoregressive models used in Bayesian disease mapping Duncan Lee School of Mathematics and Statistics, University Gardens, University of Glasgow, Glasgow G12 8QW, United Kingdom article info Article history: Received 14 October 2010 Revised 14 February 2011 Accepted 7 March 2011 Available online 12 March 2011 Keywords: Conditional autoregressive models Disease mapping Spatial correlation abstract Disease mapping is the area of epidemiology that estimates the spatial pattern in disease risk over an extended geographical region, so that areas with elevated risk levels can be identified. Bayesian hierarchical models are typically used in this context, which represent the risk surface using a combination of available covariate data and a set of spatial random effects. These random effects are included to model any overdispersion or spatial correla- tion in the disease data, that has not been accounted for by the available covariate informa- tion. The random effects are typically modelled by a conditional autoregressive (CAR) prior distribution, and a number of alternative specifications have been proposed. This paper cri- tiques four of the most common models within the CAR class, and assesses their appropri- ateness via a simulation study. The four models are then applied to a new study mapping cancer incidence in Greater Glasgow, Scotland, between 2001 and 2005. Ó 2011 Elsevier Ltd. All rights reserved. 1. Introduction Modelling data that relate to contiguous spatial units, such as electoral wards or pixels, is a common problem in a number of statistical applications, including disease mapping (MacNab et al., 2006), geographical association studies (Lee et al., 2009), image analysis (Molina et al., 1999) and agricultural field trials (Besag and Higdon, 1999). The response variables in these applications typi- cally display spatial dependence, that is, observations from units close together are more similar than those relating to units further apart. A number of statistical approaches have been adopted for modelling spatial correlation in such data, including geostatistical models (Biggeri et al., 2006), simultaneously autoregressive models (Kissling and Carl, 2008) and conditional autoregressive models (MacNab, 2003). In this paper we focus on disease mapping, the aim of which is to map the spatial pattern in disease risk over a predefined study region. Bayesian hierarchical models are typically used in such analyses, where any spatial correla- tion in the disease data is modelled at the second level of the hierarchy by a set of random effects. These effects are most commonly represented by a conditional autoregres- sive (CAR) prior distribution, which is a type of Markov random field. A number of models have been proposed within this general class of CAR priors, including the intrin- sic and convolution models (both Besag et al., 1991), as well as alternatives proposed by Cressie (1993) and Leroux et al. (1999). However, to our knowledge, no formal comparison has been made of the appropriateness of each of these prior models. Therefore this paper presents such a critique, by comparing both their theoretical properties and practical performance. The remainder of this paper is organised as follows. Section 2 provides a background to Bayesian disease mapping, while Section 3 presents the four commonly used CAR prior models. Section 4 compares the performance of the four models via simulation, while Section 5 applies them to a new disease mapping study of cancer incidence in Greater Glasgow, UK. Finally, Section 6 contains a con- cluding discussion and areas for future work. 1877-5845/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.sste.2011.03.001 E-mail address: [email protected] Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89 Contents lists available at ScienceDirect Spatial and Spatio-temporal Epidemiology journal homepage: www.elsevier.com/locate/sste
Transcript
Page 1: A comparison of conditional autoregressive models used in Bayesian disease mapping

Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89

Contents lists available at ScienceDirect

Spatial and Spatio-temporal Epidemiology

journal homepage: www.elsevier .com/locate /sste

A comparison of conditional autoregressive modelsused in Bayesian disease mapping

Duncan LeeSchool of Mathematics and Statistics, University Gardens, University of Glasgow, Glasgow G12 8QW, United Kingdom

a r t i c l e i n f o

Article history:Received 14 October 2010Revised 14 February 2011Accepted 7 March 2011Available online 12 March 2011

Keywords:Conditional autoregressive modelsDisease mappingSpatial correlation

1877-5845/$ - see front matter � 2011 Elsevier Ltddoi:10.1016/j.sste.2011.03.001

E-mail address: [email protected]

a b s t r a c t

Disease mapping is the area of epidemiology that estimates the spatial pattern in diseaserisk over an extended geographical region, so that areas with elevated risk levels can beidentified. Bayesian hierarchical models are typically used in this context, which representthe risk surface using a combination of available covariate data and a set of spatial randomeffects. These random effects are included to model any overdispersion or spatial correla-tion in the disease data, that has not been accounted for by the available covariate informa-tion. The random effects are typically modelled by a conditional autoregressive (CAR) priordistribution, and a number of alternative specifications have been proposed. This paper cri-tiques four of the most common models within the CAR class, and assesses their appropri-ateness via a simulation study. The four models are then applied to a new study mappingcancer incidence in Greater Glasgow, Scotland, between 2001 and 2005.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Modelling data that relate to contiguous spatial units,such as electoral wards or pixels, is a common problemin a number of statistical applications, including diseasemapping (MacNab et al., 2006), geographical associationstudies (Lee et al., 2009), image analysis (Molina et al.,1999) and agricultural field trials (Besag and Higdon,1999). The response variables in these applications typi-cally display spatial dependence, that is, observations fromunits close together are more similar than those relating tounits further apart. A number of statistical approacheshave been adopted for modelling spatial correlation insuch data, including geostatistical models (Biggeri et al.,2006), simultaneously autoregressive models (Kisslingand Carl, 2008) and conditional autoregressive models(MacNab, 2003).

In this paper we focus on disease mapping, the aim ofwhich is to map the spatial pattern in disease risk over apredefined study region. Bayesian hierarchical models are

. All rights reserved.

typically used in such analyses, where any spatial correla-tion in the disease data is modelled at the second level ofthe hierarchy by a set of random effects. These effects aremost commonly represented by a conditional autoregres-sive (CAR) prior distribution, which is a type of Markovrandom field. A number of models have been proposedwithin this general class of CAR priors, including the intrin-sic and convolution models (both Besag et al., 1991), aswell as alternatives proposed by Cressie (1993) and Lerouxet al. (1999). However, to our knowledge, no formalcomparison has been made of the appropriateness of eachof these prior models. Therefore this paper presents such acritique, by comparing both their theoretical propertiesand practical performance.

The remainder of this paper is organised as follows.Section 2 provides a background to Bayesian diseasemapping, while Section 3 presents the four commonly usedCAR prior models. Section 4 compares the performance ofthe four models via simulation, while Section 5 appliesthem to a new disease mapping study of cancer incidencein Greater Glasgow, UK. Finally, Section 6 contains a con-cluding discussion and areas for future work.

Page 2: A comparison of conditional autoregressive models used in Bayesian disease mapping

80 D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89

2. Disease mapping

In disease mapping studies the region of interest is splitinto n contiguous small-areas, such as census tracts orelectoral wards, and the aim of the study is to detect whichareas exhibit elevated disease risks. The observed numbersof disease cases in each small-area are collectively denotedby y ¼ ðy1; . . . ; ynÞ, where yk denotes the number of cases inarea k. To fairly assess which areas exhibit elevated levelsof disease risk, the numbers of cases expected to occur ineach small-area are calculated. These expected numbersof cases are denoted by E ¼ ðE1; . . . ; EnÞ, and are based onthe size and demographic structure of the population liv-ing within each small-area. They are calculated by dividingthe population living in each small-area into a number ofstrata, based on their age and sex. The number of peoplein each stratum is multiplied by the incidence rate for thatstratum, and the results are summed over strata toproduce the expected number of cases. From these datathe simplest measure of disease risk is the standardisedincidence ratio (SIR), which is calculated for area k asSIRk ¼ yk=Ek. Values above one represent areas with ele-vated levels of disease risk, while values below one corre-spond to comparatively healthy areas. However, elevatedrisks (as measured by the SIR) are likely to happen bychance if Ek is small, which can occur if the disease in ques-tion is rare and/or the population at risk is small.

To overcome this problem a Bayesian model-basedapproach is typically adopted, which estimates the set ofdisease risks using covariate information and a set of ran-dom effects. The random effects borrow strength from val-ues in neighbouring areas, which reduces the likelihood ofexcesses in risk occurring by chance. The class of Bayesianhierarchical models typically used in this context havebeen described in detail by Elliott et al. (2000), Banerjeeet al. (2004) and Lawson (2008), and a general formulationis given by

YkjEk;Rk � PoissonðEkRkÞ for k ¼ 1; . . . ;n;

lnðRkÞ ¼ lþ xTkbþ /k; ð1Þ

bi � Nð0;10Þ for i ¼ 1; . . . ;p;

l � Nð0;10Þ:

In the above model Rk denotes the risk of disease in areak, which is modelled by an intercept term l, a set of pcovariates xT

k ¼ ðxk1; . . . ; xkpÞ and a random effect /k. Theregression parameters b ¼ ðb1; . . . ; bpÞ and the interceptterm l are assigned weakly informative Gaussian prior dis-tributions, with mean zero and variance 10. The randomeffects are included to model any overdispersion and/orspatial correlation in the data, that persist after adjustingfor the available covariate information. Overdispersioncan occur because the Poisson likelihood enforces therestriction that Var½Yk� ¼ E½Yk�, whereas in most studiesof this type, Var½Yk� > E½Yk�. The existence of overdisper-sion and spatial correlation is likely due to the presenceof unmeasured risk factors, which thus cannot be includedas covariates in the model. Inference for this type of modelis typically based on Markov chain Monte-Carlo (MCMC)simulation, utilising a combination of Gibbs sampling andMetropolis–Hastings steps.

3. Conditional autoregressive models

In disease mapping studies the random effects/ ¼ ð/1; . . . ;/nÞ are commonly modelled by the class ofconditional autoregressive (CAR) prior distributions, whichare a type of Markov random field model. These models arespecified by a set of n univariate full conditional distribu-tions f ð/kj/�kÞ (where /�k ¼ ð/1; . . . ;/k�1;/kþ1; . . . ;/nÞ),for k ¼ 1; . . . ;n, rather than by a single multivariate distri-bution f ð/Þ. Spatial correlation between the random effectsis determined by a binary n� n neighbourhood matrix W,whose jkth element wjk is equal to one if areas ðj; kÞ are de-fined to be neighbours, and is zero otherwise. If two areasare defined to be neighbours their random effects are cor-related, while random effects in non-neighbouring areasare modelled as being conditionally independent giventhe remaining elements of /. The most common approachis to assume that areas ðj; kÞ are neighbours (i.e., wjk ¼ 1) ifand only if they share a common border, which is denotedin this paper by j � k. A number of different conditionalautoregressive prior models have been proposed in a dis-ease mapping context, and the remainder of this sectiondescribes the four that are most commonly used.

3.1. Intrinsic model

The simplest CAR prior is the intrinsic autoregressive(IAR) model, which was proposed by Besag et al. (1991)and has full conditional distributions given by

/kj/�k;W; s2I � N

1nk

Xj�k

/j;s2

I

nk

!: ð2Þ

The conditional expectation of /k is equal to the meanof the random effects in neighbouring areas, while the con-ditional variance is inversely proportional to the number ofneighbours nk. This variance structure recognises the factthat in the presence of strong spatial correlation, the moreneighbours an area has the more information there is inthe data about the value of its random effect. The varianceparameter s2

I controls the amount of variation between therandom effects, and the choice of hyperprior is discussed inSection 3.5.

The above model is the simplest possible CAR prior, andas a result is rather restrictive. Its single parameter doesnot determine the strength of the spatial correlation be-tween the random effects, because multiplying each /k

by 10 will increase s2I but leave the spatial correlation

structure unchanged. Therefore model (2) can only repre-sent strong spatial correlation structures, and is hencenot appropriate if the data are only weakly correlated. Inaddition, the joint distribution for f ð/Þ corresponding to(2) is improper, because it is possible to add a constantto each /k without changing the distribution. This impro-priety can be remedied by enforcing a constraint such as,Pn

j¼1/j ¼ 0, which can be implemented numerically ateach iteration of an MCMC algorithm.

3.2. Convolution model

The convolution model was also proposed by Besaget al. (1991), and combines the intrinsic model with an

Page 3: A comparison of conditional autoregressive models used in Bayesian disease mapping

D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89 81

additional set of independent random effects. The model isgiven by

/k ¼ hk þ wk;

hkjr2 � Nð0;r2Þ; ð3Þw ¼ ðw1; . . . ;wnÞjW; s2

I � IARðW; s2I Þ;

where w is represented by the intrinsic CAR prior describedin the previous subsection. The second set of random effectsh ¼ ðh1; . . . ; hnÞ is independent between areas, and differentstrengths of spatial correlation can be represented by vary-ing the relative sizes of the two components ðh;wÞ. How-ever, the disadvantage of this flexibility is that each datapoint is represented by two random effects, and hence onlytheir sum hk þ wk is identifiable. Eberly and Carlin (2000)assess the extent of this problem, and find that MCMC con-vergence is slow for this model, and that the individualcomponents ðhk;wkÞ are not reliably estimated.

3.3. Cressie model

An alternative approach for modelling varying strengthsof spatial correlation was proposed by Cressie (1993) andStern and Cressie (2000), who use a single set of randomeffects, but introduce an additional spatial correlationparameter. The implementation of their model we considerhere is given by

/kj/�k;W; s2;q;l � N q� 1nk

Xj�k

/j þ ð1� qÞl; s2

nk

!: ð4Þ

This CAR prior has the same conditional variance as theintrinsic model, while the conditional expectation is aweighted average of the mean of the random effects inneighbouring areas and an overall mean l. The existenceof l in model (4) means that the intercept term in Eq. (1)is not required. The weight parameter q controls thestrength of the spatial correlation between the random ef-fects, with q ¼ 0 corresponding to independence, whileincreasing its value towards one corresponds to increas-ingly strong spatial correlation (q ¼ 1 simplifies to theintrinsic model). This set of full conditional distributionscorrespond to a proper multivariate Gaussian distributionfor / if 0 6 q < 1, which has a constant mean of l. Thecovariance matrix is equal to s2Q�1

C , where QC has jkth ele-ment equal to nk if j ¼ k, �q if j � k, and zero otherwise.The major drawback with this model is the form of theconditional variance, which is unappealing when q is closeto zero. This is because in the absence of spatial correlation(when q ¼ 0) there is no reason for the conditional vari-ance of /k to be inversely proportional to the number ofneighbours, as they provide no information about /k.

3.4. Leroux model

The final model considered here was originally pro-posed by Leroux et al. (1999), and has been further ex-plored by MacNab (2003). It is based on a single set ofrandom effects / ¼ ð/1; . . . ;/nÞ, which are represented bythe multivariate Gaussian distribution

/jW; s2;q;l � N l; s2 qW� þ ð1� qÞIn½ ��1� �

: ð5Þ

In common with model (4), this prior has a constantnon-zero mean l ¼ ðl; . . . ;lÞ, which is consequently notrequired in Eq. (1). The precision matrix is given byQL ¼ qW� þ ð1� qÞIn, where In is an n� n identity matrixand the elements of W� are equal to

w�jk ¼nk if j ¼ k

�1 if j � k

0 otherwise:

8><>: ð6Þ

The precision matrix is hence a weighted average ofspatially dependent (represented by W�) and independent(represented by In) correlation structures, where theweight is equal to q. This model can represent a range ofweak and strong spatial correlation structures, with thespecial case of q ¼ 0 simplifying to a model with indepen-dent random effects. The joint distribution (5) is proper if0 6 q < 1, while q ¼ 1 corresponds to the improper intrin-sic model given by (2). The univariate full conditional dis-tributions corresponding to (5) are given by

/kj/�k;W; s2;q;l � N

qPj�k

/j þ ð1� qÞl

nkqþ 1� q;

s2

nkqþ 1� q

0B@

1CA:ð7Þ

The conditional expectation is a weighted average ofthe random effects in neighbouring areas and the overallmean l, while the conditional variance has a more attrac-tive form than that in model (4). When there is strong spa-tial correlation in the data q will be close to one and theconditional variance is approximately s2=nk, which is thesame as in the intrinsic model (2). In contrast, if the ran-dom effects are independent q ¼ 0, and the conditionalvariance of /k is a constant (equal to s2). This is at oddswith model (4), but is more theoretically appealing be-cause there is no longer any information about /k in theneighbouring random effects.

3.5. Hyperpriors

3.5.1. Hyperprior for the correlation parameter qFor the Cressie and Leroux models the correlation

parameter is assigned a discrete hyperprior, because it isleads to faster MCMC inference than if a continuous distri-bution was specified. This is because for each value of qproposed by the MCMC algorithm the determinant of theprecision matrix needs to be calculated, which will becomputationally demanding if the number of data pointsn is large. If the hyperprior is continuous a new value ofq is proposed at each iteration of the MCMC algorithm,where as if a discrete prior is adopted only a small numberof q values are possible (and hence only a small number ofdeterminant calculations are required). The discrete hyper-prior adopted here is given by

q � discrete uniformða1; . . . ; arÞ:

A uniform prior is adopted (i.e., the possible values of q,a1; . . . ; ar have the same prior probability) because beforethe analysis it is unknown whether the data contain strong,moderate or weak spatial correlation. The discrete prior is

Page 4: A comparison of conditional autoregressive models used in Bayesian disease mapping

82 D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89

implemented on the q scale (rather than say a Fisher trans-formation of it) because it is interpretable, with 0 corre-sponding to independence of the random effects, whileincreasing its value towards 1 corresponds to increasinglystrong spatial correlation. Negative spatial correlation israrely seen in disease mapping data, so the minimumallowable value for q is set at a1 ¼ 0. For models (4) and(7) to correspond to proper multivariate Gaussian distribu-tions, q must be less than 1. Therefore we set ar ¼ 0:95, be-cause the simulation study in the next section shows thatfor both models this is large enough to represent strongspatial correlation. In the absence of any prior informationthe remaining possible values a2; . . . ; ar�1 are assumed tobe equally spaced between 0 and 0.95. Finally, the numberof possible values r needs to be specified, and a value of 20is used in this paper, giving possible values of0;0:05;0:1; . . . ;0:9;0:95. A sensitivity analysis to this valueis conducted in Section 5.

3.5.2. Hyperprior for the variance parameters ðs2I ;r2; s2Þ

Following the work of Gelman (2006), all varianceparameters are assigned uniform ð0;MÞ priors on the stan-dard deviation scale. The commonly used class of inverse-gamma ð�; �Þ priors are sensitive to the value of � if the truevariance is close to zero, and are therefore not used. A va-lue of M ¼ 10 is specified as the upper limit of the uniformprior throughout this paper, although a small sensitivityanalysis is conducted to this value in Section 5.

3.6. Inference

Inference for all models is based on Markov ChainMonte-Carlo simulation, using a combination of Gibbssampling and Metropolis-Hastings steps. All regressionand intercept parameters are updated using Metropolissteps, utilising a random walk proposal distribution. Therandom effects are updated using a block Metropolis-Has-tings algorithm, where the block size can be tuned to ob-tain the desired acceptance rates. Variance parametersare Gibbs sampled from their full conditional inverse-gam-ma distributions, while q is straightforward to update be-cause it has a discrete sample space. A function to run theLeroux model in the statistical package R (R DevelopmentCore Team, 2009) as well as supporting documentationand an example data set, are available in the Supplemen-tary material accompanying this paper.

4. Simulation study

In this section, we present a simulation study, that com-pares the performance of the four CAR priors described inthe previous section.

4.1. Data generation and study design

Simulated disease data are generated for the 271 inter-mediate geographies in the Greater Glasgow health board,which is the region used in the cancer mapping study inSection 5. The data are generated from model (1), wherefor simplicity, the percentage of the population who are in-

come deprived is the only covariate. Its regression param-eter is equal to b ¼ 0:1, while the intercept term is fixed atl ¼ �0:2. The expected numbers of disease cases, E, arethose used in Section 5 for all cancer cases. The purposeof this study is to determine how well each of the CAR pri-ors can represent different types of spatial correlation, andwe simulate disease data under each of the followingscenarios:

� Scenario 1: Independence – The random effects are gen-erated from independent normal distributions, withmean zero and standard deviation equal to 0.2.� Scenario 2: Moderate spatial dependence – The random

effects are a convolution of the independent and spa-tially correlated processes used in Scenarios 1 and 3.� Scenario 3: Strong spatial dependence – The random

effects are generated from a multivariate Gaussian dis-tribution with mean zero, with a correlation matrixspecified by the Matern class with smoothness parame-ter equal to 2.5. The spatial range is fixed at 5 km, whichcorresponds to the median correlation between pairs ofareas in the study region being 0.5.

Two hundred sets of disease counts were generated un-der each of the three scenarios, and the four CAR priorsoutlined in Section 3 were applied in each case. Each sim-ulated data set is generated from a different realisation ofthe random effects, because it prevents the results frombeing affected by the particular set of random effectsdrawn. The relative performances of the four models areassessed by the following metrics.

� Regression parameter: b – Bias and root mean squareerror (RMSE) for the estimated regression parameter,presented as a percentage of its true value (0.1).� Disease risks: Rk – Bias and RMSE for the set of disease

risks Rk ¼ expðlþ /k þ xkbÞ, which are again presentedas a percentage of their true values.� Residual spatial correlation: A permutation test (at the

5% level) based on Moran’s I statistic is applied to theresiduals from each model, and the percentage of signif-icant results is reported.

4.2. Results

The results from all metrics and models are presented inTable 1. Overall, all of the models produce close to unbiasedestimates of b and Rk, with relative percentage biases rang-ing between �0.85% and 3.01% across all models and sce-narios. In the independence scenario (Scenario 1) theintrinsic model performs the worst in terms of RMSE forboth the regression parameter and the disease risks, whilethe other models produce similar results. This is not sur-prising, as the intrinsic model is the only one consideredhere that cannot represent independence or weak spatialcorrelation. In contrast, in the presence of strong spatialcorrelation (Scenario 3) the convolution model performsworst in terms of RMSE. In addition, it failed to removethe spatial correlation in 73.5% of the data sets, which sug-gests that it is not appropriate for modelling strong spatialcorrelation. The model proposed by Cressie also performed

Page 5: A comparison of conditional autoregressive models used in Bayesian disease mapping

Table 1Summary of the simulation study results. The bias and RMSE are presented as a percentage of the true values,while the fifth section summarises the percentage of data sets for which the model did not remove the spatialcorrelation. The bottom part of the table summarises the estimates of the spatial correlation parameter q.

Metric Scenario Model

Intrinsic Convolution Cressie Leroux

1 �0.74 �0.41 �0.85 �0.52Bias – b 2 �0.08 0.84 1.65 1.41

3 1.77 3.01 2.20 1.77

RMSE – b 1 18.0 15.6 14.8 14.52 11.2 11.5 11.9 11.83 12.8 16.3 14.5 13.1

Bias – Rk 1 �0.57 �0.56 �0.57 �0.572 �0.26 �0.30 �0.30 �0.313 �0.18 �0.37 �0.24 �0.20

RMSE – Rk 2 9.6 9.2 9.4 9.33 6.5 9.2 7.2 6.7

% Spatial correlation 1 0 0 0 02 0 0 0 03 0 73.5 0 0

Mean value of q 1 – – 0.195 0.0422 – – 0.621 0.3423 – – 0.950 0.948

D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89 83

less well in terms of RMSE than the intrinsic or Leroux mod-els in the presence of strong spatial correlation, which sug-gests that one of the latter is the appropriate model in thisscenario. In summary, the model proposed by Leroux ap-pears to be the most appropriate across the three scenariosconsidered here, as it performs consistently well in thepresence of independence and strong spatial correlation.

Finally, the mean value (over the 200 data sets) of thespatial correlation parameter q in the Cressie and Lerouxmodels is displayed in the bottom part of Table 1. The tableshows that on average both models produce estimates ofapproximately the appropriate size, with values close tozero under independence (Scenario 1) and close to one inthe presence of strong spatial correlation (Scenario 3).However, under independence the true value of q shouldbe zero, and the estimates from the Leroux model aremuch closer to this than the values from the Cressie model.

5. Application

This section presents a study mapping the spatial pat-tern of cancer risk in Greater Glasgow, Scotland, between2001 and 2005.

5.1. Data description

The data for this study are publicly available, and comefrom the Scottish Neighbourhood Statistics database,which is available on-line at http://www.sns.gov.uk. Thestudy region is the Greater Glasgow and Clyde healthboard, which contains the largest city in Scotland (Glas-gow) as well as the surrounding area. The health board issplit up into n ¼ 271 administrative units called intermedi-ate geographies (IG), which have a median area of 124hectares and a median population of 4239.

5.1.1. Cancer dataIn this study we map the spatial pattern of: (a) lung

cancer cases; and (b) all cancer cases; which are classifiedby the International classification of disease – 10th revi-sion (ICD-10) as (a) C33–C34 and (b) C00–C97. respec-tively. Our response variables are the numbers of newcases diagnosed between 2001 and 2005, for each of the271 intermediate geographies that comprise our study re-gion. The expected numbers of cases are calculated byexternal standardisation, using age and sex adjusted ratesfor the whole of Scotland, which were obtained from theinformation services division (ISD) of the National HealthService. Maps of the standardised incidence ratios for bothcancer types are shown in Fig. 1, where the top panel re-lates to lung cancer while the bottom one displays all can-cer cases. The figure shows that cancer incidence in GreaterGlasgow is higher than in the rest of Scotland, with averageSIRs across the study region of 1.186 (lung cancer) and1.034 (all cancer), respectively. Within the study regionthe majority of the highest SIR values are in the east, whichcorresponds to the heavily deprived east end of Glasgow.

5.1.2. CovariatesA small number of covariates are available to describe

the spatial variation in cancer risk across Greater Glasgow.The first is a modelled estimate of the percentage of thepopulation in each IG who smoke, and further details aboutits construction are available from Whyte et al. (2007). Anumber of measures of socio-economic deprivation arealso available, but the majority of these are highly corre-lated with the smoking covariate, and therefore shouldnot be used as it would lead to collinearity problems.Therefore we represent deprivation by the natural log ofthe median house price in each area (correlation withsmoking of �0.69), which is the measure of deprivation

Page 6: A comparison of conditional autoregressive models used in Bayesian disease mapping

Fig. 1. Standardised incidence ratios (SIR) for lung cancer (top) and all cancer (bottom) in Greater Glasgow between 2001 and 2005.

84 D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89

available that is least correlated with smoking. In addition,the percentage of school children from ethnic minorities(i.e., non-white) is also available, which is used here as aproxy measure of the ethnic make-up of each area.

The final covariate is the estimated annual mean con-centration of particulate matter air pollution in 2001,which is measured as PM10. Particulate matter concentra-tions have been shown to be significantly or borderline sig-nificantly associated with lung cancer in previous studies(see, for example, Pope et al., 2002; Vineis et al., 2004;

Jerrett et al., 2005), which is the reason for its inclusionhere. However, modelled estimates are only available foreach 1 km grid square across the UK, and not at the inter-mediate geography resolution. These gridded estimatescan be obtained from the Department for EnvironmentFood and Rural Affairs (DEFRA, http://laqm1.defra.gov.uk),and further details about their construction are availablefrom Steadman et al. (2002). Here, we use the median con-centration over the grid squares lying within each interme-diate geography as our exposure measure.

Page 7: A comparison of conditional autoregressive models used in Bayesian disease mapping

D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89 85

5.2. Modelling

The cancer data sets are represented using the generalmodel (1), where the covariates included are those de-scribed above. Both data sets are modelled using the fourprior distributions for / described in Section 3. Inferencefor all models is based on 50,000 MCMC samples generatedfrom five Markov chains, that were initialised at dispersedlocations in the sample space. Each chain is burnt in untilconvergence (40,000 iterations), and the next 10,000 sam-ples are used for the analysis.

The residuals from each model were tested for the pres-ence of spatial correlation, using a permutation test basedon Moran’s I statistic. For both cancer types all models ade-quately remove the spatial correlation present in the data,as all the corresponding p-values (not shown) were greaterthan 0.05. The deviance information criterion (DIC, Spie-gelhalter et al., 2002) was also calculated in each case,and is a measure of how well a model fits a set of data.Low values of the DIC indicate a better fitting model, andthe results across the four models are very similar. For bothdata sets the intrinsic model appears to be the worst fit,with DIC values of 1717 and 2205, respectively, for lungand all cancer cases. The remaining three models have sim-

0.00

0.05

0.10

0.15

Lung cancer - Cressie model

Post

erio

r pro

babi

lity

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.00

0.05

0.10

0.15

All cancer - Cressie model

Post

eri o

r pro

babi

lity

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Fig. 2. Posterior distributions for the spatial correlation param

ilar DIC values, ranging between 1705 and 1708 for lungcancer and 2200 and 2201 for all cancer.

5.3. Results

The results of our study are presented below. The firstpart displays the level of spatial correlation estimated bythe models, the second describes the covariate effects,while the third presents the fitted risk surfaces.

5.3.1. Spatial correlationThe posterior distributions of the spatial correlation

parameter ðqÞ are shown in Fig. 2, for both the Cressieand Leroux models and both cancer types. The figureshows that in all cases the data are informative about thevalue of q, as the posterior distributions are not similarto the uniform priors that were adopted. Secondly, the spa-tial correlation parameters from the two models do nothave the same calibration, as their posterior distributionsare different for both cancer types. Finally, the lung cancerdata appear to contain weak spatial correlation afteradjusting for the covariates (posterior median of 0.2 fromthe Leroux model), while the all cancer data contain mod-

0.00

0.05

0.10

0.15

Lung cancer - Leroux model

Post

erio

r pro

babi

lity

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.00

0.05

0.10

0.15

All cancer - Leroux model

Post

erio

r pro

b abi

li ty

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

eter from the models proposed by Cressie and Leroux.

Page 8: A comparison of conditional autoregressive models used in Bayesian disease mapping

Table 2Estimates and 95% credible intervals for the regression parameters. The results are presented on the relative risk scale, for a standarddeviation increase in each covariates value.

Covariate Model

Intrinsic Convolution Cressie Leroux

(a) Lung cancerEthnicity 0.92 (0.88, 0.96) 0.93 (0.89, 0.97) 0.94 (0.90, 0.98) 0.94 (0.90, 0.98)House price 0.92 (0.88, 0.96) 0.92 (0.87, 0.96) 0.91 (0.87, 0.95) 0.91 (0.87, 0.95)PM10 1.12 (1.02, 1.20) 1.12 (1.06, 1.19) 1.10 (1.06, 1.15) 1.10 (1.05, 1.15)Smoking 1.27 (1.22, 1.34) 1.28 (1.22, 1.35) 1.29 (1.22, 1.35) 1.28 (1.22, 1.34)

(b) All cancerEthnicity 0.95 (0.92, 0.07) 0.95 (0.93, 0.97) 0.95 (0.93, 0.97) 0.95 (0.93, 0.97)House price 0.95 (0.93, 0.97) 0.95 (0.92, 0.97) 0.95 (0.92, 0.97) 0.95 (0.93, 0.97)PM10 1.05 (1.00, 1.09) 1.06 (1.02, 1.10) 1.06 (1.03, 1.09) 1.05 (1.02, 1.08)Smoking 1.03 (1.00, 1.06) 1.03 (1.00, 1.06) 1.03 (1.01, 1.06) 1.03 (1.01, 1.06)

86 D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89

erate correlation (posterior median of 0.45 from the Lerouxmodel).

5.3.2. Covariate effectsThe effects of the covariates are presented in Table 2,

which displays the estimates (posterior medians) as wellas 95% credible intervals. All results are presented on therelative risk scale, for a one standard deviation increasein each covariates value. The table shows that the choiceof CAR prior has little effect on the results, as both the rel-ative risks and credible intervals are consistent across thefour models. There is convincing evidence that areas withhigher proportions of ethnic minorities are at less risk ofdeveloping cancer, with relative risks around 0.94 (lung)and 0.95 (all cancer) for a 12% increase in the non-whiteschool population. Populations living in areas that are lessdeprived (as measured by the log of average house price)are also less at risk of cancer, with relative risks of 0.92(lung) and 0.95 (all cancer), respectively.

Exposure to higher concentrations of particulate matterair pollution appears to be associated with increased can-cer incidence, with relative risks of 1.10 (lung) and 1.05(all cancer) for a 1.8 l g�3 increase in the yearly averageconcentration. The association observed for lung cancer isconsistent with the existing research described above,while the association for all cancer is much smaller. Finally,there is strong evidence that increasing the percentage ofthe population in each IG that smoke is related to increasedlung cancer risk, with a relative risk around 1.28 for a 10%increase in the smoking prevalence. The results for all can-cer are less strong, with a relative risk around 1.03. How-ever, the smoking covariate is moderately correlated withhouse price (correlation of �0.69), so the models werere-run without the latter to observe whether there wasany change in the estimated smoking effects. The observedrelative risks increased to 1.36 (lung cancer) and 1.07 (allcancer), respectively, suggesting that adjusting for depriva-tion (as measured by house price) impacts on the magni-tude of both associations.

5.3.3. Risk mapsFig. 3 shows the estimated disease risks (i.e., posterior

medians of Rk) from the Leroux model, where the scalesare the same as those used for the SIRs in Fig. 1. The esti-

mated risk maps are smoother than the raw SIR values,and are also less extreme. For example, the SIR for all can-cer ranges between 0.61 and 1.55, while the correspondingmodel estimates range between 0.75 and 1.43. However,the estimated risk surface exhibits a similar spatial patternto the SIR map, with the highest risks for both cancer typesbeing observed in the heavily deprived east end of Glasgowin the east of the study region.

Fig. 4 summarises the uncertainty in the disease risksRk, and splits the areas into three categories. Areas shadedin black exhibit elevated risks of cancer compared with thewhole of Scotland, having 95% credible intervals for Rk thatdo not include the null risk of one. Areas shaded in mid-grey have credible intervals for Rk that include the null riskof one, while those coloured light grey have decreasedrisks (credible intervals that are less than one). The figureshows there are more areas that exhibit elevated risks thandecreased risks, suggesting that Greater Glasgow has ahigher risk of cancer compared with the Scottish average.The areas of elevated risk lie mainly in the east end of Glas-gow and along the south bank of the main river (the Clyde),the latter of which is illustrated by the thin white linemoving south east across the study region.

5.3.4. Sensitivity to hyperpriorsTo assess the effects of the hyperpriors on posterior

inference, a small sensitivity analysis was undertakenusing the Leroux model. Firstly, the model was imple-mented with a Uniformð0;MÞ prior distribution for s (thestandard deviation), where M ¼ 5;10;20. The estimatesof s2 were invariant to this change, with, for example, pos-terior medians ranging between 0.0214 and 0.0231 for allcancer cases. Secondly, the model was implemented withfour different discrete prior distributions for the spatialcorrelation parameter q. As described in Section 2, a uni-form prior with values ranging between 0 and 0.95 isadopted in this paper, where the possible values areequally spaced between the two endpoints. Here, we as-sess the sensitivity of the posterior distribution to having10, 20, 30 and 40 possible values (i.e., r ¼ 10;20;30;40) be-tween these endpoints. The posterior distributions of qexhibited almost identical shapes and centres for each va-lue of r, with posterior medians for all cancer ranging be-tween 0.422 and 0.463.

Page 9: A comparison of conditional autoregressive models used in Bayesian disease mapping

Fig. 3. Estimated disease risks from the Leroux model for lung cancer (top) and all cancer (bottom).

D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89 87

6. Discussion

This paper has critiqued four of the most commonlyused conditional autoregressive prior distributions inBayesian disease mapping, which include the intrinsicand convolution models (both Besag et al., 1991), as wellas the alternatives proposed by Cressie (1993) and Lerouxet al. (1999). The performance of these models has beenquantified by simulation, specifically assessing the accu-

racy with which they can estimate regression parametersand disease risk surfaces. The paper then applies each ofthese models to a new study mapping cancer incidencein Greater Glasgow, Scotland, between 2001 and 2005.

The simulation study shows that all four prior modelsproduce close to unbiased estimates of the regressionparameters and the set of disease risks, regardless of theamount of spatial correlation in the data. However, thereare differences in the corresponding root mean square er-rors of b and Rk. If the data do not contain spatial correla-

Page 10: A comparison of conditional autoregressive models used in Bayesian disease mapping

Fig. 4. Uncertainty in the estimated disease risks from the Leroux model for lung cancer (top) and all cancer (bottom). Areas shaded in black have elevatedrisks of disease (credible intervals for Rk that are greater than 1), areas shaded in mid grey have credible intervals that contain one, while areas in light greyhave substantially decreased risks.

88 D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89

tion the intrinsic model has the largest RMSE, while in thepresence of strong spatial correlation the convolution andCressie models have the largest values. These results sug-gest that the model proposed by Leroux et al. (1999) isthe best overall, because it produces consistently good re-sults across the range of spatial correlation scenarios con-sidered here. It is also the most theoretically appealing ofthe four prior models for a number of reasons. Firstly, it

can represent a range of strong and weak spatial correla-tion structures with a single set of random effects, whichis beyond the capability of the convolution and intrinsicmodels. Secondly, it corresponds to a proper joint distribu-tion for the random effects, which is not true for the intrin-sic model. Thirdly, its full conditional distributions (givenby 7) have an appropriate mean and variance structure,regardless of whether the data are independent or contain

Page 11: A comparison of conditional autoregressive models used in Bayesian disease mapping

D. Lee / Spatial and Spatio-temporal Epidemiology 2 (2011) 79–89 89

strong spatial correlation. This is not the case for the modelproposed by Cressie (1993), which assumes the conditionalvariance is inversely proportional to the number of neigh-bouring areas, regardless of whether there is any spatialcorrelation in the data.

The study presented in Section 5 is based on ecologicaldata, which means the results relate to the health of theoverall population rather than applying directly to individ-uals. Overall, Greater Glasgow appears to have a higher riskof cancer than the Scottish average, with mean risks (fromthe Leroux model) of 1.24 (lung) and 1.05 (all) across theset of 271 intermediate geographies considered in thisstudy. The south and east of the city of Glasgow have thehighest risks of cancer, where as the more rural surround-ing areas have much lower risks. These differences in riskappear to be partly due to the covariates. Increased levelsof smoking, socio-economic deprivation and air pollutionappear to inflate an areas risk of cancer, while increasingthe proportion of ethnic minorities appears to decreasethe risk.

All of the CAR priors considered here assume that if twoareas share a common border their random effects will becorrelated, which is unlikely to be true in all cases. This isespecially true in urban areas, where neighbourhoods ofrich and poor people are often geographically adjacent.Therefore future work will involve relaxing this assump-tion, and only forcing pairs of contiguous areas to have cor-related random effects if those areas are in some waysimilar. Such similarity could be measured in numerousways, including geographical distance or differences intheir relative levels of socio-economic deprivation.

Acknowledgements

The author gratefully acknowledges the valuable com-ments and suggestions made by two referees, all of whichhave greatly improved the focus and presentation of thispaper. The data and shapefiles used in this paper were pro-vided by the Scottish Government.

Appendix A. Supplementary data

Supplementary data associated with this article can befound, in the online version, at doi:10.1016/j.sste.2011.03.001.

References

Banerjee S, Carlin B, Gelfand A. Hierarchical modelling and analysis forspatial data. 1st ed. Chapman and Hall; 2004.

Besag J, Higdon D. Bayesian analysis of agricultural field experiments. J RStat Soc Ser B 1999;61:691–746.

Besag J, York J, Mollie A. Bayesian image restoration with two applicationsin spatial statistics. Ann Inst Stat Math 1991;43:1–59.

Biggeri A, Dreassi E, Catelan D, Rinaldi L, Laglazio C, Cringoli G. Diseasemapping in veterinary epidemiology: a Bayesian geostatisticalapproach. Stat Methods Med Res 2006;15:337–52.

Cressie N. Statistics for spatial data, revised ed.. New York: Wiley; 1993.Eberly L, Carlin B. Identifiability and convergence issues for Markov chain

Monte Carlo fitting of spatial models. Stat Med 2000;19:2279–94.Elliott P, Wakefield J, Best N, Briggs D. Spatial epidemiology: methods and

applications. 1st ed. Oxford University Press; 2000.Gelman A. Prior distributions for variance parameters in hierarchical

models. Bayesian Anal 2006;1:515–33.Jerrett M, Burnett R, Ma R, Pope C, Krewski D, Newbold K, et al. Spatial

analysis of air pollution and mortality in Los Angeles. Epidemiology2005;16:727–36.

Kissling W, Carl G. Spatial autocorrelation and the selection ofsimultaneous autoregressive models. Global Ecol Biogeogr2008;17:59–71.

Lawson A. Bayesian disease mapping: hierarchical modelling in spatialepidemiology. 1st ed. Chapman and Hall; 2008.

Lee D, Ferguson C, Mitchell R. Air pollution and health in Scotland: amulti-city study. Biostatistics 2009;10:409–23.

Leroux B, Lei X, Breslow N. Estimation of disease rates in small areas: anew mixed model for spatial dependence. In: Halloran M, Berry D,editors. Statistical models in epidemiology, the environment andclinical trials. New York: Springer-Verlag; 1999. p. 135–78.

MacNab Y. Hierarchical Bayesian modelling of spatially correlated healthservice outcome and utilization rates. Biometrics 2003;59:305–16.

MacNab Y, Kmetic A, Gustafson P, Sheps S. An innovative application ofBayesian disease mapping methods to patient safety research: aCanadian adverse medical event study. Stat Med 2006;25:3960–80.

Molina R, Katsaggelos A, Mateos J. Bayesian regularization methods forhyperparameter estimation in image restoration. IEEE Trans ImageProcess 1999;8:231–46.

Pope C, Burnett R, Thun M, Calle E, Krewski D, Ito K, et al. Lung cancer,cardiopulmonary mortality, and long-term exposure to fineparticulate air pollution. J Am Med Assoc 2002;287:1132–41.

R Development Core Team, 2009. R: a language and environment forstatistical computing. R Foundation for Statistical Computing, Vienna,Austria, ISBN: 3-900051-07-0. Availalbe from: <http://www.R-project.org>

Spiegelhalter D, Best N, Carlin B, Van der Linde A. Bayesian measures ofmodel complexity and fit. J R Stat Soc Ser B 2002;64:583–639.

Steadman J., Bush T., Vincent K. UK air quality modelling for annualreporting 2001 on ambient air quality assessment under CouncilDirectives 96/62/EC and 1999/30/EC. Department for EnvironmentFood and Rural Affairs; 2002.

Stern H, Cressie N. Posterior predictive model checks for disease mappingmodels. Stat Med 2000;19:2377–97.

Vineis P, Forastiere F, Hoek G, Lipsett M. Outdoor air pollution and lungcancer: recent epidemiologic evidence. Int J Cancer 2004;111:647–52.

Whyte B., Gordon D., Haw S., Fischbacher C., Harrison R. An atlas oftobacco smoking in Scotland: a report presenting estimated smokingprevalence and smoking attributable deaths within Scotland. NHSHealth Scotland; 2007.


Recommended