+ All Categories
Home > Documents > Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 ›...

Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 ›...

Date post: 27-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
34
Demographic Research a free, expedited, online journal of peer-reviewed research and commentary in the population sciences published by the Max Planck Institute for Demographic Research Konrad-Zuse Str. 1, D-18057 Rostock · GERMANY www.demographic-research.org DEMOGRAPHIC RESEARCH VOLUME 8, ARTICLE 3, PAGES 61-92 PUBLISHED 11 February 2003 www.demographic-research.org/Volumes/Vol8/3/ DOI: 10.4054/DemRes.2003.8.3 Research Article Bayesian spatial analysis of demographic survey data: An application to contraceptive use at first sexual intercourse Riccardo Borgoni Francesco C. Billari © 2003 Max-Planck-Gesellschaft.
Transcript
Page 1: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research a free, expedited, online journal of peer-reviewed research and commentary in the population sciences published by the Max Planck Institute for Demographic Research Konrad-Zuse Str. 1, D-18057 Rostock · GERMANY www.demographic-research.org

DEMOGRAPHIC RESEARCH VOLUME 8, ARTICLE 3, PAGES 61-92 PUBLISHED 11 February 2003 www.demographic-research.org/Volumes/Vol8/3/ DOI: 10.4054/DemRes.2003.8.3 Research Article

Bayesian spatial analysis of demographic survey data: An application to contraceptive use at first sexual intercourse

Riccardo Borgoni

Francesco C. Billari

© 2003 Max-Planck-Gesellschaft.

Page 2: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Table of Contents

1 Introduction 62

2 Data: Geo-referenced FFS 63

3 Methods: Why Bayesian inference? 673.1 A Bayesian spatial model 683.2 Monte Carlo Markov Chain inference 70

4 Application to contraceptive use: methodologicalaspects and results

72

4.1 Monte Carlo Markov Chain setup 724.2 Results 724.3 Is there a problem with standard multilevel

modeling approaches?76

5 Conclusion and future developments 77

6 Acknowledgements 78

Notes 79

References 82

Appendix 1. Identification issues 86

Appendix 2. Diagnostics of MCMC inference 88

Page 3: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 61

Research Article

Bayesian spatial analysis of demographic survey data:An application to contraceptive use at first sexual intercourse

Riccardo Borgoni 1

Francesco C. Billari 2

Abstract

In this paper we analyze the spatial patterns of the risk of unprotected sexual intercoursefor Italian women during their initial experience with sexual intercourse. We rely ongeo-referenced survey data from the Italian Fertility and Family Survey, and we use aBayesian approach relying on weakly informative prior distributions. Our analyses arebased on a logistic regression model with a multilevel structure. The spatial pattern usesan intrinsic Gaussian conditional autoregressive (CAR) error component. Thecomplexity of such a model is best handled within a Bayesian framework, and statisticalinference is carried out using Markov Chain Monte Carlo simulation.In contrast with previous analyses based on multilevel model, our approach avoids therestrictive assumption of independence between area effects. This model allows us toborrow strength from neighbors in order to obtain estimates for areas that may, on theirown, have inadequate sample sizes. We show that substantial geographical variationexists within Italy (Southern Italy has higher risks of unprotected first-time sexualintercourse). The findings are robust with respect to the specification of the priordistribution. We argue that spatial analysis can give useful insights on unmetreproductive health needs.

1 Department of Social Statistics, University of Southampton, Highfield, SO17 1BJ, Southampton UK.

E-mail [email protected] .2 Istituto di Metodi Quantitativi, Università Bocconi, viale Isonzo 25, I-20135 Milano, Italy.

E-mail: [email protected] .

Page 4: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org62

1. Introduction

In the literature on potential health risks caused by sexual behavior, a particularemphasis is placed on the behavior of adolescents. Adolescents are more likely tocontract sexually transmitted diseases (STDs), including HIV/AIDS. Adolescents alsohave a higher risk of unplanned pregnancies (Kotckick et al., 2001). Consequently,adolescent sexual behavior has been considered one of the fundamental factors for thestudy of health risks among young people. This is mainly due to the fact that half of allnew cases of HIV infections involve young people between 15 and 24 years of age(UNFPA, 2000). Also, the definition of risk behavior is often linked with sexualactivity at a particularly early age (Duberstein Lindberg et al., 2000), but attention isalso being placed on the use of contraceptive methods at first-time sexual intercourse(Hogan et al., 2000). Ku and collaborators (Ku et al., 1994) have also shown that theuse of contraception one’s initial experience with intercourse has a decisive influenceon future contraceptive decisions.

In this paper we analyze data from Italy. In this country, young adults tend to waitto have sexual intercourse considerably longer than their counterparts in other Westerncountries do (Bozon and Kontula 1997; Cazzola 1999; Ongaro 2001). Our attention isdirected towards the reproductive health risks of young people who do not usecontraceptive methods at first sexual intercourse. Obviously, this indicator needs to bestudied in a multi-dimensional context and may be insufficient when considered on itsown. More precisely, the study of risk connected with behavior is flawed when thecontraceptive method is not known, as the method may not be suitable for theprevention of STDs and/or pregnancy. This is a concern particularly in Italy wheretraditional contraceptive methods are still used, even by adults (Note 1) (Bonarini 1999;Spinelli et al., 2000). Because of the limits of available data, this study concentrates onthe non-use of contraception at first-time sexual intercourse as a risk indicator, not ondifferent types of birth control methods. This indicator will provide an underestimatedrisk measure of being exposed to pregnancy (because even when birth control is used,the method involved may not be fully effective), and an even more underestimated riskmeasure for the exposure to STDs (even effective contraceptive methods such as oralcontraceptives do not provide protection against STDs) (Note 2). The indicator mayhowever overestimate risks because especially when first sexual intercourse takes placeafter first marriage, non-contraception may be a choice, although first sexualintercourse in Italy for recent cohorts is almost completely detached from marriage(Ongaro, 2001).

A fundamental factor in the study of contraceptive decision-making is theenvironment in which young people operate. The information that flows through socialnetworks, public services, local media, and schools, and the availability of

Page 5: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 63

contraceptives are important contextual characteristics in this regard (Brewster et al.1993; Billy et al. 1994; Teitler and Weiss 2000). This speaks to the importance of geo-referencing individual-level data. Italy is a particularly interesting case for scholarsinterested in studying geographical differentials, as social and historical heterogeneitywithin the country corresponds, to a wide extent, to geographical heterogeneity.Throughout Italy, for example, the age at which people have sexual intercourse for thefirst time varies widely, especially for women (Billari and Borgoni 2002; Ongaro2001). Mapping territorial influence is also important because any potential policyintervention for risk reduction is more effective when planned at a local level, as in anyWestern country. With reference to the United States, Teitler and Weiss (2000) foundeducational context to be of great importance. Furthermore, Mauldon and Luker (1996)showed that contraceptive education makes it one-third more likely that contraceptionwill be used at first-time intercourse, and that condoms will be the method used. Theabsence of official statistics on the subject creates problems both in terms ofmeasurement and statistical appraisal of reproductive health risks (with the exception ofvoluntary abortions). In these cases, survey data becomes extremely useful.Furthermore, if one wants to take into account the importance of spatial distribution,one should apply adequate mapping techniques. Such techniques have to be suitable interms of controlling for sample variability, possibly by exploiting the data’sgeographical distribution and the differential spatial incidence of the phenomenon ofinterest.

This paper proposes methods for the study of spatial distribution in Italy pertainingto the non-use of contraceptives at first-time sexual intercourse by studying the dataprovided by the Fertility and Family Survey. More precisely, with a spatial statisticalmodel featuring a Bayesian approach, we produce maps at the provincial level. Thepaper is structured as follows: Section 2 provides an introduction of the data studied.The methodological approach and estimate methods are given in Section 3. Theapplication and results are given in Section 4, and Section 5 contains some concludingremarks. The appendix provides further diagnostics and sensitivity analysis of theMCMC inference.

2. Data: Geo-referenced FFS

The data analyzed for this research are from the Fertility and Family Survey (FFS), asample survey conducted by means of an interviewer-administered questionnaire (DeSandre et al. 1997) which contains questions about the respondent’s present situationand biography. The study was part of a program implemented by the Population

Page 6: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org64

Activities Unit (PAU) of the United Nations Economic Commission (UN/ECE) withthe intent of collecting comparative information on an international scale.

The sampling strategy, which was created by the National Statistical Institute(ISTAT), is described in detail by Zannella et al. (1997). The survey has a three-stagedesign, (municipality, electoral section, and individual) and the sample comprises 6,030individuals, with two independent samples for men and women. More precisely, 4,824women and 1,206 men born between 1946 and 1975 were interviewed. The interviewswere carried out between November 1995 and January 1996 and contained questionspertaining to a series of bio-demographic events. The age of the respondents when theseevents were experienced was also registered (Note 3).

For the following analysis, we select the subset of individuals who hadexperienced sexual intercourse. The vast majority of the sample, 5,279 individuals, isincluded in this subset. Furthermore, all of the respondents whose answer on whetherthey had experienced sexual intercourse was missing are excluded from the data set (95cases). The data is considered to be reliable and of good quality by cross-validationwith other surveys (Cazzola 1999). In general, data on retrospectively-reportedcontraceptive use at first intercourse are of sufficiently high reliability in a wide varietyof contexts (Wilder 2000). In what follows, we only differentiate between respondentswho had used contraception during their initial sexual intercourse experience and thosewho had not used any contraceptive method. We include cases in which respondentsanswered "don’t know" on the use of contraception (24 cases) as part of the set of thosewho had not used any contraception. We thus assume that in absence of explicitreporting of contraceptive use, contraception was not used.

In order to study where respondents lived when they had their first sexualintercourse, we use the place of main residence during the first fifteen years of his/herlife (Note 4). This may not have been the respondent’s actual hometown, but we assumethat it is a fair reflection of the environment in which the first sexual intercourse wasexperienced. Individuals living abroad at age fifteen and individuals who did notindicate any residence location are excluded. The municipal data are then aggregated atthe provincial level. Given the methodological focus of the present work, we limit ourattention exclusively to the sub-sample of women. Finally, cases in which age at firstsexual intercourse was not given are excluded from the analysis (32). The final data setincludes 4,006 female cases. Among them, 47.6% used some contraceptive methodsduring their first sexual experience. The sample distribution across age and cohorts isreported in table 1.

Figure 1 (a) shows the sample size by province while Figure 1 (b) shows thenumber of women who used contraception during their first sexual intercourse. As thereare 103 provinces in Italy, the sample size is quite small for many provinces (40% ofthe provinces have fewer than 18 observations, and the same share has fewer than 8

Page 7: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 65

cases). This results in a wide variability of the summary statistics, like relativefrequencies, in those provinces with few observations.

Table 1: Sample size and percentage of contraceptive use by age at firstintercourse and cohort

Number of women Percentage of contraceptive use

Cohort CohortAge at firstintercourse 1946-55 1956-65 1966-76 total 1946-55 1956-65 1966-76 total

under 18 193 388 341 922 35.8 51.8 63.9 52.9

18-20 549 643 577 1769 35.9 44.2 72.1 50.7

21-24 416 289 234 939 28.4 38.4 65.4 40.7

over 25 193 139 44 376 32.1 40.3 52.3 37.5

total 1351 1459 1196 4006 33.0 44.7 67.7 47.6

a) b)

Figure 1: Sample size (a) and contraceptive use counts (b)

0 - 33 - 88 - 1414 - 2525 - 175

1 - 88 - 1717 - 3434 - 5555 - 307

Page 8: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org66

In Figure 2, we show the empirical probabilities for the 8 provinces with thelargest sample size (more than 78 units, which is approximately the upper decile of thedistribution of the number of observations across provinces) and for the 10 provinceswith the smallest sample size (Note 5) (less or equal to 5, approximately the first decileof the distribution of the sample size across provinces) along with their 95% confidenceinterval (Note 6).

♦ Smaller sample size

• Bigger sample size

Figure 2: Frequency of contraceptive use for the provinces with more than 78observations and less than 5

The map in Figure 3, where observed percentages of contraceptive use aredepicted across provinces, already shows a spatial pattern. The following analysis thusaims at using a model that borrows strength from neighbors in order to obtain estimatesfor areas that may, on their own, have inadequate sample sizes.

0.0

0.2

0.4

0.6

0.8

1.0

NO CZM

C CB PI RISR GR NU LC BZ

GE LE BA TO MI

NARM

rela

tive

freq

uenc

y

Page 9: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 67

Figure 3: Relative Frequency of contraceptive use by provinces

3. Methods: Why Bayesian inference?

This section describes the statistical techniques employed; the results are analyzed indetail in Section 4. We use a Bayesian computational approach to statistical inference.This approach makes use of simulation to estimate the posterior distribution ofparameters, that otherwise cannot be estimated using standard techniques.

Let O denote the observed data, β the parameter of a model for the data, L(O|β) thelikelihood function of the model, and π(β) the prior distribution which conveys pastknowledge of the parameter β , or the absence of such past knowledge in case of priordistributions with high variability. In Bayesian inference, the posterior distribution,π(β|O)∝ L(O|β)×π(β), links the assumptions made (the prior distribution) with theempirical evidence (the likelihood). The goal is to use the characteristics of thisdistribution (say the mean or the quantiles) to make inferences about β. This approachcan be extended to multivariate contexts where β becomes a vector of parameters.Because the parameters are themselves random variables, it is natural to deal with themin a hierarchical way. This means that we are assuming that their distribution may

0 - 0.330.33 - 0.430.43 - 0.540.54 - 0.630.63 - 1

Page 10: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org68

depend on other parameters, called hyperparameters. These hyperparameters are alsorandom variables with their own prior distributions, called hyperpriors. Suchhierarchical thinking helps to understand multiple parameter problems to developcomputational strategies and, in practice, to provide enough parameters to deal withcomplex, hierarchical, data structures (Gelman et al. 1995).

The derivation of any posterior quantity requires evaluation of integrals of theform ∫ξ(β)π(β|O)dβ, for some function ξ(β). Direct integration of this expression isusually problematic both in analytic and numerical terms. In order to overcome thisproblem we use the approach known as Markov Chain Monte Carlo (MCMC). MCMCis used to indirectly obtain the posterior distribution: given a sample generated from theposterior distribution, the characteristics of the probability density function areapproximated using the sample counterparts. The key of MCMC is to simulate aMarkov process whose stationary distribution (the distribution which the transitiondistributions converge to) is the posterior we are interested in. The simulation has toiterate long enough so that the distribution of the last generated draws is sufficientlyclose to the posterior distribution.

A full discussion of the advantages and the features of Bayesian modeling andMCMC is, however, far beyond the scope of this paper. The reader may refer, forinstance, to Congdon (2001) for an applied perspective, and to Gelman et al. (1995) andGilks et al. (1996) for a more formal presentation.

We now give a description of the statistical model we used, and we describe theestimation algorithms used in the inferential procedures based on the MCMC approach.

3.1 A Bayesian spatial model

Epidemiologists often study disease risk clustered within geographical areas (see, forinstance, Elliot and Wakefield 2001, Lawson 2001). The clustering of risks has beendefined in different ways (Wakefield et al. 2000). Knox (1989) defines a cluster in aqualitative and general manner as a group of occurrences that are geographically and/ortemporally bounded, that are related to each other through some social or biologicalmechanism or that have a common relationship with some other event or circumstance.One may for instance hypothesize that contraceptive use varies geographically withoutan underlying spatial trend, but only due to various local effects. Testing this hypothesisrequires the introduction of a variance component into the model. The assumption ofindependence and equal distribution of variance components is generally inconsistent.Clusters, in fact, often depend on cultural and economic factors which we canreasonably assume to be more similar in adjacent areas than in distant ones. In other

Page 11: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 69

words, we can expect the effects within adjacent areas to be correlated. In section 4.3,we return to the problems arising when one does not consider this type of correlation.

The dependent variable, Y, is dichotomous, where the value 1 represents therespondent’s use of a contraceptive method at first intercourse (0 otherwise). Theprobability assumption we make is that Y follows a p-mean Bernoulli distribution, andthat p is linked to a linear predictor through a logit link function. To model spatialpatterns, we add two random variables to the linear predictor. The first random variable,U, is assumed to be independently and identically distributed for all areas (theunstructured component). The second random variable, S, represents a spatial process(the structured component). The model is formally specified as follows:

GgSUXp gggigi ,,1’)(logit �=++= β (1)

where G stands for the number of areas taken into consideration (in our case the numberof Italian provinces totaling 103) and the subscript gi refers to the generic i-th sampleunit in the province g.

Hereafter, the β’s are referred to as fixed effects and the spatial components asrandom effects. We should keep in mind that in a Bayesian framework all theparameters of the model are random variables. Unlike in the frequentist tradition, theterm ‘fixed’ means that no higher hierarchical level is present beyond the assumedprior, while the term 'random' means that a further prior (called hyperprior) is elicitedfor the parameters (the so called hyperparameters) of such distributions (Gelman et al.1995).

Regarding the unstructured component, we assume that the vector U’={U1 ,…, UG}follows a normal prior distribution with a vector of 0 mean, and a variance and co-variance matrix 2I (with I being identity matrix and 2>0 unknown). For the structuredcomponent S’={S1 ,…, SG}, we assume that the prior is represented by a MarkovGaussian field or conditional Gaussian autoregressive model (Besag et al. 1991).

In this case, therefore, with S-g indicating the vector of the effects excluding that ofthe g-th location, we assume that:

),(~| 2g

rrgrgg SwNSS τ∑−

where w gr and � g are defined in terms of the precision matrix. More specifically, weuse a limit model in the form of intrinsic Gaussian autoregression (Besag andKooperberg 1995). In this model, the only single relevant weights for the determinationof the conditional law are the adjacent areas. By indicating the set of the provincial gadjacent areas with ���and its cardinality with mg, a weight equal to 1/mg is assigned to

Page 12: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org70

each adjacent area and a 0 weight otherwise. The mean and the variance are givenrespectively by:

ggr

rgg mSSSE /)|( ∑∂∈

− = , and ggg mSSVar /)|( 2τ=− (2)

which are expressions that make the model’s Markov structure clear.The rationale behind this hypothesis is that a spatial effect is usually a surrogate of

many unobserved influential factors. Some such factors may obey a strong spatialstructure, while others may act only locally. The two different random processes aresupposed to grasp such a double source of randomness.

The structured part of the prior allows us to borrow strength from neighbors inorder to make the estimates based on inadequate sample sizes more robust.Accordingly, equation (2) states that the conditional mean of Sg is the average of theneighboring effects and the variance of the distribution is expected to be smaller thehigher the number of the neighbors (Note 7).

Finally, in the model’s third stage, we select prior distributions for the fixed effectsand for the hyperparameters. We assume a highly dispersed but proper inverse gammadistribution both for the 2 and for the ² parameter (specifically we used a scaleparameter b=1 and form parameter a=0.005 inverse gamma law). As we do not haveany specific information about the fixed parameters β, neither concerning their rangenor their sign, we have assumed a "diffuse" prior distribution for them that is π(βj) isproportional to a constant for each regression parameter βj. Our assumptions on priordistributions give almost all the weight to the likelihood function in the construction ofposterior distributions (Note 8). For models such as the one we illustrated identificationis a potential problem, and an assessment of a possible under-identification of theproblem requires the calculation of model diagnostics. We discuss such problems in adetailed way in Appendix 1.

3.2 Monte Carlo Markov Chain inference

For complex models such as the one introduced in Section 3.2, the posterior distributionis both analytically and numerically intractable. Monte Carlo simulation methods arebased on the principle of posterior distribution sampling and subsequent use of thesesimulated samples for estimating the characteristics (moments) of the posteriordistribution. In particular, Markov Chain Monte Carlo methods (Gilks et al. 1996) arebased on the simulation of observations of a Markov Chain where the transitionprobabilities converge to the selected posterior distribution. The obtained sample is

Page 13: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 71

adopted for inference, after the exclusion of an initial number of observations used tolet the chain converge (burn-in). For the model adopted in this study, we make thefollowing assumptions:

• conditional on explanatory variables and on the entire set of parameters,observations are independent;

• prior distributions for fixed and random effects and hyperpriors are mutuallyindependent.

Given these assumptions, we denote the number of explanatory variables in themodel by K, the conditional distributions generically by π(z|y), and the contribution tolikelihood of the i-th unit in the g area by Lgi (ygi ),i = 1,…, ng (where ng represents thenumber of observations in the g-th area). The posterior distribution is then factorized asfollows:

βπ

τπτπ

σπσπ

β∝βστπ

∏∏

∏∏∏

==−

== =

K

1jj

G

1g

22gg

G

1g

22g

G

1g

n

1igin1

22

)()(),s|s(

)()|u(),u,s|y(L)y,,y|,,,u,s(g

MCMC inference is based, therefore, on sample simulation from each full conditionaldistribution.

The algorithms used for the MCMC simulations in this paper are described indetail in Fahrmeir and Lang (2001-A), so we provide only a brief description here. Weadopt the algorithm proposed by Gamerman (1997) to update the model’s fixedparameter values and those of the random unstructured intercepts. The essential ideabehind Gamerman's algorithm is to generate an observation by means of a Metropolis-Hastings algorithm step inside the iterative weighted least squared (IWLS) procedure(Note 9). The algorithm we use for the updating of the structured component (CAR)was originally proposed by Knorr-Held (1999) within the framework of dynamic lineargeneralized models. This algorithm aims at improving the mixing and convergence ofsimulated chains by updating, at each iteration, a block of values. (Note 10).

The full conditional distributions for the variance parameters of spatial effects stillmatch an inverse gamma distribution. Therefore, we can update the values of the chainusing a Gibbs Sampler (Fahrmeir and Lang 2001-A).

Page 14: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org72

4. Application to contraceptive use: methodological aspects andresults

After giving a brief description of the application’s technical aspects, this sectioncontains a discussion of our results (Note 11). We then compare our results to those thatwould likely have been obtained by adopting an approach that is common in multilevelstatistical models, where the usual assumption is the mutual independence betweenareas.

4.1 Monte Carlo Markov Chain setup

As indicated in the previous section, the basic idea of MCMC methods is to give anapproximation of posterior distribution by means of a sample simulated through aMarkov chain. The observations of the Markov chain are thus correlated. In principle,as many authors have observed (e.g. Gelman 1996), no substantial problems arise whenconsidering correlated observations in MCMC inference. Nevertheless, correlation mayimply the need for simulating very long chains. In this case, computing the results canbecome burdensome in terms of calculation. One solution is to use only one observationfor each k simulated for inference (this is known as thinning). At the same time, thechoice of the k-interval according to the auto-correlation in the chain allows forreduction in the correlation to be found in the sample actually used for inference (Note12).

4.2 Results

We include as explanatory variables, in addition to the province the respondents lived inup until age 15, the respondent’s age at first sexual intercourse (divided into fourcategories: 14-17 years old as the reference group, 18-20 years old, 21-24 and over 25years old), and birth-cohort (divided into three categories: 1946-55 as the referencegroup, 1956-65 and 1966-75).

The results of the analyses regarding fixed effects are given in Table 2. Age at firstsexual intercourse does not have a significant effect, meaning that the credibilityinterval, given by the 10th and 90th percentile of the posterior distribution, contains 0for all categories examined. In contrast, birth cohort significantly influences theprobability of using contraception. In other words, on the basis of this analysis, thepropensity of Italian women to use contraception at first intercourse does not depend onage (even though respondents aged 21-24 appeared to be less inclined to use

Page 15: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 73

contraception, as their odds were 7.8% lower than the 14-17 year old reference group).Regarding cohorts, the relative odds for the 1956-65 cohort is 69.5% higher than theoldest cohort, and the increase in odds for the youngest on the date of the interview(those born 1966-75) increased to 345% compared to the reference cohort. These valuesconfirm the strong differentials in frequency of use of contraception mentioned in theintroduction.

Table 2: Fixed effects estimates.

Variable Mean S.D. 1st decile Median 9th decile

Intercept -0.716 0.096 -0.838 -0.715 -0.593

Age 18-20 0.066 0.088 -0.049 0.065 0.181

Age 21-24 -0.081 0.105 -0.210 -0.081 0.052

Age 25 and over 0.003 0.141 -0.177 0.002 0.183

1956-65 cohort 0.528 0.087 0.417 0.528 0.638

1966-75 cohort 1.493 0.093 1.377 1.491 1.615

Note: reference groups: age 14-17 and 1946-55 cohort.

Turning to spatial-related aspects, studied controlling for age and cohort effects,we observe immediately from Figure 4 (a) that there is an increasingly marked effect ofthe area on the odds of moving from the south to the north of the peninsula. Figure 4 (b)shows in which provinces the effect is either positively or negatively significant, or notsignificant, meaning that the credibility interval given by the 10th and 90th posteriorpercentile is above 0, below 0 or it contains 0 respectively.

The maps in Figure 4 show a significant level of heterogeneity at the spatial levelbetween the various Italian provinces which, among other issues, reflects the south-north differentials given by age at initial sexual intercourse found in the same data(Billari and Borgoni 2002; Ongaro 2001).

Considering only the structural variance and not the spatially unstructuredcomponent may sometimes be misleading, and may not provide complete information.Only by comparing the two can we understand which areas deviate from the latentstructure (in our case, the south-north gradient).

Figure 5 (a) represents unstructured spatial effects (posterior means). The valuesused for the map are determined according to the distance in unit terms of standarddeviation from the average province values. White provinces constitute the middlegroup, as they include the overall mean (0) of estimated effects. The upper and lower

Page 16: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org74

a)

b)

Figure 4: Structured spatial effects (posterior mean) (a) and their credibilityintervals (b)

Page 17: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 75

a)

b)

Figure 5. Unstructured spatial effects (posterior mean) (a) and their credibilityintervals (b)

Page 18: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org76

values used to delimit this middle group are obtained by adding and subtracting onestandard deviation from the overall mean (Note 13). In the unstructured component, noparticular trends are evident. In any case, the effects are less relevant in statistical termsthan in the structured component. Only eight provinces show significant effects (interms of an 80% credibility interval). These provinces are highlighted in color in Figure5 (b), whereas the provinces with non-significant effects are left in white. The provincesof Genoa, Florence, Rome and Sienna stand out as having a positively significantunstructured effect. The provinces with a negatively significant unstructured effect areCaserta, Como, Salerno and Taranto.

Finally, by comparing the distribution of structured and non-structured provincialeffects across Italy, we observe that the probability interval given by the 5th and 95th

percentiles is equal to (-0.274, 0.329) for unstructured effects and (-0.693, 0.362) forstructured effects. To sum up, we conclude that the structured component has morerelevance compared to the unstructured component (Note 14).

4.3 Is there a problem with standard multilevel modeling approaches?

In social-demographic research, multilevel statistical models are often used to studyspatial clustering problems (Goldstein 1999). With such models, it is usually assumedthat the random components at the contextual level are mutually independent. Eventhough quite common, this assumption is not actually implied by the multilevelapproach, so correlated random residuals can also be specified (e.g. Langford et al.1999). The independence assumption has an inherent problem of inconsistency: if thespatial context matters, it makes sense to assume that areas close to each other are moresimilar than areas that are far apart. That is, the spatial correlation (i.e. structuredcomponent) that we used in the model presented in Section 3.1 applies here as well. Inorder to emphasize the differences that can be found by adopting this approach in aspatial context and the possible risks involved with the violation of the assumption ofindependence between aggregated spatial units, we used Model (1) with the non-structured variance component only.

The results regarding the fixed parts of the model are very similar to the resultsobtained by including both spatial components, and are therefore not reported here.Instead, the model’s posterior effects are presented in Figure 6. The map is composed inthe same way as in Figure 4 (a), using quintiles of the territorial distribution of theeffects. Even though the center-northern area may appear, on average, to be darkercompared to the center-south, detecting a clear spatial trend is now much harder thanbefore: local and trend effects appear to be mixed. Thus, neglecting the correlationbetween areas significant information would have been lost.

Page 19: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 77

Figure 6: Unstructured spatial effects (posterior mean) for a multilevel modelwithout the structured spatial component

5. Conclusion and future developments

In this paper, we show how it is possible to map a social-demographic risk indicatorwith the use of data taken from a sample survey that is not directly designed to ensure acoverage of geographical units such as those of interest. The Bayesian computationalapproach we adopted enables us to provide estimates of parameters in an otherwise toocomplex model. It also allows us to refrain from the assumption of mutualindependence between areas usually imposed in multilevel statistical models.

The absence of information from the survey regarding the type of method usedlimits the potential of the data studied. Thus, for future reference, it is necessary for this

-1.351 - -0.418-0.418 - -0.168-0.168 - 0.0250.025 - 0.2390.239 - 0.4340.434 - 0.977

Page 20: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org78

information to be collected on a geo-referenced basis. One could gain a considerableadvantage for spatial analysis with a time-varying geo-referencing of life-course events.In this way, it would also be possible to refer an event to the individual’s context at thetime it was experienced.

When considering the structured component, our results reveal how the southernareas have a higher risk at initial sexual intercourse, while the unstructured componentindicates that the southern provinces of Caserta, Salerno and Taranto have higher riskfactors. We can speculate that less available information and more pressure by thepartner against using contraception lie at the root of the territorial differences we find,even though further research is definitely necessary to verify this assumption. As a finalcaveat, however, we must add that the data provided do not allow the study of thereasons why contraception is not used: therefore, we cannot exclude the fact that thereare cases where conception is actually desired in first-time intercourse. More generally,maps enable us to produce "narratives" which are valuable for theory formation(Lesthaeghe and Neels 2002): the mere mapping of a phenomenon does not explain thephenomenon, although it constitutes a strong foundation.

6. Acknowledgements

This paper has been almost completely written while the two authors were working atthe Max Planck Institute for Demographic Research. Earlier versions of this paper havebeen presented at workshops and seminars at the University of Pisa and at theUniversity of Sienna, where we received numerous useful comments. We are grateful toArnstein Aassve, Renée Flibotte-Lüskow, Sheila Mulrooney Eldred, and threeanonymous reviewers of Demographic Research for comments and suggestions on thepaper. We are also deeply indebted to Dr. Mauro Preda (“Studi su popolazione eterritorio” Institute, Catholic University of Milan) for his enlightening advice onGeographical Information Systems and practical support in several of the ArcViewanalyses presented in this paper.

Page 21: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 79

Notes

1. In the sample we worked with, which is described in detail in section 2, only 47.6%of the female respondents declared that they had used contraception during first-time intercourse. However, this percentage varies considerably across the variouscohorts and goes to 33% for the 1946-55 decade birth cohort and to 44.7% for the1956-65 birth cohort and reaches 64.7% in the 1966-75 birth cohort.

2. On the other hand, in some cases there may also have been a desire for pregnancyat first intercourse, which makes underestimating less serious.

3. The questions relevant to this analysis are given in the ‘Fertility Regulation’section. The respondents were asked: ‘In order to avoid further questions that maynot be relevant, can I ask whether you have ever experienced completeintercourse?’ (possible answers: 'yes', 'no', and 'no answer'). In case of anaffirmative reply, the person was asked ‘At what age did you have completeintercourse for the first time?’ (with the possibility of stating ‘no answer’) and ‘Onthat occasion, was any contraceptive method used?’ (possible answers: 'yes', 'no','don’t know', and 'no answer'). Unfortunately, the respondents were not asked whattype of contraceptive method they had used, if any; this was investigated andlimited to the last 4 weeks before the interview (Bonarini 1999).

4. The following question was asked ‘Where did you live for most of the time untilyou were fifteen years old?’

5. From the set of those Italian provinces eligible to appear in the picture we excludethree provinces with 0 cases of contraceptive use and one province with oneobservation only.

6. Given the small sample size, we use Bca bootstrap intervals (Davison and Hinkley1997).

7. We can suggest a demographic meaning for the formulation of the variance inequation (2). If we assume that behavior is influenced in an important way by thecontext (in our case by geography), and that spatial autocorrelation helps incapturing such influence, the mere fact of having more neighbors allows one todraw more precise conclusions about the average behavior in a given province.

8. An alternative to the Bayesian approach could be a "mixed effects" model whereonly the spatial variation is regarded as random. Examples of this approach aregiven by Pinheiro and Bates (2000), where both random effects and errors follow aGaussian distribution, and by Langford et al. (1999) and Leyland (2001) for aPoisson model. Langford and his colleagues estimate the model via iterative

Page 22: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org80

generalized least-squares (IGLS). The main advantage of the Bayesian approachconsists in providing the full posterior distribution, while IGLS produces estimatesof spatial residuals and their standard errors, the latter using sample estimates only(Leyland, 2001). On the other hand, the Bayesian approach is much morecomputationally intensive than the latter.

9. This procedure is usually chosen for the maximum likelihood estimations withinthe framework of generalised linear models (McCullagh and Nelder 1989).

10. These types of algorithms make considerable computational advantages from thesparse structure of the correlation matrix, and also reflect the adjacent relationsbetween areas For further developments and details on highly efficient calculationmethods for the generation of Gaussian Markov fields, see Rue (2001).

11. We conducted the analyses presented in this study using BayesX (Lang andBrezger 2000). BayesX is package for Bayesian data analysis, freely availablethrough the Internet at www.stat.unimuenchen.de. They could alternatively havebeen conducted with the use of other software products for Bayesian analyses,particularly WinBug (Spiegelhalter et al. 2000). S-Plus functions (Venables andRipley 2000) were used for the final preparation of the results and for the graphics,whereas the various maps were generated with Arc/View GIS (ESRI 1996).

12. For this reason, we simulated preliminarily a chain of 22,000 iterations (of which2,000 were used for burn-in), from which we analysed the auto-correlationfunction. The correlation turned out to be negligible (< 0.15) for lags larger than25-30 for all the parameters studied in the model, so that in the final simulation athinning pass of one observation every 30 was used. This final simulation consistedin 95,000 iterations (5,000 for the chain burn-in) of which 3,000 (one out of every30) were used for final estimations. A diagnostic on the results obtained by theMCMC inference is given in Appendix 2.

13. The middle group thus covers a two-standard-deviation interval, while other groupscover one-standard-deviation intervals.

14. In our analyses, we evaluated the trade-off between S and U in a graphical way,looking at maps and at some summary statistics of the distribution of effects (as inFahrmeir and Lang, 2001-B). Other more formal approaches are also possible andhave been suggested in the literature. For instance, Best et al. (1999) define thequantity

})SD()/{SD()SD( USS +=ψ ,

Page 23: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 81

where SD(⋅) is the empirical marginal standard deviation of the considered randomeffect. ψ summarizes the posterior proportion of variation due to an excess of variationof the structured effect. A posterior of the index concentrated near 1 suggests that mostof the excess variation is due to the structured part, while a posterior close to 0 suggeststhat the unstructured component is the most relevant. Eberly and Carlin (2000) discusswhether ψ can be fruitfully used to summarize the level of Bayesian learning aboutstructured and unstructured components.

Page 24: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org82

References

Besag, J., Koopemberg, C. (1995). "On conditional and intrinsic autoregression",Biometrika, 82(4): 733-746.

Besag, J., York, J., Mollié A. (1991). "Bayesian image restoration, with twoapplications in spatial statistics", Annals of the Institute of StatisticalMathematics, 43(1):1-59.

Best N.G., Arnold R. A., Thomas A. (1999) Bayesian models for spatially correlateddisease and exposure data, In Bernardo J. M., Berger J.O., Dawid A. P. andSmith A. F. M. Bayesian Statistics 6, Oxford University Press, 131-156.

Billari, F.C., Borgoni, R. (2002). "Spatial profiles in the analysis of event histories: anapplication to first sexual intercourse in Italy", International Journal ofPopulation Geography, 8: 261-275.

Billy, J., Brewster, K., Grady, W. (1994). "Contextual effects on the sexual behavior ofadolescent women". Journal of Marriage and the Family, 56: 387-404.

Bonarini, F. (1999). L'uso della contraccezione in Italia: dalla retrospezione del 1979 aquella del 1995-96. In De Sandre, P., Pinnelli, A., Santini, A., editors. Nuzialitàe fecondità in trasformazione: percorsi e fattori del cambiamento. Bologna: ilMulino: 395-411.

Bozon, M, Kontula O. (1997). "Initiation sexuelle et genre. Comparaison des évolutionsde douze pays européens", Population, 6: 1367-1400.

Brewster, K.L., Billy, J.O., Grady, W.R. (1993). "Social context and adolescentbehaviour: The impact of community on the transition to sexual activity", SocialForces, 71(3): 713-740.

Brooks, S. P., Gelman, A. (1998). "General methods for monitoring convergence ofiterative simulations", Journal of Computational and Graphical Statistics, 7:434-455.

Cazzola, A. (1999). L'ingresso nella sessualità adulta. In De Sandre, P., Pinnelli, A.,Santini, A. editors, Nuzialità e fecondità in trasformazione: percorsi e fattori delcambiamento. Bologna: il Mulino: 311-326.

Congdon, P. (2001). Bayesian statistical modelling, Wiley, New York.

Davison A.C., Hinkley D.V., (1997). Bootstrap Methods and their Applications.Cambridge: University Press.

Page 25: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 83

De Sandre, P., Ongaro, F., Rettaroli, R., Salvini, S. (1997). Matrimonio e figli: trarinvio e rinuncia. Bologna: il Mulino.

Duberstein Lindberg, L., Boggess, S., Porter, L., Williams S. (2000). Teen risk-taking:A statistical portrait. Washington, DC: Urban Institute.

Eberly L. E., Carlin B. P. (2000). Identifiability and Convergence Issues for MarkovChain Monte Carlo Fitting of Spatial Models, Statistics in Medicine, 19:2279-2294.

Elliot, P., Wakefield, J. (2001). "Disease clusters: Should they be investigated, and, ifso, when and how", Journal of the Royal Statistical Society, Series A, 164: 3-12.

ESRI (1996). Using ArcView GIS, Redlands CA: Enviromental Systems ResearchInstitute.

Fahrmeir, L., Lang, S. (2001-A). "Bayesian Inference for generalised additive mixedmodels based on Markov random field priors", Applied Statistics, 50: 201-220.

Fahrmeir, L., Lang, S. (2001-B). Bayesian Semiparametric Regression Analysis ofMulticategorical Time-Space Data. Annals of the Institute of StatisticalMathematics, 53: 10-30.

Gamerman, D. (1997). "Sampling from the posterior distribution in generalised linearmixed model", Statistics and Computing, 7: 57-68.

Gelfand A.E., Sahu S.K. (1999). Identifiability, Improper priors, and Gibbs Samplingfor generalized linear model. Journal of the American Statistical Association, 94:247:253.

Gelfand A.E., Carlin B.P., Trevisani M. (2001). On computation using Gibbs samplingfor multilevel models. Statistica Sinica, 11: 981:1003.

Gelman A., (1996). Inference and monitoring convergence. In Gilks, W. R.,Richardson, S., Spiegelhalter, D. J., editors. Monte Carlo Markov Chain inpractice. London : Chapman and Hall: 131-143.

Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. (1995). Bayesian data analysis.London: Chapman and Hall.

Gelman, A., Rubin, D. (1992). "Inference for iterative simulation using multiplesequences", Statistical Sciences, 7:457-511.

Gilks, W.R., Richardson, S., Spiegelhalter D.J. (1996). Monte Carlo Markov Chain inpractice, London: Chapman and Hall.

Page 26: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org84

Goldstein, H. (1999). Multilevel statistical models. First Internet Edition,http://multilevel.ioe.ac.uk/index.html.

Hogan, D.P., Sun, R., Cornwell, G.T. (2000). "Sexual and fertility behaviors ofAmerican females aged 15-19 Years: 1985, 1990 and 1995", American Journalof Public Health, 90(9): 1421-1425.

Knorr-Held, L. (1999). "Conditional prior proposals in dynamic models", ScandinavianJournal of Statistics, 26:129-144.

Knox, G. (1989). Detection of clusters. In Elliot, P. editors. Methodology of enquiriesinto disease clustering. London: Small Area Health Statistic Unit: 17-22.

Kotckick, B.A., Shaffer A., Forehand, R., Miller, K.S. (2001). "Adolescent sexual riskbehavior: A multi-system perspective", Clinical Psychology Review, 21(4): 493-519.

Ku, L., Sonenstein, F., Pleck, J. (1994). "The dynamics of young men’s condom useduring and across relationships", Family Planning Perspectives, 26(6): 246-251.

Lang, S., Brezger, A. (2000). BayesX, Munich: University of Munich.

Langford, I. H., Leyland, A. H., Rabash, J., Goldstein, H. (1999). "Multilevel modelingof the geographical distributions of diseases", Journal of Royal StatisticalSociety, Series A (Applied Statistics), 48: 253-268.

Lawson, A.B. (2001) Statistical Methods in Spatial Epidemiology, New York: Wiley &Sons.

Leyland A.H. (2001) Spatial Analysis. In Leyland, A.H., Goldstein, H. editors.Multilevel modelling of Health Statistics, New York: Wiley: 143-157.

Lesthaeghe R., Neels K. (2000). "Maps, narratives and demographic innovation". IPD-WP 2000-8, Interface Demography, Brussels.

Mauldon, J., Luker, C. (1996). "The effects of contraceptive education on method use atfirst intercourse", Family Planning Perspectives, 28: 19-24.

McCullagh, P., Nelder, J.A. (1989). Generalised linear models (2nd ed.). London:Chapman and Hall.

Ongaro, F. (2001). "First sexual intercourse in Italy: A shift towards and ever morepersonal experience". Paper presented at the 14th IUSSP General Conference,Salvador.

Page 27: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 85

Pinheiro J.C., Bates D.M. (2000). Mixed Effects Models in S and S-Plus. Berlin:Springer.

Poirier D. J. (1998) “Revising Beliefs in Non-identified Models". Econometric Theory,14: 483-509.

Rue, H. (2001). "Fast sampling of Gaussian Markov random fields", Journal of theRoyal Satistical Society Series B, 63: 325-38.

Spiegelhalter, D., Thomas, A., Best, N. (2000). WinBUGS: User manual,http://www.mrc-bsu.cam.ac.uk/bugs.

Spinelli, A., Figà Talamanca, I., Lauria, L., and the European Study Group on Infertilityand Subfecundity (2000). "Patterns of contraceptive use in 5 Europeancountries", American Journal of Public Health, 90(9): 1403-1408.

Teitler, J.O., Weiss, C.C. (2000). "Effects of neighborhood and school environments ontransitions to first sexual intercourse", Sociology of Education, 73(2), 112-132.

UNFPA (United Nations Population Fund) (2000). Preventing infection-promotingreproductive heatlh. UNFPA’s response to HIV/AIDS, New York: UnitedNations.

Venables, W. N., Ripley, B. D. (2000). S programming, Berlin: Springer.

Wakefield, J. C., Kelsall, J. E., Morris, S. E. (2000). Clustering, cluster detection andspatial variation in risk. In Elliott, P., Wakefield, J. C., Best, N.G., Brings D. J.editors. Spatial Epidemiology: Methods and applications, Oxford: UniversityPress: 128-152.

Wilder, E.I. (2000). "Contraceptive use at first intercourse among Jewish women inIsrael, 1962-1988", Population Research and Policy Review, 19: 113-141.

Zannella, F., Rinaldelli, C., De Marchis, A. (1997). Strategia di campionamento elavoro sul campo. In De Sandre P., Ongaro F., Rettaroli R., Salvini S. editors.Matrimonio e figli: tra rinvio e rinuncia. Bologna : il Mulino: 171-187.

Page 28: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org86

Appendix 1. Identification issues

Identification is an issue in models like the one illustrated in Section 3.1, mostlybecause of two problems. The first problem concerns the CAR prior: as this prior isdefined conditionally, the parameter is uniquely specified only up to an additiveconstant. This problem is well-known in spatial statistics, and several solutions havebeen proposed in the literature (i.e. imposing a 0-sum constraint to structured effects asin Besag and Kooperberg, 1995). A second problem concerns the identifiability of thetwo sources of randomness included in the model, namely the unstructured andstructured spatial components. This problem is well described by Eberly and Carlin(2000) in the case of a Poisson spatial regression for aggregate area-level data. In thatcontext, letting Ng be the number of events in the area g, it is usually assumed that Ng isPoisson-distributed, with expected value Egexp(µg), Eg being a known expected numberof event for the area g and µg the log-rate of the event of interest. The model is specifiedas µg=Z’gβ+Sg+Ug where Z is a set of area-level covariates, and S and U are randomeffects allowing for overdispersion due to clustering and area-level heterogeneityrespectively. Given the system of priors described in section 3.1, an identificationproblem arises because the number of events Ng cannot possibly provide informationabout both Sg and Ug, but only about their sum Wg =Sg + Ug This basically means thatonce one reparameterizes the model in terms of (U,W), the conditional distributionπ(ug|ur≠g, w, y), does not depend on the data y. This is known as Bayesianunidentifiability (Gelfand and Sahu, 1999). Even if the model is affected by Bayesianunidentifiability, this does not preclude Bayesian learning about the parameter thatwould require π(ug|y)= π(ug) instead, a stronger condition implying that the marginal(instead of the conditional) distribution is independent from the data (for a moredetailed discussion of these issues see Poirier, 1998). From a Bayesian perspective,formal identifiability is not an issue. When proper priors are assigned to the parameters,also the posterior distribution is proper, and hence all the parameters of the model arewell-estimable. Nevertheless, for some unknown parameters, the posterior distributionmay differ only slightly from the prior distribution, which leads to a poor Bayesianlearning. Implementing iterative simulation-based model estimation may bring to poorresults, as in the case of the Gibbs algorithm. In addition, when rather vague priors arespecified (as spatial statisticians usually do) Markov chain trajectories for weaklyidentified parameters tend to show drift to extreme values, the assessment ofconvergence tends to be difficult, and computation tends to become unstable (Gelfandet al., 2001). Gelfand and colleagues also observe that such problems can rise even in amultilevel model of individual binary data, where more than one observation isavailable for each area.

Page 29: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 87

Taking all the above issues into account, the identification problem in such models(Fahrmeir and Lang , 2001-B) is more a data-related problem – how many isolatedareas are there, how many observations are there in each area – than a problemconnected to the estimation algorithm – which sampler, which starting values, theprecise prior values chosen (Eberly and Carlin, 2000). A full analysis of identifiabilityin models such as the one discussed in Section 3.1 is beyond the scope of the presentpaper, and this Appendix is only provided to make the reader aware of the problem.Moreover as Eberly and Carlin (2000) and Gelfand et al. (2001) observed, even thoughthe assessment of the effect of over-specification is still somehow possible when fixedvalues are assumed for the variance parameters (that is the model consists of two levelsof hierarchy only) assessing clearly the effect of over-specification becomes much moredifficult when hyperpriors on the variance parameters (as we do in the present paper)are used instead.

The fact that the analyses reported in this paper do not suffer from particularidentification problems is assured by the model diagnostics presented in Appendix 2.Such diagnostics show a very stable behavior of the estimates, and the absence ofanomalous drifts. Moreover a way to prevent unidentifiability is to put informativepriors on the parameters one suspects to be weekly identified. Informative prior can beseen as the Bayesian equivalent of the frequentist approach to indentifiability namely,imposing constraints on the parameter space. Appendix 2 reports the results concerningthe model estimated via informative priors for the variance parameters. Even though, aspreviously mentioned, interpretation is somewhat more difficult when hyperpriors areinvolved, what can be clearly observed is that the spatial patterns is basically notaffected by this choice assuring us of the proper identification of the model.

Page 30: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org88

Appendix 2. Diagnostics of MCMC inference

In this appendix, we report diagnostics on the analysis of section 4.2. Although nodiagnostic analysis whatsoever can be considered conclusive proof of the validity ofinference conducted with the use of MCMC methods, there is no doubt that a studybased on this approach cannot avoid considering that inference. Figures A.1 and A.2report the ergodic averages (first row), the simulated values (second row) and the auto-correlation functions for the 3,000 values used for the inference of the model’s fixedeffects. The chain presents a good mixing and a sufficiently quick convergence (eventhough it has to be considered that the graphs refer to the sampled chain with a 1-out-of-30 thinning). The same type of diagnosis was implemented for the structured andunstructured spatial effects as well. Showing all the pertinent graphs would have takenup too much space (there are 103 effects, one for each province, to be considered inboth cases). However, this diagnostic was estimated and showed a very satisfactoryconvergence and mixing.

Figure A.1: Ergodic means, simulated values and autocorrelation functions of eachparameter of variable "age"

Age

17-

20

0 500 1000 1500 2000 2500 3000

0.06

0.08

0.10

0.12

Age

17-

20

0 500 1000 1500 2000 2500 3000

-0.2

0.0

0.2

0.4

coef

. aut

ocor

r.

0 50 100 150 200 250

0.0

0.2

0.4

0.6

Age

21-

24

0 500 1000 1500 2000 2500 3000

-0.0

8-0

.06

-0.0

4-0

.02

Age

21-

24

0 500 1000 1500 2000 2500 3000

-0.4

-0.2

0.0

0.2

coef

. aut

ocor

r.

0 50 100 150 200 250

0.0

0.2

0.4

0.6

Age

>25

0 500 1000 1500 2000 2500 3000

0.0

0.05

0.10

Age

>25

0 500 1000 1500 2000 2500 3000

-0.4

0.0

0.2

0.4

coef

. aut

ocor

r.

0 50 100 150 200 250

0.0

0.2

0.4

0.6

Page 31: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 89

Figure A.2: Ergodic means, simulated values and autocorrelation functions of theintercept (first column) and of the 1956-65 and 1966-75 cohortparameters (second and third columns respectively)

Concerning spatial effects, here we provide a brief sensitivity analysis on the priordistributions. Particular attention is placed on the prior distribution of the CAR modelvariance parameter, and more informative and proper prior distributions are imposed onthe same. In particular, we take distributions of inverse gamma (IG) type intoconsideration, having a b = 1 scale parameter and form parameter a respectively equalto 0.001, 0.01, 0.05, and 0.5. We can note that the probability mass assigned by each ofthese distributions to the interval, for example, [0,100] passes from 0.004 for an inverseparameter gamma equal to 1 and 0.001 to 0.89 in case of parameters equal to 1 and 0.5.For each prior distribution, the model is evaluated by implementing a Markov Chainconsisting of 95,000 iterations with a 5,000 sample burn-in and a 1-out-of-30 thinning.

The results obtained are given in Figure A.3 for estimated spatial effects, and inFigure A.4 for credibility intervals. In Figure A.3 the mapping composition is obtainedby establishing colors according to quintiles. A substantial stability concerning the

cons

t

0 500 1000 1500 2000 2500 3000

-0.8

2-0

.78

-0.7

4

cons

t

0 500 1000 1500 2000 2500 3000

-1.0

-0.8

-0.6

-0.4

coef

. aut

ocor

r.

0 50 100 150 200 250

0.0

0.2

0.4

0.6

gen5

6-65

0 500 1000 1500 2000 2500 3000

0.52

0.56

0.60

gen5

6-65

0 500 1000 1500 2000 2500 3000

0.2

0.4

0.6

0.8

coef

. aut

ocor

r.

0 50 100 150 200 250

0.0

0.2

0.4

0.6

gen6

6-75

0 500 1000 1500 2000 2500 3000

1.50

1.52

1.54

1.56

gen6

6-75

0 500 1000 1500 2000 2500 3000

1.2

1.4

1.6

1.8

coef

. aut

ocor

r.

0 50 100 150 200 250

0.0

0.2

0.4

0.6

Page 32: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org90

phenomenon’s spatial trend and concerning the provinces that either do or do notpresent significant effects (positive or negative) stands out clearly. The prior differencesappear to mainly influence the estimates on the tails of the province effect distributionwhereas the percentiles appear to be very steady.

a) b)

c) d)

Figure A.3: Structured spatial effects (posterior mean) relative to different priors(a) IG(1,0.001), (b) IG(1,0.05), (c) IG(1,0.01), (d) IG(1,0.5)

-0.85 - -0.58-0.58 - 00 - 0.240.24 - 0.310.31 - 0.370.37 - 0.44

-1.12 - -0.54-0.54 - 0.010.01 - 0.170.17 - 0.310.31 - 0.410.41 - 0.59

-0.75 - -0.57-0.57 - -0.01-0.01 - 0.220.22 - 0.290.29 - 0.340.34 - 0.39

-0.76 - -0.56-0.56 - 00 - 0.220.22 - 0.290.29 - 0.330.33 - 0.4

Page 33: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org 91

negative effectno effectpositiveeffect

negative effectno effectpositive effect

negative effectno effectpositive effect

negative effectno effectpositive effect

a) b )

c) d )

Figure A.4: Credibility intervals of structured spatial effects relative to differentpriors: (a) IG(1,0.001), (b) IG(1,0.05), (c) IG(1,0.01), (d) IG(1,0.5)

Finally, Figure A.5 portrays the posterior distributions of some of the 103 randomstructured spatial effects, in correspondence with the different priors, including thoseused in Section 4 for the estimate of the model. It appears once again to be evident thatthey differ very little, even though the informative prior distribution tends to yield amore symmetric and mesocurtic curve. We carried out an analogous analysis withsimilar results for unstructured effects, which, however, cannot be reported herein.

Page 34: Bayesian spatial analysis of demographic survey data: An ... › volumes › vol8 › 3 › 8-3.pdf · terms of controlling for sample variability, possibly by exploiting the data’s

Demographic Research – Volume 8, Article 3

http://www.demographic-research.org92

Figure A.5: Structured spatial effect distributions of six provinces for different inversegamma priors (scale parameter equal to 1 and several shape parametervalues)

Finally, although no indicator-based diagnostic was implemented for the modelparameters (e.g. Gelman and Rubin 1992 and Brooks and Gelman 1998) other than thegraphs presented at the beginning of this appendix, the large number of chains analyzedshowed a substantial convergence to the values discussed in section 4. This assures usof the robustness of the estimates obtained.

effect

dens

ity

-1.0 -0.5 0.0 0.5 1.0

0.0

0.5

1.0

1.5

2.0 0.005

0.0010.010.050.5

effect

dens

ity

-2.0 -1.5 -1.0 -0.5 0.0 0.5

0.0

0.4

0.8

1.2 0.005

0.0010.010.050.5

effect

dens

ity

-1.0 -0.5 0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

0.0050.0010.010.050.5

effect

dens

ity

-0.5 0.0 0.5 1.0 1.5

0.0

0.5

1.0

1.5

2.0

0.0050.0010.010.050.5

effect

dens

ity

-0.5 0.0 0.5 1.0

0.0

0.5

1.0

1.5

2.0

2.5

0.0050.0010.010.050.5

effect

dens

ity

-1.0 -0.5 0.0 0.5 1.0 1.5

0.0

0.2

0.4

0.6

0.8

1.0

1.2

0.0050.0010.010.050.5


Recommended