The biogeography of the Reptiles and Amphibians …€¦ · Web viewMouse lemur phylogeography...

transcript

A necessarily complex model to explain the biogeography of Madagascar's amphibians and

reptiles

Jason L. Brown1,2, Alison Cameron3, Anne D. Yoder1, Miguel Vences4

1 Department of Biology, Duke University, Durham, NC 27708 Durham, NC, USA

2 Current address: Department of Biology, The City College of New York, NY, USA

3 School of Biological Sciences, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK

4 Zoological Institute, Technical University of Braunschweig, Mendelssohnstr. 4, 38106 Braunschweig, Germany

Abstract

A fundamental limitation of biogeographic analyses are that pattern and process are inextricably

linked, and whereas we can observe pattern, we must infer process. Yet, such inferences are

often based on ad-hoc comparisons using a single spatial predictor such as climate, topography,

vegetation, or assumed barriers to dispersal without taking into account competing explanatory

factors. Here we present an alternative approach, using mixed-spatial models to measure the

predictive potential of combinations of spatially explicit hypotheses to explain observed

biodiversity patterns. In this study we compiled a comprehensive dataset of 8362 occurrence

records from 745 amphibian and reptile species from Madagascar. These data were used to

estimate species richness, corrected weighted endemism, and species turnover (based on

generalized dissimilarity modeling). We also created or incorporated, when previously available,

18 spatially explicit predictions of 12 major diversification and biogeography hypotheses, such

as: mid-domain, topographic heterogeneity, sanctuary, and climate-related factors. Our results

clearly demonstrate that mixed-models greatly improved our ability to explain the observed

amphibian and reptile biodiversity patterns. Hence, the observed biogeographic patterns were

likely influenced by a combination of diversification processes rather than by a single

predominant mechanism. Further, selected genera of Malagasy amphibians and reptiles differed

in the major factors explaining their spatial patterns of richness and endemism. These differences

suggest that key factors in diversification are lineage specific and vary among major endemic

clades. Our study therefore emphasizes the importance of comprehensive analyses across

taxonomic, temporal, and spatial scales for understanding the complex diversification history of

Madagascar's biota. A "one-size-fits-all" model does not exist.

Keywords: Conditional Autoregressive Models, Orthogonally Transformed Beta Coefficients,

Generalized Dissimilarity Modelling, Species Distribution Modelling

The spatial distribution of biodiversity is at the core of biogeography, macroecology,

evolutionary biology, and conservation biology1,2. Biodiversity mapping indices are multi-

faceted concepts with the main components being local endemism, species richness, and species

turnover, of which the two latter correspond to alpha- and beta-diversity as used in community

ecology3,4. In different combinations, these components are invoked to identify biogeographic

regions5-7, prioritize geographic areas for conservation8, assess the effects of conservation

measures9, and/or delimit centers of speciation or extinction. These indices, however, are not

independent of one another. For instance, species turnover across an area is closely related to the

numbers of endemic species within each geographical unit or community, which in turn is often

used to estimate areas of endemism (AOE). These represent the coincident restrictedness of

taxa10-12 and are often used to identify unique geographic areas for biodiversity conservation or

biogeography studies13,14. Clearly, there is an inescapable circularity to these measures, and thus

also to the consequent inferences made regarding biogeographic processes.

Inferences of speciation mechanism fall prey to similar limitations. For example, it is

generally assumed that species formation and diversification of a range of co-distributed taxa

will be triggered or inhibited by similar barriers to gene flow, topographical and geological

settings, climatic conditions and shifts, and competition. Accordingly, it is the default

expectation that similar barriers (e.g., rivers, ecotones, climatic transitions) will lead to similar

patterns of species endemism, turnover, and richness; again, with the underlying assumption that

the observation of similar patterns among diverse species reveals a general causal mechanism of

diversification across all taxa. But there are additional processes by which species richness may

be generated. For example, climatic factors, environmental stability, land area, habitat

heterogeneity, paleogeography, and energy available all could be spatially correlated with

geographical barriers. Thus, any of these mechanisms might be indirectly, but not causally

related to diversification15. Patterns of endemism, on the other hand, are generally considered to

reflect a particular evolutionary history, with areas of endemism corresponding to centers of

diversification16 and often including some element of stochasticity. Consequently, it can be the

case that areas of high endemism often are also characterized by high species richness, though

the inverse is not necessarily true.

Species distribution models (SDMs) allow sophisticated calculations of centers of

historical habitat stability17. Yet, their spatial comparison with current patterns usually follows

narrative approaches and is similar to classical hypotheses of diversification mechanisms, with

no accounting for autocorrelation among the different explanatory variables. Based on either a

single explanatory variable or without employing statistics at all, often biogeography researchers

rely on ad-hoc comparisons with spatial distributions of single environmental factors such as

climate, topography, vegetation, or assumed barriers to dispersal. As a sign of progress, many

methodological advances are being developed to address the various problems described here.

For example, assessments of spatial biodiversity have typically used simple geographic measures

as the unit of analysis, such as the distribution range of individual species, though recent

methodological refinements include the inclusion of phylogenetic relationships among species

and their evolutionary age2,7. Moreover, carefully parameterized SDMs can generate accurate

estimates of distribution ranges18 and novel approaches are being developed to translate patterns

of species richness, endemism and turnover more objectively for determining those

biogeographic regions in greatest need for conservation and protection2,7,8,19-21. Despite this

progress in conceptual and statistical tools, biological explanation of these patterns is still in its

methodological infancy.

Here we aim to employ the latest techniques for sophisticated and improved statistical

methods for identifying the causal mechanisms that have determined the spatial distribution of

Madagascar's herpetofauna. Though the search for the drivers of biological diversification was

initially focused on the Neotropics, considerable attention has more recently been focused on

other areas such as the Australian wet tropics22 and Madagascar23. Madagascar is the world's

fourth largest island and hosts an extraordinary number of endemic flora and fauna. For example,

100% of the native species of amphibians and terrestrial mammals, 92% of reptiles, 44% of

birds, and >90% of flowering plants occur nowhere else24. This mega diverse micro-continent,

initially part of Gondwana, has been isolated from other continents since the Mesozoic. Its

current vertebrate fauna is a mix of only a few ancient Gondwanan clades and numerous

endemic radiations, originating from Cenozoic overseas colonizers arriving mainly from Africa25-

The extraordinary levels of endemism at the level of entire clades in Madagascar, and

their long isolation from their sister lineages, provide a unique opportunity to study the

mechanisms driving divergence and diversification in situ28. Over the past decade, numerous

general mechanisms and models have been formulated to explain biodiversity distribution

patterns and species diversification in Madagascar, pertaining to environmental stability (or

instability), solar energy input, geographic vicariance triggered by topographic or habitat

complexity, intrinsic traits of organisms, or stochastic effects23,29-36. Evidence has supported

numerous hypotheses, though the evidence has typically been marshalled from limited or

phylogenetically-constrained taxa. Comprehensive statistical approaches comparing their relative

importance are rare37.

In this paper we apply an integrative approach to simultaneously test which of several

competing and complementary hypotheses are most strongly correlated with empirical

biodiversity patterns (Fig. 1). We first translate a total of 12 diversification mechanisms or

diversity models into explicit spatial representations. We then use diverse statistical approaches

to assess spatial concordance of these predictor variables with species richness, endemism and

turnover as calculated from original occurrence data of Madagascar's amphibians and reptiles,

with full species-level coverage. Our results best agree with the hypothesis that various

assemblages of species are under the influence of differing causal mechanisms. The clear

message is that the distribution of diverse organismal lineages will depend upon idiosyncratic

factors determined by their specific organismal life-history traits combined with stochastic

historical factors. Thus, any model that endeavors to explain island-wide patterns must

necessarily be complex.

RESULTS

To understand spatial distribution patterns in Madagascar's herpetofauna, we first

compared range sizes, and computed species richness and endemism from the modeled

distribution areas of amphibians and non-avian reptiles (hereafter reptiles). Mean range size (±

standard deviation) in our data set is smaller in amphibians than reptiles taking into account all

species (41,673 ± 55,413 km2 vs. 50,205 ± 84,078 km2; t= 3.981, p< 0.001; df=649.7) and after

excluding species known from only 1 or 2 localities (64,106 ± 57,532 km2 vs. 95,294 ± 87,495

km2; t= 4.511, p<0.001; df=427.4). Microendemics (species with distributions less than

1000km2) constitute 36.5% of all amphibian and 33.6 % of all reptile species in Madagascar

(difference not significant; Z =0.411, p=0.682).

Spatial patterns of species richness are quite similar between the two groups (Fig 2A &

E) and reach highest values in the eastern rainforest; in amphibians, richness peaks in the central

east, whereas in reptiles, it is more evenly distributed across the rainforest biome, with some

areas of high diversity also in the north, west, and southwest. Spatial patterns of endemism in

both groups (Fig 2B & F) reveal two centers in the north around the Tsaratanana Massif and in

the central east. Endemism values for reptiles are also high in southwestern Madagascar, the

most arid region of the island.

We applied Generalized Dissimilarity Modelling (GDM)38,39 to identify areas of

endemism on the basis of turnover patterns for non-avian reptiles and amphibians together. The

major AOE obtained in a 4-class categorization of the originally continuous GDM results (Figure

2C & H) largely mirrors the bioclimatic regions of Cornet (1974).

Our test includes a total of 12 predictor hypotheses, some of which focus on the

geographical pattern in which species diversity is distributed, but without making any clear

assumption about how the species originated (e.g., the mid-domain or topography heterogeneity

hypotheses). Others explicitly refer to mechanisms of diversification and make predictions about

how these processes affected the distribution of species diversity over geographical space (see

Supplementary Documents for detailed accounts). We divided all the hypotheses into two

categories: one in which predictions for continuous two-dimensional spatial richness and

endemism can be derived, and another in which nominal AOE predictions can be derived. The

first category includes (1) the Mid-domain Effect, (2) Topographic Heterogeneity, (3) Climatic

Refugia, (4) Museum (montane refugia), (5) Disturbance-Vicariance, (6) Climate Stability, (7)

Sanctuary and (8) Montane Species Pump. The second category includes (9) the River-Refuge

(large river model), (10) Riverine Barrier (minor and major rivers), (11) Climatic Gradient and

(12) Watershed. All these hypotheses were transformed into explicit spatial representations

(Supplementary Materials) and used as predictor variables for further analyses.

We calculated unbiased correlation of the continuous predictor and test variables

following the method of Dutilleul40, which reduces the degrees of freedom according to the level

of spatial autocorrelation between two variables (detailed results in Supplementary Materials

Table S3). We found that measures of both reptile and amphibian endemism significantly

correlated to the predictor hypotheses of Topographic Heterogeneity, Disturbance-Vicariance

and Museum (montane refugia). Reptile endemism (but not amphibian) is also correlated to

Sanctuary. Correlations to species richness were not tied to measures of endemisms. Whereas

reptile species richness is correlated to the Mid-domain Effect (distance) and Sanctuary

hypotheses, amphibian richness is correlated to the Sanctuary hypothesis as well as to the

Topographic Heterogeneity, Montane Species Pump, Disturbance-Vicariance, Museum, and

River-Refuge hypotheses.

In the univariate correlation analyses of nominal geospatial data (those related to AOE

predictions) we compared the biogeographic zonation of Madagascar as suggested by the GDM

analysis of amphibian and reptile distributions with zonations derived from five predictor

hypotheses. We found all predictor variables (corresponding to the hypotheses Riverine-major

and Riverine-minor, Gradient, River-refuge, and Watershed) to be significantly correlated to the

15-class GDM, and all but watershed with the 4-class GDM zonation (Table 1). Both GDM

classifications share the most overlap with the Riverine and Gradient hypotheses (between 40.9–

54.3% and 56.2–71.1%, respectively; Table 1).

Given the significant correlation of each of the spatial amphibian and reptile biodiversity

patterns with various predictor variables we used mixed conditional autoregressive spatial

models (CAR models) to test the influences of various predictors simultaneously. To avoid over-

parameterization we used AICc (corrected Akaike Information Criterion), an information-

theoretical approach, to compare models with different sets of predictors. We found that complex

models including most of the biogeography hypotheses (continuous predictor variables)

performed best, based on the lowest AICc values, and consequently used these for further

analysis. Detailed contributions of each predictor to the models of richness, endemism and GDM

zonation are summarized in Supplementary Materials Table S4. The top-five variables

contributed 49.4–75.9% to the models. For a more simplified graphical representation (Fig. 3),

we summarized the three Mid-domain Effect hypotheses (latitude, longitude, and distance), the

three principal components representing the Climate Gradient hypothesis, and three hypotheses

focused on topography (Topographic Heterogeneity, Disturbance-vicariance, Montane Species

Pump) were categorized together, respectively (Figs 3 & 4) . We found relevant influences of the

Mid-domain Effect especially on the GDM and the species richness and endemism of reptiles

(30.9%, 32.9% and 45.5%, respectively). However, it is important to point out that almost all the

Mid-domain correlation coefficients were negative. Thus, indicating that Mid-domain Effects do

not play a key role in determining spatial patterning. Climate Gradient effects influenced all the

models of biodiversity equally, contributing roughly a quarter to each (25.1–27.7%), though in

many cases the sign of the contribution varied. However in this case, a positive correlation was

not expected. The topography variables contributed positively to the richness and endemism

models of amphibians and reptiles, with joint influences of 9.1% and 22.4% on richness, and

6.5% and 17.3% on endemism. The two unique hypotheses, Sanctuary and Museum, each

contributed positively to all models, with Museum contributing between 7.1–17.1% (one of the

few hypothesis to contribute >5% and to be positively correlated to all biodiversity

measurements). The Sanctuary hypothesis also contributed positively to all hypotheses, though

to a lesser degree than the Museum hypothesis (which demonstrated little contribution to reptile

endemism).

To assess variation in biogeography patterns among major groups of the Malagasy

herpetofauna, we calculated mixed CAR models using the same methods for richness and

endemism of four exemplar sub-clades: the leaf chameleons (Brookesia), tree frogs (Boophis),

day geckos (Phelsuma) and Oplurus iguanas (including the monotypic iguana genus

Chalarodon). The top contributors to the models were drastically different for several of these

clades (Fig. 4). For instance, the topography variables had strong influences on Boophis richness,

with a joint contribution of 24.5%, but contributed much less to explaining the patterns of most

other groups. Further, the Sanctuary hypothesis had a strong influence on the Brookesia and

iguana models, though it contributed very little to the predictions of endemism in Boophis and

Phelsuma. Mid-domain Effects were apparent in most models but the sign on the correlation and

the contribution of each Mid-domain hypothesis varied considerably.

DISCUSSION

The results of this study clearly demonstrate that single-mechanism explanatory

hypotheses of spatial patterning in Madagascar's herpetofauna (and presumably, other Malagasy

vertebrates) are inadequate. Thus, we propose a novel method for examining and synthesizing

spatial parameters such as species richness, endemism, and community similarity. In this

framework, biogeographic hypotheses are explanatory variables. The resulting mixed-model

geospatial approach to biogeography analyses is both more robust, and more realistic. Our

approach has the potential to reduce bias and subjectivity in the search for prevalent factors

influencing the distribution of biodiversity, both in Madagascar and elsewhere. Currently,

researchers typically approach such questions by univariate and sometimes narrative analyses

that examine the fit of the observed patterns to only single explanatory models or mechanisms

(e.g. in Madagascar33,35,41) or compare a limited number of competing variables in univariate

approaches37. Such analyses are hampered, however, by spatial autocorrelation of biodiversity

patterns and predictor variables thereby inflating type-I errors in traditional statistical tests42,43.

Several solutions have been proposed for this problem. Some authors attempt to exclude spatial

autocorrelation from models44, whereas others incorporate spatial autocorrelation as a predictive

parameter in geospatial models45-47, as was applied in this study.

The results obtained here for some sub-clades are in agreement with previous analyses,

while others are not. For example, the high influence of the mid-domain effect on Boophis

treefrogs, one of the most species-rich frog genera in Madagascar, agrees with a previous

analysis performed by Colwell & Lees48 for all Malagasy amphibians (with a high representation

of Boophis). On the contrary, the negative contributions of the mid-domain effects on the

biodiversity patterns of the other genera in the analysis are obvious given that their centers of

richness and endemism are in either southern or northern Madagascar, but not in central parts of

the island. Previous studies postulated a high influence of topography on the diversification of

leaf chameleons (Brookesia),41,49 though this is not supported by our analysis. This latter example

exemplifies a dilemma of scale, inherent in all comparisons of spatial data sets. In fact, the

distribution of Brookesia is highly specific to certain mountain massifs in northern Madagascar

while the genus is largely absent from the equally topographically heterogeneous south-east.

This absence is probably due to its evolutionary history, with a diversification mainly in the

north and limited capacity for range expansion41. This historical distribution pattern probably

accounts for low influence of the topographic hypotheses on Madagascar-wide Brookesia

richness and endemism, while at a smaller spatial scale (northern Madagascar) these hypotheses

might well have a strong predictive value.

While patterns of richness and endemism of the Malagasy herpetofauna have been

analyzed several times for various purposes based on partial data sets8,35,37,41,48 the analysis of

turnover of species composition and the definition of biogeographic regions following from such

explicit analyses are still in their infancy. For reptiles, Angel's50 proposal of biogeographic

regions based on classical phytogeography, i.e., regions based on plant community

composition51, has usually been adopted52. Later, Schatz53 refined this zonation of Madagascar

based on explicit bioclimatic analyses, and Glaw & Vences54 proposed a detailed geographical

zonation based on the areas of endemism of Wilmé33. The GDM approach herein is the first

explicit analysis of a large herpetofaunal dataset to geographically delimit regions distinguished

by abrupt changes in their amphibian and reptile communities. This model turned out to agree

remarkably well with classical bioclimatic and phytogeographic zonations of Madagascar51,53,

strongly correlated to climatic explanatory variables (Fig. 3). Especially in the 4-classes GDM,

the regions almost perfectly correspond with those proposed by Schatz53 based on bioclimate,

i.e., eastern humid, central highland/montane, western arid, southwestern subarid zone. Although

the coincidence of the precise boundaries of these regions might be methodologically somewhat

biased, as we interpolated community distribution using climate variables in the analysis, the

model is still mainly based on real distributional information of species and thus provides

convincing evidence that amphibian and reptile communities strongly differ among the major

bioclimatic zones of Madagascar.

Several authors have suggested that the current distribution of biotic diversity in the

tropics resulted from a complex interplay of a variety of diversification mechanisms55,56. This

implies that no single hypothesis adequately explains the diversification of broad taxonomic

groups — our results support this assumption. Richness, endemism and turnover of large and

heterogeneous groups exemplified by the all-species amphibian and reptile data sets were in all

cases best explained by complex CAR models. These models have the advantage of

incorporating most or all of the originally included explanatory variables.

Several alternative explanations may account for this outcome. Patterns of biodiversity

may not be strongly correlated to any of the predictor mechanisms simply because none of them

provide the causal mechanism underlying the diversification processes. As another consideration,

spatial predictions of some of the biodiversity hypotheses may have been inaccurate, though we

took great care to avoid such mistakes. In any event, improvements in these methods may result

in different outcomes in future analyses.

Caveats aside, the results of this study almost certainly support a third explanation, that

different clades of organisms are each predominantly influenced by a different set of

diversification mechanisms. In turn, these are driven by intrinsic factors, such as morphological

or physiological constraints, or to extrinsic factors, such as an initial diversification in an area

characterized by a certain topography, climate, or biotic composition. This alternative is

supported by the observation that the patterns of several of the smaller subgroups in our analysis

were indeed best explained by opposing predominant variables, e.g., topographic heterogeneity

and museum (Boophis endemism) vs. climate stability and sanctuary (Brookesia endemism). An

overarching message is that the taxonomic scale of analysis is of extreme importance when

attempting to derive global explanations of biodiversity distribution patterns. Including too many

taxa will blur the existing differences among clades and lead to complex explanatory models,

whereas patterns within specific clades may be best explained by simple models.

The method proposed herein allows for a more objective quantification of the influences

of particular diversification mechanisms on biodiversity patterns, compared to traditional,

univariate approaches. Further developments of the method should especially focus on including

a phylogenetic dimension, and when appropriate (for predictor hypotheses), a temporal

component. Geospatial analyses of biodiversity pattern typically use species as equivalent and

independent data points, though in reality, they are entities with substantial variation in

parameters such as evolutionary age, dispersal capacity and population density, and with

different degrees of relatedness depending on their position in the tree of life. This multilayered

information can be included in various ways in the CAR/OTBC approach, e.g. by plotting

richness and endemism of evolutionary history rather than taxonomic identity, calculating

turnover only for sister species with adjacent ranges, or repeating the calculations for sets of

species defined by particular nodes on a phylogenetic tree. This latter approach— iterating the

analysis for successively more inclusive clades — appears particularly promising for identifying

those moments in evolutionary history wherein shifts in prevalent diversification mechanisms

have occurred. Given this perspective, we can begin to tease apart the diversification histories of

individual clades versus prevailing biogeoclimatic events that shape entire biotas.

MATERIALS AND METHODS

Biodiversity Estimates

Species Distribution Modeling

Species data consisted of 8362 occurrence records of 745 Malagasy amphibian and

reptile species (325 and 420 species, respectively). Species distribution models (SDMs) were

limited to species that had, at minimum, 3 unique occurrence points at the 30 arc-second spatial

resolution (ca. 1 km). The reduced dataset represented 453 species (consisting of 5440 training

points of 248 reptile and 205 amphibian species), with a mean of 12 training points per species

(max= 131). For 107 amphibian and 119 reptile species with only 1-2 occurrence records a 10km

buffer was applied to point localities in place of SDM. The species distribution models were

generated in MaxEnt v3.3.3e57 using the following parameters: random test percentage = 25,

regularization multiplier = 1, maximum number of background points = 10000, replicates = 10,

replicated run type = cross validate.

One limitation of presence-only data SDM methods is the effect of sample selection bias,

where some areas in the landscape are sampled more intensively than others58. MaxEnt requires

an unbiased sample. To account for sampling biases, we used a bias file representing a Gaussian

kernel-density of all species occurrence localities. The bias file upweighted presence-only data

points with fewer neighbors in the geographic landscape59. Species distributions were modeled

for the current climate using the 19 standard bioclimatic variables derived from interpolation of

climatic records between 1950 and 2000 from weather stations around the globe (Worldclim

1.460). Non-climatic variables geology, aspect, elevation, solar radiation, and slope were also

included61,62. All layers were projected to Africa Alber’s Equal-Area Cylindrical projection in

ArcMap at a resolution of 0.91 km2.

Correcting SDMs for Over-prediction

To limit over-prediction of SDMs, a problem common with modeling distributions of

Madagascar biota8,37, we clipped each SDM following the approach of Kremen et al.8. This

method produces models that represent suitable habitat within an area of known occurrence

(based on a buffered MCP), excluding suitable habitat greatly outside of observed range. The

size of the buffer was based on the area of the MCP. We used buffer distances of 20km, 40km,

and 80km, respectively, for three MCP area classes, 0-200km2, 200-1000 km2, and >1000 km2.

All corrected SDMs were proofed by taxonomic experts to ensure reliability; if a model did not

tightly match knowledge of areas where distributions were well documented, or if little prior

information existed regarding a species distribution, or taxonomy was convoluted, and because

of, its expected distribution could not be evaluated, the species was excluded from analyses (n=

Range Sizes, Species Richness and Corrected Weighted Endemism

For descriptive range-size statistics, distribution range-sizes were sampled for all species

at 0.01 degrees2 from corrected SDMs (or buffered point data where applicable) and a student’s

t-test with unequal variance was performed between amphibian and reptile species. To assess

differences in the frequency of microendemics among the two groups, we converted all

distributions that were > or < than 1000 km2 to a value of 0 and 1, respectively. We then

calculated the mean frequency for both groups and ran a binomial test among both groups.

Species richness was calculated separately for amphibians and reptiles by summing the

respective corrected binary SDMs (based on a maximum training sensitivity plus specificity

threshold) and, for species with 1-2 occurrence records, buffered points in ArcGIS. This

provided a high resolution estimate of richness that is less affected by spatial scale and

incomplete sampling than traditional measurements based solely on occurrence records.

Measures of endemism are inherently dependent on spatial scale. We chose a grid scale

of 82 x 63 km, separating Madagascar into 24 latitudinal and 8 longitudinal rows, to reduce

problems associated with estimating endemism over too small or large areas12,35. Endemism was

measured as corrected weighted endemism (CWE), where the proportion of endemics are

inversely weighted by their range size (species with smaller ranges are weighted more than those

with large63) and this value divided by the local species richness12. CWE emphasizes areas that

have a high proportion of animals with restricted ranges, but not necessarily high species

richness. We calculated CWE separately for reptiles and amphibians using SMDtoolbox v1

(Brown in review).

Generalized Dissimilarity Modeling

Generalized Dissimilarity Modeling (GDM) is a statistical technique extended from

matrix regressions designed to accommodate nonlinear data commonly encountered in ecological

studies38. One use of GDM is to analyze and predict spatial patterns of turnover in community

composition across large areas. In short, a GDM is fitted to available biological data (the absence

or presence of species at each site and environmental and geographic data) then compositional

dissimilarity is predicted at unsampled localities throughout the landscape based on

environmental and geographic data in the model. The result is a matrix of predicted

compositional dissimilarities (PCD) between pairs of locations throughout the focal landscape.

To visualize the PCD, multidimensional scaling was applied, reducing the data to three

ordination axes, and in a GIS each axis was assigned a separate RGB color (red, green or blue).

Due to computation limitations associated with pairwise comparisons of large datasets,

we could not predict composition dissimilarities among all sites in our high resolution

Madagascar dataset. To address this, we randomly sampled 2500 points throughout Madagascar

from a ca. 10 km2 grid. We then measured the absence or presence of each of the 679 species at

each locality. We used the same high resolution environmental and geography data used in the

SDM. These 23 layers were reduced to nine vectors in a principal component analyses which

represented 99.4% of the variation of the original data. These data were sampled at the same

2500 localities. Both data (species presence and environmental data) were input into a

generalized dissimilarity model using the R package: GDM R distribution pack v1.1

(www.biomaps.net.au/gdm/GDM_R_Distribution_Pack_V1.1.zip). We then extrapolated the

GDM into the high resolution climate dataset by assigning ordination scores using k-nearest

neighbor classification (k=3, numeric Manhattan distance), calculating each ordination axes

independently38.

The continuous GDM was transformed into a model with four major classes, and each of

these was then classified separately into 3-5 minor classes. The numbers of major and minor

classes were based on hierarchical cluster analyses in in SPSS v1964 using a “bottom up”

approach. The number of classes equaled the number of dendrogram nodes with relative

distances (scaled from 0-1) at 0.71 and 0.63 for major and minor groups, respectively. The

distance cut off can be somewhat arbitrary, however in our data there were obvious

discontinuities (long dendrogram branches between nodes) at these two values. The resulting

classified models were interpolated into high resolution climate space using a k-nearest neighbor

classification as described above.

Biogeography hypotheses

We examined which specific spatial predictions for the three biodiversity patterns:

species richness, endemism and/or in areas of endemism (AOE- the coincident restrictedness of

taxa) in Madagascar could be derived from each of 12 biogeography hypotheses, and then

translated these predictions into spatial models in a GIS.

In a GIS, spatially explicit predictions of the three biodiversity patterns (species richness,

endemism or areas of endemism) were estimated for each biogeography hypothesis. For some of

the hypotheses not all three metrics of biodiversity were calculated due to lacking, or incomplete,

expectations (e.g. not all hypothesis make predictions about AOE). Because of these incomplete

biodiversity pattern predictions, comparisons among hypotheses are statistically non-trivial. This

is in part because few diversification hypotheses capture all facets of biodiversity (species

richness, endemism, AOE). Further, many estimates of biodiversity patterns rely on components

of climate or geography, thus some are based on the same data and are not entirely independent

of each other. Each hypothesis was generated at the spatial resolution of 30 arc-seconds

(matching the resolution of GDM and species richness estimates, later transformed to 0.91 km2).

For the endemism analyses, each biogeography hypothesis was upscaled to match resolution of

the endemism analyses by averaging all values encompassed in cell.

Spatial Statistics

The spatial predictions derived from the various biodiversity hypotheses resulted in either

continuous or nominal categorical data. Conducting statistical tests between data types is

nontrivial and, in some cases, not logical or impossible. Spatial data are represented in GIS by

two different formats: raster and vector. Geospatial raster data are composed of equal sized

squares, tessellated in a grid, with each cell representing a value (often continuous data), such as

elevation. Spatial vector data (commonly called ‘shapefiles’) can be represented by points, lines,

or polygons, such as: localities, roads and countries, respectively. Vector data are non-

topological and represent discrete features. They are often used to depict nominal data, where the

relationship of data categories to others is unknown or non-linear.

Raster data can be converted to vector data (and vice versa) and the data type (i.e.

nominal or continuous) may or may not change when converted. For example, in some cases

continuous data can be converted to ordered categories (ordinal data) when converted from raster

to polygon. However if the same data were converted back to a raster file, it would remain

categorical data due to data loss in the first conversion. Regardless of GIS data format, statistical

tests need be chosen according to the data types, however GIS data format remain equally

important, as often a single data format is required to perform a spatial statistic of interest (i.e.

software input limitations).

Analyses- Continuous Data

To assess a global measurement of correlation between continuous data, we calculated

Pearson correlations following the unbiased correlation method of Dutilleul40 and using the

software Spatial Analysis in Macroecology65.

Analyses- Nominal Categorical Data

Comparisons of nominal categorical spatial data (i.e. AOE predictions compared to

classified GDM) focused on the spatial distributions of the borders between the subunits. We

used the following methods to measure similarities and significance: (1) border overlap, and (2)

Pearson correlation coefficients (r) with Dutilleul’s spatial correlation (see above).

(1) Border overlap was calculated by sampling the landscape at 1 km resolution for the

presence of a border. If present, a point was placed. We then measured the spatial overlap of the

sampled borders of two landscapes. In all analyses, a 10 km buffer was applied to the overlap

calculation, and points datasets that overlap by 10km or less are were considered overlapping

boundaries. To account for differences in the level of subdivision of layers, overlap was

converted to a percentage and averaged for both layers being compared. Country outline was

excluded from all comparisons and thus, only intra-country boundaries were compared.

(2) To assess global correlation between two polygon shapefiles, each shapefile was

converted to a distance raster, measuring the closest distance from any point in the landscape to a

boundary. Using these layers we measured a Pearson correlation (unbiased correlation after

Dutilleul40), where high correlation coefficients represent two landscapes that have congruent

areas that are isolated from boundaries and others congruent areas that are adjacent to

boundaries. Each distance landscape was evenly sampled by 2000 points and correlations were

assessed on the values of these points.

Analyses- Mixed Models of Continuous Data

To determine the influence of each biogeography hypothesis in predicting the observed

biodiversity patterns, we integrated all continuous biogeography hypotheses into a single mixed

conditional autoregression model (CAR) using the software Spatial Analysis in Macroecology65.

To normalize the predictor variables, Box-Cox transformations (Box and Cox 1964) were

performed. The lambda parameter was estimated by maximizing the log-likelihood profile in R

package GeoR47. A Gabriel connection matrix was used to describe the spatial relationship

among sample points66. Using Gabriel networks, short connections between neighboring points,

is preferable (i.e. more conservative67) than using inverse-decaying distances because in most

empirical datasets the residual spatial autocorrelation tends to be stronger at smaller distance

classes68.

The main goal of our mixed spatial analyses were to determine the combination of

biogeography hypotheses that best predict the observed biodiversity patterns. If each explanatory

variable was incorporated natively, due to considerable multi-colinearity, often only a few

variables would end up contributing to a majority of the model. To estimate the true contribution

of each hypothesis in context of a mixed model (even if highly correlated to others), we

developed a novel approach that removes colinearity from the response variables (but in the

process explicit variable identity is temporarily lost). The transformed response variables are

then run in a CAR analysis and the resulting standardized model contributions are then

transformed back into original response variable identities; reflecting the relative contribution of

each in the model. This process is casually referenced here as Orthogonally Transformed Beta

Coefficients (OTBCs).

Orthogonally Transformed Beta Coefficients

Each biogeography hypothesis is standardized from zero to one. This ensured that the

component loadings reflected the relative contribution of each biogeography hypothesis. A

principal component analysis was performed on the standardized biogeography hypotheses using

a covariance matrix. All the resulting principal components (PCs) were extracted and then loaded

as explanatory variables in the CAR model. The CAR analyses were run iteratively, starting with

all PCs as response variables and then excluding each PC that did not contribute significantly to

the model (α = 0.05) until the final model included only PCs that contributed significantly to the

model. Because each PC represented a linearly uncorrelated variable, only the relevant,

independent data were incorporated into the final CAR model. The resulting standardized beta

coefficients (βj from the CAR analyses, Fig. 1 and Equation 1) were then multiplied by the value

of the corresponding component loadings (αij from the PCA, see Equation 1). The absolute value

of the product reflects the relative contributions of each biogeography hypothesis to each PC,

which are weighted by the PC’s contribution in the CAR model (herein termed the weighted

component loadings or WCLif , Equation 1). The weighted component loadings (WCLif, Equation

1) were then summed for each biogeography hypothesis across all PCs (Hi) and depict the

contributions of each hypothesis in the CAR model. The H i value was then converted to

percentages (HPi) to allow comparison among all CAR analyses. A positive or negative

correlation was determined for each biogeography hypothesis by running a separate CAR

analysis using the raw biogeography variables as a single response variable (all other parameters

were matched).

Equation 1

WCLij=|β j|×|α ij|H i=∑i

WCLij H All=∑ij

WCLij H Pi=( H i

Hall)∗100

Acknowledgments

We are grateful to numerous friends and colleagues who provided invaluable assistance

during fieldwork and previous discussions of Madagascar's biogeography, we would like to

particularly thank Franco Andreone, Parfait Bora, Christopher Blair, Lauren Chan, Sebastian

Gehring, Frank Glaw, Steve M. Goodman, Jörn Köhler, Peter Larsen, David C. Lees, Brice P.

Noonan, Maciej Pabijan, Ted Townsend, Krystal Tolley, Roger Daniel Randrianiaina

Fanomezana Ratsoavina, David R. Vieites, and Katharina C. Wollenberg. Fieldwork of MV was

funded by the Volkswagen Foundation. JLB was supported by the National Science Foundation

under Grant No. 0905905 and by Duke University start-up funds to ADY.

References

1 Kent, M. Biogeography and macroecology. Progress in Physical Geography 29, 256-264, doi:10.1191/0309133305pp447pr (2005).

2 Beck, J. et al. What's on the horizon for macroecology? Ecography 35, 673-683, doi:10.1111/j.1600-0587.2012.07364.x (2012).

3 Whittaker, R. H. Vegetation of the Siskiyou Mountains, Oregon and California. Ecol Monogr 30, 279-338, doi:10.2307/1943563 (1960).

4 Whittaker, R. H. Evolution and Measurement of Species Diversity. Taxon 21, 213-251, doi:10.2307/1218190 (1972).

5 Williams, P. H. Mapping variations in the strength and breadth of biogeographic transition zones using species turnover. Proceedings of the Royal Society B-Biological Sciences 263, 579-588, doi:10.1098/rspb.1996.0087 (1996).

6 Kreft, H. & Jetz, W. A framework for delineating biogeographical regions based on species distributions. Journal of Biogeography 37, 2029-2053, doi:10.1111/j.1365-2699.2010.02375.x (2010).

7 Holt, B. G. et al. An Update of Wallace's Zoogeographic Regions of the World. Science 339, 74-78, doi:10.1126/science.1228282 (2013).

8 Kremen, C. et al. Aligning conservation priorities across taxa in Madagascar with high-resolution planning tools. Science 320, 222-226, doi:10.1126/science.1155193 (2008).

9 Hoffmann, M. et al. The Impact of Conservation on the Status of the World's Vertebrates. Science 330, 1503-1509, doi:10.1126/science.1194442 (2010).

10 Platnick, N. I. On areas of endemism. Australian Systematic Botany 4, 2pp.-2pp. (1991).11 Harold, A. S. & Mooi, R. D. Areas of endemism: definition and recognition criteria.

Systematic Biology 43, 261-266, doi:10.2307/2413466 (1994).12 Crisp, M. D., Laffan, S., Linder, H. P. & Monro, A. Endemism in the Australian flora.

Journal of Biogeography 28, 183-198 (2001).13 Terborgh, J. & Winter, B. A method for siting parks and reserves with special reference

to Columbia and Ecuador. Biological Conservation 27, 45-58 (1983).14 Ackery, P. R. & Vane-Wright, R. I. Milkweed butterflies: Their cladistics and biology.

(British Museum of Natural History and Cornell University Press, 1984).15 Hawkins, B. A. et al. Energy, water, and broad-scale geographic patterns of species

richness. Ecology 84, 3105-3117, doi:10.1890/03-8006 (2003).16 Jetz, W., Rahbek, C. & Colwell, R. K. The coincidence of rarity and richness and the

potential signature of history in centres of endemism. Ecology Letters 7, 1180-1191, doi:10.1111/j.1461-0248.2004.00678.x (2004).

17 Carnaval, A. C., Hickerson, M. J., Haddad, C. F. B., Rodrigues, M. T. & Moritz, C. Stability Predicts Genetic Diversity in the Brazilian Atlantic Forest Hotspot. Science 323, 785-789, doi:10.1126/science.1166955 (2009).

18 Kozak, K. H., Graham, C. H. & Wiens, J. J. Integrating GIS-based environmental data into evolutionary biology. Trends in Ecology & Evolution 23, 141-148, doi:10.1016/j.tree.2008.02.001 (2008).

19 Lamoreux, J. F. et al. Global tests of biodiversity concordance and the importance of endemism. Nature 440, 212-214, doi:10.1038/nature04291 (2006).

526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567

20 Linder, H. P. et al. The partitioning of Africa: statistically defined biogeographical regions in sub-Saharan Africa. Journal of Biogeography 39, 1189-1205, doi:10.1111/j.1365-2699.2012.02728.x (2012).

21 Olivero, J., Márquez, A. L. & Real, R. Integrating fuzzy logic and statistics to improve the reliable delimitation of biogeographic regions and transition zones. Systematic biology 62, 1-21 (2013).

22 Graham, C. H., Moritz, C. & Williams, S. E. Habitat history improves prediction of biodiversity in rainforest fauna. Proceedings of the National Academy of Sciences of the United States of America 103, 632-636, doi:10.1073/pnas.0505754103 (2006).

23 Vences, M., Wollenberg, K. C., Vieites, D. R. & Lees, D. C. Madagascar as a model region of species diversification. Trends in Ecology & Evolution 24, 456-465, doi:10.1016/j.tree.2009.03.011 (2009).

24 Goodman, S. M. & Benstead, J. P. The natural history of Madagascar. (University of Chicago Press Chicago, 2003).

25 Yoder, A. D. & Nowak, M. D. Has vicariance or dispersal been the predominant biogeographic force in Madagascar? Only time will tell. Annual Review of Ecology, Evolution, and Systematics, 405-431 (2006).

26 Crottini, A. et al. Vertebrate time-tree elucidates the biogeographic pattern of a major biotic change around the K-T boundary in Madagascar. Proceedings of the National Academy of Sciences of the United States of America 109, 5358-5363, doi:10.1073/pnas.1112487109 (2012).

27 Samonds, K. E. et al. Spatial and temporal arrival patterns of Madagascar's vertebrate fauna explained by distance, ocean currents, and ancestor type. Proceedings of the National Academy of Sciences of the United States of America 109, 5352-5357, doi:10.1073/pnas.1113993109 (2012).

28 Yoder, A. D. et al. A multidimensional approach for detecting species patterns in Malagasy vertebrates. Proceedings of the National Academy of Sciences of the United States of America 102, 6587-6594, doi:10.1073/pnas.0502092102 (2005).

29 Pastorini, J., Thalmann, U. & Martin, R. D. A molecular approach to comparative phylogeography of extant Malagasy lemurs. Proceedings of the National Academy of Sciences of the United States of America 100, 5879-5884, doi:10.1073/pnas.1031673100 (2003).

30 Goodman, S. M. & Ganzhorn, J. U. Biogeography of lemurs in the humid forests of Madagascar: the role of elevational distribution and rivers. Journal of Biogeography 31, 47-55, doi:10.1111/j.1365-2699.2004.00953.x (2004).

31 Yoder, A. D. & Heckman, K. L. Mouse lemur phylogeography revises a model of ecogeographic constraint in Madagascar. (2006).

32 Dewar, R. E. & Richard, A. F. Evolution in the hypervariable environment of Madagascar. Proceedings of the National Academy of Sciences of the United States of America 104, 13723-13727, doi:10.1073/pnas.0704346104 (2007).

33 Wilme, L., Goodman, S. M. & Ganzhorn, J. U. Biogeographic evolution of Madagascar's microendemic biota. Science 312, 1063-1065, doi:10.1126/science.1122806 (2006).

34 Wollenberg, K. C., Vieites, D. R., Glaw, F. & Vences, M. Speciation in little: the role of range and body size in the diversification of Malagasy mantellid frogs. Bmc Evolutionary Biology 11, doi:10.1186/1471-2148-11-217 (2011).

568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612

35 Wollenberg, K. C. et al. Patterns of endemism and species richness in Malagasy cophyline frogs support a key role of mountainous areas for speciation. Evolution; international journal of organic evolution 62, 1890-1907, doi:10.1111/j.1558-5646.2008.00420.x (2008).

36 Pabijan, M., Wollenberg, K. C. & Vences, M. Small body size increases the regional differentiation of populations of tropical mantellid frogs (Anura: Mantellidae). Journal of evolutionary biology 25, 2310-2324, doi:10.1111/j.1420-9101.2012.02613.x (2012).

37 Pearson, R. G. & Raxworthy, C. J. The evolution of local endemism in madagascar: watershed versus climatic gradient hypotheses evaluated by null biogeographic models. Evolution; international journal of organic evolution 63, 959-967, doi:10.1111/j.1558-5646.2008.00596.x (2009).

38 Ferrier, S., Manion, G., Elith, J. & Richardson, K. Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Diversity and Distributions 13, 252-264, doi:10.1111/j.1472-4642.2007.00341.x (2007).

39 Allnutt, T. F. et al. A method for quantifying biodiversity loss and its application to a 50-year record of deforestation across Madagascar. Conservation Letters 1, 173-181, doi:10.1111/j.1755-263X.2008.00027.x (2008).

40 Dutilleul, P., Clifford, P., Richardson, S. & Hemon, D. Modifying the t test for assessing the correlation between two spatial processes. Biometrics, 305-314 (1993).

41 Townsend, T. M., Vieites, D. R., Glaw, F. & Vences, M. Testing Species-Level Diversification Hypotheses in Madagascar: The Case of Microendemic Brookesia Leaf Chameleons. Systematic Biology 58, 641-656, doi:10.1093/sysbio/syp073 (2009).

42 Kreft, H. & Jetz, W. Global patterns and determinants of vascular plant diversity. Proceedings of the National Academy of Sciences of the United States of America 104, 5925-5930, doi:10.1073/pnas.0608361104 (2007).

43 Hoeting, J. A. The importance of accounting for spatial and temporal correlation in analyses of ecological data. Ecological Applications 19, 574-577, doi:10.1890/08-0836.1 (2009).

44 Ohlemuller, R., Walker, S. & Wilson, J. B. Local vs regional factors as determinants of the invasibility of indigenous forest fragments by alien plant species. Oikos 112, 493-501, doi:10.1111/j.0030-1299.2006.13887.x (2006).

45 Bacaro, G. & Ricotta, C. A spatially explicit measure of beta diversity. Community Ecology 8, 41-46 (2007).

46 Bacaro, G. et al. Geostatistical modelling of regional bird species richness: exploring environmental proxies for conservation purpose. Biodiversity and Conservation 20, 1677-1694 (2011).

47 Diggle, P. J. & Ribeiro, P. J. J. Model-based Geostatistics. (Springer, 2007).48 Colwell, R. K. & Lees, D. C. The mid-domain effect: geometric constraints on the

geography of species richness. Trends in Ecology & Evolution 15, 70-76, doi:10.1016/s0169-5347(99)01767-x (2000).

49 Raxworthy, C. J. & Nussbaum, R. A. Systematics, speciation and biogeography of the dwarf chameleons (Brookesia; Reptilia, Squamata, Chamaeleontidae) of northern Madagascar. Journal of Zoology 235, 525-558 (1995).

50 Angel, F. Les Lézards de Madagascar. (Academie Malgache, 1942).

613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657

51 Humbert, H. Les territoires phytogéographiques de Madagascar. Année Biologique 31, 439–448 (1955).

52 Glaw, F. & Vences, M. Amphibians and Reptiles of Madagascar. (Vences, M. and Glaw Verlags, F. GbR., 1994).

53 Schatz, G. E. in Diversity and Endemism in Madagascar (eds W.R. Lourenço & S.M. Goodman) 1–9 (Société de Biogéographie, MNHN, ORSTOM, 2000).

54 Glaw, F. & Vences, M. Field Guide to the Amphibians and Reptiles of Madagascar. Third Edition edn, (Vences and Glaw Verlag, 2007).

55 Bush, M. B. Amazonian speciation: a necessarily complex model. Journal of Biogeography 21, 5-17, doi:10.2307/2845600 (1994).

56 Oneal, E., Otte, D. & Knowles, L. L. Testing for biogeographic mechanisms promoting divergence in Caribbean crickets (genus Amphiacusta). Journal of Biogeography 37, 530-540, doi:10.1111/j.1365-2699.2009.02231.x (2010).

57 Phillips, S. J., Anderson, R. P. & Schapire, R. E. Maximum entropy modeling of species geographic distributions. Ecological Modelling 190, 231-259, doi:10.1016/j.ecolmodel.2005.03.026 (2006).

58 Phillips, S. J. et al. Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological Applications 19, 181-197, doi:10.1890/07-2153.1 (2009).

59 Elith, J. et al. A statistical explanation of MaxEnt for ecologists. Diversity and Distributions 17, 43-57, doi:10.1111/j.1472-4642.2010.00725.x (2011).

60 Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G. & Jarvis, A. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25, 1965-1978, doi:10.1002/joc.1276 (2005).

61 Moat, J. & Du Puy, D. (ed Kew Royal Botanic Gardens) (1997).62 Jarvis, A., Reuter, H. I., Nelson, A. & Guevara, E. (CGIAR-CSI SRTM 90m Database,

2008).63 Williams, P. H. Some properties of rarity scores for site-quality assessment. British

Journal of Entomology and Natural History 13, 73-86 (2000).64 IBM SPSS Statistics for Windows v. 19.0 (IBM Corporation, Armonk, NY, 2010).65 Rangel, T. F., Diniz-Filho, J. A. F. & Bini, L. M. SAM: a comprehensive application for

Spatial Analysis in Macroecology. Ecography 33, 46-50, doi:10.1111/j.1600-0587.2009.06299.x (2010).

66 Legendre, P. & Legendre, L. Numerical ecology. Vol. 2nd (Elsevier, 1998).67 Griffith, D. A. in Practical handbook of spatial statistics (ed S. L. Arlinghaus) 65-82

(CRC Press, 1996).68 Bini, L. M. et al. Coefficient shifts in geographical ecology: an empirical evaluation of

spatial and non-spatial regression. Ecography 32, 193-204, doi:10.1111/j.1600-0587.2009.05717.x (2009).

658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697

Figure 1. Overview of work protocol and dataflow. Three types of original data were input

into the analyses: (1) biogeography hypotheses, (2) geography and climate data and (3) species

locality data. These data were used to predict distribution of species, and the distribution models

used to calculate biodiversity patterns (species richness, corrected weighted endemism,

turnover). We then tested for correlation of these biodiversity patterns with spatial predictions

derived from biogeography hypotheses, and used a multivariate model to simultaneously test the

influences of these hypotheses on the biodiversity patterns. *The response variables constituted

standardized principal components of the raw biogeography hypotheses ** The CAR models

were iterated until only response variables that contributed significantly the model were

included.

Figure 2. Observed Biodiversity Data. Reptile and amphibian species richness (A & E)

measures the number of species present. Endemism (B & F), based on a corrected weighted

calculation, reflects the proportion of unique species present within certain areas. The

generalized dissimiliarity model (GDM) analyzes compositional turnover of communities (here

jointly for amphibians and reptiles) and predicts dissimilarity throughout the landscape based on

an interpolation based on variation in climate and geographic data. The continuous GDM (G) can

then be classified into 4 major and 15 minor areas of endemism (H & C).

Figure 3. Explanatory contribution of continuous biogeography hypotheses to a conditional

autoregressive spatial model of each observed biodiversity measurements. Only hypotheses

contributing >5% are shown; see Supplementary Table S4 for complete and detailed data). Mid-

domain: I. latitude, II. longitude, III. distance. Climate-Gradient: I. PC1, II. PC2, III. PC3.

Climate-Stability: I. Precipitation Stability, II. Climate Stability (temperature and precipitation).

Topography: I. Topographic Heterogeneity, II. Disturbance-vicariance, III. Montane Species

Pump. An asterisk marks hypotheses that contributed negatively to the mixed CAR model.

Figure 4. Contribution of continuous biogeography hypotheses to a conditional

autoregressive spatial model of species richness and endemism for four focal groups. Only

hypotheses contributing >5% are shown; see Supplementary Table S4 for complete and detailed

data). Mid-domain: I. latitude, II. longitude, III. distance. Climate-Gradient: I. PC1, II. PC2, III.

PC3. Climate-Stability: I. Precipitation Stability, II. Climate Stability (temperature and

precipitation). Topography: I. Topographic Heterogeneity, II. Disturbance-vicariance, III.

Montane Species Pump. An asterisk marks hypotheses that contributed negatively to the mixed

CAR model. (1) Brookesia chameleons (number of species = 27, number of original distribution

points = 178). (2) Boophis treefrogs (n of spp.= 77, n of pts.=460) (3) Phelsuma day geckos (n of

spp.= 28, n of pts.= 304) (4) oplurid iguanas (Oplurus plus the monotypic Chalarodon; n of

spp.= 7, n of pts.= 147). Subgroups and colors of pie charts as in Fig. 3. An asterisk marks

hypotheses that contributed negatively CAR. For all hypotheses (with exception of the climate-

gradient variables) a positive correlation was expected between biodiversity metrics.

Table 1. Correlations of nominal biodiversity hypotheses to Generalized Dissimilarity Models (upper: 15-class; lower: 4-class

transformation of the original model). The left three columns show calculations based on the percentage of overlapping cells between

boundaries. High mean values depict high levels of shared boundaries among GDM and AOE (derived from the respective

hypothesis). R-values reflect non-spatial Pearson product-moment correlation coefficients. To assess significance of raster data, we

used an unbiased correlation following the method of Dutilleul (1993) that reduces the degrees of freedom according to the level of

spatial autocorrelation between the two variables.

Hypothesis Percent overlapping cells

(10km buffer)

Correlation to Generalized

Dissimilarity Model

GDM 15

classesHypothesis Mean r F-stat df p

Riverine- Major 77.11% 35.31% 56.21% 0.427 13.984 62.537 <.001

Riverine- Minor 72.43% 49.41% 60.92% 0.503 28.599 84.415 <.001

Gradient 74.63% 67.51% 71.07% 0.403 24.974 128.641 <.001

Watershed 60.14% 25.92% 43.03% 0.229 5.887 106.824 0.017

GDM -4 100.00% 38.71% 69.35% 0.551 26.257 60.171 <.001

Riverine – Refuge 71.00% 20.69% 45.85% 0.440 11.450 47.693 0.001

GDM- 4

classesHypothesis Mean Pearson's r F-stat df p

Riverine- Major 41.6% 40.3% 40.9% 0.556 12.519 28.042 0.001

Riverine- Minor 39.4% 58.1% 48.7% 0.51 16.345 46.550 <.001

Gradient 39.6% 69.0% 54.3% 0.378 13.012 77.926 <.001

Watershed 33.2% 33.3% 33.2% 0.213 3.775 79.701 0.056

GDM -15 52.1% 100.0% 76.1% 0.551 26.257 60.171 <.001

Riverine – Refuge 30.3% 22.1% 26.2% 0.837 52.355 22.433 <.001

SUPPLEMENTARY MATERIALS

Hypotheses

Climate Stability Hypothesis (Fig S2.J)

Climate stability is thought to create greater climatic stratification across environmental

gradients (Dynesius and Jansson 2000, Jansson and Dynesius 2002). In stable climates, orbitally

forced species’ range dynamics (ORD) are low, allowing localized populations to persist, and

thus become highly specialized and differentiated. Like the Gradient Hypothesis (following), this

model also focuses on bioclimatic disparities, but additionally incorporates climate stability. This

hypothesis states that areas of climate stability, particularly those with climatic stratification,

should possess higher species richness and endemism.

To estimate this model for Madagascar, coliniarity was measured for all BIOCLIM layers

(Bio1–19) using a Pearson Coefficient. If R2 values exceeded 0.5, one of the layers was

excluded. We preferentially selected layers based on raw data (e.g. selecting mean annual

precipitation over seasonality). The following layers were excluded: Bio3, Bio7, Bio9, Bio13–

18. For each remaining BIOCLIM layer we calculated the standard deviation of each cell

throughout the four time periods for which climate data were available (0 kya, 6kya, 21 kya, 120

kya). The resulting standard deviation of each BIOCLIM layer was standardized to 1 to account

for different units in raw data. All standardized stability layers were summed to create the final

climate stability layer; with lower values representing higher climate stability through time.

Disturbance-Vicariance Hypothesis (Fig S2.I)

Under this model, the major factor contributing to diversification was temperature fluctuations

(Colinvaux 1993, Bush 1994, Haffer 1997), rather than fluctuations in precipitation and forest

fragmentation (as in the preceding Refuge and River hypotheses). This hypothesis states cyclic

fluctuations of temperature during the Quaternary caused reoccurring displacement of taxa

towards lower and higher altitudes (during cool and warm periods, respectively). Range

retractions of taxa into the highlands occasionally resulted in allopatric divergence into new

species. The gradual displacement of temperature specialized taxa would cause reoccurring

invasions and counter-invasions of heterogeneous landscapes by both montane and lowland taxa.

This hypothesis predicts species richness and endemism would be highest in areas of topographic

heterogeneity and temperature instability.

This hypothesis was generated by measuring colinearity between all BIOCLIM layers

corresponding to temperature (Bio1– Bio11, see Climate Stability above for details). The

following layers were excluded: Bio3, Bio7 and Bio11. For each remaining BIOCLIM layer we

calculated the standard deviation of each cell throughout the four time periods for which climate

data were available (0 kya, 6kya, 21 kya, 120 kya). This layer was standardized from 0 to 1 to

account for different units between layers. All standardized stability layers were summed to

create the final temperature stability layer with lower values representing higher temperature

stability. This layer was inverted and multiplied by a standardized version of the final

topographic heterogeneity layer (see above). Higher values represented areas with high

temperature instability through time and high topographic heterogeneity.

Gradient Hypothesis (Fig S1.E)

According to the Gradient Hypothesis, diversification of Malagasy taxa was driven by

bioclimatic disparities throughout the island (i.e. between the east and west), causing parapatry

of populations along environmental gradients. This hypothesis was first formulated by Endler

(1982), however more recently it was adapted and formulated in detail for Madagascar by

Vences et al. (2010). Vences et al. focused on the climatic stratification longitudinally between

the dry west and humid east of Madagascar, terming this species case the Ecogeographic

hypothesis. Under the Gradient hypothesis populations adapt to local ecotones, diverging from a

generalist ancestor (or one of broader ecological tolerance). Due to local specialization, gene

flow within ecological similar sites is higher than those distributed at ecologically different sites

across a gradient. The subdivision of populations create a scenario where drift or selection can

override gene flow among ecogeographic subpopulations (Fisher 1930) and daughter species

occupy separate, adjacent niches. The exact mechanism of speciation is controversial (e.g.

prezygotic isolation, behavioral isolation or reproductive barriers) and beyond the focus of this

study. This hypothesis invokes no barriers or mechanism of allopatry and predicts species

richness and endemism to be highest in areas of high bioclimatic stratification. This GIS

prediction was obtained from Pearson & Raxworthy (2009). In our continuous CAR-OBTC

analyses, this hypothesis was represented by the first 3 PCs from a PCA on the 19 current

BIOCLIM data for Madagascar (Hijmans et al. 2005).

Mid-domain Effect Hypothesis (Fig S2. A-C)

Initially described to explain latitudinal trends in species richness, this mathematical hypothesis

demonstrates that if species’ ranges are distributed randomly between northern and southern

geographic limits, the highest overlap of species ranges would be in the middle (Lees 1996, Lees

and Colwell 2007). For Madagascar, this hypothesis (in two dimensions) results in increased

overlap of species toward the center of country, over the Ankaratra highlands, resulting from the

sum of random overlapping species ranges. We explore four variants of this hypothesis: 1

dimension (the mid-domain of latitudinal, altitude or longitudinal gradients) and 2 dimensions, as

described above (the mid-domain of both latitudinal and longitudinal gradients). Note the mid-

domain of altitude is the same GIS calculation of the Museum hypothesis. Thus, throughout the

manuscript this hypothesis is referred to as the Museum hypothesis rather than the mid-domain

altitude hypothesis. In one dimension, the latitude hypothesis predicts that species richness

would be highest around 18° S, or in two dimensions, centered around 18° S and 46.5° E. The

mid-domain hypotheses are included as a null hypothesis for spatial variation in species richness;

these hypotheses invoke no barriers or ecological/ habitat heterogeneity and rely solely on the

random distribution of species in a defined geographic space.

Montane Species Pump Hypothesis (Fig S2.M)

This hypothesis predicts that montane regions have higher species richness because of their

topographic complexity and climatic zonation - both increase opportunities for allopatric and

parapatric speciation (e.g., Moritz et al. 2000; Rahbek and Graves 2001; Hall 2005; Fjeldsa°

and Rahbek 2006; Kozak and Wiens 2007). This hypothesis predicts that speciation should be

highest in areas of high topographic and climatic heterogeneity. Within those habitats, rates of

speciation should be highest in mid-elevations. This hypothesis predicts species richness and

endemism would be highest in areas of topographic and climatic heterogeneity.

To create a spatially explicit model of this hypothesis we calculated the mean standard

deviation of the first three climate PCs (based on Bioclim current data) at ca. 10 km2 square

neighborhood of each cell. Each was summed together and then the product was standardized

from 0-1. The resulting layer was then multiplied by a standardized (0-1) topographic variation

hypothesis. High values depict areas of high topographic and climatic heterogeneity.

Museum Hypothesis (Fig S2.D)

According to this hypothesis speciation occurred in montane habitats. In the Montane Museum

hypothesis, more species exist at intermediate elevation because these elevations were simply

occupied the longest, because of this, there has been more time for speciation and the

accumulation of species in these habitats when compared to habitats at lower and higher

elevations (Stebbins 1974; Stephens and Wiens 2003). This hypothesis predicts levels of

endemism and richness will be highest in the middle elevations. Note that in execution, the

prediction for this hypothesis is identical to a Mid-domain hypothesis of altitude. The GIS

representation of this hypothesis for Madagascar represents the median elevation (378m), where

the overlap of species should peak. This elevation was given a value of one and from this

elevation, values linearly transitioned to zero at the maximum and minimum elevations.

Paleogeographic Hypothesis

This hypothesis states that vicariate differentiation of Malagasy lineages is associated with

formation of geologic barriers to dispersal. Each hypothesis is specific to the focal

paleogeographic event. There are two major classes of geological events that have occurred since

the separation from India (ca. 65 MYA). The first, directly a result from separation from India, is

the orogeny of eastern escarpment. This hypothesis states that deep lineages, pre-65Mya, should

exist between the east and western species. A second, more local barrier is the volcanisms of

several regions: Tsaratanana, Manongarivo, Ampasindava (ca. 50 Mya), Ankaratra (phase 1 ca.

28 Mya, phase 2 ca.15 Mya, phase 3 ca. 2 Mya) and Ambre (ca. 2 Mya, Krause 2007). Clades

should exhibit breaks around volcanic activity, though given localization of these barriers; biota

in most cases should have been able to dispersal around, though perhaps experiencing reduced

gene flow. All paleogeographic hypotheses are difficult to test because more strongly than other

hypotheses, they are dependent on the location of ancestral populations and the timing of

geological events relative to a clade's diversification. Given the variation of ages of origins in

Madagascar's vertebrates, it is likely that some clades were affected by most of the major

paleogeographical events while others were only affected by some. Because of these factors, it is

unlikely that species exhibit congruent biogeographic patterns. Thus in this paper, we were

unable to test this hypothesis.

Refuge Hypothesis -(Fig S2.D) This hypothesis holds that episodic fragmentation of forests

resulted in isolated patches of wet forest and this caused vicariant differentiation between

adjacent patches. These transitions were driven by periodic changes (every 20-100 Ky) in

insolation associated with Milankovitch cycles (aberrations of the orbit of the earth around the

sun due to the slight asymmetrical shape of the earth). Recurrent changes in insolation caused

various dry periods followed by humid periods, particularly pronounced at tropical latitudes. The

evolutionary consequences of paleoecological changes in climate likely depend on the regional

topography, predominant weather patterns and the impact on the ecosystem (for example, a

slight reduction in rain may have different biological consequences in a spiny forest versus

rainforest). In Amazonia during the late Tertiary and Quaternary during dry climatic periods, it

has been argued that extensive humid forests survived due to subtle topographic variation that

facilitated rainfall gradients adjacent to the Andes, Guianan highlands, Rondonia and hilly areas

east of Pará (Haffer 1969; Vanzolinii1970, 1973; Brown et al., 1974; Prance, 1982, 1996).

In addition to changes in insolation associated with Milankovitch cycles, the gradual

drifting of Madagascar northward towards the equator likely led to habitat changes, presenting

another source of refugial isolation. Thus, it is plausible that refugia persisted and species with

narrow ecological niches became isolated and diverged into separate species. To create a

continuous spatial representation of this for Madagascar, first colinearity was measured for each

BIOCLIM precipitation layer (Bio12 – Bio19) using a Pearson Correlation. If R2 values

exceeded 0.5, one of the layers was excluded. This resulted in the exclusion of Bio13, Bio15,

Bio16 and Bio17. We preferentially selected layers based on raw data (e.g. selecting mean

precipitation over seasonality). For each remaining BIOCLIM layer, we calculated the standard

deviation of each cell throughout the four time periods for which climate data were available (0

kya, 6kya, 21 kya, 120 kya). This layer was standardized from 0 to 1 to account for different

units/decimal places in raw data between layers. All standardized stability layers were summed

to create the final precipitation stability layer with lower values representing higher precipitation

stability.

Riverine Barrier Hypothesis (Fig S1. A,B).

Malagasy rivers are thought to have acted as barriers separating populations, resulting in intra-

riverine species and subspecies. As the geology of Madagascar has remained relatively stable

since absolute isolation major rivers should have persisted (at least seasonally). Under this

hypothesis, we expect vicariate differentiation of Malagasy lineages associated with large

tributaries. There have been several criticisms of this hypothesis, such that rivers frequently

change course, causing land and its inhabitants to passively transfer across the barrier, rivers

cease to act as barriers due to the lack geographic separation at headwaters (Wallace 1852) or

temporal fluctuations of climate causing rivers to change in size (e.g. the sizes of any rivers were

dramatically reduced during the Pleistocene).

To create a spatial prediction for this hypothesis, we selected all major permanent rivers

that have headwaters above 1000m and created polygons from the lowland areas between major

rivers and 1000m contour line. A second calculation focused on major riverine units composed

of areas between rivers with headwaters above 2000m and created biogeographic units from

lowland areas between rivers and 1000m contour line.

River-Refuge Hypothesis (FigS1.C)

The River-Refuge hypothesis, initially described by Haffer (1992, 1993) for the Neotropics, was

more recently proposed as a model of Malagasy diversification (Craul et al. 2007). This

hypothesis combines aspects of the Riverine Barrier hypothesis and the Refuge hypothesis under

which it states that lowland vicariant speciation occurs in refugia separated by broad lowland

rivers and by considerable unsuitable terrain in the headwaters.

To estimate this model, we combined the Riverine and a binary refuge hypothesis. The

continuous refuge layer was converted to a binary model by converting the top quartile to 1 and

all other values to 0. To account for differences between precipitation stability in arid areas

(versus wet), we excluded areas with less than 50cm in any of the time periods. Areas above

1000m were also excluded. Regions of the Riverine layer and binary refuge layer were then

combined into smaller river-refuge subunits.

Sanctuary Hypothesis (Fig S2.L)

Past climate changes greatly altered the distributions of organisms through time; causing local

extinctions, bottlenecks, isolation, range expansion and contraction of populations. Sanctuaries

represent specific areas of habitat stability that have remained present through time, differing

from refugia (which do not invoke geospatial consistency) and species track suitable habitat

across geography (Recuero & García-París 2011). We estimated sanctuaries in Madagascar by

calculating SDMs for 453 species (a subset of the GDM dataset, using species with three or more

unique occurrence localities). The SDM was projected into three historic time periods: LGM,

120KYA, 6 KYA. All SDMs were converted to binary models using the maximum training

sensitivity plus specificity threshold. The maximum training sensitivity plus specificity

threshold: 0.080 (SD +-0.047), area: 0.316 (SD +-0.196), training omission: 0.005 (SD +- 0.017),

number of training samples (mean: 12.321, max 137, min 3, SD 14.414). For each species, all

four binary SDMs were summed. The resulting layer was reclassified so that values of 3 and

below were converted to zero and values of 4 were converted to 1. Under this classification,

areas where the species was present for all four time periods were considered ‘sanctuaries’. This

was repeated for all species and all sanctuaries were summed to estimate areas of higher species

richness and endemism.

Topographic Heterogeneity (Fig S2.E)

In several studies, the level of topographic variation has been observed to be positively

correlated to species richness patterns and centers of endemism (Kerr & Packer 1997; Rahbek

and Graves 20001; Jetz & Rahbek 2002; Jetz et al. 2004). To characterize this hypothesis for

Madagascar we measured the standard deviation of elevation at ca. 10 km2 of each pixel.

Watershed Hypothesis (Fig S1.D)

One of the more recent diversification hypotheses is the Watershed hypothesis (Wilmé et al.

2006). According to this model, climatic changes caused retraction of forests to the surrounding

major rivers. If the headwaters of adjacent rivers were at lower elevations, the intervening areas

between rivers (the watersheds) become arid and forests populations became isolated, serving as

areas of speciation. By contrast, if the headwaters of rivers were higher elevations, the watershed

served as areas of retreat and forest refugia remained connected among rivers. These watersheds

are expected to contain proportionally much lower diversity and endemism. This GIS prediction

was obtained from Wilmé et al. (2006).

Additional references.

Brown Jr., K.S., Sheppard, P.M., Turner, J.R.G. 1974. Quaternary refugia in tropical America:

evidence from race formation in Heliconius butterflies. Proc. R. Soc. Lond. B. 187: 369-

Bush, M.B. 1994. Amazonian speciation: a necessarily complex model. Journal of

Biogeography 21::5–17.

Colinvaux, P.A. 1993. Pleistocene biogeography and diversity in tropical forests of South

America P. Goldblatt (Ed.), Biological Relationships between Africa and South America,

Yale University Press, New Haven, CT.

Craul, M., Zimmermann, E., Rasoloharijaona, S., Randrianambinina, B., Radespiel, U. 2007.

Unexpected species diversity of Malagasy primates (Lepilemur spp.) in the same

biogeographical zone: a morphological and molecular approach with the description of

two new species. BMC Evolutionary Biology 7:83.

Dynesius, M., Jansson, R. 2000. Evolutionary consequences of changes in species_

geographical distributions driven by Milankovitch climate oscillations. Proc. Natl Acad.

Sci. U.S.A. 97:9115–9120.

Endler, J. 1982. Pleistocene forest refuges: fact or fancy? In Prance, G.T. (Ed.). Biological

Diversification in the Tropics. New York: Columbia University Press, p. 179-200.

Fisher, R.A. 1930. The Genetical Theory of Natural Selection. Clarendon Press.

Goodman, S. M. & Benstead, J. P. The natural history of Madagascar. (University of Chicago

Press Chicago, 2003).

Fjeldsa°, J., Rahbek, C. 2006. Diversification of tanagers, a species-rich bird group, from the

lowlands to montane regions of South America. Integr. Comp. Biol. 46:72–81.

(doi:10.1093/icb/icj009)

Haffer, J. 1969. Speciation in Amazonian forest birds. Science 165: 131-137.

Haffer, J. 1997.Alternative models of vertebrate speciation in Amazonia: an overview

Biodiversity and Conservation, 6:451–476.

Haffer,, J. 1992. On the “river effect” in some forest birds of southern Amazonia. Bol. Mus.

Para. Emilio Goeldi, sér. Zool. 8:217-245.

Haffer, J. 1993. Time’s cycle and time’s arrow in the history of Amazonia. Biogeographica

69:15-45.

Hall, J. P. 2005. Montane speciation patterns in Ithomiola butterflies (Lepidoptera:

Rhiodinidae): are they consistently moving up in the world? Proc. R. Soc. B 272:2457–

2466. (doi:10.1098/rspb.2005.3254)

Jansson, R., Dynesius, M. 2002. The fate of clades in a world of recurrent climatic change:

Milankovitch oscillations and evolution. Annu. Rev. Ecol. Syst. 33:741–777.

Jetz, W., Rahbek, C. 2002. Geographic range size and determinants of avian species

richness. Science 297:1548–1551.

Jetz, W., Rahbek, C., Colwell, R.C. 2004. The coincidence of rarity and richness and the

potential signature of history in centers of endemism. Ecol. Lett. 7:1180–1191.

Kerr, J.T. Packer, L. 1997. Habitat heterogeneity determines mammalian species richness in

high energy environments. Nature. 385:252-254..

Kozak, K.H., Wiens, J.J. 2007. Climatic zonation drives latitudinal variation in speciation

mechanisms. Proc. R. Soc. B 274:2995-3003. doi: 10.1098/rspb.2007.1106

Krause, D. W. 2003. Late Cretaceous vertebrates of Madagascar: A window into Gondwanan

biogeography at the end of the Age of Dinosaurs. Pp. 40-47 in S. M. Goodman and J. P.

Benstead (eds.), The Natural History of Madagascar. University of Chicago Press,

Chicago.

Lees, D. C. 1996. The Périnet effect? Diversity gradients in an adaptive radiation of

butterflies in Madagascar (Satyrinae: Mycalesina) compared with other rainforest taxa,

Pages 479-490 in W. R. Lourenço, ed. Biogéographie de Madagascar. Paris, Editions de

l'ORSTOM.

Lees, D. C., Colwell R.K. 2007. A strong Madagascan rainforest MDE and no equatorward

increase in species richness: Re-analysis of 'The missing Madagascan mid-domain

effect', by Kerr J.T., Perring M. & Currie D.J (Ecology Letters 9:149-159, 2006). Ecology

Letters 10:E4-E8.

Moritz, C., Patton, J. L., Schneider, C. J., Smith, T. B. 2000. Diversification of rainforest

faunas: an integrated molecular approach. Annu. Rev. Ecol. Syst. 31:533–563. (doi:10.

1146/annurev.ecolsys.31.1.533)

Pearson, R.G., Raxworthy C.J. 2009. The evolution of local endemism in Madagascar:

watershed versus climatic gradient hypotheses evaluated by null biogeographic

models. Evolution 63:959–967.

Prance, G.T. 1982. Biological diversification in the tropics. New York: Columbia University.

Prance, G.T. 1996. Islands in Amazonia. Phil. Trans. R. Soc. Lond. B 351: 823-833.

Rahbek, C. 1997. The relationship among area, elevation, and regional species richness in

Neotropical birds. Am. Nat. 149:875–902.

Rahbek, C., Graves, G. R. 2001. Multiscale assessment of patterns of avian species richness.

Proc. Natl Acad. Sci. USA 98:4534–4539. (doi:10.1073/pnas.071034898)

Recuero, E., García-París, M. 2011. Evolutionary history of Lissotriton helveticus: Multilocus

assessment of ancestral vs. recent colonization of the Iberian Peninsula. Molecular

Phylogenetics and Evolution 60: 170-182.

Stebbins, G. L. 1974. Flowering plants: evolution above the species level. Harvard University

Press, Cambridge, Mass.

Stephens, P. R., Wiens, J.J. 2003. Explaining species richness from continents to

communities: the time-for-speciation effect in emydid turtles. American Naturalist

161:112–128.

Vanzolini, PE. 1970. Zoología sistemática, geografía e a origem das espécies. São Paulo:

Instituto Geográfico de São Paulo. 56 p

Vanzolini, P.E. 1973. Paleoclimates, relief, and species multiplication in equatorial forests. In

Meggers, B.J., Ayensu E.S.. Duckworth, W.D. (Eds.). Tropical forest ecosystems in Africa

and South America: A comparative review. Washington: Smithsonian Institution. p.

255-258.

Wallace, A.R. 1852. On the monkeys of the Amazon. London. Proc. Zool. Soc., 1852:107-110.

Wilmé, L., Goodman, S.M., Ganzhorn, J.U. 2006. Biogeographic evolution of Madagascars

microendemic biota. Science 312:1063– 1065.

Supplementary Figure S1. Nominal Biogeography hypotheses. These hypotheses are

comprised of non-order categories from which relationships among contained data are not known

or non-linear. Four hypotheses fell into this category: The Riverine (major and minor), River-

Refuge, Watershed and Gradient. See table S1 for summary of each hypotheses.

Supplementary Figure S2. Continuous Biogeography hypotheses. Nine hypotheses fell into

this category: Mid-domain (longitude, latitude and distance), Museum, Topographic

Heterogeneity, Gradient* (PC1-PC3), Disturbance-vicariance, Climate Stability, Precipitation

Stability, Sanctuary and Montane Species Pump (see table S1 for summary of each hypotheses).

Inlayed tables represent the percent contribution of each corresponding hypothesis in the CAR

model with the lowest AICc of each observed biodiversity measurements. *The values of the

three climate principal components are not necessary assumed to reflect a positive correlation to

endemism and species richness, however, we are assuming each reflects a prediction of a linear

correlation (either positive or negative). The inclusion of the three climate principal components

is the result of the power CAR models and the ability to include multiple explanatory variables.

Due to the statistical limitations associated with nominal hypotheses, if a hypothesis could be

depicted by continuous data (even if it required several variables) they were converted to this

format and incorporated into the CAR. For example, if nominal data represented classified

continuous data, we include the continuous data (such as the 3 PCs of climate data).

Supplementary Table S1. Major Biogeographic Predictions Relevant to Madagascar.

Hypothesis Description Key Factors EffectsPredictions for Reptiles

and AmphibiansGIS Prediction

Temporal

ScopeKey Citations

Climate Stability

(Fig S2.J)

Climate stability creates

greater climatic stratification

across environmental

gradients

Stable climate; both

seasonally and

through geologic

time; no barrier is

necessary

In stable climates, orbitally

forced species’ range dynamics

(ORD) are low, allowing

localized populations to persist,

and thus become highly

specialized and differentiated.

Higher levels of

endemism in areas of

climatic stability

Use GIS and

spatiotemporal explicit

climate data to estimate

climate stability; stable

areas should harbor

higher species richness

and endemism.

None Dynesius & Jansson,

2000; 2002

Disturbance-

vicariance

(Fig S2.L)

Allopatry results from

altitudinal range retractions

caused by temperature

fluctuations.

Temperature

fluctuations,

changes in CO2 and

habitat

heterogeneity

(usually associated

with changes in

altitude)

Decreased temperatures allow

cool adapted species to disperse

south. Cyclic fluctuations in

temperature cause populations to

habitat track attitudinally, when

temperatures are at their highest,

populations become isolated on

sky islands

Diversity with

monophyletic lineages are

associated with a single

region. Most common

ancestor of sister clades

date to Pleistocene.

Estimate areas of high

temperate fluctuations

adjacent areas of slope;

higher values reflect

Quaternary Colinvaux 1993,

Bush 1994, Haffer

1997; Raxworthy

Nussbaum 1995.

Gradient

(Fig S1.E)

Parapatry of populations due

to environmental gradients

Divergent selection

and an

environmental

gradient; no barrier

Parapatric speciation across

climate space

Sister taxa are found in

different habitats along an

environmental gradient

Cluster analyses of all

current climate data to

estimate areas of

None Endler 1982

is necessary endemism.

Mid-domain

Effect

(Fig S2. A-C)

Species’ ranges are

distributed randomly between

northern and southern

geographic limits, the highest

overlap of species ranges

would be in the middle.

Geographic space No mechanism invoked Species richness should be

highest in mid-domains

Richness is highest in the

mid-domain of latitude,

longitude and elevation.

None Lees 1996, Lees and

Colwell 2007

Montane Species

(Fig S2. M)

Topographic complexity and

climatic zonation of

mountains increase

opportunities for allopatric

and parapatric speciation

Topographic and

climatic

heterogeneity

Allopatric and parapatric

speciation across elevations

Sister taxa share common

ancestry with montane

ancestor. Extant montane

species display higher

levels of intraspecific

genetic variation.

Estimate areas of

topographic and climatic

heterogeneity- high

values reflect centers of

high endemism and

species richness

None Moritz et al. 2000;

Rahbek and Graves

2001; Hall 2005;

Fjeldsa° and Rahbek

2006; Kozak and

Wiens 2007, Roy et

al 1997, Fjeldsa et al

Museum

(Fig S2.D)

More species occur at

intermediate elevations

simply because these

elevations were occupied the

longest and there has been

more time for speciation and

the accumulation of species in

Extended time

occupying in mid-

elevations

Increased differentiation at mid-

elevations

More species occur at

intermediate elevations

because these elevations

were occupied the longest

Use GIS to calculate the

median elevation of

Madagascar which

should possess the

highest species richness

and endemism

None Stephens and Wiens

these habitats relative to those

at lower and higher elevations

Paleogeographica

Vicariant differentiation of

Malagasy lineages is

associated with formation of

geologic barriers to dispersal.

Each hypothesis is specific to

the focal paleogeographic

event.

Geological changes

resulting in vicariant

events

Vicariant differentiation across

barriers

Distinct east and west

lineages associated with

the central mountains, and

between the

southern/central/northern

massifs, tsingys.

Not Calculated Specific to

geological

Goodman &

Benstead 2003

Refuge

(Fig S2. K)

Allopatry due to retraction of

wet habitats

Repeated cycles of

drastic fluctuation in

precipitation

Episodic fragmentation of

forests resulting in isolated

patches of wet forest causing

vicariant differentiation between

adjacent patches

Evolutionary lineages

associated with refugia

(areas of continued

precipitation relative to

regional mosaic of

habitats)

Use GIS and

spatiotemporal explicit

climate data to estimate

stable wet habitats, these

areas reflect centers of

endemism

Cenozoic

(Tertiary

Quaternary

Haffer 1969m 1990,

1999, Endler (1982),

Brown (1987) Nores

(1999)

Riverine (Fig

S1. A, B)

Allopatry due to rivers acting

as barriers.

Permanent large

rivers

Vicariant differentiation of

Malagasy lineages associated

with large tributaries

Reciprocal monophyly of

clades on opposite sides of

Measured inter-riverine

areas which are areas of

endemism

None Wallace 1853,

Capparella 1991,

Patton et al. 1994,

Goodman &

Ganzhorn, 2004

River-Refuge Allopatry due the restriction

of wet habitats to lower

Reduced

precipitation,

Similar to Riverine Hypothesis;

fragmental faunal distributions

adjacent intra-riverine

Combine the Riverine

Hypothesis subunits and

tertiary

Haffer 1992; 1993,

(Fig S1.C) elevations; higher elevation

habitats and intervening rivers

act as barriers

maintenance of

permanent rivers

into intra-riverine corridors,

isolation is associated with

increased aridity adversely

affecting habitat suitability at

headwater regions

corridors a binary precipitation

stability /low elevation

layer. Resulting areas

depict areas of high

predicted endemism.

(Post-

Miocene)

Craul et al 2007

Sanctuary

(Fig S2.L)

Extinction occurs more often

in instable habitats; thus

stable areas accumulate

species.

Climate fluctuations

through time across

heterogeneous

landscapes

Areas of niche stability

accumulate species through time.

Areas of niche stability

(specific aspects of

climate vs. overall climate

in Climate stability)

provide sanctuary for

species though time.

Use GIS and SDM to

estimate stable areas in

each species distribution

through time; areas of

highest stability should

harbor higher species

richness and endemism

None Recuero & García-

Paris 2011

Topographic

Heterogeneity

(Fig S1. D)

The level of topographic

variation has been observed to

be positively correlated to

species richness patterns and

centers of endemism

Topography No mechanism invoked Species richness should be

highest in areas of high

topographic heterogenety

Areas of high

topographic

heterogeneity harbor

None Kerr & Packer 1997;

Rahbek and Graves

20001; Jetz &

Rahbek 2002; Jetz et

al. 2004

Watershed

(Fig S1.D)

Allopatry results from

altitudinal range retractions

caused by simultaneous

decreases in temperature and

precipitation. Lower elevation

Repeated cycles of

drastic simultaneous

increases of both

temperature and

precipitation. Large

Dispersal during warm-humid

periods allows lowland species

to disperse attitudinally, across

headwater habitat (previously

too arid to occupy). Allopatry

adjacent intra-riverine

corridors. Most common

ancestor of sister clades

date to Pleistocene.

See Wilmé et al. 2006. Quaternary Wilmé et al 2006

rivers act as barriers. permanent rivers

and mountains

adjacent to

lowlands.

occurs as climate cools and the

species depends into lowlands

and lowland rivers prevent gene

flow between populations, now

occurring on both sides of river.

Supplementary Table S3. Correlations of biodiversity hypotheses to observed biodiversity patterns. R-values reflect non-spatial

Pearson product-moment correlation coefficients. To assess significance of raster data, we used an unbiased correlation following the

method of Dutilleul (1993). This method reduced the degrees of freedom according to the level of spatial autocorrelation between the

two variables.

Hypothesis Reptile Amphibian

Correlation to Observed

Endemism R F-stat df p r F-stat df p

Mid-domain: Distance -0.148 0.539 24.037 0.470 -0.026 0.018 26.753 0.894

Topographic Heterogeneity 0.343 5.425 40.738 0.025* 0.662 21.664 27.724 <.001*

Refuge 0.036 0.04 32.328 0.848 0.154 0.552 22.722 0.465

Montane Species Pump 0.303 2.639 48.488 0.107 0.613 13.852 22.990 0.001*

Disturbance-vicariance 0.379 4.101 22.413 0.055 0.616 15.758 22.829 <.001*

Climate stability 0.241 0.62 14.580 0.444 0.356 2.033 15.124 0.174

Sanctuary 0.296 2.438 25.313 0.131 0.606 7.611 13.080 0.016*

Museum 0.285 4.971 56.429 0.030* 0.335 17.709 49.836 0.015*

River-Refuge (binary) 0.207 3.87 86.869 0.052 0.066 0.954

216.95

9 0.330

Correlation to Observed Richness R F-stat df p r F-stat df p

Mid-domain: Distance -0.480

5 34.353 0.003* -0.204 1.072 24.698 0.310

Topographic Heterogeneity 0.140 2.398 119.852 0.124 0.307 6.769 64.952 0.011*

Refuge -0.101 0.332 31.948 0.598 -0.093 0.216 24.605 0.646

Montane Species Pump 0.056 0.507 160.576 0.477 0.296 5.506 57.196 0.022*

Disturbance-vicariance 0.019 2.297 62.251 0.135 0.303 5.286 41.554 0.027*

Climate stability 0.233 1.506 26.191 0.231 0.210 0.723 15.672 0.408

Sanctuary 0.419

9 47.267 0.003* 0.818 34.069 16.885 <.001*

Museum 0.234 4.037 69.472 0.048* 0.250 4.481 66.989 0.038*

River-Refuge (binary) 0.108 2.378 201.187 0.125 0.234 4.94 85.022 0.029*

Supplementary Table S4. Mixed CAR spatial models of observed biodiversity data. A principal component analyses was performed

on the standardized biogeography hypotheses. All the resulting principal components (PCs) were extracted and then loaded as

explanatory variables. The CAR analyses were run iteratively, starting with all PCs as response variables and then excluding each PC

that did not contribute significantly to the model (α = 0.05) until the final model included only PCs that contributed significantly to the

model. The standardized beta coefficients (β) were then used to calculate contributions of each biogeography hypothesis (see methods

on OTBCs) in the final CAR analysis. To compare the contributions of each biogeography hypothesis among models of observed

biodiversity patterns (richness, endemism, GDM), β coefficents from each OTBC/CAR analyses were converted to the percentage of

contribution. *The mean of the 3 MDS vectors loadings were calculated and contributed as a single value to the total mean.

Percent Contribution to CAR Endemism Richness GDM Mean

contribution to

all observed

biodiversity

models*

Amphibia

Reptile Amphibian Reptile MDS-D1 MDS-D2 MDS-D3 Mean*

Mid-domain- Latitude 0.0% 27.7% 3.5% 3.6% 5.3% 20.7% 1.6% 9.2% 8.8%

Mid-domain- Longitude 8.1% 8.0% 6.7% 9.2% 10.9% 10.2% 8.1% 9.7% 8.4%

Mid-domain- Distance*** 15.1% 9.7% 7.7% 20.0% 14.4% 5.7% 15.8% 11.9% 12.9%

Climate- PC1 15.4% 0.1% 11.4% 14.5% 15.1% 4.5% 13.8% 11.1% 10.5%

Climate- PC2 10.1% 12.7% 8.5% 11.0% 10.1% 11.7% 7.4% 9.7% 10.4%

Climate- PC3 2.2% 12.8% 4.6% 1.8% 4.4% 11.9% 3.9% 6.7% 5.6%

Refuge 1.4% 4.7% 6.9% 0.0% 3.8% 10.5% 0.0% 4.8% 3.5%

Climate Stability*** 4.7% 4.6% 7.4% 3.0% 0.0% 1.0% 3.8% 1.6% 4.2%

Topographic Heterogeneity 6.5% 1.7% 7.1% 3.7% 3.9% 0.5% 8.0% 4.2% 4.6%

Disturbance-vicariance 5.9% 4.8% 8.1% 3.1% 3.4% 2.2% 6.8% 4.1% 5.2%

Montane Species Pump 5.0% 0.0% 7.1% 2.3% 2.9% 0.0% 7.2% 3.4% 3.6%

Sanctuary 10.9% 0.3% 13.7% 12.6% 11.7% 3.1% 4.7% 6.5% 8.8%

Museum 14.8% 13.0% 7.1% 15.1% 14.3% 18.1% 19.0% 17.1% 13.4%

CAR Model Summary

Explained by Predictor

Variables: r2 (AICc)

(-372.3)

(-468.5)

(7861.9)

(21849.9)

(-4037.8)

(-1084.2)

(-3327.7)

Total Explained (Predictor and

Space): r2 (AICc)

(-426.0)

(-556.4)

(6349.3)

(19364.2)

(-6659.7)

(-6760.4)

(-5700.9)

Model significance: n, F, p-val 141, 42.5, 141, 9.7, 2501, 2501, 2501, 2501, 2501, NA

<0.001 <0.001 303.5

<0.001

278.1,

<0.001

1265.9,

<0.001

147.2,

<0.001

396.8,

<0.001

Supplementary Table S5. Mixed CAR spatial models of focal subgroups. A principal component analyses was performed on the

standardized biogeography hypotheses. All the resulting principal components (PCs) were extracted and then loaded as explanatory

variables. The CAR analyses were run iteratively, starting with all PCs as response variables and then excluding each PC that did not

contribute significantly to the model (α = 0.05) until the final model included only PCs that contributed significantly to the model. The

standardized beta coefficients (β) were then used to calculate contributions of each biogeography hypothesis (see methods on OTBCs)

in the final CAR analysis. To compare the contributions of each biogeography hypothesis among models of observed biodiversity

patterns (richness, endemism, GDM), β coefficents from each OTBC/CAR analyses were converted to the percentage of contribution.

*The mean of the 3 MDS vectors loadings were calculated and contributed as a single value to the total mean.

Percent Contribution to CAR Endemism Richness

Boophis Brookesi

Oplurus* Phelsuma Boophis Brookesia Oplurus* Phelsuma

Mid-domain- Latitude 3.0% 4.0% 2.5% 17.0% 0.0% 5.1% 2.3% 4.2%

Mid-domain- Longitude 8.4% 9.9% 10.6% 11.8% 6.1% 8.3% 6.3% 10.2%

Mid-domain- Distance*** 16.0% 9.9% 6.3% 6.4% 10.6% 12.0% 11.0% 20.4%

Climate- PC1 12.2% 12.2% 13.5% 5.8% 14.4% 11.7% 15.9% 13.1%

Climate- PC2 6.0% 14.1% 13.2% 7.0% 13.0% 11.2% 8.9% 9.7%

Climate- PC3 5.2% 4.3% 2.8% 18.3% 1.6% 7.9% 0.0% 3.9%

Refuge 0.0% 10.5% 11.3% 2.3% 9.9% 4.5% 3.3% 0.0%

Climate Stability*** 3.6% 8.8% 11.2% 5.9% 8.9% 3.6% 5.8% 1.8%

Topographic Heterogeneity 9.0% 2.2% 1.0% 5.1% 2.8% 2.1% 6.5% 4.1%

Disturbance-vicariance 7.0% 4.7% 4.1% 7.2% 5.8% 2.5% 6.8% 2.7%

Montane Species Pump 8.4% 0.6% 0.0% 4.3% 2.7% 0.0% 4.5% 2.7%

Sanctuary 0.1% 18.9% 22.6% 0.0% 20.3% 17.2% 23.6% 7.9%

Museum 21.1% 0.0% 0.8% 8.8% 3.8% 13.9% 5.2% 19.3%

CAR Model Summary

Explained by Predictor

Variables: r2 (AICc)

(-252.2)

(-176.1)

(-317.9)

(-253.4)

(9710.5)

(5392.1)

(9789.1)

(6731.4)

Total Explained (Predictor and

Space): r2 (AICc)

(-316.0)

(-235.3)

(-436.8)

(-354.6)

(6851.0)

(2680.8)

(8530.6)

(4837.3)

Model significance: n, F, p-val 141, 46.6,

<0.001

141, 49.7,

<0.001

141, 20.5,

<0.001

141, 20.1,

<0.001

2501, 364.7,

<0.001

2501, 220.6,

<0.001

2501, 550.1,

<0.001

2501, 632.4,

<0.001