Religion and Economic Change over a Century: Linking Diverse Historical Data New Technologies and...

Post on 18-Jan-2016

218 views 0 download

Tags:

transcript

Religion and Economic Change over a Century:

Linking Diverse Historical Data

New Technologies andInterdisciplinary Research on Religion

Harvard, 2010

Robert D. Woodberry Juan Carlos EsparzaUniversity of Texas at Austin

Sociology Department and Population Research Center

The challenge:Roots of current differences may go back decades, even centuries – How test?

Religious recordsvaluable information

seldom used

Linking diverse sources over time

The data:Source: Data: Characteristics:

Electronic datasets

Recent censuses, surveys and geo-climatic data

Polygons & Grids of Cells

Historical data

Historic censuses and colonial records Polygons

Protestant Data

Missionaries, education, etc. Points (mission stations)

English, Danish, Norwegian, French, German, and Spanish

Catholic Data

Missionaries, education, etc. Polygons (ecclesiastical jurisdictions)

English, Chinese, Italian, French, German, Latin, Spanish, Polish, and Portuguese.

Problems:Gathering complete data

Digitizing data & maps

Normalizing and linking data from different sources

Dealing with missing data

Creating database for geo-spatial statistical modeling

Complete dataLocating and evaluating “the universe” of sources

Temporal coverage

Spatial coverage

Data Quality

Variables included

Complete dataComplete data often only available in archives: e.g., “Vatican Secret Archives,” & “Archives of Propaganda Fide”

Negotiating access

Locating, copying and digitizing sources

Spatial LinkingIssues:

1) Data given for different spatial units

2) Spatial units change over time

3) Accuracy of base map

Spatial Linking

1) Data given for different spatial units

Protestant: points

Catholic: polygons

Censuses, surveys, geo-climatic data:

different polygons and grids of cells

Spatial Linking

2) Spatial units change over time

Cities’ & towns’ names change

Catholic ecclesiastical jurisdictions evolve

National, provincial, and other state boundaries change

Spatial LinkingWhy Important?

Connecting data to proper geographic referente.g., EJs & provinces in 1913

Linking data over time

For statistical analysis

For imputation

(How does data in 1892 relate to data in 1934 and 2009)

Spatial Linking

3) Historic maps inaccurate (limited usefulness)

Points:Why matters:

1) change over time

2) link to proper polygon

3) link to proper geo-climatic conditions

Find place in modern gazetteer

Link locations between sources known alternative names

consistent institutions

Spatial Linking

Historic maps inaccurate (limited usefulness)

Territories: map spaghetti

Why matters:

1) Arbitrarily linking borders

2) Imputing data to artificial slivers

3) How link data when no maps

Spatial Linking

Improving accuracy:

Start with accurate modern maps

Reconstruct border change from legal documents

Reconstruct border overlap from legal documents

(e.g., Catholics and state jurisdictions borders)

Bring modern borders back through time

Linking (cont.)

Accurate base maps:

Current world maps insufficient accuracy(e.g., mission stations in ocean or wrong country)

Improve coastlines, islands, borders, and maritime boundaries

Remove sliversAllows automatic linking of point and polygon data

Maritime Boundaries

Reconstructing historic borders:

Papal decrees document changes in EJs & identify corresponding government borders

Linking (cont.)Reconstructing historic borders:

Check accuracy with country & empire records

Smallest unit in legal sources determines size of MCGUs and precision of data linking

When possible use modern borders, when not digitize border from relatively accurate historical maps

Linking (cont.)Determine Maximum Consistent Geographic Unit

(MCGU) before creating digital maps

MCGUs foundation for all linking and imputation

Only one base map (easy to update)

All other geographic units are unions of MCGUs

Linking (cont.)

Maximum Consistent Geographic Unit (MCGU)

All point and cell data link to MCGUs

Protestant data

Geo-climatic data

Missionary mortality data

Also allow contextual analysis

(spatial autocorrelations, etc.)

Minimizes over-aggregation of data

Linking (cont.)

Linking geo-climatic data (endogeneity)

Aggregate as grid of cells: Grid of boxes covering world

Assign unique IDs and vectorize raster data

Normalize so boxes perfectly overlap and IDs match between layers

(very hard and time consuming)

Aggregate for MCGUs

Linking (cont.)

Linking mortality data (endogeneity)

Data on over 100,000 missionary lives

Calculate comparative mortality estimates by linking lives to

1) points (mission stations)

2) polygons (Countries, EJs & MCGUs)

Can generalized to other areas based on geo-climatic conditions, etc.

Name Sex Born Sailed Loc_01 Begin End Loc_02 Begin2End2

Cover, James Fleet 1 1762 1796 Tahiti 1797 1798 Port Jackson 1798 1800

Eyre,John 1 1768 1796 Tahiti 1797 1808 Huahine 1808 1809

Jefferson, John 1 1760 1796 Tahiti 1797 1807

Lewis, Thomas 1 1765 1796 Tahiti 1797 1799

Bicknell, Henry 1 1766 1796 Tahiti 1797 1808 Port Jackson

Bowell, Daniel 1 1774 1796 Tongataboo 1797 1799

Broomhall, Benjamin 1 1776 1796 Tahiti 1797 1801

Buchanan, John 1 1765 1796 Tongataboo 1797 1800 Port Jackson 1800 1800

Cooper, James 1 1768 1796 Tongataboo 1797 1800 Port Jackson 1800 1801

Cock, John 1 1773 1796 Tahiti 1797 1798 Port Jackson 1798

Missing DataProblems:

Changing categories between sources/years

Inconsistent categories within same source

Missing places in source

Inconsistent years between sources

Missing Data (cont.)Strategies:

Finding missing data:

Letters of bishops to Pope

Triangulating between sources- To identify missing institutions &

organizations

- To identify estimates from inconsistencies

- To fill in missing data

Missing Data (cont.)Strategies:

Imputing missing data (multiple imputation):

Using: 1) trend over time in MCGUs

- e.g., using linked MCGUs in 1913 & 1932

to estimate 1923

2) pattern with neighbor

Can compare results with and without imputed data

An example: Mexico

Reconstruct all locality changes back to 1815

Reconstruct all EJ changes from 1850

Link historical censuses & modern surveys

Re-aggregate data according to any geographic unit (MCGU or larger)

Mexico (cont.)

Once completed:

All census, Catholic, and Protestant data linked for about 120 years

Multiple current surveys linked so can analyze modern consequences

Longitudinal database of MCGUs

Mexico (cont.)Interrupted Time Series:

impact of introducing Protestant missions on Catholic church behavior

impact of Catholic and Protestant interventions on the change in literacy between censuses

Cumulative Influence:

Endogeneity: test correlates of when and where Protestants and Catholics invest in particular areas.

Thank You!• .