Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | georgia-matthews |
View: | 216 times |
Download: | 1 times |
Spatial Analysis of Surnames in Great Britain
James CheshireDepartment of Geography and CASA, UCL
jamescheshire.co.uk
“It may be thought by some that the investigation of the distribution of names is an idle amusement, productive of no utility of man. I have come to think, however...that it is a matter of much importance to the antiquarian, the historian the ethnologist and also to the more practical politician” Henry Guppy, 1890.
Outline- Surnames in Great Britain*.- Surnames and Geography.- Research aims.- Surnames and Genetics.- Unearthing Great Britain’s surname regions.- Effects of scale.- 2 interesting examples.- Surname regions in Great Britain?- Future research.
* I will be talking about every surname registered in Great Britain. The majority would have originated in Britain; these remain the dominant driver of surname regions and will therefore be the focus of the contextual information that follows. When I refer to Great British surnames I am, however, referring to those registered, not necessarily those originating in Britain.
What are surnames in Great Britain?- In 1066 the Normans “brought with them a
new, upper class fashion for surnames” (Miles, 2005).
- Main purpose was to clarify the right to ownership of land.- Indicated family place of origin in France or land
acquired in England.
What are surnames in Great Britain?
Category Example ExplanationOccupational (Metonyms)Profession Smith Blacksmith/ metal workerOffice/ Trade Reeve Chief magistrate/ overseerRank/Status Knight A knighted personOccupation Features Falconer One who kept/trained FalconsLocal Surnames (50% of surnames)Toponymic (from landscape) Rivers Dweller near riverToponymic (from village/ region) Cornwall Man from CornwallHabitation (residence) Gate Habitation at/near a gateHabitation (work) Hall A worker at the hall.Surnames of RelationshipFrom personal name (patronymic) Johnson/ Jones Son of JohnFrom personal name (metronymic) Margaretson Son of MargaretPersonal name from other relative Also: Johnson Related to JohnPersonal name from diminutive Dickens Son of Dick (Richard)Clan or tribal names MacBain Related to the MacBain clan.NicknamesFrom animals Fox Slyness or other attributesFrom characteristic traits Careless Free from care/ responsibilityFrom objects Shorthose Someone who wore short bootsFrom physical features Little A small personFrom times and seasons Pasque Person born at EasterFrom iconic description Drinkwater Heavy drinker
It took around 300 years for surnames to be widely adopted, with people taking naming inspiration from every aspect of their lives:
- With greater recording of the population (starting with the Domesday book of 1085) surnames became patronymical (inherited from the father).
- They became fixed to a family lineage rather than location and could move with their “owners”.
- Many names originated in only one place/ region due to different conventions throughout Britain.
What are surnames in Great Britain?
Surnames and Geography
Is it the case that these places of origin remain the areas of highest concentration for these names?
...or over the 1000 years since surnames arrived in Britain has population movement (including international migrants) caused spatial mixing of surnames?
Surnames and geography- some examples: individual surnames
Lewis Smith
Macleod Buckley
Surnames and geography- some examples: groups of surnames
Source: Schurer, K. 2004
Genitival “s” names Patronymic/ Metronymic Names
Surnames and geography- some examples: groups of surnames
Surnames and Geography
-The surnames of Britain appear to exhibit a clear geography.
- This presents an interesting regionalisation problem.
- It also has broader cultural significance.
Aims
- Aggregate the multiple surname distributions to establish broad regional variations.
- Undertake the first study of this kind on two “complete” population registers.
- Establish the extent to which the derived regions are genetic/ cultural.
- Develop a methdological framework for the future spatial analysis of names.
- Demonstrate the inherently spatial nature of surnames and their utility as a resource.
Data1881 Census29 Million People425, 793 Surnames345, 781 <10 occurrences Principle level of geography: 657 Registration Districts
2001 Enhanced Electoral Roll45.6 Million People1,597, 805 Surnames1,457, 681< 10 occurrencesPrinciple level of geography: 410 Districts* (excl. N.Ireland)Additional analysis on: approx 10650 Wards (inc. N.Ireland)
* In the analysis the 32 London Boroughs have been aggregated to a single district (leaving 379 districts) as their high dissimilarity in comparison with the rest of Britain and each other was distorting the results of the regionalisation.
1881 Surname Frequencies (top 500 names)
2001 Surname Frequencies (top 500 names)
Genes and Surnames
- If surnames are inherited then they behave much like a genetic attribute.
- Obviously only really works for men (unless women keep their maiden names).
Genes and Surnames
King and Jobling, 2009
Genes and Surnames
- Previous diagram does not account for geography (the surname possessors could live anywhere).
- The fact that many surnames have stayed concentrated in their point of origin suggests that the groups of people possessing them haven't moved much. – They are therefore even more likely to be related.
Isonymy (“same name”)
- Concept forms the basis to this analysis.
- George Darwin (son of Charles) was interested in isonymous marriages.- His perspective was a genetic
one. He wanted to quantify the effects of inbreeding between cousins*.
* His father and mother were cousins so he had a vested interest!
Coefficient of Isonymy“The probability of members of two populations or subpopulations having genes in common by descent as estimated from sharing the same surnames” (Lasker, 1985:142).
where Si1 is the number of occurrences of the ith surname in a sample from Area 1 and Si2 is the number of occurrences from the same surname from Area 2.
The resulting values can be considered as the proportional correspondence in terms of a shared surname pool between a particular place and all others in the country .
CheshireMateos
Singleton
LongleyO’Brien
AdnanLewis
Smith
Dormandy
Evans
PopeRohde
Penny
Buckle
Cheshire
Mateos
Singleton
Longley
O’Brien
Adnan
Lewis
Smith
DormandyEvans
Pope Rohde
Penny
Buckle
CheshireSingleton
Longley
O’Brien
Adnan
Smith
Buckle
Mateos
Lewis
DormandyEvans
Rohde
Penny
Richards
Whitfield
Johns
Dolan
Cheshire
Singleton Longley O’BrienAdnan
Smith
Buckle
MateosLewis
Dormandy
Evans
RohdePennyRichardsWhitfield
JohnsDolan
A
CB
C= 1 in 17B= 1 in 17
Take Cheshire from A, probability of removing Cheshire from:
A= 2 in 20C= 1 in 17
Take Mateos from B, probability of removing Mateos from:
Repeat this process for each name and sum the probabilities for each comparison...
Take Johns from C, probability of removing Johns from:
A= 0B= 1 in 17
Coefficient of Isonymy
Cheshire
Singleton
O’Brien
Lewis
Smith
Mateos
LongleyAdnan
Dormandy
Evans
CheshireSingleton
Longley
O’Brien
Adnan
Smith
Buckle
Mateos
Lewis
DormandyEvans
Rohde
Penny
Richards
Whitfield
Johns
Dolan
Cheshire
Singleton Longley O’BrienAdnan
Smith
Buckle
MateosLewis
DormandyRohdePennyRichards
WhitfieldJohns
Dolan
A
CB
Coefficient of isonymy between districts A, B and C:
A B C
A 1 1/17+1/10+0=0.16 1/17+0+0=0.05
B 1/17+1/10+0=0.16 1 1/17+1/17= 0.10
C 1/17+0+0=0.05 1/17+1/17= 0.10 1
Coefficient of Isonymy
Lasker Distance
where L is the Lasker distance and i and j are two separate populations
- This takes the Coefficient of Isonymy values and does the following:- Turns them from very small numbers to
larger ones.- Inverts them so that smaller values
represent greater similarity (rather than greater difference).
Lasker Distance Matrices
95Z 99ZZ OOLN 00BL 7.520982 7.336616 7.219516 00BM 7.428889 7.315671 7.425037 00BN 7.347616 7.356772 7.394888 00BP 7.452982 7.299915 7.330886 00BQ 7.410027 7.300150 7.387787
Yarmouth Yeovil York Aberayron 6.389540 6.289929 6.438361 Aberdeen 6.356152 7.019357 6.213222 Abergavenny 6.412893 6.361753 6.566717 Aberystwith 6.327093 6.319481 6.467985 Abingdon 6.353814 6.559106 6.621873
2001 Matrix 1881 Matrix
Can be thought of as placing the districts in “surname space”.
Analysing the Lasker Distance Matrix
The purpose is to group/ split the data by surname similarity.
- Clustering - Multidimensional Scaling
District i or jLasker’s Distance
Clustering: K-Means- The K-means algorithm randomly allocates a set of k seeds within the data matrix and then allocates all data points to their nearest seed.
- A new mean cluster centroid is then calculated for each cluster, and a new partitioning of the data points is made based on the new nearest centroid.
- Centroids are then recalculated for the new clusters, and the algorithm repeats these steps until no more switching takes place.
K-Means Clustering (K=15) 1881
K-Means Clustering (K=15) 2001
Clustering: Ward’s Hierachical Clustering
- Considers union of every cluster pair.
- The two clusters with the minimum increase in ‘information loss’ are combined.
- Information loss is defined by Ward in terms of an error sum-of-squares criterion.
Ward’s Hierarchical Clustering 1881
2001
Ward’s Hierachical Clustering1881 2001
Ward’s Hierarchical Clustering (K=15) 1881 2001
Multidimensional Scaling1881 2001
Summary- There is undoubtedly a regionalisation to Great
British surnames.- The underlying causes appear to be cultural
rather than explicitly environmental: i.e. surname dissemination does not appear to be related to topographic barriers.
- The Scotland/ England transition is a lot more discrete than the Wales/ England transition.
- To what extent is this patterning an artefact of the spatial units used in the Lasker Distance calculations...?
Higher Resolution Analysis- Does calculating the Lasker Distance between smaller areas
create a different picture of the surname regions in Britain?- Is small scale variation sufficient to mask broader trends/
effects?- These questions are explored with 2001 CAS Wards. - Some considerations:
- Data size: at Ward level the Lasker Distance calculation involves 1,597, 805 *10500*10500 cells of data.- Small numbers problem
- Key advantage is the reduced influence of London (accounts for only 6% of the units of analysis instead of 13%). It can therefore be included in the cluster analysis.
Higher Resolution Analysis: CAS Wards
Corby1881 2001
MDS
Ward’s
K-Means
- In 1932 Stewarts and Lloyds built a new iron and steel works in Corby.
- Labour sourced from closing Scottish steelworks, mainly in Lanarkshire.
- Into the 1970s, 50% of the incoming population Scottish.
- Transformed population from 1,500 to 34,000 .
- Annual Highland Games.
Corby
1881 2001MDS
Ward’s
K-Means
Danelaw
Danelaw1881 2001 2001 Ward Level
“It might appear...that the family of nomenclature of Englishmen was for the most part in a confused jumble, and that on account of the rapid means of inter-communication, which we enjoy in the present Century, most of the distinctions that existed in the past would have been lost in the whirl and bustle of the industrial era in which we live. It might have seemed...that chance had played such as part in the intermingling of inhabitants of different counties and districts, that it would seem a hopeless task to unravel the entangled skein...I found it was yet possible to pick up the threads. By this means I have found order where I expected disorder and method where I only looked for chance. ” Henry Guppy, 1890.
Surname Regions Great Britain?
Surname Regions in Britain?
- Multiple levels from broad contiguous regions to small areas of intra-region similarities.- Each level representing a different slice through time?
• Likely to reflect areas of genetic and cultural similarities/ difference.
Spatial Analysis of Surnames
Methods ApplicationsAugmentation
Clustering
Visualisation
Surname Sampling
Surface Analysis
Geodemographics
Genetic Characteristics
Functional/ Uniform Regions?
Population Sampling
Geo-Genealogy
Hypothesis generation
Migration flowsTemporal Analysis Temporal Analysis
A Population Geology of the UK?
Effective Population Sampling- Using surname regions to inform sample design
regions of the Britain: - For example there is little point in sampling a person from Corby if you wish to genetically characterise the Northamptonshire population.- Equally, the Corby population may have unrepresentative views on Scottish devolution, for example.
- Do the sub-regional groups show more allegiance to each other than the broader regions they fall within?
Conclusions
- Surname regions exist in contemporary Britain.- To a remarkable degree they remain unchanged
from their conception nearly 1000 years ago.- Unearthing these regions by establishing a clear
methodological framework and utilising complete population registers provides a firm basis for future research.