CHAPTER 1: How species richness and total abundance constrain the
distribution of abundance
CHAPTER 2: Efficient algorithms for sampling feasible sets
Rank-abundance curve (RAC)
Rank in abundance
Abun
danc
e
Frequency distribution
Species abundance distribution (SAD)
Abundance class
freq
uenc
y
Integer Partitioning
Integer partition: A positive integer expressed as an unordered sum of positive integers
e.g. 6 = 3+2+1 = 1+2+3 = 2+1+3
Written in non-increasing ordere.g. 3+2+1
Rank-abundance curves are integer partitions
Rank-abundance curve
N = total abundanceS = species richness
S unlabeled abundancesthat sum to N
Integer partition
N = positive integerS = number of parts
S unordered +integersthat sum to N=
Combinatorial Explosion
N S Shapes of the SAD
1000 10 > 886 trillion
1000 100 > 302 trillion trillion
Random integer partitions
Goal: Random partitions for N = 5, S = 3:
54+13+23+1+12+2+12+1+1+11+1+1+1+1
Nijenhuis and Wilf (1978) Combinatorial Algorithms for Computer and Calculators. Academic Press, New York.
DATAEthan P. White, Katherine M. Thibault, and Xiao Xiao 2012. Characterizing species
abundance distributions across taxa and ecosystems using a simple maximum entropy model. Ecology 93:1772–1778
Dataset Number of sites
Christmas Bird Count 1992
North American Breeding Bird Survey 2769
Gentry’s Forest Transect 222
Forest Inventory & Analysis 10356
Mammal Community Database 103
TOTAL 15442
Dataset Number of sites
Indoor Fungal Communities 128
Terrestrial metagenomesChu Arctic Soils, Lauber 88 Soils 128
Aquatic metagenomesCatlin Arctic Waters, Hydrothermal Vents 252
TOTAL METAGENOMES 512
GRAND TOTAL 15954
Microbial metagenomic datasetsobtained from MG-RAST metagenomics.anl.gov
TOOL LOGO COOLNESS
Sage mathematical software 8
Amazon Web Services 2
Weecology Servers (in-house) 10
TOTAL COMPUTING CORES 180
Generating random samples of the feasible set
Dataset total sites analyzable sites
Christmas Bird Count 1992 129 (6.5%)
North American Breeding Bird Survey 2769 1586 (57%)
Gentry’s Forest Transect 222 182 (82%)
Forest Inventory & Analysis 10356 7359 (71%)
Mammal Community Database 103 42 (41%)
Indoor Fungal Communities 128 124 (97%)
Terrestrial metagenomes 128 92 (72%)
Aquatic metagenomes 252 48 (19%)
TOTAL 15950 9562 (60%)
R2 = 0.93
100 101 102
102
101
100
Obs
erve
d ab
unda
nce
Abundance at center of the feasible set
North American Breeding Bird Survey(1583 sites)
Public code and data repository
https://github.com/weecology/feasiblesets
General Conclusions
Feasible set: A primary way to account for how variables constrain ecological patterns…before attributing a pattern to a process
General Conclusions
Extending the feasible set approach:○ Spatial abundance distribution○ Species area relationship○ Distributions of wealth and abundance
The ubiquitous hollow curve
0.91
Obs
erve
dUrban population sizes
among nations(1960-2009, rescaled)
Oil related CO2 emission among nations
(1980-2009, rescaled)
0.92
Center of the feasible set
Combinatorial Explosion
N S SAD shapes
1000 10 > 886 trillion
1000 1,...,1000 > 2.4x1031
Probability of generating a random partition of 1000 having 10 parts: < 10-17
1. Generate a random partition of N with S as the largest part
2. Conjugate the partition
A recipe for random SADsN = total abundanceS = species richness
Generate a random partition of N with S as the largest part
Divide & Conquer
54+13+23+1+12+2+12+1+1+11+1+1+1+1
Multiplicity
Top down
Bottom up
Old Apples: probability of generating a partition for N = 1000 & S = 10: < 10-17
New Oranges: Seconds to generate a partition for N = 1000 & S = 10: 0.07
Integer partitionsS positive integers that sum to N
in without respect to order
What if a distribution has zeros?• subplots with 0 individuals• people with 0 income • publications with 0 citations
Abundance class
freq
uenc
y
0 1 2 3 4 5
Intraspecific spatial abundance distribution (SSAD)N = abundance of a species
S = number of subplots
SSAD
N = total abundanceS = no. subplots
S non-negative abundances that sum to N without respect to order
(weak) Integer partition
N = positive integerS = number of parts
S non-negative integersthat sum to N without
respect to order=
Intraspecific spatial abundance distribution (SSAD)
Intraspecific spatial abundance distribution (SSAD)
Abundance class
Freq
uenc
y
Abundance class Abundance class
Freq
uenc
y SAD
“…frequency distributions of intraspecific abundance among sample sites resemble distributions … that have been used to characterize the distribution of abundances among species” (Brown et al. 1995)
Species abundance = 1KSubplots = 100
Community abundance =1KSpecies = 50
SSAD
Abundance class Abundance class
Conclusions
•How do empirical SSADs compare to the feasible set of possible SSAD shapes?
•Other ecological patterns/distributions:
–Occupancy frequency distribution–Collector’s curve–Species-area curve–Species-time relationship
Public code repository
https://github.com/klocey/partitions
PeerJ Preprint
https://peerj.com/preprints/78/
Locey KJ, McGlinn DJ. (2013) Efficient algorithms for sampling feasible sets of macroecological patterns. PeerJ PrePrints 1:e78v1
AcknowledgementsFor collecting, managing and providing datasets:North American Breeding Bird SurveyChristmas Bird CountGentry’s Forest Transect DataForest Inventory and Analysis datasetMicrobial metagenomic datasets accessed from MG-RASTMammal Community Database
My committee: Morgan Ernest, David Koons, Jeannette Norton, Jacob Parnell Past: Mike Pfrender, Paul CliftenColleagues: Justin Kitzes, James O’Dwyer, Bill Burnside, Jay Lennon, Paul Stone and the Stone CrewFaculty and Staff of the Biology Dept: esp. Brian Joy, Kami McNeil
Funding: W. L. Eccles Graduate Research Fellow 2008-2011James A. and Patty MacMahon ScholarshipJoseph E. Greaves Scholarship in BiologyDissertation FellowshipCAREER grant from NSF to Ethan White (DEB-0953694)Research grant from Amazon Web ServicesAmerican Museum of Natural History Theodore Roosevelt Memorial Grant
Sampling the SAD feasible Set
Den
sity
Evenness Evenness Evenness
Den
sity
Den
sity
Sample size = 300 Sample size = 500 Sample size = 700