Combinatorial insights into distributions of wealth, size,
and abundance
Ken Locey
Rank-abundance curve (RAC)
Rank in abundance
Abun
danc
e
Frequency distribution
Species abundance distribution (SAD)
Abundance class
freq
uenc
y
Ranked curve (RC)
Rank in abundance,wealth, or size
Abun
danc
e/w
ealth
/size
Frequency distribution
Distributions of wealth, size, abundance
Abundance, wealth, or size class
freq
uenc
y
Wheat Production (tons)
Poverty in Rural America, 2008
Percent in Poverty
54 – 25.1 25 – 20.1 20 – 14.1 14 – 12.1 12 – 10.1 10 – 3.1
Distributions used to predict variation in wealth, size, & abundance1. Pareto (80-20 rule)2. Log-normal3. Log-series4. Geometric series5. Dirichlet6. Negative binomial7. Zipf8. Zipf-Mandelbrot
Rank-abundance curve (RAC)
Rank in abundance
Abun
danc
e
Frequency distribution
Predicting, modeling, & explaining the Species abundance distribution (SAD)
Abundance class
freq
uenc
y
Rank in abundance
Abun
danc
e104
103
102
101
100
ObservedResourcepartitioningDemographic stochasticity
Predicting, modeling, & explaining the Species abundance distribution (SAD)
Rank in abundance
Abun
danc
e104
103
102
101
100
N = 1,700S = 17
Predicting, modeling, & explaining the Species abundance distribution (SAD)
How many forms of the SAD for a given N and S?
Rank in abundance
Abun
danc
e104
103
102
101
100
Integer Partitioning
Integer partition: A positive integer expressed as the sum of unordered positive integers
e.g. 6 = 3+2+1 = 1+2+3 = 2+1+3
Written in non-increasing (lexical) ordere.g. 3+2+1
Rank-abundance curves are integer partitions
Rank-abundance curve
N = total abundanceS = species richness
S unlabeled abundancesthat sum to N
Integer partition
N = positive integerS = number of parts
S unordered +integersthat sum to N=
Combinatorial Explosion
N S Shapes of the SAD
1000 10 > 886 trillion
1000 100 > 302 trillion trillion
Random integer partitions
Goal: Random partitions for N = 5, S = 3:
54+13+23+1+12+2+12+1+1+11+1+1+1+1
Nijenhuis and Wilf (1978) Combinatorial Algorithms for Computer and Calculators. Academic Press, New York.
SAD feasible sets aredominated by hollow curves
Freq
uenc
y
log2(abundance)
The SAD feasible setln
(abu
ndan
ce)
Rank in abundance
N=1000, S=40
Can we explain variation in abundance based on how N and S constrain
observable variation?
Question
Dataset communities
Christmas Bird Count 129
North American Breeding Bird Survey 1586
Gentry’s Forest Transect 182
Forest Inventory & Analysis 7359
Mammal Community Database 42
Indoor Fungal Communities 124
Terrestrial metagenomes 92
Aquatic metagenomes 48
TOTAL 9562
The center of the feasible setln
(abu
ndan
ce)
Rank in abundance
N=1000, S=40
Obs
erve
d ab
unda
nce
100 101 102
Abundance at the center of the feasible set
102
101
100R2 per site
R2 = 1.0
Obs
erve
d ab
unda
nce
R2 = 0.93
Breeding Bird Survey (1,583 sites)
100 101 102
R2 per site
Abundance at the center of the feasible set
102
101
100
Abundance at center of the feasible set
Obs
erve
d ab
unda
nce
Obs
erve
d ab
unda
nce
Abundance at center of the feasible set
Public code and data repository
https://github.com/weecology/feasiblesets
Center of the feasible set
Obs
erve
d ho
me
runs
0.93 0.88
0.91 0.91
0.94 0.93
http://mlb.mlb.com
Combinatorics is one only way to examine feasible sets
Other (more common) ways:Mathematical optimizationLinear programming
Dataset total sites analyzable sites
Christmas Bird Count 1992 129 (6.5%)
North American Breeding Bird Survey 2769 1586 (57%)
Gentry’s Forest Transect 222 182 (82%)
Forest Inventory & Analysis 10356 7359 (71%)
Mammal Community Database 103 42 (41%)
Indoor Fungal Communities 128 124 (97%)
Terrestrial metagenomes 128 92 (72%)
Aquatic metagenomes 252 48 (19%)
TOTAL 15950 9562 (60%)
Efficient algorithms for generating random integer partition with
restricted numbers of parts
Random integer partitions
Goal: Random partitions for N = 5, S = 3:
54+13+23+1+12+2+12+1+1+11+1+1+1+1
Nijenhuis and Wilf (1978) Combinatorial Algorithms for Computer and Calculators. Academic Press, New York.
Combinatorial Explosion
N S SAD shapes
1000 10 > 886 trillion
1000 1,...,1000 > 2.4x1031
Probability of generating a random partition of 1000 having 10 parts: < 10-17
Task: Generate random partitions of N=9 having S=4 parts
4+3+2
Task: Generate random partitions of N=9 having S=4 parts
4+3+2
4+3+2
4+3+2
3+3+2+14+3+2
4+3+2
3+2=5
4+3+2=9
3+3+2+14+3+2=9
1. Generate a random partition of N - S with S or less as the largest
2. Append S to the front3. Conjugate the partition4. Let cool & serve with garnish
A recipe for random partitions of N with S parts
54+13+23+1+12+2+12+1+1+11+1+1+1+1
Generate a random partition of N-S with S or less as the largest part
Divide & Conquer
Multiplicity
Top down
Bottom up
Un(bias)
Skewness of partitions in a random sample
Den
sity
Speed
Number of parts (S)
Sag
e/al
gorit
hm
N = 50 N = 100
N = 150 N = 200
Old Apples: probability of generating a partition for N = 1000 & S = 10: < 10-17
New Oranges: Seconds to generate a partition for N = 1000 & S = 10: 0.07
Integer partitionsS positive integers that sum to N
without respect to order
What if a distribution has zeros?• subplots with 0 individuals• people with 0 income • publications with 0 citations
Abundance class
freq
uenc
y
0 1 2 3 4 5
Intraspecific spatial abundance distribution (SSAD)N = abundance of a species
S = number of subplots
Intraspecific spatial abundance distribution (SSAD)
Public code repository
https://github.com/klocey/partitions
PeerJ Preprint
https://peerj.com/preprints/78/
Locey KJ, McGlinn DJ. (2013) Efficient algorithms for sampling feasible sets of macroecological patterns. PeerJ PrePrints 1:e78v1
Future Directions in Combinatorial Feasible Sets
Future Directions: metrics of Evenness, diversity, & inequality
freq
uenc
y
Future Directions: metrics of Evenness, diversity, & inequality
freq
uenc
y
Future Directions: metrics of Evenness, diversity, & inequality
Per
cent
ile in
feas
ible
set
Gini’s coefficient of inequality
Future Directions: metrics of Evenness, diversity, & inequality
integer composition: all ordered ways that S positive integers can sum to N
Future Directions: New combinatorial feasible sets
6 = 3+2+1 = 1+2+3 = 3+1+2
Future Directions: New combinatorial feasible sets
Rank
log
abun
danc
e
Future Directions: New combinatorial feasible sets
Rank
log
abun
danc
e
Future Directions: New combinatorial feasible sets
Rank
Pragmatic: explanations & predictions using few inputs
Mathematical: combinatorics can be used to characterize and understand observable variation in nature
System specific: patterns attributed to specific processes are constrained by general variables. What drives the values of the variables?
Policy, management, & philosophy:Would you want to know if the most costly, likely, preferred outcome was 95% similar to 95% of all others? Why?
http://figshare.com/articles/Combinatorial_insight_into_distributions_of_wealth_size_and_abundance/866822