Conservation genetics of Redside Dace (Clinostomus elongatus):
insights from environmental DNA and phylogeography
A Thesis Submitted to the Committee on Graduate Studies
in Partial Fulfillment of the Requirements for the Degree of Master of Science
in the Faculty of Arts and Science
TRENT UNIVERSITY
Peterborough, Ontario, Canada
© Copyright by Natasha R. Serrao 2016
Environmental and Life Sciences M.Sc. Graduate Program
May 2016
ii
Abstract
Conservation genetics of Redside Dace (Clinostomus elongatus):
insights from environmental DNA and phylogeography
Natasha R. Serrao
Recent range reductions of endangered species have been linked to urban
development, increased agricultural activities, and introduction of non-native species. I
used Redside Dace (Clinostomus elongatus) as a focal species to examine the utility of
novel monitoring approaches, and to understand historical and contemporary processes
that have influenced their present distribution. I tested the efficacy of environmental DNA
(eDNA) to detect Redside Dace, and showed that eDNA was more sensitive for detecting
species presence than traditional electrofishing. Parameters such as season, number of
replicates, and spatial versus temporal sampling need to be accounted for when designing
an eDNA monitoring program, as they influence detection effectiveness and power. I also
assessed the species’ phylogeographic structure using both mitochondrial and
microsatellite DNA analysis. The data from the microsatellite markers indicate that
Redside Dace populations are genetically structured, with the exception of several
populations from the Allegheny River basin. Combined sequence data from three
mitochondrial genes (cytochrome b, ATPase 6 and ATPase 8) indicated that Redside
Dace persisted within three Mississippian refugia during the last glaciation. Secondary
contact between two lineages was indicated by both mitochondrial and microsatellite
data. The combined results from the eDNA and conservation genetics studies can be used
to inform Redside Dace recovery efforts, and provide a template for similar efforts for
other aquatic endangered species.
iii
Keywords: Redside Dace (Clinostomus elongatus), endangered species, environmental
DNA (eDNA), DNA, detection probability, conservation genetics, phylogeography
iv
Dedication
This thesis is dedicated with love to my grandmother Rosy Serrao who passed
away at the start of my degree. She was one of the most influential female figures in my
life and I aspire to have her strength. I love you nana, and think about you all the time!
v
Acknowledgements
I am extremely grateful for all the amazing experiences and wonderful people I have
had the opportunity to work with over the last three years. I have had the lab support and
assistance from a number of talented individuals in the Fish Genetics Lab. I especially thank
Maggie Boothroyd for being there to listen and offer advice during my entire degree, and
being such a great friend to me. Kristyne Wozney has helped me with all aspects of my
thesis, and has offered invaluable lab troubleshooting, as well as life advice along the way.
I am grateful to Cait Nemeczek for teaching me how to do my first eDNA extraction, her
friendship, and words of encouragement. I thank Caleigh Smith for help with everything
microsatellite and mitochondrial, and always managing to put a smile on my face. I also
thank Jenn Bronnenhuber and Anne Kidd for their insight and advice on various aspects of
my project.
I am very appreciative to Scott Reid’s Aquatic Endangered Species Research Team
for all their help in the field, and collecting water samples for my project during Fall 2012.
To Victoria Kopf, thank you for all your efforts for everything Redside Dace, teaching me
to climb, and entertaining my terrible jokes. I wouldn’t have been able to survive this degree
without your friendship and support. I would also like to thank Matt Sweeting for all his
patience with me in the field; I could not have asked for a better person to have experienced
my first field season with. I also thank Andrea Dunn and colleagues (Conservation Halton)
for helping me collect buccal swabs at Sixteen Mile Creek.
I would like to acknowledge all my American collaborators for going out into the
field and collecting Redside Dace samples for my project. Douglas Carlson (New York
vi
State Department of Environmental Conservation), Konrad Schmidt and Jenny
Kruckenberg (North American Native Fishes Association), Holly Jennings (Forest Service)
and John Pagel (Ottawa National Forest), Brant Fisher (Indiana Department of Natural
Resources), Brian Zimmerman (Ohio State University), David Thorne and Isaac Gibson
(West Virginia Division of Natural Resources), John Lyons (Wisconsin Department of
Natural Resources), Nate Tessler (EnvironScience), and David Miko (Pennsylvania Fish
and Boat Commission), Matthew R. Thomas (Kentucky Department of Fish and Wildlife
Resources) generously provided samples that would otherwise have not been possible to
collect. Additionally, Aaron Clauser (Clauser Environmental), Wayne Starnes (North
Carolina State Museum of Natural Sciences), Aaron Snell (Streamside Ecological Services,
Inc), and Douglas Fischer (Pennsylvania Fish and Boat Commission) also provided
valuable advice during my project.
I thank my mum and my dad for their unconditional love and support, and being
there to help me with all four Peterborough moves! I wouldn’t be where I am without the
values you have instilled in me. I thank my siblings Nicole and Daniel for providing
comedic relief, and helping me through the final stretch of this degree. I am extremely
grateful to Shannon Fera for her words of wisdom, and for taking me in under her wing
and helping me through the initial rough patch of my graduate degree. I also thank Jessica
Tomlin for her daily calls, editing thesis drafts, and 17 years of sustained friendship- I am
so lucky to have you in my life! Lauren Banks, Tim Bartley, Ryan Franckowiak, Bob
Hanner, Christine Terwissen, Spencer Walker, Cristen Watt also deserve special thanks
for valuable insight into various aspects of my project.
vii
Lastly, I would like to thank my graduate committee for all their help and
guidance over my three years. I thank Al Dextrase for his patience, hours of hands-on
help with occupancy modelling, and always being there to help despite short notice. I
thank Joanna Freeland for her speedy feedback, attention to details, and for her words of
encouragement during my project. I am appreciative to Scott Reid for providing me with
the field technical support, and teaching me the importance of being a professional. You
are a brilliant research scientist and I am grateful to have had the opportunity to work
with you. To Chris- thank you for your mentoring, life counselling, and for helping me
grow. Your friendship has been one of my most valued ones at Trent and I have come to
view you as my academic father. Thank you!
viii
Table of Contents
Abstract ............................................................................................................................... ii
Dedication .......................................................................................................................... iv
Acknowledgements .............................................................................................................. v
Table of Contents ............................................................................................................. viii
List of Figures .................................................................................................................... xi
Chapter 1: General introduction ........................................................................................... 1
References ............................................................................................................................ 8
Chapter 2 ............................................................................................................................ 15
Abstract .............................................................................................................................. 15
Introduction ........................................................................................................................ 16
Results ................................................................................................................................ 31
Discussion .......................................................................................................................... 35
References .......................................................................................................................... 40
Chapter 3 ............................................................................................................................ 57
Abstract .............................................................................................................................. 57
Introduction ........................................................................................................................ 58
Methods .............................................................................................................................. 63
Results ................................................................................................................................ 71
Discussion .......................................................................................................................... 78
References .......................................................................................................................... 90
Chapter 4: General Discussion ....................................................................................... ..130
ix
List of Tables
Table 2.1: Mean, standard deviation, maximum, and minimum values of environmental
variables for 29 sites sampled for eDNA testing for Redside Dace................................... 45
Table 2.2: Estimates of Redside Dace detection probability and occupancy and ΔAICc
values from models for spring and fall field seasons (horizontal headings), at temporal
sampling (R) of 3, 4, and 5 replicates. ............................................................................... 46
Table 3.1: Locations with drainage, jurisdiction, code names, latitude/longitude and
number of samples obtained for mtDNA and microsatellite genetic samples used for
study. ................................................................................................................................ 113
Table 3.2: ATPase variable sites for 23 unique haplotypes of C. elongatus (1st column),
nucleotide positions at which mutations occur (1st row), number of individuals (N) and
populations that contain that particular haplotype. Haplotype 1A represents reference
sequence for table; dots within a cell represent nucleotide positions identical to the
reference sequence. .......................................................................................................... 115
Table 3.3: Summary of ATPase 6 and 8 and Cytochrome b sequence results for 27
Redside Dace populations, showing numbers of sequenced individuals (N), number of
haplotypes detected (Nh), haplotypic richness (HR), haplotype diversity (h), and
nucleotide diversity (π) for 27 Redside Dace populations. .............................................. 116
Table 3.4: Cytochrome b variable sites for 35 unique haplotypes of C. elongatus (1st
column), showing nucleotide positions at which mutations occur (1st row), number of
individuals (N) and populations that contain that particular haplotype. Dots within a cell
represent nucleotide positions identical to the reference sequence.................................. 119
Table 3.5: Haplotype name, number of individuals and population occurrences for 47
unique haplotypes based on combined sequences (ATPase 6 and 8, and cytochrome b).
.......................................................................................................................................... 123
Table 3.6: Genetic description of 28 Redside Dace populations (see Table 3.1 for
localities). Columns represent letter codes, number of individuals genotyped (N),
x
observed number of alleles (Na), standardized allelic richness (AR) for n=20 gene copies,
observed heterozygosity (HO), expected heterozygosity (HE), and inbreeding coefficient
(FIS). ................................................................................................................................. 125
Table 3.7: Pairwise FST values among 28 Redside Dace populations along with sample
size for each population. .................................................................................................. 126
Table 3.8: Analysis of Molecular Variance (AMOVA) for total evidence (cytochrome b
and ATPase 6 and 8) mitochondrial DNA data based on hypothesized (i) Mississippi and
Atlantic refugia (2 groups), (ii) mitochondrial DNA bootstrap supported groups (3
refugia), and (iii) microsatellite Principal Coordinate Analysis clustering (3 refugia
hypothesis), and hierarchical FST analysis for (iv) eastern versus western groups (v) three
groups identified by STRUCTURE, and (vi) contemporary drainage patterns ............... 128
xi
List of Figures
Figure 2.1: Map of 29 Redside Dace eDNA sampling sites from Fall 2012 and Spring
2013 sampling season (grey circles), 10 lake negative control sampling sites (black
triangles) from Spring 2013, and Otonabee River field blank (star) to help establish
detection threshold. ............................................................................................................ 47
Figure 2.2: Plot of log10 transformed template DNA copy number (x-axis) versus
dilutions for four Redside Dace eDNA samples in order to test for inhibition at four
sampling locations (LC1= Lynde Creek 1, LC2= Lynde Creek 2, MC1= Mitchell Creek 1,
MC2= Mitchell Creek 2). ................................................................................................... 48
Figure 2.3: Histogram of negative controls copy numbers/reaction of amplified Redside
Dace eDNA (x-axis) versus frequency (y-axis) for four types of: (a) filter control (n=168,
x̅ =0.091, s=0.288), (b) lake control (n=32, x̅ =0.048, s=0.13), (c) DNA extraction control
(n=31, x̅=0.081, s=0.29), (d) field control (n=27, x̅=0.18, s=0.39). .................................. 49
Figure 2.4: Scatter plot for mean copy number /reaction of each sample run in triplicates
(y-axis) versus the coefficient of variation of those values (CV; x-axis) (left) and
histogram of CV versus the frequency of samples that fall under the CV (right). ............ 50
Figure 2.5: Boxplot of qPCR standards with known DNA concentrations (1000 copies/
reaction down to 1 copy/ reaction) set as “eDNA unknowns” versus copy number log10
transformed (y-axis), as a test for qPCR accuracy. ............................................................ 51
Figure 2.6: Boxplot of Redside Dace standards (106 down to 100 copies/ reaction) at the
threshold cycle (Ct) where the copy number passes the baseline threshold for (A) omitted
(data points for the standard curve were removed to improve R2 value) (B) All standards
(no data points excluded). ...................................................................................... ………52
Figure 2.7: Barplot of total temporal Redside Dace detections (x-axis) found at each of
the eleven sampled sites (y-axis). Five temporal replicates were collected at each season
(fall and spring) twice, with a time lapse of approximately 10 d between sampling weeks
within a season. Site labels on y-axis are listed in Appendix 2.1. ..................................... 53
xii
Figure 2.8: Detection probabilities (y-axis) within seasons (x-axis) for Fall Week 1
(FW1), Fall Week 2 (FW2), Spring Week 1 (SW1), and Spring Week 2 (SW2) (error bars
represent upper and lower 95% confidence limits of estimates). ...................................... 54
Figure 2.9: Individual site detection probability estimates for index of flow versus
detection probability (top), and temperature versus detection probability (bottom), during
Spring at 5 replicates. ......................................................................................................... 55
Figure 2.10: A comparison of the number of sites (out of n=29) with Redside Dace DNA
detections (x-axis), versus the number of replicates sampled (y-axis), for a) the four
spatially replicated samples collected at each site and b) the four temporally replicated
samples collected at each site in each season. ................................................................... 56
Figure 3.1: Distribution map of sampling locations for Redside Dace (Clinostomus
elongatus). Inset map shows the species’ global range (reproduced from COSEWIC 2007
report, with permission), with the polygon enclosing the species range. .......................... 98
Figure 3.2: Mutational network observed among C. elongatus haplotypes for ATPase 6
and 8 based on parsimony. Each numeric circle corresponds to a haplotype listed in Table
3.2; each node represents one nucleotide substitution. Branch lengths do not correspond
to genetic distance. Inset map shows the geographic distribution. .................................... 99
Figure 3.3: Neighbour-joining dendrogram of relationships among ATPase 6 and 8
haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table
3.2) are represented by numbers outside brackets, while number of individuals are
represented by numbers inside brackets. Numbers at branch nodes show bootstrap support
values >50 %. ................................................................................................................... 100
Figure 3.4: Mutational network observed among C. elongatus haplotypes for cytochrome
b based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.4;
each node represents one nucleotide substitution. Branch lengths do not correspond to
genetic distance. Inset map shows the geographic distribution of haplogroups
(purple=haplogroup C; light green=haplogroup D)…………………………………….101
xiii
Figure 3.5: Neighbour-joining dendrogram of relationships among cytochrome b
haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table
3.4) are represented by numbers outside brackets, while numbers of individuals are
represented by numbers inside brackets. Numbers at branch nodes show bootstrap support
values >50 %. ................................................................................................................... 102
Figure 3.6: Mutational network observed among C. elongatus haplotypes for total
evidence for cytochrome b and ATPase 6 and 8 based on parsimony. Each numeric circle
corresponds to a haplotype listed in Table 3.5; while each node represent one nucleotide
substitution. Branch lengths do not correspond to genetic distance. Inset map shows the
geographic distribution of haplogroups (orange=haplogroup 1, olive green=haplogroup 3,
red=haplogroup 2). ........................................................................................................... 103
Figure 3.7: Neighbour-joining dendrogram of relationships among total evidence
(cytochrome b and ATPase 6 and 8) halotypes based on p-distances with 500 bootstrap
replicates. Haplotype numbers (Table 3.5) are represented by numbers outside brackets,
while number of individuals are represented by numbers inside brackets. Numbers at
branch nodes show bootstrap support values >50%. ....................................................... 104
Figure 3.8: Distribution of haplogroups (orange=haplogroup 1, green=haplogroup 3,
red=haplogroup 2, black = unassigned) for combined cytochrome b and ATPase 6 and 8
data using groups identified via mutational network (Figure 3.6) and genetic distance
(Figure 3.7). ..................................................................................................................... 105
Figure 3.9: Results from Bayesian clustering analyses in STRUCTURE for Redside Dace
individuals, where K represents number of genetically unique populations. Analyses were
run at K=1 to K=29, and methods outlined in Chapter 2. Results analysed using (i) log
likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al. (2005). ..................... 106
Figure 3.10: Bayesian clustering assignment implemented in STRUCTURE for 28
populations at (a) K=3 for range-wide analysis (b) results of separate STRUCTURE runs
on the three identified subsets for fine-scale analysis, showing optimal K values along
with preceding and successive K values. All runs were implemented with no admixture,
and independent allele frequencies. Colours between different runs have no association
with each other. ................................................................................................................ 107
xiv
Figure 3.11: Results from Bayesian clustering analyses in STRUCTURE for Redside
Dace individuals, for three genetic groups identified by K=3 on Figure 3.10, where K
represents number of genetically unique populations. Analyses were run for red group
from K=1 to K=10 (top left), green group from K=1 to K=20 (top right), and blue group
from K=1 to K=10 bottom group using methods outlined in Chapter Two. Results
analysed using (i) log likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al.
(2005). .............................................................................................................................. 108
Figure 3.12: Principal coordinate analysis (PCoA) of genetic structure across all sampled
Redside Dace populations (red= cluster A, blue= cluster B, green=cluster C). Inset map
shows the geographic distributions of each genetic group. ............................................. 109
Figure 3.13: Neighbour joining dendrogram of genetic relationships among sampled
populations based on Nei et al. (1983) DA genetic distance for 10 microsatellite loci.
Numbers at branch nodes represent bootstrap support values > 50% based on 500
bootstrap replicates. Groups correspond to those identified in Figure 3.12. ................... 110
Figure 3.14: Plot of isolation by distance for pairwise population comparisons of
transformed geographic distance [ln (distance in km+1)] versus genetic divergence
[(FST)/(1-FST)]. ................................................................................................................. 111
Figure 3.15: Isolation by distance plot of transformed geographic distance [ln (distance in
km+1)] versus genetic divergence [(FST)/(1-FST)] for population population pairs with
geographic distances of less than 123 km. Points in yellow represent pairwise
comparisons among the Allegheny River populations. ................................................... 112
xv
List of Appendices
Appendix 2.1: Field data collected at 29 sites including sampling dates, fish caught,
habitat characteristics (channel width, channel depth, conductivity, temperature), and
GPS coordinates. .............................................................................................................. 139
Appendix 2.2a: Raw qPCR values (copies/reaction) for fast mix during Spring (S) and
Fall (F) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and
spatial (S1-S4) replicates. ................................................................................................ 141
Appendix 2.2b: Raw qPCR values (copies/reaction) for environmental mix during
Spring(S) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and
spatial (S1-S4) replicates. ................................................................................................ 143
Appendix 2.3: 10 Lake control sites absent for Redside Dace along with their GPS
coordinates and date sampled. ......................................................................................... 145
Appendix 2.4: Comparison of Environmental versus Fast mastermix ........................... 146
Appendix 2.5: Separating the signal from the noise: using receiver operator
characteristics to optimize sensitivity and specificity of environmental DNA detections
.......................................................................................................................................... 152
Appendix 2.6: Comparison of electrofishing and eDNA detections during Fall and Spring
sampling season (total of 29 sites). .................................................................................. 165
Appendix 2.7: Estimates of Redside Dace detection probability and occupancy, AICc,
ΔAICc, AIC weights, number of parameters, and -2log values from models for spring and
fall field seasons (horizontal headings), at temporal sampling (R) of 3, 4, and 5 replicates.
.......................................................................................................................................... 166
Appendix 2.8: Comparison of costs for eDNA versus electrofishing ............................. 168
xvi
Appendix 3.1: Primer sequences used for microsatellite DNA analysis, along with their
GenBank accession numbers, repeat motifs, size range (bp) and annealing temperatures.
.......................................................................................................................................... 170
Appendix 3.2: Proportion of polymorphic loci across ten microsatellite primers for 28
Redside Dace populations. ............................................................................................... 172
Appendix 3.3: List of 20 populations that deviate from Hardy-Weinberg equilibrium
expectations. ..................................................................................................................... 173
Appendix 3.4: Total evidence haplotype numbers with corresponding ATPase 6 and 8
and cytochrome b haplotypes. .......................................................................................... 174
1
Chapter 1: General introduction
Within the last century, the rate of vertebrate species loss has increased to
approximately 100 times greater than the background extinction rate (Ceballos et al.
2015). This rapid loss of biodiversity has occurred primarily as a result of anthropogenic
causes, including habitat fragmentation and loss (Soulé and Kohm 1989, Frankham et al.
2002). The North American freshwater fauna are experiencing a loss of biodiversity
comparable to biota in tropical forests; extinction rates for freshwater mussels are
projected to increase by 6% within the next decade, with 39% of fish species listed as
imperiled (Ricciardi and Rasmussen 1999, Dudgeon et al. 2006, Jelks et al. 2008,
Ceballos et al. 2015).
Conservation biology is a multidisciplinary science that encompasses monitoring,
biogeography, ecology, and genetics in order to protect species (Soulé et al. 1985). To
reduce the loss of biodiversity, the field of conservation biology seeks to counter some of
the threats to species decline (Soulé et al. 1985). The main goals are to protect species or
populations that have been negatively impacted by human-mediated activities including
habitat loss, invasive species, urbanization, and agricultural activities (Magurran 2009).
Many initiatives have been implemented to protect freshwater systems (Suski and Cooke
2006, George et al. 2009), while continuing research on the influence of anthropogenic
processes on individual species and community assemblages (Leidy et al. 2011, Ramirez-
Llodra et al. 2011). Current research has explored the extent of habitat loss (Skole and
Tucker 1993), the effects of fragmentation and its contribution to lower fish species
density (Layman et al. 2004), and the negative impact of urbanization on species richness
and composition (McKinney 2002).
2
Conservation units
Within Canada and the United States, conservation units below the species level
can be applied to protect species at risk. For species with ranges spanning multiple
jurisdictions, with different laws and policies at federal, state, and provincial levels,
effective conservation can be challenging (Mooers et al. 2010, Petrou et al. 2013, Favaro
et al. 2014). The Endangered Species Act in the United States provides protection for
threatened and endangered species, as well as subspecies or distinct population segments
within species, along with their corresponding habitats. Defining a distinct population
segment can be challenging due to ambiguity in classification below the subspecies level,
so the evolutionary unit (EU) concept was developed in order to help identify this. An EU
is considered distinct if there is evidence of genetic isolation, geographic and temporal
isolation, and behavioural and reproductive isolation (National Research Council 1995).
This can be achieved by looking at ecological, genetic, behavioural and morphological
data. Additionally, the term “evolutionary significant unit” exists to help conserve species
at risk, and there are many interpretations of its meaning throughout the literature. Ryder
(1986) used it to identify conservation units based on adaptive variation, while Waples
(1991) used it to identify populations based on adaptive variation and reproductive
isolation. More recently, Moritz (1994) used a genetics-based approach with conservation
units being identified by reciprocal monophyly using mitochondrial DNA (mtDNA) and
allele frequency divergences using nuclear DNA. Lastly, a management unit (MU) refers
to conserving populations’ short term by identifying populations containing high mtDNA
and nuclear genetic diversity levels (Moritz 1994).
3
Within Canada, “Designatable Units” are applied to identify conservation units
based on: (i) the presence of subspecies or varieties, (ii) if populations are discrete from
each other, and (iii) if the discreteness has an evolutionary significance (COSEWIC
2014). A population can be classified as discrete based on genetic uniqueness using either
neutral markers or examining inherited traits, if there are natural disjunctions between
large portions of a species range so that movement between the locations is difficult, and
if the species occupies various eco-geographic locations (COSEWIC 2014). Based on the
above criteria, a population or group can be considered evolutionarily significant if (i) it
has deep phylogenetic divergences from other populations, (ii) is in a unique ecological
setting that could have resulted in local adaptations, (iii) shows evidence that the
population being examined is the only group of populations (or population) left in the
species’ native range, or (iv) its loss would result in an extensive disjunction in the
species’ range (COSEWIC 2014).
Using genetics to identify conservation units
Patterns of genetic structure and diversity within species reflect historical and
contemporary influences, and an understanding of the impacts of past and current events
and processes is beneficial for effective management (Bernatchez and Wilson 1998,
McDermid et al. 2011, Ginson et al. 2015). At a historical scale, genetic data reflect large-
scale processes such as vicariant events and climatic changes despite having occurred
thousands of years ago, and can be used to understand contemporary genetic structuring
(Avise 2000, Gum et al. 2005, Borden and Krebs 2009). In particular, Pleistocene
glaciations played important roles in shaping the contemporary distributions of many
freshwater fish species within North America (Hocutt and Wiley 1986 and chapters
4
therein). Advancing ice sheets displaced fish species into refugia at glacial margins, and
their current distributions are reflective of postglacial dispersal during and after
deglaciations (Hocutt and Wiley 1986, Mandrak and Crossman 1992, Wilson and Hebert
1996, McDermid et al. 2011). Phylogeographic studies using mitochondrial DNA
(mtDNA) have been used to make inferences about glaciation events and postglacial
dispersal routes that provide insight into evolutionary lineages important for management
units (Bernatchez and Wilson 1998, Ginson et al. 2015). Mitochondrial DNA is well
suited for this type of study, as it is maternally inherited without recombination, is
considered selectively neutral, and has a higher mutation rate than most nuclear genes
(Brown et al. 1979, Avise et al. 1987, Moritz et al. 1987).
At a more contemporary scale, knowledge of connectivity between populations,
inbreeding levels, and metapopulation structuring, can also be inferred using genetic data
(Berendzen and Dugan 2008, Blakney et al. 2014, Ginson et al. 2015). These data can
also be used to examine the impact of anthropogenic habitat alterations such as
urbanization and habitat fragmentation (Blakney et al. 2014, Mather et al. 2015). Patterns
of historical structuring can be complemented using faster-evolving markers such as
nuclear microsatellite DNA loci, which are biparentally inherited, selectively neutral, and
have high mutation rates (Li et al. 2002, Ellegren 2004, Selkoe and Toonen 2006, Kirk
and Freeland 2011). Microsatellite data can provide information on genetic diversity,
effects of inbreeding, gene flow, and population structuring within and among
contemporary populations (McCusker et al. 2014, Ginson et al. 2015, Glass et al. 2015).
These complementary genetic data sources can be used separately or in concert to help
5
inform conservation and management decisions for conservation efforts (Moritz 1994,
Neff et al. 2011).
Monitoring challenges
An effective monitoring program should account for knowledge of a species’
distribution, their habitat, and allow for adaptive management practices (Campbell et al.
2002). It is common for species at risk to have poorly identified habitat ranges and
requirements at the time of listing, and few studies have accounted for population
viability and unoccupied habitat when making designations (Camaclang et al. 2015).
Monitoring efforts to obtain this information have often been undertaken rather
haphazardly, with specific objectives, experimental design and statistical analysis being
overlooked, likely due to the limited resources available for the recovery of a species
(Noss 1990). Once comprehensive monitoring has been undertaken, sites can be assessed
for recovery efforts.
A novel application for conservation genetics is using environmental DNA
(eDNA) to document occurrences of aquatic endangered species. Environmental DNA
detection, is growing in popularity for its ability to detect occurrences of aquatic species
(Ficetola et al. 2008, Darling and Mahon 2011, Mahon et al. 2013). Using this technique,
species’ presence can be documented based on the presence of their DNA in water or
sediments (Ficetola et al. 2008). This technique has both the potential to detect species at
low density levels, and is non-invasive, yet its application to endangered species has been
limited (Bronnenhuber and Wilson 2013, Wilcox et al. 2013). The focus of most eDNA
work has been on invasive species (Olson et al. 2012, Goldberg et al. 2013, Jerde et al.
6
2013). Environmental DNA has been evaluated as an effective tool in controlled
environments (Ficetola et al. 2008, Thomsen et al. 2012, Goldberg et al. 2013), marine
and freshwater systems (Thomsen et al. 2012, Jerde et al. 2013, Goldberg et al. 2013) and
aquatic sediments (Turner et al. 2014). Studies have also evaluated the effect of flow on
detection rates (Deiner and Altermatt 2014), the effectiveness of various DNA extraction
methods (Deiner et al. 2014), the number of PCR replicates needed to avoid false
positives and negatives in metabarcoding (Ficetola et al. 2014), and seasonal effects on
detection rates (Deiner and Altermatt 2014, Jane et al. 2014). These advances have helped
refine eDNA as a monitoring tool, although several limitations still exist that need to be
addressed in future studies (Roussel et al. 2015). Standard reporting procedures need to
be incorporated across studies so that basic information including (i) copy numbers
present in negative controls, (ii) limits of detection for target species and (iii)
standardized measures of what constitutes a “false positive” are known (Roussel et al.
2015). To date, comparison of detection effectiveness of eDNA versus traditional
sampling methods, and comparison with measures of occupancy/ abundance estimates
have been rarely assessed (Roussel et al. 2015). Also, while detection probability
estimates are useful for determining optimal sampling conditions, few studies have used
this approach to account for imperfect detection (Schmidt et al. 2013; Ficetola et al. 2014;
Hunter et al. 2015).
Test Species
The Redside Dace, Clinostomus elongatus (Teleostei: Cyprinidae), is a small
freshwater minnow that typifies many conservation concerns and information needs to
create effective recovery strategies. Redside Dace are stream fish that are generally found
7
in pools with overhanging vegetation that support terrestrial insects, their primary food
source (Novinger and Coon 2000). The species has a disjunct distribution throughout the
upper Mississippi River Drainage, Ohio River, Allegheny River and upper Susquehanna
River, as well as many tributaries in the Great Lakes Basin (COSEWIC 2007). Within
Canada, Redside Dace populations are restricted to southern Ontario, with the exception
of one population east of Sault Saint Marie (Redside Dace Recovery Team 2010).
Populations have been declining due to habitat loss and degradation throughout their
range (Parker et al. 1988, COSEWIC 2007, Redside Dace Recovery Team 2010). Redside
Dace is thought to be extirpated from 10 of 24 Ontario watersheds, with eight of the
remaining 14 locations experiencing decline (COSEWIC 2007). Scott and Crossman
(1973) identified Reside Dace as a species to study based on the documented declines.
Despite the species being designated as Endangered in 2007 by the Committee on the
Status of Endangered Wildlife in Canada (COSEWIC 2007), it is not yet protected at the
federal level. Within Ontario, Redside Dace and its habitat are protected under the
province’s Endangered Species Act, and a provincial recovery strategy has been
developed (Redside Dace Recovery Team 2010).
Research on Redside Dace has largely focused on habitat associations, monitoring
approaches, and threats; however, limited information exists on their spatial genetic
structure and diversity (Berendzen and Dugan 2008, Houston et al. 2010, Redside Dace
Recovery Team 2010, Sweeten 2012). Although a large portion of the current Redside
Dace range was glaciated during the Pleistocene, their postglacial origins still remain
unresolved. Based on distributional data, Underhill (1986) suggested that Redside Dace
colonized their contemporary range from a single (Mississippian) glacial refugium,
8
whereas Mandrak and Crossman (1992) suggested that contemporary populations may
have originated from two refugia (Atlantic and Mississippian). Additionally, few studies
have looked at contemporary Redside Dace structuring using microsatellite DNA
(Berendzen and Dugan 2008, Sweeten 2012), which can provide important information
on genetic diversity levels, gene flow, and potential inbreeding. In order to investigate
large-scale historical and fine-scale contemporary influences on genetic structure and
diversity within and among populations of Redside Dace, information on geographic
variation of both mitochondrial (mtDNA) and microsatellite DNA would help inform
conservation efforts.
My research aimed to advance the conservation of Redside Dace by applying
conservation genetic tools to identify and map genetic diversity within the species range,
and to improve monitoring efforts. In Chapter 2, I sought to assess the sensitivity of
eDNA for documenting the presence of Redside Dace, and developed basic sampling
protocols to minimize false positive and false negative error rates. In Chapter 3, I
characterized the phylogeography and contemporary genetic structure and diversity for
Redside Dace across its global range, with an emphasis on Ontario populations. The
results from these two studies provide new knowledge and tools to help inform Redside
Dace conservation and address actions identified in the Ontario Recovery Strategy for the
species.
References
Avise JC, Arnold J, Ball RM, Bermingham E, et al. (1987) Intraspecific phylogeography:
the mitochondrial DNA bridge between population genetics and systematics. Annual
Review of Ecology and Systematics, 18, 489-522.
Avise JC (2000) Phylogeography. Harvard University Press, Cambridge, MA.
9
Berendzen PB, Dugan JF (2008) Establishing conservation units and population genetic
parameters of fishes of greatest conservation need distributed in Southeast Minnesota.
State Wildlife Grants Program. Division of Ecological Resources, Minnesota Department
of Natural Resources. Available from:
http://files.dnr.state.mn.us/eco/nongame/projects/consgrant_reports/2008/2008_berendzen
_etal.pdf.
Bernatchez L, Wilson CC (1998) Comparative phylogeography of Nearctic and Palearctic
fishes. Molecular Ecology, 7, 431–452.
Blakney JR, Loxterman JL, Keeley ER (2014) Range-wide comparisons of northern
leatherside chub populations reveal historical and contemporary patterns of genetic
variation. Conservation Genetics, 15, 757-770.
Borden WC, Krebs RA (2009) Phylogeography and postglacial dispersal of smallmouth
bass (Micropterus dolomieu) into the Great Lakes. Canadian Journal of Fisheries and
Aquatic Sciences, 66, 2142-2156.
Bronnenhuber JE, Wilson CC (2013) Combining species-specific COI primers with
environmental DNA analysis for targeted detection of rare freshwater species.
Conservation Genetics Resources, 5, 971–975.
Brown WM, George M, Wilson AC (1979) Rapid evolution of animal mitochondrial
DNA. Proceedings of the National Academy of Sciences, 76, 1967-1971.
Camaclang AE, Maron M, Martin TG, Possingham HP (2015) Current practices in the
identification of critical habitat for threatened species. Conservation Biology, 29, 482-
492.
Campbell SP, Clark JA, Crampton LH, Guerry AD, et al. (2002) An assessment of
monitoring efforts in endangered species recovery plans. Ecological Applications, 12,
674-681.
Ceballos G, Ehrlich PR, Barnosky AD, García A, et al. (2015) Accelerated modern
human–induced species losses: Entering the sixth mass extinction. Science Advances, 1,
e1400253.
COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace
Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in
Canada. Ottawa, ON.
COSEWIC (2014) Guidelines for Recognizing Designatable Units. Committee on the
Status of Endangered Wildlife in Canada.Available from:
http://www.cosewic.gc.ca/eng/sct2/sct2_5_e.cfm.
10
Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based
methods for monitoring biological invasions in aquatic environments. Environmental
Research, 111, 978-988.
Deiner K, Altermatt F (2014) Transport distance of invertebrate environmental DNA in a
natural river. PLoS ONE, 9, e88786.
Deiner K, Walser JC, Mächler E, Altermatt F (2014) Choice of capture and extraction
methods affect detection of freshwater biodiversity from environmental DNA. Biological
Conservation, 183, 53-63.
Dudgeon D, Arthingto AH, Gessner MO, Kawabata Z, et al. (2006) Freshwater
biodiversity: importance, threats, status and conservation challenges. Biological Reviews,
81, 163-182.
Ellegren H (2004) Microsatellites: simple sequences with complex evolution. Nature
Reviews Genetics, 5, 435-445.
Favaro B, Claar DC, Fox CH, Freshwater C, et al. (2014) Trends in extinction risk for
imperiled species in Canada. PloS ONE, 9, e113118.
Ficetola GF, Miaud C, Pompanon F, Taberlet P (2008) Species detection using
environmental DNA from water samples. Biology Letters, 4, 423-425.
Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false
presences, and the estimation of presence / absence from eDNA metabarcoding data.
Molecular Ecology Resources, 15, 543–556.
Frankham R, Briscoe DA, Ballou JD (2002) Introduction to Conservation Genetics.
Cambridge University Press, Cambridge.
George AL, Kuhajda BR, Williams JD, Cantrell MA, et al. (2009) Guidelines for
propagation and translocation for freshwater fish conservation. Fisheries, 34, 529-545.
Ginson R, Walter RP, Mandrak NE, Beneteau CL, et al. (2015) Hierarchical analysis of
genetic structure in the habitat-specialist Eastern Sand Darter (Ammocrypta pellucida).
Ecology and Evolution, 5, 695–708.
Glass WR, Walter RP, Heath DD, Mandrak NE, et al. (2015) Genetic structure and
diversity of spotted gar (Lepisosteus oculatus) at its northern range edge: implications for
conservation. Conservation Genetics, 16, 889-899
Goldberg CS, Sepulveda A, Ray A, Baumgardt J, Waits P (2013) Environmental DNA as
a new method for early detection of New Zealand mudsnails (Potamopyrgus
antipodarum). Freshwater Science, 32, 792–800.
11
Gum B, Gross R, Kuehn R (2005) Mitochondrial and nuclear DNA phylogeography of
European grayling (Thymallus thymallus): evidence for secondary contact zones in central
Europe. Molecular Ecology, 14, 1707-1725.
Hocutt CH, Wiley EO (1986) The Zoogeography of North American Freshwater Fishes.
John Wiley and Sons, New York.
Houston DD, Shiozawa DK, Riddle BR (2010) Phylogenetic relationships of the western
North American cyprinid genus Richardsonius, with an overview of phylogeographic
structure. Molecular Phylogenetics and Evolution, 55, 259-273.
Hunter ME, Oyler-McCance SJ, Dorazio RM, Fike JA, et al. (2015) Environmental DNA
(eDNA) sampling improves occurrence and detection estimates of invasive Burmese
pythons. PLoS ONE, 10, e0121655.
Jane S, Taylor WM, Mckelvey KS, Young MK (2014) Distance, flow and PCR
inhibition: eDNA dynamics in two headwater streams. Molecular Ecology Resources, 15,
216-227.
Jelks HL, Walsh SJ, Burkhead NM, Contreras-Balderas S, et al. (2008) Conservation
status of imperiled North American freshwater and diadromous fishes. Fisheries, 33, 372-
407.
Jerde CL, Chadderton WL, Mahon AR, Renshaw MA, et al. (2013) Detection of Asian
carp DNA as part of a Great Lakes basin-wide surveillance program. Canadian Journal of
Fisheries and Aquatic Sciences, 70, 522-526.
Sweeten J (Manchester University) (2012) Redside dace (Clinostomus elongatus) in Mill
Creek, Wabash County, Indiana: A strategy for research and augmentation. Indiana
Department of Natural Resources.
Kirk J, Freeland JR (2011) Applications and implications of neutral versus non-neutral
markers in molecular ecology. International Journal of Molecular Sciences, 12, 3966-
3988.
Layman CA, Arrington DA, Langerhans RB, Silliman BR (2004) Degree of
fragmentation affects fish assemblage structure in Andros Island (Bahamas)
estuaries. Caribbean Journal of Science, 40, 232-244.
Leidy RA, Cervantes‐Yoshida K, Carlson SM (2011) Persistence of native fishes in small
streams of the urbanized San Francisco Estuary, California: acknowledging the role of
urban streams in native fish conservation. Aquatic Conservation: Marine and Freshwater
Ecosystems, 21, 472-483.
12
Li YC, Korol AB, Fahima T, Beiles A, Nevo E (2002) Microsatellites: genomic
distribution, putative functions and mutational mechanisms: a review. Molecular
Ecology, 11, 2453-2465.
Magurran AE (2009) Threats to freshwater fish. Science, 325, 1215-1216.
Mahon AR, Barnes MA, Li F, Egan SP, et al. (2013) DNA-based species detection
capabilities using laser transmission spectroscopy DNA-based species detection
capabilities using laser transmission spectroscopy. Journal of the Royal Society Interface,
10, 20120637.
Mandrak NE, Crossman E (1992) Postglacial dispersal of freshwater fishes into Ontario.
Canadian Journal of Zoology, 70, 2247-2259.
Mather A, Hancox D, Riginos C (2015) Urban development explains reduced genetic
diversity in a narrow range endemic freshwater fish. Conservation Genetics, 16, 625-634.
McCusker MR, Mandrak NE, Egeh B, Lovejoy NR (2014) Population structure and
conservation genetic assessment of the endangered Pugnose Shiner, Notropis
anogenus. Conservation Genetics, 1, 343-353.
McDermid JL, Wozney JK, Kjartanson SL, Wilson CC (2011) Quantifying historical,
contemporary, and anthropogenic influences on the genetic structure and diversity of lake
sturgeon (Acipenser fulvescens) populations in northern Ontario. Journal of Applied
Ichthyology, 27, 12–23.
McKinney ML (2002) Urbanization, biodiversity, and conservation. BioScience, 52, 883-
890.
Mooers AO, Doak DF, Findlay CS, Green DM, et al. (2010) Science, policy, and species
at risk in Canada. BioScience, 60, 843-849.
Moritz C (1994) Defining ‘evolutionarily significant units’ for conservation. Trends in
Ecology & Evolution, 9, 373-375.
Moritz C, Dowling TE, Brown WM (1987) Evolution of animal mitochondrial DNA:
relevance for population biology and systematics. Annual Review of Ecology and
Systematics, 18, 269-292.
National Research Council (NRC) (1995) Science and the Endangered Species Act.
National Academy Press, Washington, D.C.
Neff BD, Garner SR, Pitcher TE (2011) Conservation and enhancement of wild fish
populations: preserving genetic quality versus genetic diversity. Canadian Journal of
Fisheries and Aquatic Sciences, 68, 1139-1154.
13
Noss RF (1990) Indicators for monitoring biodiversity: a hierarchical
approach. Conservation Biology, 4, 355-364.
Novinger DC, Coon TG (2000) Behavior and physiology of the redside dace,
Clinostomus elongatus, a threatened species in Michigan. Environmental Biology of
Fishes, 57, 315–326.
Olson ZH, Briggler JT, Williams RN (2012) An eDNA approach to detect eastern
hellbenders (Cryptobranchus a. alleganiensis) using samples of water. Wildlife Research,
39, 629-636.
Parker BJ, McKee P, Campbell RR (1988) Status of the redside dace, Clinostomus
elongatus, in Canada. Canadian Field Naturalist, 102,163-169.
Petrou EL, Hauser L, Waples RS, Seeb JE, et al.(2013) Secondary contact and changes in
coastal habitat availability influence the nonequilibrium population structure of a
salmonid (Oncorhynchus keta). Molecular Ecology, 22, 5848-5860.
Ramirez-Llodra E, Tyler PA, Baker MC, Bergstad OA, et al. (2011) Man and the last
great wilderness: human impact on the deep sea. PLoS ONE, 6, e22588.
Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus
elongatus) in Ontario. Ontario Ministry of Natural Resources. Peterborough, ON.
Ricciardi A, Rasmussen JB (1999) Extinction rates of North American freshwater
fauna. Conservation Biology, 13, 1220-1222.
Roussel JM, Paillisson JM, Tréguier A, Petit E (2015) The downside of eDNA as a
survey tool in water bodies. Journal of Applied Ecology, 52, 823-826.
Ryder OA (1986) Species conservation and systematics: the dilemma of
subspecies. Trends in Ecology & Evolution, 1, 9-10.
Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy
models in the analysis of environmental DNA presence/absence surveys: a case study of
an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.
Scott WB, Crossman EJ (1973) Freshwater Fishes of Canada. Fisheries Research Board
of Canada, Bulletin 183, Toronto, ON. 966 pp.
Selkoe KA, Toonen RJ (2006) Microsatellites for ecologists: a practical guide to using
and evaluating microsatellite markers. Ecology Letters, 9, 615-629.
Skole D, Tucker C (1993) Tropical deforestation and habitat fragmentation in the
Amazon: satellite data from 1978 to 1988. Science, 260, 1905-1910.
14
Soulé ME (1985) What is conservation biology? A new synthetic discipline addresses the
dynamics and problems of perturbed species, communities, and ecosystems. BioScience,
35, 727-734.
Soulé ME, Kohm KA (1989) Research Priorities for Conservation Biology. Island Press,
Washington, DC.
Suski CD, Cooke SJ (2007) Conservation of aquatic resources through the use of
freshwater protected areas: opportunities and challenges. Biodiversity and
Conservation, 16, 2015-2029.
Thomsen PF, Kielgast J, Iversen LL, Wiuf C, et al. (2012) Monitoring endangered
freshwater biodiversity using environmental DNA. Molecular Ecology, 21, 2565–2573.
Turner CR, Uy KL, Everhart RC (2014) Fish environmental DNA is more concentrated in
aquatic sediments than surface water. Biological Conservation, 183, 93-102.
Underhill JC (1986) The fish fauna of the Laurentian Great Lakes, the St. Lawrence
Lowlands, Newfoundland and Labrador. In: Zoogeography of North American
Freshwater Fishes (eds Hocutt CH, Wiley EO), pp. 105-136. John Wiley and Sons, NY.
Waples RS (1991) Pacific salmon, Oncorhynchus spp., and the definition of “species”
under the Endangered Species Act. Marine Fisheries Review, 53, 11-22.
Wilcox TM, McKelvey KS, Young MK, Jane SF, Lowe WH (2013) Robust detection of
rare species using environmental DNA: the importance of primer specificity. PLoS ONE,
8, e59520.
Wilson CC, Hebert PDN (1996) Phylogeographic origins of lake trout (Salvelinus
namaycush) in eastern North America. Canadian Journal of Fisheries and Aquatic
Sciences, 53, 2764-2775.
15
Chapter 2: Using environmental DNA (eDNA) to detect endangered Redside Dace,
Clinostomus elongatus
Abstract
Detection and monitoring of species at risk in aquatic environments necessitates
methods that are non-intrusive and are able to identify target organisms at low densities.
Environmental DNA (eDNA) as a monitoring tool has been applied extensively to invasive
species, but research on species at risk has been limited. In this study, Redside Dace
(Clinsotomus elongatus), an endangered fish native to southwestern Ontario within Canada,
was used to determine if eDNA is a sensitive tool for monitoring cyprinids and other stream
fishes. A total of 29 historic Redside Dace sites were sampled, with five temporal and four
spatial eDNA replicates collected at each site, and later analyzed using qPCR. Additionally,
I assessed if seasonal differences in spawning activity and stream flow would impact
Redside Dace eDNA detections. Using occupancy modelling, the results from my study
indicate that overall detection probability is higher in the spring than in the fall, varied
between sampling weeks within a season, that collecting water over spatial and temporal
scales are comparable, and a minimum of three replicates are needed to reliably detect
Redside Dace at a specific site. A comparison of naive detections for electrofishing versus
eDNA monitoring in the fall indicated that eDNA was able to detect Redside Dace at more
sites than electrofishing. The results from my study indicate that eDNA surveying is a
sensitive tool for species detection, that can complement conventional sampling methods
to increase detection success.
16
Introduction
The rapid loss of biodiversity globally has provoked a need to conserve species
(Frankham et al. 2002). To counter the effects of biodiversity loss, many jurisdictions
have enacted legislation to protect species at risk and their habitats. Important activities
associated with these laws include gathering scientific information regarding the species
distribution, biology and habitat requirements, and encouraging stewardship activities that
facilitate recovery. For this to be successful, an extensive inventory and monitoring
program needs to be put in place to determine the range of occupied habitat for a species,
and site-specific population trends (Campbell et al. 2002). A basic understanding of
habitat ranges for most species is lacking, and this can therefore be a difficult undertaking
(Thompson 2004).
The detection and monitoring of aquatic species at risk requires methods that are
non-intrusive and are able to identify target organisms at low densities. Traditional
monitoring for aquatic organisms including electrofishing, snorkelling and netting can not
only be costly, but also time consuming, and requires trained labour (Darling and Mahon
2011). From a conservation perspective, these monitoring approaches could cause
increased stress and reduce fitness to the already at-risk species (Nielsen 1998). To
overcome these limitations, environmental DNA (referred to as eDNA herein) detection is
an emerging technique that is becoming more frequently used because of its potential for
species detection (Ficetola et al. 2008). Referring to the detection of target DNA from an
environmental sample, its benefit lies in documenting a species’ presence without having
to use invasive survey techniques. This DNA-based identification holds potential for
17
unambiguous species identification at various life stages (Hebert et al. 2003, Victor et al.
2009, Serrao et al. 2014), and for greater sensitivity at detecting the target organism over
traditional methods for monitoring surveys (Darling and Mahon 2011).
Environmental DNA methodology has been applied to a wide range of aquatic
species to document their occurrence, distribution, and habitat occupancy. The primary
application of eDNA to date has been on aquatic invasive species such as the American
Bullfrog (Rana catesbeiana) (Ficetola et al. 2008), Bighead Carp (Hypophthalmichthys
nobilis), Silver Carp (H. molitrix) (Jerde et al. 2013) and New Zealand Mudsnail
(Potamopyrgus antipodarum) (Goldberg et al. 2013). In comparison, eDNA methods for
monitoring species at risk have been largely understudied (Bronnenhuber and Wilson
2013, Janosik and Johnston 2015, Laramie et al. 2015). Since the advent of eDNA
technology, lab methodology has progressed from using traditional polymerase chain
reaction (PCR) (Ficetola et al. 2008), to using newer platforms such as real-time PCR
(qPCR) (Wilcox et al. 2013), next-generation sequencing (Thomsen et al. 2012) and laser
transmission spectroscopy (Mahon et al. 2013). Our understanding of DNA’s properties
has also increased over this period. For example, Goldberg et al. (2013) determined that
in a controlled environment, eDNA from the New Zealand Mudsnail could no longer be
detected 45 days after individuals were removed. Similarly, Takahara et al. (2012) found
that when Common Carp (Cyprinus carpio) were placed inside a 9L tank, the eDNA shed
by the organisms increased until around day 6, after which it reached an equilibrium.
Despite the utility of eDNA monitoring in many study systems, the importance of
sampling design and factors that influence detection have been largely overlooked
(Yoccoz 2012, Schmidt et al. 2013).
18
Redside Dace (Clinostomus elongatus) present a unique opportunity for testing the
efficacy of eDNA detection for documenting species occurrence because limited
knowledge is available on their habitat distribution, and few studies have focused on
endangered aquatic species. Redside Dace typically consume terrestrial insects and
therefore require clear waters to visually detect their prey, as well as overhanging
vegetation to attract their prey (Daniels and Wisniewski 1994, Novinger and Coon 2000).
They are stream fish that are found at mid-water depths within pools, and move to riffles
for spawning when water temperatures reach between 16-18 ºC (Novinger and Coon
2000, Redside Dace Recovery Team 2010). Redside Dace occupy a disjunct distribution
throughout the upper Mississippi River Drainage, Great Lakes Basin, Ohio River and
upper Susquehanna River (COSEWIC 2007, Novinger and Coon 2000). Within Canada,
Redside Dace populations are restricted to southern Ontario, with the exception of the
Two Tree River population that has recently been found on St. Joseph Island near Sault
Saint Marie, with unknown origins (Redside Dace Recovery Team 2010). Populations
have been declining as a result of habitat loss and degradation throughout their range
(Parker et al. 1987, COSEWIC 2007, Redside Dace Recovery Team 2010). In 1973,
Redside Dace was identified as a species to study on the grounds that they were less
common than 30 years prior (Scott and Crossman 1973). In 1987, Redside Dace was
designated as being of Special Concern by the Committee on the Status of Endangered
Wildlife in Canada (COSEWIC) (Parker et al. 1988) and reassessed as Endangered in
2007 (COSEWIC 2007). The species was listed as Endangered under Ontario’s
Endangered Species Act in 2009 (OMNRF 2015).
19
The distribution of this species is poorly documented and large knowledge gaps
exist to determine extant populations, relative abundance, and occupancy (Redside Dace
Recovery Team 2010). Additionally, a need exists to implement a long-term monitoring
program to examine Redside Dace populations and their habitats through time (Redside
Dace Recovery Team 2010). Before 1979, Redside Dace were not targeted species in
surveys, and therefore historical knowledge gaps exist regarding their distribution and
abundance (Poos et al. 2012). A few studies have looked at the effectiveness of assessing
Redside Dace abundance using a backpack electrofisher and a bag seine (Reid et al. 2008,
Poos et al. 2012), and standardized approaches for monitoring Redside Dace presence and
abundance exist (Wilson and Dextrase 2008).
The overall objective of my study was to evaluate whether eDNA is a sensitive
tool for regional monitoring of stream fishes and the detection of aquatic species at risk,
using Redside Dace as a study species. I also investigated whether seasonal differences in
stream flow and spawning activity influence Redside Dace detectability. If there was high
water flow within a season, I predict fewer Redside Dace detections because of a dilution
effect. Specific objectives were to: (i) characterize the repeatability of sampling results
within a season, (ii) determine the minimum number of water samples to be collected at a
site to ensure confidence in detecting Redside Dace that are present, (iii) compare
detection rates between multiple water samples collected over a one-hour temporal scale
versus spatially replicated samples, and (iv) compare presence/absence of eDNA and
electrofishing-based detections of Redside Dace. The results of this study will be
important for informing future collection efforts for Redside Dace.
20
Methods
Field sampling
Water samples were collected from 29 sites in southern Ontario streams where
Redside Dace are present at varying population densities (COSEWIC 2007, Reid et al.
2008, Poos et al. 2012) (Figure 2.1). Samples were collected during the spring (May and
June 2013) and fall (September 2012 and 2013). Habitat measurements taken at each site
were channel width (mean = 3.60 m, median = 3.22 m), maximum channel depth (mean =
0.22 m, median = 0.22 m), water temperature (mean = 15.1 ºC, median = 15.1 ºC), and
conductivity (mean = 779.6 μS, median = 724 μS). Average habitat characteristics for
each season are listed in Table 2.1 and a list of sites, locations, and habitat characteristics
are provided in Appendix 2.1.
At each of the 29 sites sampled (Appendix 2.1), nine 1-Lwater samples were
collected to compare spatial and temporal sensitivity, and evaluate the effect of sample
replicates on Redside Dace detections. Throughout this chapter, the use of the term “site”
refers to the area at which temporal and spatial replicates were collected from the
watercourses outlined in Appendix 2.1. The starting point of sampling for each site was
located (approximately 2-3 m) downstream of a pool. Pools were targeted for sampling,
as they are the preferred habitat for Redside Dace (Novinger and Coon 2000). Site length
was set at 10 times the wetted stream width, with a minimum length of 40 m, and
contained a minimum of one riffle-pool sequence according to the Ontario Stream
Assessment Protocol (Stanfield 2013). Temporal sampling consisted of collecting five 1-
L water samples at the downstream end of the site at 15 minute intervals (time 0, 15 min,
21
30 min, 45 min, and 60 min). Five spatial samples were also collected within each site,
using the t = 60 min temporal sample as the first spatial replicate. The remaining four
samples were systematically collected in an upstream manner; each separated by 10 m (if
stream width ≤ 4 m) or a longer distance (if width > 4 m) defined as wetted width x 10/4.
Each site was sampled in both spring and fall to assess seasonal variation in eDNA
detection probability. Eleven of the 29 sites (Appendix 2.1) were resampled
approximately one week after initial sampling for both fall and spring field seasons, in
order to assess repeatability within each season.
A 1-L field control sample was collected at the start of each sampling trip from
the Otonabee River (known absence of Redside Dace) to ensure sterile field techniques.
During the spring of 2013, three 1-L water samples were collected from 10 lakes that do
not support Redside Dace populations (Mandrak and Crossman 1992) in order to define a
minimum copy number per DNA reaction that would be accepted as a positive detection
(Figure 2.1).
After water sampling at a site was complete, the same area was sampled with a
Smith-RootTM backpack electrofisher during the fall season. It was assumed that Redside
Dace populations consisting of greater than 1 individual/100m2 would be detectable via
electrofishing (Reid et al. 2008). If Redside Dace were absent from the first sampling
pass, a second (and up to three) electrofishing passes were completed. The median
electrofishing effort completed across each site was 929 s, the average number of passes
per site was two, and the total effort across all 29 sites was 30,416 s. Each pass was
separated by 10 min to allow the water to clear. Counts of Redside Dace for each pass
22
were recorded at all 29 sites, and all individuals were released after sampling.
Electrofishing was the only traditional method used to collect Redside Dace.
Sample filtration
Water samples were stored in refrigeration (4 C) until filtration. Filtration took
place within 48 h of sample collection. One-litre samples were individually filtered
through a three-manifold filtering apparatus (EZ-Stream™ vacuum pump) with filter
sizes of 47 mm GFC and 934-AH membrane pore. Between samples, filter funnels and
their bases were immersed in 10% bleach solution for 10-15 min to destroy DNA on
filtering equipment. Equipment was then rinsed thoroughly with tap water, followed by a
final rinse with double-distilled water to ensure that residual bleach was removed.
Forceps were flame-sterilized between samples in 95% ethanol after contact with
processed filters. Filters were then placed in small labeled petri dishes and stored at -80
C until DNA extraction.
eDNA marker development
Primer and probe sets were designed to amplify an 83 base pair segment of the
barcoding region of the mitochondrial cytochrome c oxidase subunit I (COI) gene in
Redside Dace. This gene region, which is approximately 650 base pairs in length, was
used in this study because it serves as a tool for species identification and discovery
(Hebert et al. 2003). The primers and probe were designed using Primer Express 3.0.1
software (Applied Biosystems). To verify the specificity of the primers and ensure no
cross-reaction between species found in the same area, the primers were tested across a
PCR temperature gradient of 55-65 C against tissue-derived DNA from Blacknose Dace
23
(Rhinichthys atratulus) and Creek Chub (Semotilus atromaculatus). The two species are
commonly found in waters with Redside Dace. The most genetically similar fish to
Redside Dace include Rosyside Dace (Clinostomus funduloides), and cyprinids in the
genera Richardsonius and Lotichthys. The distributions of these species do not overlap
with Redside Dace and therefore are not likely to amplify (Houston et al. 2010). No
cross-amplifications were apparent at all temperatures, indicating that the designed
primers were specific to Redside Dace.
DNA extraction
To avoid aerosol contamination, water filtering, DNA extraction and PCR
amplification of samples took place in separate rooms. Samples were extracted following
the MoBio PowerWater DNA Isolation Kit (http://www.mobio.com) protocols, with
several modifications shown in bold. Filters were removed from the -80 C freezer,
allowed to thaw and transferred to a 15 mL falcon tube using forceps. Forceps were
flame-sterilized between samples. 1000 L of PW1 heated at 70 C was added to each
tube. Tubes were placed in the shaker to lyse for a minimum of 30 min after which they
were centrifuged at 2,000 x g for 1 min in order for the liquid to spin to the bottom.
Using a transfer pipette, supernatant was placed in a clean 2-mL collection tube, after
which tubes were centrifuged at 13,0000 x g for 1 min. The supernatant was pipetted into
a clean 2-mL tube after which 200 L of PW2 was added. The entire solution was mixed
using a vortex (VWR Advanced Digital Shaker). Tubes were stored at 4 C for 5 min.
PW2 was used to remove inhibitory compounds, such as proteins, cell debris and other
non-DNA materials. Tubes were centrifuged at 13,000 x g for 1 min and the supernatant
24
was transferred into a clean 2-mL centrifuge tube, leaving the pellet in the tube. 650 L
of PW3 was placed in the incubator at 70 C for several minutes, and was added into a
new tube and mixed using a vortex. PW3 is a high concentration salt that binds to DNA,
allowing the DNA to bind to the silica-based spin column. In the fumehood, 650 L from
each tube was pipetted into a spin column, centrifuged at 13,000 x g for 1 min, and the
flow-through was discarded. This step was repeated twice until all supernatant from each
tube was washed through the spin column. The spin column was placed into a clean 2-mL
tube. Next, 650 L of PW4 was added to the 2-mL tube, spun at 13,000 x g for 1 min and
the flow-through was subsequently discarded; PW4 was used to remove any residual salts
that may inhibit downstream PCR reactions. 650 L of PW5 was added to the same tube,
centrifuged at 13,000 x g for 1 min, and the flow-through was discarded again. The same
tubes were immediately re-centrifuged at 13,000 x g for 2 min; PW5 was used to make
sure that PW4 is completely removed from the DNA. Spin baskets were placed into the
final tube, after which the DNA was eluted with 100 L of low TE (10 mM Tris pH 8,
0.1 mM EDTA) and the solution was centrifuged at 13,000 x g for 1 min. The spin basket
was then discarded. Low TE was used instead of buffer PW6 (as illustrated through
protocols) because it is superior for long-term storage of DNA.
qPCR amplification
Quantitative PCR (polymerase chain reaction) or real-time PCR (referred to as
qPCR herein), was used as the assay for DNA detection. Real-time PCR is similar to
conventional PCR, with the major technical difference being the use of a fluorescently-
labeled probe. This assay uses Taqman with minor groove binding properties. During the
25
annealing phase, the Redside Dace-specific forward (RSD-F: 5’-
GCTAGCTTCTTCTGGCGTTGA-3’) and reverse primer (RSD-R: 5’-
CTGCATGGGCAAGGTTACCT-3’) bind to the target strand. Additionally, a probe
(6FAM-CGGAACAGGATGAACGG-MGBNFQ) which consists of a 5’ fluorescent
reporter and a 3’ quencher hybridize to the target strand. When the probe is intact, the 3’
quencher absorbs the signal from the 5’ fluorescent reporter through fluorescent
resonance energy transfer (FRET). However, when the Taq polymerase adds the
complementary nucleotides into the target strand and reaches the probe, it cleaves the
probe through its 5’-3’ nuclease activity, which then causes the probe to split and emit a
fluorescence signal. The fluorescence signal is proportional to the quantity of starting
template quantity, and is determined through the use of standard interpolation (Heid et al.
1996). Benefits to using qPCR over traditional PCR include: 1) qPCR does not require
gel electrophoresis; this reduces contamination because the amplified qPCR product is
unopened; 2) the qPCR reaction takes less time to run; and, 3) qPCR measures copy
number after each cycle (Heid et al. 1996).
Development of qPCR standards
DNA standards were generated using two reference Redside Dace specimens
(RSD1-4, RSD2-2), and were developed to quantify the copies of target DNA present in
the environmental sample. The reference specimens were PCR-amplified to target the 707
bases of COI, inclusive of primers (Ivanova et al. 2007). PCR product was quantified
using a Picogreen plate (BMG FluoStar Galaxy 96-well plate system). The volume of
Redside Dace DNA needed for 10 billion copies of DNA/reaction was calculated, based
on the Picogreen reading and the molecular weight for the COI region. The calculated
26
volume was used as a starting point for serial dilution: 10 L of 1010 copies/reaction were
added to 90 L of low TE, to achieve a concentration of 109 copies/reaction, after which
the solution was mixed thoroughly (Wozney and Wilson 2012). A new tip was used to
pipette 10 L of 109 copies/reaction to 90L of low TE, to achieve 108 copies/reaction.
This was repeated until a concentration of 1 copy/reaction was achieved. For each qPCR
run, 106 copies/reaction down to 1 copy/reaction were assayed as quantitative controls.
A standard curve was generated by plotting the known concentration of DNA
against the cycle at which the signal passed the cycle threshold (Ct). The Ct is chosen to
be significantly higher than the background fluorescence, in order to make accurate
inferences of signal versus noise. Standards were used to identify the cycle number at
which a "known" quantity of DNA passes the fluorescence threshold. Cycle number was
used to infer the number of DNA copies associated with each sample. For each qPCR
assay, 2 standards were run, each in duplicate to compare within-pipette variability.
The data for the raw qPCR values across all sites can be found in Appendix 2.2,
and the locations of the 10 negative control lakes can be found in Appendix 2.3.
qPCR reactions
PCR reactions were set up in a laminar flow UV fumehood to minimize the risk of
DNA contamination. Standards were pipetted into wells in a separate room that was
dedicated for amplified Redside Dace DNA to avoid cross-contamination. A preliminary
test for inhibition was done based on Redside Dace eDNA lab samples that were
previously collected. Two sample replicates were collected from Lynde Creek (LC1 and
LC2), and two eDNA sample replicates were collected from Mitchell Creek (MC1 and
27
MC2). Samples were diluted to determine if inhibitors were present using a dilution series
of undiluted DNA, 1:2, 1:5, 1:10, 1:20, 1:30; each sample was run twice on the qPCR
assay to determine the within-sample variation. If inhibitors were present in the reaction,
an increase in copy number would be expected in more diluted samples. For both field
seasons (Sept 2nd 2012-June 11th 2014), each eDNA sample was run with 15L of the
following cocktail: 10 L of TaqMan® Fast Universal PCR Master Mix (2✕) (referred to
as fast mix hereafter), 0.4 L of RSD-R, 0.4 L of RSD-F, 0.4 L of RSD-probe, 3.8L
of ddH2O, 5 L of stock DNA. Each sample was run in triplicate to assess the level of
within-sample variability. There was no evidence of inhibition as a result of decreased
copy numbers at higher dilution series (Figure 2.2). StepOnePlus thermocycling
conditions for the fast mix were as follows: initial denaturation for 2 min at 95 °C,
followed by a 2 step-process of 1 s denaturation at 95 °C, and a 20 s annealing at 60 °C,
repeated for 40 cycles.
Water samples from fall 2012 were run using fast mix, while water samples from
spring 2013 were run using both fast mix and TaqMan® Environmental Master Mix 2.0
(referred to as environmental mix herein), the latter of which was first used in the lab
during 2013. The environmental mix could not be tested on fall 2012 water samples due
to the potential for DNA degradation over the past year that they were in the freezer. The
fast mix was compared to the environmental mix in order to see if there was a significant
difference between the two mixes at detecting low copy numbers. Based on these results,
no significant difference would indicate that the data from the environmental mix and the
fast mix could be analyzed in conjunction, but a significant difference would indicate that
28
the data from each mix would have to be analyzed separately. The paired comparison of
the PCR results from each of the TaqMan master mixes is detailed in Appendix 2.4.
Statistical analyses
Each qPCR sample was run in triplicate and the average of the three runs were
used for the analysis. The precision and accuracy of the qPCR platform was assessed in
order to determine how reliable the platform was for determining DNA copy number
present within a sample. The coefficient of variation (CV) was calculated to assess the
within-trial precision for each triplicate sample. This was determined by dividing the
standard deviation by the mean for each PCR replicate. Values that had a standard
deviation of 0 and a mean of 0 were left as 0 for the coefficient of variation (since 0/0 is
undefined). To measure the qPCR accuracy, known concentrations of DNA (1000
copies/reaction down to 1 copy/reaction) were treated as “unknown” DNA samples. This
was replicated seven times to get a true measure of data variability. A second measure of
accuracy was determined by graphing the copy number of each standard against the cycle
at which it passes the threshold (Ct) for (i) all standards generated in the assay and (ii)
standard points used in the regression to generate the copy numbers (often points at 1
copy/reaction were omitted because they deviated far away from the line of best fit).
Failure rates were calculated for each mix at the 1 copy/reaction to determine how many
failed to generate a Ct at 40 cycles. These values were assigned a Ct of 40 cycles.
An occupancy modelling approach based on multiple temporal samples was used
to estimate detection probabilities, and to assess the influence of water temperature,
stream flow levels, and number of water samples collected on Redside Dace detection
29
(MacKenzie et al. 2002). The detection probability associated with electrofishing was not
specifically modelled, so the detection ability of the gears was compared using naïve
detections. A site was considered positive for Redside Dace eDNA presence if at least
one of the nine sampling replicates had Redside Dace DNA. The approach estimates the
probability of site occupancy and detection probability (the probability of detecting a
species in an individual survey or sample if it is present) using maximum likelihood
procedures (MacKenzie et al. 2002). Detection probabilities can be more accurately
estimated by adding covariates and using the logistic formula exp(XB)/(1+exp(XB),
where X represents covariate data, and B represents the vector of model parameters
(MacKenzie et al. 2002). Detection probability is an important statistic for monitoring
programs because it allows one to determine the conditions under which sampling would
be most effective and the amount of sampling effort required to increase chances of
detection. Single-season occupancy modelling assumes that: 1) during the sampling
period, the site is closed to occupancy changes, 2) false positives do not occur, 3)
detecting a species at one site is independent of detecting a species at another site, and 4)
probability of occupancy and detection are constant across sites or are modelled as a
function of covariates (MacKenzie et al. 2002). For this study, it was assumed that
occupancy is constant across sites.
Program PRESENCE 6.2 (Hines 2006) was used to estimate detection
probabilities and standard errors. Three copies/reaction were used as a threshold for a
positive detection in individual samples (see Appendix 2.5), because it incorporates all
the negatives that had a reading of greater than 0 copies/reaction. To examine the
repeatability of sampling results within a season using eDNA, detection probabilities
30
were calculated at the 11 repeated sites using a null model (no covariates) during spring
week one and week two, and fall week one and week two (objective i). Additionally,
candidate sets of four models including detection covariates were tested at three, four, and
five temporal replicates during the spring and the fall to look at between season
repeatability (objective ii). The models included in each candidate set were: 1) constant
detection probability (null model, detection probability is the same across all surveys and
sites), p(.); 2) detection probability with water temperature as a site-specific covariate,
p(temp); 3) detection probability with an index of flow as a covariate p(flow); and 4)
detection probability with water temperature and flow as covariates, p(temp+flow).
Correlations between the two covariates were assessed using Spearman’s rank correlation
rho, using the stats package in R studio (R Core Team 2013). Temperature was chosen as
a covariate because it could influence factors that break down/preserve DNA; I predicted
that an increase in temperature would result in lower detection probability. An index of
flow (depth x wetted width) was chosen as a covariate because environmental DNA
concentrations are considered to negatively covary with flow (Klymus et al. 2014).
Estimated occupancy and detection probabilities were obtained for models analyzed using
PRESENCE. An information theoretic approach was used to compare competing models
based on Akaike’s Information Criterion corrected for small sample sizes (AICc). The
number of sampled sites was used as the sample size when calculating AICc (e.g., n = 29
for the spring and fall sampling seasons). Goodness of fit was assessed using the global
model in each candidate set (p(temp+flow)), using the Pearson chi-square statistic and
10,000 bootstraps; over dispersion in the data was assessed by estimating ĉ (MacKenzie
and Bailey 2004). Where there was over dispersion (ĉ > 1), Quasi-AIC corrected for small
sample size (QAICc) was used for model selection.
31
The detection probability of the spatial sampling could not be directly modelled
due to the spatial nature of the replicates. Therefore the spatial and temporal sampling
schemes were directly compared by examining detection histories for both methods at
sites where Redside Dace were detected (objective iii). To compare the efficiencies of fall
electrofishing and eDNA samples, the number of unique sites with Redside Dace
detections for each gear type, and the number of sites with detections by both sampling
methods were determined (objective iv). The minimum number of temporal water
samples needed to be 95% confident of detecting Redside Dace when present at a
particular site, was calculated using the formula: 1-[1-p]k, where k represents number of
replicates, and p represents the model-averaged detection probability (Pellet and Schmidt
2005). Minimum sample effort was calculated for each season and each replicate
scenario.
Results
The fall eDNA samples (n = 316) had a mean value of 13.0 copies/reaction (based on
triplicate mean per sample), a maximum value of 1156.9 copies/reaction, and a median of
1.3 copies/reaction. The spring eDNA samples (n = 316) had a mean value of 8.8
copies/reaction, a maximum value of 154.8 copies/reaction, and a median value of 2.8
copies/reaction. These values were obtained for all water samples, including those with
no detections. Overall, 69% of the qPCR negative controls (field blanks, lab negatives,
lake controls) contained no Redside Dace DNA; however, the samples that did have a
copy number reading contained minute amounts of DNA. Control samples (n = 258) had
a maximum copy number of 3.3 copies/reaction at the filter control, and had an overall
32
mean copy number of 0.09 copies/reaction, standard deviation of 0.3, and a median of 0
copies/reaction (Figure 2.3).
High variability was found between qPCR sample triplicates for both control and
environmental samples. The coefficient of variation (CV) (n = 890) had a mean of 57.7%,
(median CV=36.7%, max = 173) (Figure 2.4). The mean copy number of samples (n =
88) having a CV value of greater than 150% was 0.2 copies/reaction (median=0.2
copies/reaction, max=1.9 copies/reaction). Real-time PCR results from eDNA standards
of known concentrations indicated a decrease in the accuracy of qPCR to quantify DNA
copy numbers within a reaction, as copy number decreased (Figure 2.5). The variance in
qPCR copy number increased as copy number decreased, as indicated by the greater
variation (larger boxplot spread) at 10 copies/reaction and 1 copy/reaction. At a test
concentration of 1 template copy/reaction, the control assay was unable to detect Redside
Dace in two of the seven replicates. The qPCR output for all controls had a reading of 0
copies/reaction. A comparison of all standards used to test environmental samples,
indicates that there was high overlap between the 10 copies/reaction and 1 copy/reaction
Ct (Figure 2.6), and a failure rate of 34% for the 1 copy/reaction standards to amplify.
Redside Dace eDNA was detected at 18 of 29 sites sampled during fall, and 16 of
29 sites sampled during spring using only eDNA monitoring. Redside Dace were detected
at 13 sites in both spring and fall and were detected at 5 unique sites in the fall and at 3
unique sites in the spring using only eDNA detection. Electrofishing only detected
Redside Dace at 14 of 29 sites sampled in the fall; eDNA was able to detect Redside Dace
at 7 sites that had no detections with electrofishing, while electrofishing had 3 sites with
33
detections that were not detected by eDNA. Redside Dace were detected using both
sampling methods at 11 of 29 sites (See Appendix 2.6).
The number of sites with positive detections differed between sampling weeks
(sampling separated by approximately 10 d) within the same season (Figure 2.7). Of the
11 sampled sites, six were positive for Redside Dace during fall week one while only
three of these same sites tested positive for Redside Dace during fall week two. Number
of positive detections were higher during spring, with seven sites testing positive for
Redside Dace during week one and nine positives during week two (with all sites that
tested positive in week one testing positive in week two). At two sites, Redside Dace
eDNA was not detected during either season or either week (Figure 2.7). The
inconsistency between sampling weeks is also reflected by differences in estimated
detection probabilities. Fall water sampling during week one resulted in a detection
probability of 0.63 + 0.09. The detection probability increased approximately 0.40 to 1.0,
implying perfect detection during week two (Figure 2.8). Spring detection probabilities
were more similar between weeks (week one detection probability = 0.68 + 0.08; week
two detection probability = 0.80 + 0.06) (Figure 2.8).
Detection probabilities for eDNA sampling were consistently high; ranging from
0.62 to 0.90 depending on the model and season (Table 2.2). There were no correlations
between flow index and water temperature during spring (rs=-0.1, p>0.05) and fall (rs=-
0.2, p>0.05). None of the candidate sets show evidence of lack of fit (i.e., p > 0.05) and
only the candidate set for fall samples with five replicates showed signs of overdispersion
(ĉ = 1.39). For each model that was tested in both seasons, spring detection probability
was always higher than fall detection probability. For example, the highest model-average
34
estimate of p during the fall was 0.73 (+ 0.096); while the lowest model-average estimate
of p during the spring was 0.74 (+ 0.081). Within a season, the importance of covariates
for both spring and fall detection probabilities was variable. During the fall season at the
various replicates, the null model was always the best or second ranked model (Appendix
2.7). For the spring season, both the temperature model and the additive model of
temperature and flow, were the best models (Appendix 2.7). A plot of individual site
detection probabilities against flow and temperature indicates that detection probabilities
increase as flow decreases and temperature increases. Detection probabilities were lowest
at temperatures below 13 ºC (Figure 2.9).
As the number of temporal samples increased, detection probability decreased,
although there was an increase in the estimated occupancy rates for both fall and spring.
At three replicates for the spring null model, there was an occupancy estimate of 0.52 (+
0.09) and a detection probability of 0.89 (+ 0.05), while there was an occupancy estimate
of 0.58 (+ 0.09) and a detection probability of 0.80 (+ 0.04) at five temporal replicates. A
similar pattern was present during the fall sampling season. At three replicates for the fall
null model, there was an occupancy estimate of 0.45 (+ 0.09) and detection probability of
0.76 (+ 0.07), while at five temporal replicates there was an occupancy of 0.53 (+ 0.09)
and detectability of 0.65 (+ 0.06) (Table 2.2). Standard errors of detection probability
estimates tended to be slightly smaller for the models with additional temporal replicates.
Using the formula 1-[1-p] k and model-averaged estimates of p for each replicate during
each season, two or three temporal samples were needed to be 95% confident of detection
Redside Dace when present at a site (Appendix 2.7).
A comparison of water samples collected over a short-time period (temporal) at
single locations versus along a stream reach (spatial) indicated that both approaches
35
provided similar results in the fall. Spatial sampling performed slightly better in the
spring. With one water sample, 13 sites tested positive for Redside Dace eDNA in the
spring for both temporal and spatial sampling (Figure 2.10). In the fall, one temporal
replicate resulted in 12 positive detections and one spatial replicate resulted in 10 positive
detections. When the number of samples increased to the fourth collection replicate, both
temporal and spatial sampling in the spring detected Redside Dace at 15 sites. In the fall,
temporal and spatial sampling detected Redside Dace at 13 sites and 15 sites, respectively
(Figure 2.10).
Discussion
A successful monitoring program includes efficient survey design, statistical
power needed to detect change, and sensitive methodology (Legg and Nagy 2006). Major
advantages of using eDNA as a monitoring tool, are that it is non-intrusive to target and
non-target species and their habitats, sampling can be done without specialized or costly
field equipment, and volunteers are able to participate in field collections with limited
expertise (Portt et al. 2006, Darling and Mahon 2011, Biggs et al. 2015). Despite the
increased use of eDNA monitoring (Ficetola et al. 2008, Darling and Mahon 2011, Barnes
et al. 2014), few studies have explored basic sampling procedures needed to be confident
of a species’ absence or presence at a particular site (Schmidt et al. 2013, Ficetola et al.
2014). This study has helped to fill in these knowledge gaps by intensively collecting
water samples spatially, temporally and seasonally throughout 29 sites in watersheds
where Redside Dace are known to be present.
Environmental DNA detection probabilities differed between the fall and spring
sampling seasons, as well as between weeks within the same season. Spring had a higher
36
number of sites with positive detections, as well as a higher detection probability estimate
compared to the fall. These results were contrary to my initial prediction, in which I
expected that the higher water flow during the spring would have created a DNA dilution
effect, thereby reducing the genetic signal present within the water. The lack of a dilution
effect could result from the limited differences in stream flow between seasons. The
results from this study were consistent with the literature, however, in which Jane et al.
(2014) found that eDNA copy number was steady across distances at high flow compared
to low flow. Additionally, different sampling weeks within the same season (approximate
10-day sampling difference) had varying detection probability estimates, and were all
relatively high. While estimates were fairly consistent between weeks during the spring,
the fall detection probability had both the highest and lowest detection values of the 4
weeks (Figure 2.8). A limitation to these detection probability estimates, are that sample
size was low (n = 11), which could result in biases. Additionally, differences in detection
probability may be the result of changes in Redside Dace abundance between sampling
weeks, changes in occupancy status (for example, the closure assumption may be
violated, and the weeks could represent different sampling seasons), or it could be the
result of other factors not measured. Detection probability might be higher in the spring
as a result of spawning season (Scott and Crossman 1973), which would cause higher fish
activity and higher local abundances of adult fish, and therefore more cells would be shed
into the environment (Klymus et al. 2014).
Environmental DNA had the highest detection probability at three temporal
replicates, while highest occupancy rates were obtained at five temporal replicates. A
tradeoff occurs between sampling more sites and obtaining more replicates at a site, and
37
one must be able to determine if the added effort at one location is worth both the time
and the cost (Gibbs et al. 1998). Based on the results from this study, it is recommended
that three temporal replicates be sampled at each site. The extra initiative to collect five
temporal samples only resulted in an additional one and two sites with positive detections
during the spring and fall, respectively, and therefore the additional efforts could be
aimed at either targeting other reaches within a stream, or sampling additional
watersheds.
The spatial replicate sampling was more effective than temporal replicates for
total number of positive detections; however, the differences between the two sampling
approaches were not substantial. A study of how eDNA varies horizontally and vertically
within a water column determined that high eDNA variability does exist over a smaller
spatial scale (~10 metre intervals), and that before more extensive sampling for aquatic
species occurs, initial surveys should employ fine-scale collections (Eichmiller et al.
2014, Laramie et al. 2015). In the context of my study, the main advantage of using
temporal replicates would be to obtain detection probability estimates (since spatial
replicates are more difficult to model, and they are likely sampling a larger total sample
site). However, temporal replicates at a single location may fail to detect DNA from
Redside Dace further upstream that are detected by upstream spatial replicates within the
site. By taking spatial replicates, it would (i) allow DNA to be collected over a wider
spatial range, given that DNA is unevenly distributed throughout the waterbody, and (ii)
sample collection would take a shorter time as a result of no time lapse between spatial
replicates other than the time required to move upstream with sampling gear.
38
Despite a strong correspondence of sites with Redside Dace detections via
electrofishing and eDNA monitoring, eDNA was able to detect Redside Dace at a greater
number of sites. This finding is consistent throughout the literature, as demonstrated
within the Slackwater Darter (Etheostoma boschungi) (Janosik and Johnston 2015),
Eastern Hellbender (Cryptobranchus a. alleganiensis) (Spear et al. 2015), and Great
Crested Newt (Triturus cristatus) (Biggs et al. 2015) study systems. A point worth noting
is that eDNA did not detect Redside Dace at three sites where the species was detected by
electrofishing. This could be the result of low-density populations, an example being
Lynde Creek, which has been the focus of several conservation efforts (Redside Dace
Recovery Team 2010). Therefore, both methods have their own advantages and
disadvantages, and instead of one method replacing the other, the two should be able to
complement each other. Important considerations when using traditional gear are the
seasonal bias, age class bias, and difficulty standardizing efforts when using multiple gear
types (Pope and Willis 1996, Bonvechio et al. 2008, Meye and Ikomi 2012, Fischer and
Quist 2014).
Conclusion
The major findings of the study were (i) spring sampling resulted in higher
detection probabilities than fall sampling, (ii) eDNA repeatability was variable between
sampling weeks of the same season, (iii) a minimum of three replicates was needed at a
site to ensure confidence that Redside Dace were detected when present, (iv) results were
comparable for temporal versus spatial sampling, and (v) eDNA detected Redside Dace at
more sites than electrofishing, and (vi) detection probability using eDNA was consistently
high. Despite the increased use of eDNA for invasive species, its application to species at
39
risk has been largely lacking. This study demonstrated that eDNA is a sensitive tool for
detecting Redside Dace, however, sampling design is key in order to increase the chances
of detecting a species. Based on results, a recommendation for future monitoring efforts
that can be extended to other cyprinids and stream fishes would be to use electrofishing as
a first pass to monitoring. Reasons why one might get a signal from eDNA despite
electrofishing failing could be as a result of the fish being upstream of the sampling site,
as well as low population-densities that might make it difficult to catch the fish. Despite
eDNA having a higher success rate of detecting Redside Dace, the logistical costs
associated with eDNA exceeded the benefits of using electrofishing as a first pass, and
should be implemented if traditional gear are unable to detect a species presence (see
Appendix 2.8). Along with budgetary costs, using traditional gear has the benefit of
physically verifying species presence. If Redside Dace are not detected at a site using
traditional gear, eDNA sampling should be used as a second approach because of its
potential for increased sensitivity. Precise documentation of an endangered species’
distribution is critical not only for their protection, but also for the protection of other
species that occupy similar habitats. It is also important to be able to recognize true
absences, so that project developers are able to use the land without worry of future
restrictions if an endangered species were to be found after development. With limited
resources available for the recovery of a species, comprehensive knowledge of their
distribution is critical so that appropriate efforts can be invested on locations that contain
the target species.
Despite the knowledge that this study has contributed to the growing field of
eDNA, there is a lot to be learned. Future efforts could focus on (i) determining how other
40
factors such as turbidity and overhanging vegetation influence Redside Dace occupancy,
(ii) adding more sampling sites so that within-season detection probability can be more
thoroughly investigated, and (iii) investigating the effectiveness of eDNA to detect other
stream fishes of management concern, such as Brook Trout (Salvelinus fontinalis). While
more information needs to be collected before eDNA can be used routinely, it is a
sensitive tool for species monitoring and its use should continue to be explored for other
aquatic endangered species and systems.
References
Barnes MA, Turner CR, Jerde CL, Renshaw MA, et al. (2014) Environmental conditions
influence eDNA persistence in aquatic systems. Environmental Science & Technology,
48, 1819–1827.
Biggs J, Ewald N, Valentini A, Gaboriaud C, et al. (2015) Using eDNA to develop a
national citizen science-based monitoring programme for the great crested newt (Triturus
cristatus). Biological Conservation, 183, 19–28.
Bonvechio TF, Pouder WF, Hale MM (2008) Variation between electrofishing and otter
trawling for sampling Black Crappies in two Florida Lakes. North American Journal of
Fisheries Management, 28, 188–192.
Bronnenhuber JE, Wilson CC (2013) Combining species-specific COI primers with
environmental DNA analysis for targeted detection of rare freshwater species.
Conservation Genetics Resources, 5, 971–975.
Campbell SP, Clark JA, Crampton LH, Guerry AD, et al. (2002) An assessment of
monitoring efforts in endangered species recovery plans. Ecological Applications, 12,
674–681.
COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace
Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in
Canada. Ottawa. Vii+59 pp. (www.sararegistry.gc.ca/status/status_e.cfm).
Daniels RA, Wisniewski SJ (1994) Feeding ecology of redside dace, Clinostomus
elongatus. Ecology of Freshwater Fish, 3, 176-183.
41
Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based
methods for monitoring biological invasions in aquatic environments. Environmental
Research, 111, 978–988.
Eichmiller JJ, Bajer PG, Sorensen PW (2014) The relationship between the distribution of
common carp and their environmental DNA in a small lake. PLoS ONE, 9,e112611.
Ficetola GF, Miaud C, Pompanon F, Taberlet P (2008) Species detection using
environmental DNA from water samples. Biology letters, 4, 423–425.
Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false
presences, and the estimation of presence / absence from eDNA metabarcoding data.
Molecular Ecology Resources, 15, 543–556.
Fischer JR, Quist MC (2014) Gear and seasonal bias associated with abundance and size
structure estimates for lentic freshwater fishes. Journal of Fish and Wildlife Management,
5, 394–412.
Frankham R, Ballou JD, Briscoe DA (2002) Introduction to Conservation Genetics.
Cambridge, UK: Cambridge University press. 617 p.
Gibbs JP, Droege S, Eagle P (1998) Monitoring populations of plants and animals.
American Institute of Biological Sciences, 48, 935–940.
Goldberg CS, Sepulveda A, Ray A, Baumgardt J, Waits P (2013) Environmental DNA as
a new method for early detection of New Zealand mudsnails (Potamopyrgus
antipodarum). Freshwater Science, 32, 792–800.
Hebert PDN, Cywinska A, Ball SL, deWaard JR (2003) Biological identifications through
DNA barcodes. Proceedings of the Royal Society of London. Series B, 270, 313–321.
Houston DD, Shiozawa DK, Riddle BR (2010) Phylogenetic relationships of the western
North American cyprinid genus Richardsonius, with an overview of phylogeographic
structure. Molecular Phylogenetics and Evolution, 55, 259-273.
Heid C, Stevens J, Livak JK, Williams PM (1996) Real time quantitative PCR. Genome
Research, 6, 986–994.
Hines JE (2006) PRESENCE2 - Software to estimate patch occupancy and related
615 parameters. USGS-PWRC. http://www.mbr-pwrc.usgs.gov/software/presence.html.
Ivanova NV, Zemlak TS, Hanner RH, Hebert PDN (2007) Universal primer cocktails for
fish DNA barcoding. Molecular Ecology Notes, 7, 544–548.
42
Jane S, Taylor WM, Mckelvey KS, Young MK (2014) Distance, flow and PCR
inhibition: eDNA dynamics in two headwater streams. Molecular Ecology Resources, 15,
216-227.
Janosik AM, Johnston CE (2015) Environmental DNA as an effective tool for detection
of imperiled fishes. Environmental Biology of Fishes, 98, 1889–1893.
Jerde CL, Chadderton WL, Mahon AR, Renshaw MA, et al. (2013) Detection of Asian
carp DNA as part of a Great Lakes basin-wide surveillance program. Canadian Journal of
Fisheries and Aquatic Sciences, 70, 522–526.
Klymus KE, Richter CA, Chapman DC, Paukert C (2014) Quantification of eDNA
shedding rates from invasive bighead carp Hypophthalmichthys nobilis and silver carp
Hypophthalmichthys molitrix. Biological Conservation, 183, 77-84.
Laramie MB, Pilliod DS, Goldberg CS (2015) Characterizing the distribution of an
endangered salmonid using environmental DNA analysis. Biological Conservation, 183,
29–37.
Legg CJ, Nagy L (2006) Why most conservation monitoring is, but need not be, a waste
of time. Journal of environmental management, 78, 194–199.
MacKenzie DI., Nichols JD., Lachman GB., Droege S, et al. (2002) Estimating site
occupancy rates when detection probabilities are less than one. Ecology, 83, 2248-2255.
MacKenzie DI, Bailey LL (2004) Assessing the fit of site-occupancy models. Journal of
Agricultural, Biological, and Environmental Statistics, 9, 300-318.
MacKenzie DI, Nichols JD, Royle JA, Pollock KH (2006) Occupancy Estimation and
Modeling: Inferring Patterns and Dynamics of Species Occurrence. Elsevier, Burlington,
MA.
Mahon AR, Barnes MA, Li F, Egan SP, et al. (2013) DNA-based species detection
capabilities using laser transmission spectroscopy DNA-based species detection
capabilities using laser transmission spectroscopy. Journal of the Royal Society Interface,
10, 20120637.
Mandrak NE, Crossman EJ (1992) A Checklist of Ontario Freshwater Fishes. Royal
Ontario Museum, Toronto, ON.
Meye JA, Ikomi RB (2012) Seasonal fish abundance and fishing gear efficiency in river
Orogodo, Niger Delta, Nigeria. World Journal of Fish and Marine Sciences, 4, 191–200.
Nielsen JL (1998) Scientific sampling effects : electrofishing California’s endangered fish
populations. Fisheries, 23, 6–12.
43
Novinger DC, Coon TG (2000) behavior and physiology of the Redside Dace,
Clinostomus elongatus, a threatened species in Michigan. Environmental Biology of
Fishes, 57, 315–326.
Parker B, Mckee P, Campbell RR (1987) COSEWIC status report on the redside dace
Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in
Canada. Ottawa. 1-20pp.
Parker BJ, McKee P, Campbell RR (1988) Status of the redside dace, Clinostomus
elongatus, in Canada. Canadian Field Naturalist, 102,163-169.
Pellet J, Schmidt BR (2005) Monitoring distributions using call surveys: estimating site
occupancy, detection probabilities and inferring absence. Biological Conservation, 123,
27-35.
Poos M, Lawrie D, TU C, Jackson DA, Mandrak NE (2012) Estimating local and
regional population sizes for an endangered minnow, redside dace (Clinostomus
elongatus), in Canada. Aquatic Conservation: Marine and Freshwater Ecosystems, 22,
47–57.
Pope KL, Willis DW (1996) Seasonal influences on freshwater fisheries sampling data.
Reviews in Fisheries Science, 4, 57–73.
Portt CB, Coker GA, Ming DL, Randall RG (2006) A review of fish sampling
methods commonly used in Canadian freshwater habitats. Canadian Technical Report of
Fisheries Aquatic Sciences. 2604 p.
R Core Team (2013) R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL
http://www.R-project.org/.
Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus
elongatus) in Ontario. Ontario Recovery Strategy Series. Prepared for the Ontario
Ministry of Natural Resources, Peterborough, Ontario. vi + 29 pp.
Reid SM, Jones NE, Yunker G (2008) Evaluation of single-pass electrofishing and rapid
habitat assessment for monitoring Redside Dace. North American Journal of Fisheries
Management, 28, 50–56.
Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy
models in the analysis of environmental DNA presence/absence surveys: a case study of
an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.
Scott WB, Crossman EJ (1973) Freshwater Fishes of Canada. Fisheries Research Board
of Canada, 184.
44
Serrao NR, Steinke D, Hanner RH (2014) Calibrating snakehead diversity with DNA
barcodes: expanding taxonomic coverage to enable identification of potential and
established invasive species, PLoS ONE, 9, e99546.
Spear SF, Groves J.D, Williams LA, Waits LP (2015) Using environmental DNA
methods to improve detectability in a hellbender (Cryptobranchus alleganiensis)
monitoring program. Biological Conservation, 183, 38–45.
Stanfield, L (2013) Ontario Stream Assessment Protocol. Version 9.0.
Fisheries Policy Section. Ontario Ministry of Natural Resources. Peterborough, Ontario.
505 Pages.
Takahara T, Minamoto T, Yamanaka H, Doi H, Kawabata Z (2012) Estimation of fish
biomass using environmental DNA. PLoS ONE, 7, e35868.
Thompson WL (2004). Sampling Rare or Elusive Species: Concepts and Techniques for
Estimating Population Parameters. Island Press, Washington DC, USA.
Thomsen PF, Kielgast J, Iversen LL, Wiuf C, et al. (2012) Monitoring endangered
freshwater biodiversity using environmental DNA. Molecular Ecology, 21, 2565–2573.
Victor BC, Hanner R, Shivji M, Hyde J, Caldow C (2009) Identification of the larval and
juvenile stages of the Cubera Snapper, Lutjanus cyanopterus, using DNA barcoding.
Zootaxa, 2215, 24–36.
Wilcox TM, McKelvey KS, Jane SF, Lowe WH, et al. (2013) Robust detection of rare
species using environmental DNA: the importance of primer specificity. PLoS ONE, 8,
e59520.
Wozney KM, Wilson PJ (2012) Real-time PCR detection and quantification of elephantid
DNA: Species identification for highly processed samples associated with the ivory
trade. Forensic Science International, 219, 106-112.
Wilson CC, Dextrase AJ (2008) Sampling protocols for Redside Dace. Prepared for the
Ontario Ministry of Natural Resources, Peterborough, Ontario. vi + 4 pp.
Yoccoz NG (2012) The future of environmental DNA in ecology. Molecular
Ecology, 21(8), 2031-2038.
45
Table 2.1: Mean, standard deviation, maximum, and minimum values of environmental
variables for 29 sites sampled for eDNA testing for Redside Dace.
Spring Fall
Habitat
Characteristic Mean
Standard
Deviation
Min Max Mean Standard
Deviation Min Max
Mean Channel
Width (m) 3.54 1.83 1.02 7.20 3.66 2.00 0.86 7.80
Mean Water
Depth (m) 0.24 0.08 0.07 0.40 0.20 0.07 0.09 0.36
Index of Flow
(m2) 0.90 0.64 0.12 2.64 0.75 0.53 0.09 2.09
Temperature
(ºC) 15.8 2.2 12.0 20.2 14.4 3.3 7.6 20.6
Conductivity
(μS) 674 244.6 183.0 1302.0 852.4 289.7 490.0 1514.0
46
Table 2.2: Estimates of Redside Dace detection probability and occupancy and ΔAICc
values from models for spring and fall field seasons (horizontal headings), at temporal
sampling (R) of 3, 4, and 5 replicates.
R Model Fall
Occupancy
ψ (+ SE)
Fall
Detection
probability
P (+ SE)
ΔAICc Spring
Occupancy
ψ (+ SE)
P (+ SE)
Spring
Detection
Probability
ΔAICc
3 ψ(.)p(.) 0.45
(0.094)
0.76
(0.07) 0
0.52
(0.0929)
0.89
(0.050) 2.26
ψ(.)p(temp) 0.49
(0.10)
0.69
(0.093) 0.22
0.5
(0.0929)
0.81
(0.087) 0
ψ (.)p(flow) 0.45
(0.094)
0.77
(0.11) 2.35
0.52
(0.093)
0.90
(0.06) 3.82
ψ (.)p(temp+flow) 0.49
(0.10)
0.71
(0.11) 2.38
0.56
(0.099)
0.84
(0.076) 0.24
4 ψ(.)p(.) 0.44
(0.092)
0.77
(0.060) 1.29
0.55
(0.092)
0.86
(0.044) 7.53
ψ(.)p(temp) 0.48
(0.10)
0.70
(0.082) 0
0.59
(0.09)
0.79
(0.069) 1.14
ψ(.)p(flow) 0.45
(0.092)
0.76
(0.092) 3.56
0.55
(0.093)
0.85
(0.059) 7.26
ψ(.)p(temp+flow) 0.48
(0.1014)
0.70
(0.10) 2.69
0.58
(0.099)
0.78
(0.088) 0
5 ψ(.)p(.) 0.52
(0.093)
0.65
(0.056) 0.56
0.58
(0.092)
0.80
(0.044) 8.19
ψ(.)p(temp) 0.52
(0.093)
0.66
(0.077) 2.14
0.59
(0.093)
0.76
(0.077) 5.72
ψ(.)p(flow) 0.53
(0.096 )
0.62
(0.085) 0
0.58
(0.092)
0.79
(0.058) 5.23
ψ(.)p(temp+flow) 0.53
(0.096)
0.63
(0.10) 1.76
0.61
( 0.097)
0.73
(0.081) 0
47
Figure 2.1: Map of 29 Redside Dace eDNA sampling sites from Fall 2012 and Spring
2013 sampling season (grey circles), 10 lake negative control sampling sites (black
triangles) from Spring 2013, and Otonabee River field blank (star) to help establish
detection threshold.
48
Figure 2.2: Plot of log10 transformed template DNA copy number (x-axis) versus
dilutions for four Redside Dace eDNA samples in order to test for inhibition at four
sampling locations (LC1= Lynde Creek 1, LC2= Lynde Creek 2, MC1= Mitchell Creek 1,
MC2= Mitchell Creek 2).
49
Figure 2.3: Histogram of negative controls copy numbers/reaction of amplified Redside
Dace eDNA (x-axis) versus frequency (y-axis) for four types of: (a) filter control (n=168,
x̅ =0.091, s=0.288), (b) lake control (n=32, x̅ =0.048, s=0.13), (c) DNA extraction control
(n=31, x̅=0.081, s=0.29), (d) field control (n=27, x̅=0.18, s=0.39).
50
Figure 2.4: Scatter plot for mean copy number /reaction of each sample run in triplicates
(y-axis) versus the coefficient of variation of those values (CV; x-axis) (left) and
histogram of CV versus the frequency of samples that fall under the CV (right).
51
Figure 2.5: Boxplot of qPCR standards with known DNA concentrations (1000 copies/
reaction down to 1 copy/ reaction) set as “eDNA unknowns” versus copy number log10
transformed (y-axis), as a test for qPCR accuracy.
52
Figure 2.6: Boxplot of Redside Dace standards (106 down to 100 copies/reaction) at the
threshold cycle (Ct) where the copy number passes the baseline threshold for (A) omitted
(data points for the standard curve were removed to improve R2 value) (B) All standards
(no data points excluded).
53
Figure 2.7: Barplot of total temporal Redside Dace detections (x-axis) found at each of the eleven sampled sites (y-axis). Five
temporal replicates were collected at each season (fall and spring) twice, with a time lapse of approximately 10 d between sampling
weeks within a season. Site labels on y-axis are listed in Appendix 2.1.
0 1 2 3 4 5
D1
L1
L2
DU1
DU2
DU3
P1
F1
F2
F3
R2
Number of Detections
Fall 2 Fall 1 Spring 2 Spring 1
54
Figure 2.8: Detection probabilities (y-axis) within seasons (x-axis) for Fall Week 1
(FW1), Fall Week 2 (FW2), Spring Week 1 (SW1), and Spring Week 2 (SW2) (error bars
represent upper and lower 95% confidence limits of estimates).
55
Figure 2.9: Individual site detection probability estimates for index of flow versus
detection probability (top), and temperature versus detection probability (bottom), during
Spring at 5 replicates.
0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5 2 2.5
Det
ecti
on
Pro
bab
ility
Flow index (m2)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25
Det
ecti
on
Pro
bab
ility
Temperature (oC)
56
Figure 2.10: A comparison of the number of sites (out of n=29) with Redside Dace DNA detections (x-axis), versus the number of
replicates sampled (y-axis), for a) the four spatially replicated samples collected at each site and b) the four temporally replicated
samples collected at each site in each season.
0 2 4 6 8 10 12 14 16
one
two
three
four
Number of Sites
Nu
mb
er o
f R
epli
cate
s
Spatial-Spring
Temporal-Spring
Spatial-Fall
Temporal-Fall
Legend
57
Chapter 3: Conservation genetics of Redside Dace (Clinostomus elongatus):
phylogeography and contemporary spatial structure
Abstract
Redside Dace Clinostomus elongatus (Teleostei: Cyprinidae) is a species of
conservation concern that is declining throughout its range as a result of urban
development and agricultural activities. The purpose of this study was to use
mitochondrial and microsatellite data to characterize genetic diversity for Redside Dace
to understand how past and current events have shaped their genetic relationships.
Phylogeographic structure among 28 Redside Dace populations throughout Ontario and
the United States was assessed by sequence analysis of the mitochondrial cytochrome b
and ATPase 6 and 8 genes. Populations were also genotyped using 10 microsatellite loci
to examine genetic diversity within and among populations as well as contemporary
spatial structuring. Mitochondrial DNA data revealed three geographically distinct
lineages, which were highly concordant with the three groups identified via microsatellite
analysis. Additionally, secondary contact was observed within the Allegheny River and
tributaries to Lake Ontario. The refugial groups in this study differed from the one
refugium (Mississippian), and two refugia (Mississippian and Atlantic) hypotheses
presented in the literature. With the exception of three allopatric populations within the
Allegheny watershed, high genetic structuring between populations suggests their
isolation, indicating that recovery efforts should be population-based.
58
Introduction
Contemporary species’ distributions have largely been influenced by historical
environmental changes during the Quaternary (Hocutt and Wiley 1986, Hanfling et al.
2002, Gum et al. 2005). Glaciation events during the Pleistocene played an important role
in influencing evolutionary history; in North America, most of Canada and New England
was repeatedly covered by ice sheets (Hocutt and Wiley 1986, Overpeck et al. 1992).
During cycles of glacial advance and retreat, species now found in formerly glaciated
areas were repeatedly displaced into peripheral (usually southern) refugia in order to
survive (Hocutt and Wiley 1986). These historical events have had a profound impact on
contemporary species’ distributions and genetic structure and diversity within species
(Bernatchez and Wilson 1998, Hewitt 2004). More recently, anthropogenic influences
have also had profound impacts on contemporary species distributions (Wang et al. 2001,
Leidy et al. 2011). Anthropogenic influences can negatively impact aquatic environments
by restricting fish dispersal, causing habitat loss, and degrading water quality (Helfman
2007). This includes the construction of dams that impede fish migration, the removal of
riparian vegetation which increases turbidity and sedimentation into waters, and increased
agricultural activities resulting in the release of harmful chemicals into the water
(Helfman 2007). Disentangling how past and more recent events have influenced fish
distribution can be a daunting task; however, it is necessary to understand how historic
and contemporary processes have contributed to observed patterns.
In many cases, genetic information can improve our understanding of species
biology, ecology, and behaviour, and can help inform management decisions. Many
studies have demonstrated that freshwater fishes have retained genetic historical
59
signatures that reflect glacial influences and events (Bernatchez and Wilson 1998); this
can provide important information on postglacial colonization routes and evolutionary
lineages for management and conservation (Wilson and Hebert 1998, Ginson et al. 2015).
At a more contemporary scale, conservation genetics provides information on
evolutionary lineages within species, their genetic diversity, migration levels and
population structure, which can aid in the identification of appropriate conservation and
management units for a species of concern (Frankham et al. 2002, McDermid et al. 2011).
This information can also be used to assess the genetic consequences of natural or
anthropogenic factors that impact population health and viability (Avise 2000, Frankham
et al. 2002). Genetic information at multiple temporal scales, along with information on
distribution range, abundances, and critical habitat designation are some of the
considerations required to conserve rare species (Frankham et al. 2002, MacKenzie et al.
2006, Helfman 2007).
The Redside Dace, Clinostomus elongatus (Teleostei: Cyprinidae), is a small
freshwater fish that typifies many of these conservation concerns and information needs.
Redside Dace are stream fish that are generally found in pools, and occupy a disjunct
distribution throughout the upper Mississippi River Drainage, Great Lakes Basin, Ohio
River and upper Susquehanna River (Novinger and Coon 2000, COSEWIC 2007). Within
Canada, Redside Dace populations are largely restricted to southern Ontario, with the
exception of one northern population near Sault Saint Marie (Redside Dace Recovery
Team 2010). Populations have been declining throughout their range as a result of
urbanization and agricultural activities, and 40% of Ontario populations are thought to be
extirpated (Parker et al. 1988, COSEWIC 2007, Redside Dace Recovery Team 2010). In
60
1973, Redside Dace was identified as a species of conservation concern on the grounds
that they were less common than 30 years prior (Scott and Crossman 1973). In 1987,
Redside Dace was designated as being of Special Concern by the Committee on the
Status of Endangered Wildlife in Canada (COSEWIC) (Parker et al. 1988) and reassessed
as Endangered in 2007 (COSEWIC 2007). The species was listed as Endangered under
Ontario’s Endangered Species Act in 2009 (OMNRF 2015). Within Ontario, urban
development is a major threat to populations in watersheds draining into western Lake
Ontario (Redside Dace Recovery Team 2010). Research to date has focused attention on
habitat associations, monitoring techniques, threats, and approaches to augmentation; by
contrast, only limited information exists on its historical origins and genetic structuring
(Berendzen and Dugan 2008, Houston et al. 2010, Redside Dace Recovery Team 2010,
Sweeten 2012).
Redside Dace were able to persist in glacial refugia and subsequently recolonize
Ontario; however, competing hypotheses exist in the literature of one versus two refugia.
Based on distributional data, Hocutt and Wiley (1986) suggested that Redside Dace
colonized their contemporary range from a single (Mississippian) glacial refugium,
whereas Mandrak and Crossman (1992) suggested that contemporary populations may
have originated from two (Atlantic and Mississippian) refugia. There exists the potential
for more than two glacial refugia, as these hypotheses are only based on distributional
data; the use of genetic data can provide insight into distinct lineages that can be used to
infer number of refugial groups. Additionally, large scale phylogeography has never been
applied across the Redside Dace range, and can be informative for detecting structure,
ancestry and relationships within a species range (Avise 2000). It is worth noting that
61
multiple refugial groups are a common interpretation (Mandrak and Crossman 1992,
Soltis et al. 2006, Ginson et al. 2015) among other southern Ontario freshwater fishes. A
better understanding of glacial refugia and postglacial dispersal routes should provide
valuable insights into how these affected present patterns in genetic diversity and
structure of Redside Dace, as well as other species occupying similar ranges.
Few studies have looked at contemporary and historical genetic structuring of
Redside Dace populations. Houston et al. (2010) assessed molecular systematic
relationships within a subset of the cyprinid family based on mitochondrial (mtDNA) and
nuclear data, and resolved Clinostomus as the sister group to the Richardsonius-
Lotichthys clade, refuting previous interpretations which grouped Clinostomus as a sister
group to Richardsonius. They also used mtDNA to infer that C. elongatus and C.
funduloides diverged approximately 2.6 million years ago, and determined the mutation
rate of cytochrome b to be ~1.7% per million years (Houston et al. 2010). Berendzen et
al. (2008) assessed genetic patterns of Redside Dace using cytochrome b within and
among several tributaries of the upper Mississippi River, and found shallow patterns of
genetic divergences among the three tributaries. After assessing the same populations
using microsatellite analysis, high among population variation was observed, indicating
that conservation efforts would need to be drainage based (Berendzen et al. 2008). More
recently, Pitcher et al. (2009) designed a set of eight microsatellite loci for Redside Dace
and its congener (C. funduloides) to be used in characterizing population structuring and
genetic diversity within Redside Dace. These loci have subsequently been used to assess
mate choice and reproductive success in captive Redside Dace (Beausoleil et al. 2012),
but have yet to be applied to spatial or conservation genetic questions. A comprehensive
62
understanding of how Redside Dace declines throughout Ontario may affect local and
regional levels of genetic diversity can be used to select candidates that would serve as
source populations for re-introductions, to examine the effects of inbreeding depression,
as well as fragmentation. Examining the global and local variation in genetic diversity of
Redside Dace has been identified as a research priority in the Ontario Redside Dace
Recovery Strategy (Redside Dace Recovery Team 2010).
In this study, I assessed the phylogeography and contemporary genetic structure
and diversity of Clinostomus elongatus across its North American range, with an
emphasis on Ontario populations. Sequence analysis of mtDNA was used to look at large-
scale phylogeography and infer post-glacial dispersal patterns, while microsatellite data
were used to characterize contemporary genetic structuring, patterns of gene flow, and
genetic diversity levels within and among sampled populations. My major study
objectives were to (i) identify phylogeographic lineages based on mtDNA sequencing in
order to determine the number of glacial refugia from which contemporary populations
originated, and (ii) characterize the genetic diversity and structure of Redside Dace
populations in Ontario and across its range using microsatellite loci. Using mtDNA, a
single refugium for all Redside Dace during the Pleistocene would be supported by
shallow mtDNA lineages with low bootstrap support and little or no spatial structuring
(Avise 2000, Maggs et al. 2008). Alternatively, if Redside Dace dispersed after glaciation
from more than one refugium, I would predict that multiple divergent genetic clusters
with high bootstrap support and largely distinct geographic distributions would be
detected. For microsatellite data, strong contemporary genetic structuring among Redside
Dace populations would be expected because of their specific habitat requirements
63
(Novinger and Coon 2000) and limited dispersal abilities between watersheds (Poos and
Jackson 2012).
Methods
Field Sampling
Redside Dace samples were obtained across the species’ range. Samples were
collected intensively throughout Ontario and broadly across other parts of their range in
order to test contrasting postglacial origin hypotheses as well as to assess how the spatial
genetic structuring of the Ontario populations compare with that throughout the rest of
their range. Samples were collected across Ontario using a bag seine or backpack
electrofisher (Table 3.1; Figure 3.1). Eleven populations were sampled within Ontario
during 2012 and 2013 and DNA samples were collected. The buccal swab technique was
used to collect DNA samples when more than 30 fish were caught during the first
sampling day (Reid et al. 2012). The caudal fin clip technique was used to collect DNA
samples when less than 30 fish were caught during the first sampling day. Fin clipping
ensured that individuals from populations that were sampled repeatedly (in order to obtain
target sample sizes) were not resampled. DNA tissue samples from 21 populations across
other parts of its range were provided by collaborators in the United States including the
Mississippi River, Great Lakes, Ohio River, Allegheny River, Monongahela River and
Susquehanna River basins in order to assess genetic diversity across the global range
(Table 3.1).
DNA extraction
64
DNA from two samples were used for primer optimization of microsatellite and
mitochondrial loci, and were extracted using Qiagen DNeasy Blood and Tissue kits
(QIAGEN); all other samples were extracted using a simple lysis and isopropanol method
(Sambrook et al. 1989). DNA was extracted from fin clips using QIAGEN following the
manufacturer’s instructions with some exceptions: the samples were lysed at 65 C in the
incubator overnight; after adding AW2, each sample was spun at 13, 000 x g for 5 min;
sample DNA was eluted in low TE (10 mM Tris pH 8, 0.1 mM EDTA).
DNA from fin clips and buccal swabs were extracted in 96 well plates using an
isopropanol protocol with 250 µL of lysis buffer consisting of approximately 250 L of
1xTNES lysis buffer (50 mM Tris pH 8, 100 mM NaCl, 1 mM EDTA, 1 % SDS weight
per volume) and 1mg proteinase K. Plates were sealed with a silicon mat and placed in a
37C incubator overnight to lyse samples.
After incubation, samples were centrifuged for one min at 1030 Relative
Centrifugal Force (RCF) in order to spin condensation down to the well bottom. The
silicon mat was removed, and 10 L of 5M NaCl was placed in each well, followed by
500 L of 80% isopropanol. The plate was then resealed and centrifuged at 2360 RCF for
45 min. The supernatant was discarded, 1000 L of 70% ETOH was added to each well,
the plate was resealed and quickly vortexed, and then centrifuged at 2360 RCF for 45
min. The supernatant was again discarded, and the deep well plate was dried in a 65C
incubator for approximately 30 to 40 minutes. The DNA pellet at the bottom of the plate
was eluted in 150 L of sterile 1x TE buffer (10 mM Tris pH8, 1 mM EDTA). The plate
was resealed, vortexed, and incubated overnight at 4 C.
65
Gel Visualization
After overnight incubation, the resuspended DNA was transferred into a stock
plate. To determine if DNA was successfully extracted, samples were visualized using a
1.5% agarose gel; 2 L of 2X SYBR Green was added to 2 L of DNA, centrifuged
down to combine SYBR Green with DNA, and the 4 L mixture was added into each
well. A DNA mass ladder (BioShop) was included in each row of the gel as a size
standard (3 L of mass ladder plus 3L of SBYR Green). The gel was run at 95 V for
approximately 1.5 hours, and a picture was taken using ultraviolet (UV) photoimaging.
Depending on band strength, samples were diluted with varying volumes of low TE
(10mM Tris pH 8, 0.1mM EDTA), to make 100 L of working DNA solutions of
approximately 6 ng/L.
mtDNA PCR amplification
The ATPase 6 and 8, and cytochrome b mitochondrial genes were used for
assessing phylogeographic structure in Redside Dace. The primer sequences used to
amplify ATPase 6 and 8 were ATP 8.2_L8331 (5’-AAAGCRTYRGCCTTTTAAGC-3’)
and CO3.2_H9236 (5’- GTTAGTGGTCAKGGGCTTGGRTC-3’) (Sivasundar et al.
2001). Each 10 L PCR reaction consisted of 2 L template DNA [6 ng/L], 1 L BSA
[200 ng/L], 0.2 L dNTPs [10 mM], 2 L 10X Buffer (with 15 mM MgCL2), 0.4 L
MgCl2 [25 mM], 0.2 L of each primer [10 mM], 0.05 L Taq [5 units/L], and 3.95 L
ddH2O. PCR cycling conditions for ATPase 6 and 8 were an initial hot start of 94 ºC for 3
min, followed by 30 cycles of denaturation at 94 ºC for 1 min, annealing at 56 ºC for 90 s
and extension at 72 ºC for 90 s, with a final extension at 72 ºC for 1 min. For cytochrome
66
b, the primers used to amplify the fragment were LA-a (5’-GTGACTTGAAAAACCACC
GTT-3’) and HA-a (5’-CAACGATCTCCGGTTTACAAGAC-3’(Dowling and Naylor
1997). Each 10 L PCR reaction consisted of 2 L template DNA [6ng/L], 1 L BSA
[200 ng/L], 0.2 L dNTPs [10 mM], 2 L 10X Buffer (with 15 mM MgCL2), 0.4 L
MgCl2 [25mM], 0.2 L of each primer [10 mM], 0.05 L Taq [5 units/L], and 3.95 L
ddH2O. The PCR cycling conditions for this gene region were an initial hot start of 94 ºC
for 8 min, followed by 35 cycles of denaturation at 95 ºC for 30 s, annealing at 50 ºC for
30 s and extension at 72 ºC for 90 s, with a final extension at 72 ºC for 7 min. PCR
products were visualized on a 1.5% agarose gel to determine amplification success.
Sequencing
Post-PCR clean-up was done using ExoSAP to remove excess primers and
dNTPs. Before sequencing, 8 L of PCR product had the following mixture added to it:
0.9 L Antarctic Phosphatase Buffer, 5,000 units/mL Antarctic Phosphatase, and 20,000
units/mL Exonuclease (New England Biolabs). Reactions were placed in the
thermocycler with the following conditions: 37 ºC for 15 min, 80 ºC for 15 min followed
by a cool-down to 10 ºC.
Sequencing reactions were carried out in 12 µL reactions with each well
containing the following quantities of reagents: 0.5 L of BigDye dye terminator mix 3.1
(Applied Biosystems), 1L of 5X buffer, 9 L of ddH2O product and 0.5 L of PCR
product. With PCR reactions that generated faint bands, the reaction quantities were
changed to 1.0 L of PCR product and 8.5 L of ddH2O. PCR products were sequenced
in both directions using amplification primers as well as internal primers 8.3 (5’-
67
AAYCCTGARACTGACCATG-3’) for ATPase 6 and 8, and LDrs (5’-
CCATTTGTCATCGCCGGTGC-3’), HDrs (5’- GGGTTATTTGACCCTGTTTCGT-3’)
for cytochrome b (Dowling and Naylor 1997, Houston et al. 2010). The PCR plate was
then placed into the thermocycler under the following conditions: an initial hot start of 96
°C for 2 min, followed by 30 cycles of denaturation at 96 °C for 30 s, annealing at 55°C
for 15 s, and an extension at 60 °C for 4 min. Following the cycle sequencing reaction,
ethanol precipitation was done. 1.1L of 5M sodium acetate was heated to 37°C prior to
being added to the reaction, followed by 37 L of 95% ethanol. The plate was vortexed
and placed in the centrifuge at 2360 RCF for 45 min. The supernatant was discarded, 150
L of 70% ethanol was added to each well, and the plate was centrifuged at 2360 RCF for
45 min. Post centrifugation, the supernatant was discarded and the plate was dried in a 65
°C incubator. Once dried, the pellet was reconstituted in 10L of HiDi formamide
(Applied Biosystems), and sequenced product was visualized on an ABI 3730 sequencer.
Microsatellite PCR amplification
For microsatellite amplification, primers for ten published loci (Dimsoski et al.
2000, Bessert and Orti 2003, Pitcher et al. 2009) were optimized to be run in multiplex
(see Appendix 3.1 for list of primers). Each PCR reaction consisted of 2 L template
DNA [6 ng/L], 1 L BSA [200 ng/ L], 0.2 L dNTPs [10 mM], 2 L 10X Buffer (with
15 mM MgCL2), 0.05 L Taq [5 units/L], and remainder ddH2O depending on number
of primers added to the reaction with a total volume of 10 L. Primer concentrations for
multiplex reactions were as follows: Multiplex 1- RSD 86 [0.3 M], RSD 42A [0.3 M],
RSD 70 [0.3 M], RSD 2-91 [0.3 M]; Multiplex 2- RSD 142 [0.05], RSD 179 [0.05];
68
Multiplex 3- RSD 2-58 [0.3 M], CA 12 [0.3 M]; Multiplex 4- CA11 [0.3 M],
Ppro118 [0.3 M]. Multiplexes MP1 and MP2 had an initial hot start of 94 ºC for 10 min,
followed by 30 cycles of denaturation at 92 ºC for 1 min, annealing at 54 ºC for 1 min and
extension at 72 ºC for 1 min 30 s, with a final extension at 72 ºC for 7 min. Multiplexes
MP3 and MP4 had an initial hot start of 94 ºC for 10 min, followed by 30 cycles of
denaturation at 92ºC for 1 min, annealing at 57ºC for 1 min and extension at 72 ºC for 90
s, with a final extension at 72 ºC for 7 min. Post PCR, 1 mL of HiDi was mixed with 4 L
of 350 ROX size standard (Applied Biosystems). 10 L of the HiDi formamide (Applied
Biosystems) and 350 ROX mixtures were pipetted into each well of a 96-well plate and 1
L of PCR product was added. The plate was spun down and used for genotyping using
an ABI 3730 sequencer.
Data Analysis
Mitochondrial Data
Sequencher v.4.05 (GeneCodes) was used to trim primers, assemble and manually
edit bidirectional sequences from raw electropherogram “trace” files. Sequence data were
aligned using MEGA v.6.06 (Tamura et al. 2007). Basic haplotype information was
obtained using DnaSP v.5.10.01, including the number of unique haplotypes, variable
sites, and haplotype and nucleotide diversity for each gene region (Librado and Rozas
2009). Haplotype richness was calculated for ATPase 6 and 8, and cytochrome b, in
HAPLOTYPE ANALYSIS v.1.05 (Eliades and Eliades 2009) using a sample size of ten
for cytochrome b, and nine for ATPase 6 and 8, to account for differences in sample size
across populations. Mutational differences among haplotypes were assessed in PopART
69
v.1.7.2 (http://popart.otago.ac.nz) using minimum spanning networks (Bandelt et al.
1999). Dendrograms were created in MEGA v.6.06 (Tamura et al. 2007) for neighbour-
joining (p-distance) (Saitou and Nei 1987), and maximum parsimony with a 500
bootstrap support (Felsenstein 1985). Mutational networks and genetic distance
dendrograms were created for ATPase 6 and 8, and cytochrome b, separately, as well as
for the combination of the two gene regions (referred to as “total evidence” hereafter). In
order to test various refugial hypotheses, the populations were grouped into a priori
subsets to examine variation present at various hierarchical levels. An analysis of
molecular variance (AMOVA) was implemented in Arlequin 3.5 (Excoffier and Lischer
2010) to determine how the variation was partitioned.
Additionally, to determine time of divergence for Redside Dace cytochrome b
lineages, the C. elongatus - C. funduloides divergence of 2.6 million years and a variation
of 1.7% per million years (Houston et al. 2010) was applied to cytochrome b haplogroups
detected in this study.
Microsatellite Data
Genotype data were manually scored using GeneMapper v.4.1 (Applied
Biosystems). Microchecker v.2.2.3 (Van Oosterhout et al. 2004) was used to test for null
alleles and genotyping errors in the dataset. Genetic polymorphism levels were calculated
for number of alleles (Na), observed heterozygosity (Ho), expected heterozygosity (HE),
inbreeding coefficient (FIS), and Hardy Weinberg equilibrium using GenAlEx
v.6.5 (Peakall and Smouse 2006). To account for sample size differences, HP-Rare v.1.1
(Kalinowski 2005) was used to calculate allelic richness standardized to a sample size of
70
10 individuals or 20 alleles. Genepop v4.2 on the web (http://genepop.curtin.edu.au/) was
used to determine if linkage disequilibrium occurred among genotyped loci (Raymond
and Rousset 1995). Pairwise population FST values were calculated with Arlequin v.3.5
(Excoffier and Lischer 2010), and statistical significance was based on 10,000
dememorization steps.
Individual-based analysis was first run to determine if population structuring
exists, followed by population-based analysis to examine genetic differentiation between
groups. STRUCTURE, a Bayesian-based clustering program, was used to assign
individuals to clusters (K). The program was run with a 50,000 step burn-in period,
followed by 50,000 Markov Chain Monte Carlo (MCMC) resampling steps, with a total
of 8 iterations per value of K. Allele frequencies were not correlated, and the no
admixture model was selected because the populations under study are distinct (Pritchard
et al. 2000). The K value was set from 1 (panmixia) to 29 (indicating discrete population
structuring for all sampled groups). Additionally, STRUCTURE was re-run based on the
populations that clustered together when K=3 (one western and two eastern refugial
groups). The objective of these runs was to characterize population structure at regional
and local scales. Cluster 1 was run from K 1 to 10, Cluster 2 was run from K 1 to 10, and
Cluster 3 was run from K 1 to 20. Structure Harvester v 0.6.94 (Earl and vonHoldt 2011)
was used to visualize the optimal number of clusters using the data generated from
STRUCTURE with (i) maximum likelihood for K and (ii) highest second order rate of
change using ∆K (Evanno et al. 2005). CLUMPP v.1.1.2 (Jakobsson and Rosenberg
2007) was used to collate independent run results from STRUCTURE, which were then
visualized using program DISTRUCT v.1.1 (Rosenberg 2007). Principal Coordinate
71
Analysis (PCoA) was run using GenAlEx (Peakall and Smouse 2006). The purpose of
this test was to use an ordination method to visualizeat genetic distances among sites
sampled. POPTREE2 (Takezaki et al. 2010) was used to generate genetic distance
dendrograms using neighbour joining of DA genetic distance measure (Nei et al. 1983).
The DA genetic distance measure has been proven to construct precise and accurate
dendrograms under many evolutionary models (Takezaki et al. 2010). Lastly, a
hierarchical FST analysis was completed in Arlequin v.3.5 to account for the
subpopulation substructure variation by running an AMOVA. This was run to examine
groupings between (i) east versus western groups, (iii) clustering according to K=3 for
STRUCTURE output and PCoA analysis, and (iii) current drainage distributions (Great
Lakes, Upper Mississippi, Ohio River, and Susquehanna). A Mantel test was run in
GenAlEx to test for a significant relationship between genetic and geographic distance
between all population pairs, using pairwise FST values and geographic distances based
on map coordinates. This relationship was then graphed, with the geographic distance on
the x-axis, ln (geographic +1) transformed, and the genetic distance on the y-axis, Fst/(1-
Fst), with a total of 999 permutations (Rousset 1997).
Results
Mitochondrial polymorphism
A total of 312 individuals were successfully sequenced for ATPase 6, ATPase 8,
and cytochrome b. Twenty-three unique haplotypes were identified from ATPase 6 and 8
sequences obtained from 338 Redside Dace individuals (Table 3.2). No gaps were present
in the data. Of the 806 nucleotide positions in this gene region, 29 were variable
72
(polymorphic), with 25 parsimony-informative sites, and 4 singleton sites. Haplotype
diversity within populations ranged from 0 to 0.667, and nucleotide diversity ranged from
0 to 0.00127 (Table 3.3). Allegheny River drainage populations WOO and BHR had the
highest haplotype diversities, but also had larger sample sizes (Table 3.3). These results
were consistent when haplotype richness was calculated, standardized for a sample size of
ten. For cytochrome b, 35 unique haplotypes were identified from 327 Redside Dace
individuals. Three sequences were removed from the analysis due to ambiguity in base
calling. Of the 1101 nucleotide positions in this region, no gaps were present, and 41 sites
were variable, of which 33 were parsimony informative, while 8 were singleton sites
(Table 3.4). The overall haplotype diversity for all sequenced individuals was 0.897,
while nucleotide diversity was 0.003. Haplotype diversity within populations ranged from
0 to 0.838, while nucleotide diversity ranged from 0 to 0.005 (Table 3.3). Twelve
populations were characterized by a single haplotype. The highest haplotype and
nucleotide diversities were associated with three Allegheny River populations (DOD,
BHR, and EBM) (Table 3.3). These results were consistent when haplotype richness was
calculated, standardized to a sample size of ten.
For the total evidence analysis, ATPase 6 and 8 and cytochrome b sequences from
312 individuals were combined to create a 1908 base pair fragment. This resulted in 47
unique haplotypes with 37 parsimony informative sites and 10 singletons (Table 3.5).
Mutational Network and Distance Based Analysis
For ATPase, a mutation network analysis identified the presence of two
haplogroups, A and B, which were separated by a minimum of six mutational steps
73
(Figure 3.2). The TTR population was represented by haplotype 9, and were separated by
a minimum of five mutational steps between the two haplogroups. Haplogroup A was
represented by 18 unique haplotypes with a maximum of five mutations between any two
haplotypes, while B was represented by 4 unique haplotypes, with a maximum of three
mutations between any two haplotypes. The two groups were supported by high genetic
structuring between the eastern and western groups with high bootstrap support (77%) for
haplogroup A from haplogroup B (Figure 3.3). Phylogeographical structuring was
observed, with haplogroup A representing an eastern distribution, and haplogroup B
representing a western distribution (Figure 3.2). Haplotypes 1, 3, 9, and 15 were the most
common haplotypes found in multiple drainages. A group was defined as having a
minimum of six mutational steps away from each other, as well as having a high
bootstrap support (>70) within the neighbour joining dendrogram.
For cytochrome b, a mutation network analysis revealed the presence of two
haplogroups (labelled C and D) which were separated by a minimum of seven mutational
steps (Figure 3.4). Haplogroup C was represented by 26 unique haplotypes with a
maximum of six mutations away from any two haplotypes, while haplogroup D was
represented by nine unique haplotypes with a maximum of three haplotypes between any
two mutations. The two groups are supported by high bootstrap support (99%) (Figure
3.5). Haplogroup D was restricted to the Allegheny and Ohio River drainages, whereas
haplogroup C was present within all sampled drainages. Haplotypes 2, 9, and 19 were the
most common haplotypes observed. Based on the molecular clock estimated by Houston
et al. (2010) of 1.7% per million years for cytochrome b, the genetic distance of 0.9%
74
observed between cytochrome b haplogroups C and D would suggest their divergence
took place approximately 0.5 million years ago.
For the total evidence data (ATPase 6 and cytochrome b combined), a mutation
network analysis revealed the presence of three composite haplogroups (labelled 1, 2, and
3) which were separated by a minimum of five mutational differences between each
group (Figure 3.6; Appendix 3.4). Haplogroup 1 was the most common and was
represented by 27 unique haplotypes, haplogroup 2 was represented by 8 unique
haplotypes, and haplogroup 3 was represented by 11 unique haplotypes. The three groups
were supported by high bootstraps (>70%), and haplogroup 7 did not fall within any of
the lineages and therefore grouped on its own (Figure 3.6, 3.7). A map of the three
haplogroups and their distributions indicated that haplogroup 2 was restricted to the
western portion of the species range (upper Mississippi River), while haplogroups 1 and 3
were only observed in its eastern range (Figure 3.8). Haplogroup 1 was found exclusively
within the Ohio River, Susquehanna River, and Great Lakes drainages, and haplogroup 3
was found exclusively within the Ohio River watershed. The distributions of the two
eastern haplogroups overlapped within the Allegheny River watershed a headwater
system of the Ohio River drainage (Figure 3.8).
Microsatellite polymorphism
All ten microsatellite loci were polymorphic across all populations. Within
individual populations, the number of polymorphic loci ranged from 60% to 100%, with a
mean of 88.6% and a standard error of 1.9% (Appendix 3.2). The number of alleles at
each locus ranged from 11 to 27 (mean =17.5). The mean number of alleles per locus
75
within a population was 3.8 when standardized (at 20 alleles) and 4.8 when uncorrected.
Across all populations, linkage was found between loci RSD2-91 and RSD42A, and Ca11
and RSD142 (p<0.05). With few exceptions, pairwise comparisons within populations
were not significant, and therefore all loci were used for analysis. Most loci by population
tests (93%) were in Hardy-Weinberg equilibrium (Appendix 3.3). When null alleles were
analyzed on a population-by-population basis using MICROCHECKER, there was no
evidence for null alleles in at least 26 of 28 populations analyzed, and all loci were used
in subsequent analysis.
Allelic richness and observed heterozygosity were greatest in eastern mid-latitude
Redside Dace populations, specifically those in the Allegheny River (WOO, EBM, BHR,
and DOD) (Table 3.6). The lowest values were found in Lake Huron tributary populations
in Ontario (TTR, SAU, and GUL). High levels of Redside Dace genetic diversity were
also measured from some of the Ohio River drainage populations (EFI and BRU) at the
more southern part of the species’ range. Populations on the north shore of Lake Ontario
tributaries had low to moderate genetic diversity levels, while the populations found in
the western portion of its range were characterized by low levels of diversity (Table 3.6).
Population Structure Analysis
Hierarchical STRUCTURE analysis confirmed population differentiation at
range-wide, regional, and local scales. The number of clusters (K) identified within
STRUCTURE was 17 using the likelihood [Pr(X/K)] values (Figure 3.9), while the
number of clusters using the Evanno et al. (2005) ΔK method was 2 (Figure 3.9).
STRUCTURE was used at a range-wide level to confirm the unique lineages identified
76
via mtDNA analysis (Figure 3.10). A K=1 value, which would result if Redside Dace
were panmictic, was not supported within STRUCTURE because it had the lowest
likelihood [Pr(X/K)] value (Figure 3.9). At K=2, the clustering analysis partitioned the
populations into “eastern” versus “western” groups. Although the Evanno et al. (2005)
method suggested K=2 to be the best method based on second order rate of change in ∆K,
population assignments based on K=3 groups showed better concordance with the
mtDNA neighbour-joining dendrogram and likelihood [Pr(X/K)] value. K=3 was best
supported at the range-wide level, and with the exception of Lake Ontario tributaries, the
populations that fell into the major mtDNA groups matched that of STRUCTURE results.
STRUCTURE partitioned the populations into a western group (the Northwest
Mississippi River, the Great Lakes (Lake Superior), and two eastern groups (Figure 3.10).
The first eastern group was observed in the Ohio River, tributaries of Lake Huron and
Lake Erie, and the lower Mississippi River; the second eastern group was observed in
Lake Ontario tributaries, the Allegheny River, and the Susquehanna River.
Further STRUCTURE analysis was separately undertaken within each of the three
groups identified at K=3. Each subgroup showed evidence of genetic substructure: the
substructure runs indicated that the first eastern (blue) group had a total of eight groups,
while the western (red) group had a total of five groups, and the second eastern (green)
group had a total of sixteen (Figure 3.10, 3.11). These values were obtained based on the
ln(P(K)) method because they provided the most accurate optimal solutions for K. Within
each of the three groups, strongpopulation structuring was observed across all
populations. An exception to this, was three of the Allegheny River populations within
77
the second eastern lineage (DOD, WOO, BHR), as well as the Monongahela population
(STR), which consistently grouped together across different solutions for K (Figure 3.10).
PCoA identified the same groups as STRUCTURE at K=3 (Figure 3.10, 3.12).
Group B and group C (eastern populations) grouped closer together, while group A
(western) appeared to be more genetically distinct (Figure 3.12). Axes one and two
accounted for 33.2% (eigenvalue = 10.1) and 14.2% (eigenvalue = 4.3) of the variation in
the microsatellite data, for a cumulative value of 47.4%. The neighbour joining
dendrogram of pairwise genetic distances generated using Nei et al.’s (1983) DA indicated
substantial genetic divergence among populations (Figure 3.13). Groups A and C
identified by PCoA analysis were supported by high bootstrap values in the neighbour-
joining dendrogram, however, group B showed two discordances (i) one of the
populations within the neighbour joining dendrogram in group B (MIL) was more
genetically similar to the western group, and (ii) the Ohio River populations within group
B were more genetically similar to the group C than to the Great Lakes populations
within Group B.
Pairwise genetic differences (FST tests) between populations were all significant (p
< 0.05), and pairwise FST values ranged from 0.08, to 0.62 (Table 3.7). A significant
isolation by distance was revealed when geographical distance was plotted against
pairwise genetic distance (R2 = 0.28, p < 0.0001) (Figure 3.14). The second IBD plot of a
subset of populations of geographic distance (Ln (1+Distance)), showed that the genetic
differences were higher between geographically proximatepopulations, whereas lower
genetic distances were found between more distant Allegheny River populations (Figure
3.15).
78
Hierarchical Analysis of Molecular Variance (AMOVA)
AMOVA identified significant genetic variation at all hierarchical levels, for the
hypotheses tested (Table 3.8). For mtDNA, the greatest amount of variation was observed
among the three groups identified via total evidence analysis (71.1%), while 21.9% of the
variation was attributed to populations within groups, and 7.0% of the variation was
attributed to within populations. The second highest amount of variation was observed
between the Atlantic and Mississippian populations (53.8%), with 32.1% of the variation
attributed to populations within the drainage groups, and 14.1% of the variation attributed
to within populations. The least amount of variation was observed among the three groups
identified by STRUCTURE and PCoA (47.1%), with 35.2% of the variation attributed to
populations within groups, and 17.7% of the variation attributed to within populations
(Table 3.8). The AMOVA results from the FST analysis indicated that most of the
variation occurred within populations of all three hypotheses tested (54.6% for two
groups, 59.2% for three groups, 61.4% for four groups), rather than among populations
within groups (24.6% for two groups, 21.1% for three groups, 28.5% for four groups), or
among groups (20.8% for two groups, 19.7% for three groups, 10.1% for four groups)
(Table 3.8).
Discussion
The genetic data from this study indicated that Redside Dace were isolated within
multiple glacial refugia during the last ice age, and post-glacial recolonization routes can
be inferred based on phylogeographic patterns. Both mitochondrial and microsatellite
data exhibited evidence of three distinct phylogeographic lineages with restricted
79
geographic distributions. The restricted geographic range and high genetic variation
between the three mtDNA haplogroups, suggests their historical persistence in separate
refugia. Co-occurrence of mtDNA lineages occurred within the Allegheny River
drainage, suggesting historical secondary contact during postglacial colonization. Using
microsatellite data, high genetic structuring was observed, suggesting little to no gene
flow and reciprocal isolation among sampled populations. Moderate genetic diversity
levels were found in urbanized areas of Ontario, whereas high diversity levels were found
within the Allegheny River drainage likely as a result of fewer landscape-level
disturbances, and the southern part of the Redside Dace range likely as a result of those
areas being in unglaciated areas. The data generated from this study can be used as a
baseline for future Redside Dace genetic monitoring.
Evidence for multiple glacial lineages
Multiple glacial refugia were supported by the highest among-group variation in
the AMOVA being explained by the mtDNA total evidence lineages. My findings do not
support the single Mississippian hypothesis (Hocutt and Wiley 1986); there are three
lineages with high bootstrap support and spatial structuring, which is contrary to shallow
mtDNA lineages and limited spatial structuring expected by one refugium. My data
support a multiple refugia hypothesis, as supported by both mtDNA and microsatellite
evidence, all reflective of a Mississippian origin. However, instead of one Atlantic and
one Mississippian lineage as hypothesized, three Mississippian groups were identified.
An Atlantic origin, as suggested by Mandrak and Crossman (1992), may be more difficult
to achieve because the Appalachian mountains could have created a barrier to dispersal,
as seen with the Slider turtle (Trachemys scripta), Bowfin (Amia calva), and Largemouth
80
Bass (Micropterus salmoides) (Soltis et al. 2006). If an Atlantic lineage was present, the
eastern Lake Ontario populations should have separated into its own haplogroup, instead
of grouping along with the eastern Mississippian lineages, as seen with Banded Killifish
(Fundulus diaphanous) (April and Turgeon 2006) and Lake Trout (Salvelinus
namaycush) (Wilson and Hebert 1998). Redside Dace diverged from their congener (C.
funduloides) approximately 2.6 million years BP during the Pliocene (Houston et al.
2010), and so the splitting of the three lineages would have had to take place after the
speciation event.
The narrow western distribution of haplogroup two was characterised by high
genetic distances from haplogroup one and three, suggesting its long-term isolation
Borden and Krebs (2009) identified a similar pattern for Smallmouth Bass (Micropterus
dolomieu), which was also characterized by low genetic diversity levels west of the
Mississippi River. The “Driftless Area,” located in southwestern Wisconsin was not
covered by glaciers and likely served as a refugium to Redside Dace and other freshwater
fishes such as Brook Trout (Salvelinus fontinalis) (Berendzen and Dugan 2008, Hoxmeier
et al. 2015). To reach their current distributions within the western range, Redside Dace
likely used the Brule-Portage outlet of Lake Duluth (precursor to Lake Superior), which
formed approximately 11, 500 years BP (Mandrak and Crossman 1992). Redside Dace
may have reached as far as ancestral Lake Huron or Lake Algonquin either using Lake
Duluth or the Michigan Upper Peninsula. Similar eastward dispersal from a western
source has been inferred for the Common Gartersnake (Thamnophis sirtalis) (Placyk et al.
2007), which would have used terrestrial corridors. Of the three lineages, haplogroup one
was characterized by the highest number of haplotypes and also had the widest
81
geographic range, with tributaries draining into the Ohio River, Lake Erie, Lake Ontario,
Allegheny River and the Susquehanna River. Co-occurrence of haplogroup one and three
was found within the Allegheny River basin, suggesting their secondary contact following
deglaciation. Within haplogroup one, the highest genetic diversity levels were found
within the tributaries of the Kentucky and Licking rivers (tributaries of the Ohio River),
and were consistent with high diversity levels in southern, unglaciated ranges (Hewitt
1996, Bernatchez and Wilson 1998). These populations are located within the Eastern
Highland Region, which was unaffected by the ice sheets and contained high amounts of
endemism and species richness (Mayden 1988). The Kentucky River served as an
important refugial source for Smallmouth Bass (Borden and Krebs 2009), the Painted-
hand Mudbug crayfish (Cambarus polychromatus) (Simon and Burskey 2014) and the
Orangethroat Darter (Etheostoma spectabile) (Bossu et al. 2013). In general, the southern
tributaries of the Teays River system were a refugial source for freshwater fishes during
the last ice age (Hocutt et al. 1986).
Using cytochrome b haplotype divergence data, and the temporal divergence
estimate for C.elongatus and C. funduloides published by Houston et al. (2010),
haplogroup one and three may have separated from each other ~0.5 million years BP
during the Kansan glaciation (Hocutt et al. 1986), and were likely connected via the
Teays River System. During the early Pleistocene, the Teays originally started at the
headwaters of West Virginia and went northwest through Ohio, Indiana and Illinois,
where it then connected to the ancestral Mississippi River (Ver Steeg 1946). Connections
within this drainage system are thought to have been altered starting in the Nebraskan
around 1 million years BP, resulting in the current Ohio River System (Hocutt et al.
82
1986). Haplogroup 3 was found within the upper Ohio River system. Based on the genetic
haplotype distribution and knowledge of past events, I hypothesize that Redside Dace
were present in unglaciated West Virginia, and gained access to the Allegheny and
Susquehanna River drainage via proglacial Lake Monongahela (Hocutt and Wiley 1986,
Mandrak and Crossman 1992).
The data presented here suggest the presence of three separate Mississippian
refugia. For haplogroups one and two, the distinct geographical distributions and the high
number of mutations are reflective of two separate refugia. The two haplogroups also
appear to have not come into contact based on the haplogroups corresponding to non-
overlapping geographic localities. Similarly, haplogroup one and three likely persisted in
separate refugia: their divergence and distinct geographic distributions reflect their
isolation and subsequent dispersal from separate refugia (Maggs et al. 2008). The two
glacial groups appear to have come into contact with each other after the ice melted; this
is supported by co-occurrence of haplogroups one and three within the Allegheny, as well
as the high mtDNA and microsatellite diversity level in this area, which is found when
two groups come into contact with each other (Maggs et al. 2008).
A point worth mentioning is the discordance between the ATPase 6 and 8, as well
as the cytochrome b data set. ATPase 6 and 8 and cytochrome b, each identify two groups
that are different from each other. This could be the result of selective sweeps, mutation,
or background selection (Ballard and Whitlock 2004). While some authors have argued
for analysing data from each gene region separately as a result of heterogeneity in
datasets (Miyamoto and Fitch 1995), other studies have taken a total evidence approach,
because it is thought that multiple genes evolving at different rates could interact in a
83
positive way to resolve phylogenetic relationships (Kluge 1989, Mousson et al. 2005).
Given the high congruence between the microsatellite and the total evidence mtDNA
dataset, the latter approach seemed to be the most appropriate.
The combined use of both mtDNA and microsatellite markers provide invaluable
contributions to understanding the interplay between historical and contemporary genetic
structuring. Co-occurrence of the two eastern mtDNA lineages occurred within the
Allegheny River drainage, and microsatellite data suggest that historical secondary
contact extended up to the western tributaries of Lake Ontario. The predominance of
haplogroup one over haplogroup three in secondary contact zones within western Lake
Ontario tributaries could have been the result of natural selection, genetic drift, or smaller
number of founding individuals for haplotype three coming into contact with an already
established haplogroup one (Bernatchez and Wilson 1998, Avise 2000, Galtier et al.
2009, Jezkova et al. 2013). Mitochondrial DNA is haploid and therefore has a lower
effective population size, and reaches fixation at a fast rate (Avise 2000, Avise 2001), and
would therefore under-estimate the extent of secondary contact. Microsatellite markers
are diploid, have a high mutation rate, and in conjunction with the mtDNA, they allow for
a comprehensive understanding of the extent of secondary contact based on the
discrepancy between the two lines of evidence. The use of microsatellite data on their
own would make it difficult to infer secondary contact, while the use of mtDNA on its
own would underestimate the degree of secondary contact.
I hypothesized that variation in mtDNA data was likely the result of multiple
interstadials, which would have allowed for long-distance dispersal of haplogroup one via
glacial meltwaters (Lewis et al. 1995, Dyke 2004). Lake Ontario was likely an important
84
area of secondary contact for Redside Dace, along with other north-eastern freshwater
fishes during deglaciation (Mandrak and Crossman 1992, Stepien and Faber 1998, April
et al. 2013). Additionally, Lake Erie was a critical post colonization route for species
dispersal during the Pleistocene because it was the first area to be deglaciated (Lewis et
al. 2008). Lake Ontario was covered by ice sheets during the last ice age and it wasn’t
until 13, 000 years ago that the glaciers started to melt, resulting in the formation of Lake
Iroquois, which received waters from Lake Whittlesey (ancestral Lake Erie) (Hocutt and
Wiley 1986). The end of the Erie ice lobe was marked by a large flooding through New
York to Hudson Valley, and subsequently to the Atlantic Ocean; this could have
contributed to Redside Dace refugial mixing (Schmidt 1986, Mandrak and Crossman
1992). Another possible scenario is that the two refugial groups mixed prior to
colonization, and then later dispersed. If haplogroup one reached fixation before
dispersal, this would create similar results as seen under the secondary contact scenario.
Contemporary structuring
As well as reflecting intraspecific phylogeographic lineages, microsatellite data
show evidence of high genetic structuring within Redside Dace populations, indicating a
general lack of gene flow among populations. While microsatellite markers are usually
used for looking at contemporary genetic structuring (Freeland et al. 2011), the PCoA and
STRUCTURE results reflect phylogeographic ancestry, suggesting that historical factors
have had an important influence on present day genetic patterns. Despite the evidence of
historical signatures, however, high genetic structuring was found among most Redside
Dace populations, suggesting that local populations became reciprocally isolated after
colonization. The lack of connectivity between populations as a result of limited dispersal
85
abilities (Poos & Jackson 2012), can cause genetic drift and therefore result in genetic
differentiation between populations. Redside Dace are also habitat specialists because
they prefer slow-moving waters, overhanging vegetation, occupy mid-position levels in
pools, and require cool and clear waters in order to persist, resulting in smaller home
ranges and patchy distributions (Novinger & Coon 2000, Poos & Jackson 2012). The
findings presented here are consistent with other Redside Dace genetic studies, which
documented high genetic structuring among Ohio River drainage populations (Sweeten
2012), as well as populations in Mississippi River tributaries (Berendzen et al. 2008).
This study greatly expanded on these previous studies by providing evidence for similar
genetic structuring throughout the species range. Similar results are observed in other
small-bodied fishes in Ontario including Eastern Sand Darter (Ammocrypta pellucida)
(Ginson et al. 2015), Greenside Darter (Etheostoma blennioides) (Beneteau et al. 2009),
and Pugnose Shiner (Notropis anogenus) (McCusker et al. 2014).
Effects of urbanization and agriculture on genetic diversity levels
Increasing urban development in adjacent landscapes is thought to have resulted
in the decline and isolation of Redside Dace populations (MNR 2011). Despite over 80%
of Ontario populations residing within areas of high urbanization (COSEWIC 2007), high
genetic diversity levels were found throughout much of the Redside Dace range. This
could be the result of samples being collected from headwater populations that are still
abundant and on the edge of the advancing front of urbanization. Similar patterns have
been reported in freshwater mussels (Galbraith et al. 2015) and Lake Sturgeon (Wozney
et al. 2011), where genetic diversity levels did not reflect population status. The lowest
genetic diversity levels were observed in the Saugeen River and Gully Creek (Lake
86
Huron drainage), which are both characterized by high agricultural activities. Redside
Dace were historically widespread in the Saugeen River watershed, but are now limited to
a small number of headwater catchments. Within Gully Creek, Redside Dace numbers are
thought to be below those needed for long-term population viability (Poos et al. 2012).
Poos et al. (2012) found that Redside Dace were located throughout Gully Creek, but only
at low densities, and suggested that their low numbers could be the result of the chronic,
as opposed to episodic, nature of chemicals being released into the water. The reduced
genetic diversity in these systems might therefore reflect reduced numbers of Redside
Dace and increased genetic drift, or might also reflect genetic loss due to local inbreeding
(Redside Dace Recovery Team 2010). It is worth noting that the low sample sizes for
Saugeen River sites might have artificially reduced diversity estimates. Poos et al. (2012)
also flagged Gully Creek as a population of special concern, with current demographics
falling below the population size required to ensure long-term viability.
Allegheny River populations
STRUCTURE analysis identified that three populations within the larger
Allegheny River watershed (DOD, BHR, and WOO) were genetically similar. The STR
population may also belong to this group, as it is most genetically similar to BHR despite
being from the Monogahela watershed. While EBM is part of the Allegheny River
watershed, it was genetically different than other populations. Two potential explanations
for the genetic similarity of the three populations could be the result of natural dispersal
and connectivity among the Allegheny River populations, or human transfer among sites
via bait bucket introductions; Redside Dace is most abundant within Pennsylvania, in
comparison to the rest of its range. As a result, it is permitted for use as bait within this
87
jurisdiction. For natural dispersal, if high connectivity was present between the three
Allegheny River populations, genetically similar populations could have resulted due to
high gene flow. There is no evidence to refute the natural dispersal scenario; while the
populations surveyed are geographically distant and Redside Dace are non-migratory fish,
they are also very abundant in the Allegheny watershed, and no published literature
exists on genetic structuring and/or dispersal. The focus of past Redside Dace studies
have been on populations that were in areas of urbanization and agricultural activities
(Sweeten 2012, Poos & Jackson 2012, Poos et al. 2012) , and this is the first study to look
at this species in an area of high abundance. Bait bucket introductions into streams is
another possible scenario. If bait bucket transfers are occurring, high genetic similarities
would result due to recent shared ancestry, irrespective of geographic distance, because
gene flow is human-mediated, and can therefore span large distances. The lower FST
values between the Allegheney River populations over greater distances could be
explained by such transfers. Literature suggests that anglers still dispose of live baitfish
into the waters even though there are strict regulations that permit this, which could have
resulted in the release of Redside Dace as baitfish into non-native waters (Litvak &
Mandrak 1993).
The available data do not refute/disprove predictions from evidence for either
hypothesis. To resolve the probable cause underlying the observed genetic patterns in the
Allegheny watershed, fine-scale genetic sampling within the three Allegheny streams
would be required to document potential metapopulation structure and test or refute both
hypotheses. If a natural dispersal scenario were valid, I would expect to see genetic
similarity between all sampled populations, and a plot of isolation by distance would
88
reveal low divergence over low to moderate geographic distances, and divergences
increasing with geographic distance. If a bait bucket scenario were valid, I would not
expect to see the same patterns of increased genetic distance over increased geographic
distance; the introductions would be human-mediated and fish are able to travel large
distances.
Management Implications
Defining conservation units below the species level has yet to be done in a large-
scale context for Redside Dace, and the genetic data generated from this study can be
used to inform recovery efforts. The three monophyletic lineages identified using
microsatellite and mtDNA analyses, and their genetic distinctiveness from each other,
indicate their long-term separation. The reciprocal monophyly observed within the three
groups, meets one of the criteria for defining Evolutionary Significant Units (Moritz
1994) and Management Units (Moritz 1994) within the United States, and Designatable
Units (COSEWIC 2012) within Canada. For identifying DUs, however, both evolutionary
and/or ecological significance are also necessary for recognizing distinct groups
(COSEWIC 2014). To date, as only limited ecological data exist for comparisons, it may
therefore be worth considering whether or not the genetic groups identified should be
recognized as separate DUs, or a single DU.
Regardless of whether populations in Canada are classified as a single DU or
multiple DUs, recovery efforts focused on multiple distinct genetic conservation units
will be important for recovery efforts. Successfully conserving a species, involves
preserving multiple populations from a variety of ecological settings that are able to
89
support themselves (Redford et al. 2011). By identifying multiple populations to protect
within these three major groups, this would ensure adequate conservation of all lineages
throughout the Redside Dace range. Additionally, the jurisdiction of Ontario is unique in
that it contains all three lineages identified by both mtDNA and microsatellite analysis, so
all groups and surrounding habitat receive protection under the Ontario Endangered
Species Act.
The genetic data will be invaluable for informing translocation and population
augmentation efforts, which have been limited for Redside Dace recovery. The mtDNA
data (total evidence network and dendrogram) will be crucial for delineating major
lineages, while the microsatellite data (genetic diversity levels) will be important for
translocation efforts to avoid both outbreeding and inbreeding depression, and to avoid
introducing novel genotypes to the population (George et al. 2009). To date, only one
augmentation initiative has been attempted, with approximately 500 individuals
translocated from Asher Creek to Mill Creek in Indiana (Sweeten 2012). Preliminary
results suggest limited spawning. A modelling study undertaken by Poos et al. (2012)
suggests that the minimum number of Redside Dace individuals required for population
viability is between 2900 and 4300 breeding individuals. Novinger and Coon (2000)
wanted to determine the feasibility of moving New York individuals to the endangered
Michigan population, and determined this would not be appropriate based on
physiological and behavioural experiments. Genetic data presented here also indicates
that this would be inappropriate on genetic grounds, because the two populations fall
within different evolutionary lineages. For future recovery efforts, genetic data will allow
for higher confidence in identifying appropriate “source” and “recipient” candidates for
90
translocation experiments, before investing resources into physiological and behavioural
experiments.
References
April J, Hanner RH, Dion‐Côté AM, Bernatchez L (2013) Glacial cycles as an allopatric
speciation pump in north‐eastern American freshwater fishes. Molecular Ecology, 22,
409-422.
April J, Turgeon J (2006) Phylogeography of the banded killifish (Fundulus diaphanus):
glacial races and secondary contact. Journal of Fish Biology, 69, 212-228.
Avise JC (2000) Phylogeography. Harvard University Press, Cambridge, MA.
Avise JC (2001) Cytonuclear genetic signatures of hybridization phenomena: rationale,
utility, and empirical examples from fishes and other aquatic animals. Reviews in Fish
Biology and Fisheries, 3, 253-263.
Bandelt H, Forster P, Röhl A (1999) Median-joining networks for inferring intraspecific
phylogenies. Molecular Biology and Evolution, 16, 37–48.
Beausoleil JMJ, Doucet SM, Heath DD, Pitcher TE (2012) Spawning coloration, female
choice and sperm competition in the redside dace, Clinostomus elongatus. Animal
Behaviour, 83, 969-977.
Beneteau CL, Mandrak NE, Heath DD (2009) The effects of river barriers and range
expansion of the population genetic structure and stability in Greenside Darter
(Etheostoma blennioides) populations. Conservation Genetics, 10, 477-487.
Berendzen PB, Dugan JF (2008) Establishing conservation units and population genetic
parameters of fishes of greatest conservation need distributed in Southeast Minnesota.
State Wildlife Grants Program. Division of Ecological Resources, Minnesota Department
of Natural Resources. Available from:
http://files.dnr.state.mn.us/eco/nongame/projects/consgrant_reports/2008/2008_berendzen
_etal.pdf.
Bernatchez L, Wilson CC (1998) Comparative phylogeography of Nearctic and Palearctic
fishes. Molecular Ecology, 7, 431–452.
Bessert ML, Orti G (2003) Microsatellite loci for paternity analysis in the fathead
minnow, Pimephales promelas (Teleostei: Cyprinidae). Molecular Ecology Notes, 3,
532–534.
91
Borden WC, Krebs RA (2009) Phylogeography and postglacial dispersal of smallmouth
bass (Micropterus dolomieu) into the Great Lakes. Canadian Journal of Fisheries and
Aquatic Sciences, 66, 2142-2156.
Bossu CM, Beaulieu JM, Ceas PA, Near TJ (2013) Explicit tests of palaeodrainage
connections of southeastern North America and the historical biogeography of
Orangethroat Darters (Percidae: Etheostoma: Ceasia). Molecular Ecology, 22, 5397-
5417.
Ceballos G, Ehrlich PR, Barnosky AD, García A et al. (2015) Accelerated modern
human–induced species losses: entering the sixth mass extinction. Science Advances, 1,
e1400253.
COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace
Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in
Canada. Ottawa, ON.
COSEWIC (2012) Guidelines for Recognizing Designatable Units. Committee on the
Status of Endangered Wildlife in Canada.
http://www.cosewic.gc.ca/eng/sct2/sct2_5_e.cfm.
Dimsoski P, Toth GP, Bagley MJ (2000) Microsatellite characterization in central
stoneroller Campostoma anomalum (Pisces: Cyprinidae). Molecular Ecology, 9, 2155–
2234.
Dowling TE, Naylor GJP (1997) Evolutionary relationships of minnows in the genus
Luxilus (Teleostei : Cyprinidae) as determined from cytochrome b sequences. Copeia,
1997, 758–765.
Dyke AS (2004) An outline of North American deglaciation with emphasis on central and
northern Canada. Quaternary Glaciations: Extent and Chronology, 2, 373-424.
Earl DA, vonHoldt BM (2012) STRUCTURE HARVESTER: a website and program for
visualizing STRUCTURE output and implementing the Evanno method. Conservation
Genetics Resources, 4, 359-361.
Eliades N-G, Eliades DG (2009) HAPLOTYPE ANALYSIS: software for analysis of
haplotypes data. Distributed by the authors. Forest Genetics and Forest Tree Breeding,
Georg-Augst University Goettingen, Germany.
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals
using the software STRUCTURE: a simulation study. Molecular Ecology, 14, 2611–
2620.
92
Excoffier L, Lischer HEL (2010) Arlequin suite ver 3.5: a new series of programs to
perform population genetics analyses under Linux and Windows. Molecular Ecology
Resources, 10, 564-567.
Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap.
Evolution, 39, 783-791.
Frankham R, Briscoe DA, Ballou JD (2002) Introduction to Conservation Genetics.
Cambridge University Press, Cambridge.
Freeland JR, Kirk H, Petersen S (2011) Molecular Ecology, Second Edition. Wiley-
Blackwell, New York.
Galbraith HS, Zanatta DT, Wilson CC (2015) Comparative analysis of riverscape genetic
structure in rare, threatened and common freshwater mussels. Conservation Genetics, 16,
845-857.
Galtier N, Nabholz B, Glémin S, Hurst GDD (2009) Mitochondrial DNA as a marker of
molecular diversity: a reappraisal. Molecular Ecology, 18, 4541-4550.
George AL, Kuhajda BR, Williams JD, Cantrell MA, et al. (2009). Guidelines for
propagation and translocation for freshwater fish conservation. Fisheries, 34, 529-545.
Ginson R, Walter RP, Mandrak NE, Beneteau CL, et al. (2015) Hierarchical analysis of
genetic structure in the habitat-specialist Eastern Sand Darter (Ammocrypta pellucida).
Ecology and Evolution, 5, 695–708.
Gum B, Gross R, Kuehn R (2005) Mitochondrial and nuclear DNA phylogeography of
European grayling (Thymallus thymallus): evidence for secondary contact zones in central
Europe. Molecular Ecology, 14, 1707-1725.
Hänfling B, Hellemans B, Volckaert FAM, Carvalho GR (2002) Late glacial history of
the cold‐adapted freshwater fish Cottus gobio, revealed by microsatellites. Molecular
Ecology, 11, 1717-1729.
Helfman GS (2007) Fish Conservation: A Guide to Understanding and Restoring Global
Aquatic Biodiversity and Fishery Resources. Island Press, Washington, DC.
Hewitt GM (1996) Some genetic consequences of ice ages, and their role in divergence
and speciation. Biological journal of the Linnean Society, 58, 247-276.
Hewitt GM (2004) Genetic consequences of climatic oscillations in the Quaternary.
Philosophical Transactions of the Royal Society of London B: Biological Sciences, 359,
183-195.
93
Hocutt CH, Jenkins RE, Stauffer Jr JR (1986) Zoogeography of the fishes of the central
Appalachians and central Atlantic coastal plain. In: Zoogeography of North American
Freshwater Fishes (eds Hocutt CH, Wiley EO), pp. 161-211. John Wiley and Sons, New
York.
Hocutt CH, Wiley EO (1986) The Zoogeography of North American Freshwater Fishes.
John Wiley and Sons, New York.
Houston DD, Shiozawa DK, Riddle BR (2010) Phylogenetic relationships of the western
North American cyprinid genus Richardsonius, with an overview of phylogeographic
structure. Molecular Phylogenetics and Evolution, 55, 259–73.
Hoxmeier RJH, Dieterman DJ, Miller LM (2015) Brook Trout Distribution, Genetics, and
Population Characteristics in the Driftless Area of Minnesota. North American Journal of
Fisheries Management, 35, 632-648.
Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation
program for dealing with label switching and multimodality in analysis of population
structure. Bioinformatics, 23, 1801-1806.
Jezkova T, Leal M, Rodríguez‐Robles JA (2013) Genetic drift or natural selection?
Hybridization and asymmetric mitochondrial introgression in two Caribbean lizards
(Anolis pulchellus and Anolis krugi). Journal of Evolutionary Biology, 26, 1458-1471.
Kalinowski ST (2005) HP-Rare 1.0: a computer program for performing rarefaction on
measures of allelic richness. Molecular Ecology Notes, 5, 187–189.
Sweeten J (Manchester University) (2012) Redside dace (Clinostomus elongatus) in Mill
Creek, Wabash County, Indiana: A strategy for research and augmentation. Indiana
Department of Natural Resources.
Kluge AG (1989) A concern for evidence and a phylogenetic hypothesis of relationships
among Epicrates (Boidae, Serpentes). Systematic Zoology, 38, 7-25.
Leidy RA, Cervantes‐Yoshida K, Carlson SM (2011) Persistence of native fishes in small
streams of the urbanized San Francisco Estuary, California: acknowledging the role of
urban streams in native fish conservation. Aquatic Conservation: Marine and Freshwater
Ecosystems, 21, 472-483.
Lewis CM, Moore TC, Rea DK, Dettman DL, et al. (1995) Lakes of the Huron basin:
their record of runoff from the Laurentide Ice Sheet. Quaternary Science Reviews, 13,
891-922.
94
Lewis CM, Karrow PF, Blasco SM, McCarthy FM, et al. (2008) Evolution of lakes in the
Huron basin: deglaciation to present. Aquatic Ecosystem Health & Management, 11, 127-
136.
Librado P, Rozas J (2009) DnaSP v5: a software for comprehensive analysis of DNA
polymorphism data. Bioinformatics, 25, 1451–1452.
Litvak MK, Mandrak NE (1993) Ecology of freshwater baitfish use in Canada and the
United States. Fisheries, 18, 6-13.
MacKenzie DI, Nichols JD, Royle JA, Pollock KH (2006) Occupancy Estimation and
Modeling: Inferring Patterns and Dynamics of Species Occurrence. Elsevier, Burlington,
MA.
Maggs CA, Castilh R, Foltz D, Henzler C, et al. (2008) Evaluating signatures of glacial
refugia for North Atlantic benthic marine taxa. Ecology, 89, S108-S122.
Mandrak NE, Crossman E (1992) Postglacial dispersal of freshwater fishes into Ontario.
Canadian Journal of Zoology, 70, 2247-2259.
Mayden RL (1988) Vicariance biogeography, parsimony, and evolution in North
American freshwater fishes. Systematic Biology, 37, 329-355.
McCusker MR, Mandrak NE, Egeh B, Lovejoy NR (2014) Population structure and
conservation genetic assessment of the endangered Pugnose Shiner, Notropis
anogenus. Conservation Genetics, 15, 343-353.
McDermid JL, Wozney JK, Kjartanson SL, Wilson CC (2011) Quantifying historical,
contemporary, and anthropogenic influences on the genetic structure and diversity of lake
sturgeon (Acipenser fulvescens) populations in northern Ontario. Journal of Applied
Ichthyology, 27, 12–23.
Ministry of Natural Resources (MNR) (2011) DRAFT Guidance for Development
Activities in Redside Dace Protected Habitat. Ontario Ministry of Natural Resources,
Peterborough, Ontario.
Miyamoto MM, Fitch WM (1995) Testing species phylogenies and phylogenetic methods
with congruence. Systematic Biology, 44, 64-76.
Moritz C (1994) Defining ‘evolutionarily significant units’ for conservation. Trends in
Ecology & Evolution, 9, 373-375.
Mousson L, Dauga C, Garrigues T, Schaffner F, Vazeille M, Failloux AB (2005)
Phylogeography of Aedes (Stegomyia) aegypti (L.) and Aedes (Stegomyia) albopictus
95
(Skuse)(Diptera: Culicidae) based on mitochondrial DNA variations. Genetical
research, 86, 1-11.
Nei, M, Tajima F, Tateno Y (1983) Accuracy of estimated phylogenetic trees from
molecular data. Journal of Molecular Evolution, 19, 153-170.
Novinger DC, Coon TG (2000) Behavior and physiology of the redside dace,
Clinostomus elongatus, a threatened species in Michigan. Environmental Biology of
Fishes, 57, 315–326.
Overpeck JT, Webb RS, Webb T (1992) Mapping eastern North American vegetation
change of the past 18 ka: No-analogs and the future. Geology, 20, 1071-1074.
Parker BJ, McKee P, Campbell RR (1988) Status of the redside dace, Clinostomus
elongatus, in Canada. Canadian Field Naturalist, 102, 163-169.
Peakall R, Smouse PE (2006) Genalex 6: genetic analysis in Excel. Population genetic
software for teaching and research. Molecular Ecology Notes, 6, 288–295.
Pitcher TW, Beneteau CL, Walter RP, Wilson CC et al. (2009) Isolation and
characterization of microsatellite loci in the redside dace, Clinostomus elongatus.
Conservation Genetics Resources, 1, 381–383.
Placyk JS, Burghardt GM, Small RL, King RB, et al. (2007) Post-glacial recolonization
of the Great Lakes region by the common gartersnake (Thamnophis sirtalis) inferred from
mtDNA sequences. Molecular Phylogenetics and Evolution, 43, 452-467.
Poos MS, Jackson DA (2012) Impact of species-specific dispersal and regional
stochasticity on estimates of population viability in stream metapopulations. Landscape
Ecology, 27, 405-416.
Poos MS, Lawrie D, Tu C, Jackson DA, et al. (2012) Estimating local and regional
population sizes for an endangered minnow, redside dace (Clinostomus elongatus), in
Canada. Aquatic Conservation: Marine and Freshwater Ecosystems, 22, 47-57.
Pritchard JK, Stephens M and Donnelly P (2000) Inference of population structure using
multilocus genotype data. Genetics, 155, 945–959.
Raymond M, Rousset F (1995) GENEPOP (Version 1.2): Population Genetics Software
for Exact Tests and Ecumenicism. Journal of Heredity, 86, 248-249.
Redford KH, Amato G, Baillie J, Beldomenico P, et al. (2011) What does it mean to
successfully conserve a (vertebrate) species? BioScience, 61, 39-48.
96
Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus
elongatus) in Ontario. Ontario Ministry of Natural Resources. Peterborough, ON.
Reid SM, Kidd A, Wilson CC (2012) Validation of buccal swabs for noninvasive DNA
sampling of small‐bodied imperiled fishes. Journal of Applied Ichthyology, 28, 290-292.
Rosenberg NA (2004) DISTRUCT: a program for the graphical display of population
structure. Molecular Ecology Notes, 4, 137-138.
Rousset F (1997) Genetic differentiation and estimation of gene flow from F-statistics
under isolation by distance. Genetics, 145, 1219–1228.
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing
phylogenetic trees. Molecular biology and evolution, 4, 406-425.
Sambrook J, Fritsch EF, Maniatis T (1989) Molecular Cloning. Cold Spring Harbor
Laboratory Press, New York.
Schmidt RE (1986) Zoogeography of the Northern Appalachians . In: Zoogeography of
North American Freshwater Fishes (eds Hocutt CH, Wiley EO), pp. 137-159. John Wiley
and Sons, New York.
Scott WB, Crossman EJ (1973) Freshwater Fishes of Canada. Bulletin of Fisheries
Research Board of Canada 184.
Simon TP, Burskey JL (2014) Spatial Distribution and Dispersal Patterns of Central
North American Freshwater Crayfish (Decapoda: Cambaridae) with Emphasis on
Implications of Glacial Refugia. International Journal of Biodiversity.
Sivasundar A, Bermingham E, Orti G (2001) Population structure and biogeography of
migratory freshwater fishes (Prochilodus: Characiformes) in major South American
rivers. Molecular Ecology, 10, 407-417.
Soltis DE, Morris AB, McLachlan JS, Manos PS, et al. (2006) Comparative
phylogeography of unglaciated eastern North America. Molecular Ecology, 15, 4261–
4293.
Stepien CA, Faber JE (1998) Population genetic structure, phylogeography and spawning
philopatry in walleye (Stizostedion vitreum) from mitochondrial DNA control region
sequences. Molecular Ecology, 7, 1757-1769.
Takezaki N, Nei M, Tamura K (2010) POPTREE2: Software for constructing population
trees from allele frequency data and computing other population statistics with Windows
interface. Molecular Biology and Evolution, 27, 747–752.
97
Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular evolutionary genetics
analysis (MEGA) software version 4.0. Molecular Biology and Evolution, 24, 1596–
1599.
Van Oosterhout C, Hutchinson WF, Willis DP, Shipley P (2004) MICRO‐CHECKER:
software for identifying and correcting genotyping errors in microsatellite
data. Molecular Ecology Notes, 4, 535-538.
Ver Steeg K (1946) The Teays River. Ohio Journal of Science, 46, 297-307.
Wang L, Lyons J, Kanehl, P, Bannerman R (2001) Impacts of urbanization on stream
habitat and fish across multiple spatial scales. Environmental Management, 28, 255-266.
Wilson CC, Hebert PD (1996) Phylogeographic origins of lake trout (Salvelinus
namaycush) in eastern North America. Canadian Journal of Fisheries and Aquatic
Sciences, 53, 2764-2775.
Wilson CC, Hebert PD (1998) Phylogeography and postglacial dispersal of lake trout
(Salvelinus namaycush) in North America. Canadian Journal of Fisheries and Aquatic
Sciences, 55, 1010-1024.
Wozney KM, Haxton TJ, Kjartanson S, Wilson CC (2011) Genetic assessment of lake
sturgeon (Acipenser fulvescens) population structure in the Ottawa River. Environmental
Biology of Fishes, 90, 183-195.
98
Figure 3.1:Distribution map of sampling locations for Redside Dace (Clinostomus elongatus). Inset map shows the species’ global
range (reproduced from COSEWIC 2007 report, with permission), with the polygon enclosing the species range.
99
Figure 3.2: Mutational network observed among C.
elongatus haplotypes for ATPase 6 and 8 based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.2; each
node represents one nucleotide substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic
distribution.
Haplogroup A
Haplogroup B
100
Figure 3.3: Neighbour-joining dendrogram of relationships among ATPase 6 and 8
haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table
3.2) are represented by numbers outside brackets, while number of individuals are
represented by numbers inside brackets. Numbers at branch nodes show bootstrap support
values >50 %.
Haplogroup A
Haplogroup B
101
Figure 3.4: Mutational network observed among C. elongatus haplotypes for cytochrome b based on parsimony. Each numeric circle
corresponds to a haplotype listed in Table 3.4; each node represents one nucleotide substitution. Branch lengths do not correspond to
genetic distance. Inset map shows the geographic distribution of haplogroups (purple=haplogroup C; light green=haplogroup D).
Haplogroup C
Haplogroup D
102
Figure 3.5: Neighbour-joining dendrogram of relationships among cytochrome b
haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table
3.4) are represented by numbers outside brackets, while numbers of individuals are
represented by numbers inside brackets. Numbers at branch nodes show bootstrap support
values >50 %
Haplogroup C
Haplogroup D
10
3
Figure 3.6: Mutational network observed among C. elongatus haplotypes for total evidence for cytochrome b and ATPase 6 and 8
based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.5; while each node represent one nucleotide
substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic distribution of haplogroups
(orange=haplogroup 1, olive green=haplogroup 3, red=haplogroup 2).
Haplogroup 1
Haplogroup 2
Haplogroup 3
104
Figure 3.7: Neighbour-joining dendrogram of relationships among total evidence
(cytochrome b and ATPase 6 and 8) haplotypes based on p-distances with 500 bootstrap
replicates. Haplotype numbers (Table 3.5) are represented by numbers outside brackets,
while number of individuals are represented by numbers inside brackets. Numbers at
branch nodes show bootstrap support values >50%.
Haplogroup 1
Haplogroup 2
Haplogroup 3
105
Figure 3.8: Distribution of haplogroups (orange=haplogroup 1, green=haplogroup 3, red=haplogroup 2, black= unassigned) for
combined cytochrome b and ATPase 6 and 8 data using groups identified via mutational network (Figure 3.6) and genetic distance
(Figure 3.7).
106
Figure 3.9: Results from Bayesian clustering analyses in STRUCTURE for Redside Dace individuals, where K represents number of
genetically unique populations. Analyses were run at K=1 to K=29, and methods outlined in Chapter 2. Results analysed using (i) Log
Likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al. (2005).
0
50
100
150
200
250
300
350
-40000
-35000
-30000
-25000
-20000
-15000
-10000
-5000
0
0 5 10 15 20 25 30 35
Del
ta K
Mea
n L
nP
(K)
Number of Clusters (K)
Mean LnP(K) Delta K
107
Figure 3.10: Bayesian clustering assignment implemented in STRUCTURE for 28 populations at (a) K=3 for range-wide
analysis (b) results of separate STRUCTURE runs on the three identified subsets for fine-scale analysis, showing optimal K
values along with preceding and successive K values. All runs were implemented with no admixture, and independent allele
frequencies. Colours between different runs have no association with each other.
108
Figure 3.11: Results from Bayesian clustering analyses in STRUCTURE for Redside Dace individuals, for three genetic groups
identified by K=3 on Figure 3.10, where K represents number of genetically unique populations. Analyses were run for red group from
K=1 to K=10 (top left), green group from K=1 to K=20 (top right), and blue group from K=1 to K=10 (bottom) group using methods
outlined in Chapter Two. Results analysed using (i) Log Likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al. (2005).
109
Figure 3.12: Principal coordinate analysis (PCoA) of genetic structure across all sampled Redside Dace populations (red= cluster A,
blue= cluster B, green=cluster C). Inset map shows the geographic distributions of each genetic group.
EFI
BRU
LCR NFRHUM
HANMIL
FOU
KET
SMC
TTR
CAR
ROUDON
MIT
GUL
DOD
OST
COB
UNN
WOO
BHR
EBM
EBC
SAU
STR
LRR
RED
PC
oA
2 (
14
.2%
)
PCoA 1 (33.2%)
Group A
Group B
Group C
110
Figure 3.13: Neighbour joining dendrogram of genetic relationships among sampled
populations based on Nei et al. (1983) DA genetic distance for 10 microsatellite loci.
Numbers at branch nodes represent bootstrap support values > 50% based on 500
bootstrap replicates. Groups correspond to those identified in Figure 3.12.
Group A
Group B
Group C
DA
111
Figure 3.14: Plot of isolation by distance for pairwise population comparisons of transformed geographic distance [ln (distance in
km+1)] versus genetic divergence [(FST)/(1-FST)].
112
Figure 3.15: Isolation by distance plot of transformed geographic distance [ln (distance in km+1)] versus genetic divergence
[(FST)/(1-FST)] for population population pairs with geographic distances of less than 123 km. Points in yellow represent
pairwise comparisons among the Allegheny River populations.
113
Table 3.1: Locations with drainage, jurisdiction, code names, latitude/longitude and number of samples obtained for mtDNA and
microsatellite genetic samples used for study.
State/Province Drainage Population Code
Name
Latitude
(ºN)
Longitude
(ºW)
ATPase Cyt
b
Microsatellite
Minnesota Mississippi R
Little Cannon
River
LCR 44.35 -92.96 12 14 33
Minnesota Mississippi R
North Fork
Zumbro
NFR 44.30 -92.79 14 15 35
Michigan Lake Superior Unnamed Creek UNN 46.68 -90.02 12 12 36
Michigan Lake Superior Schroeder Creek SCH 46.56 -89.83 7 6 --
Wisconsin Mississippi R Little Rib River LRR 45.09 -89.81 12 10 23
Wisconsin Mississippi R East Fork
Raccoon Creek
EFR 42.53 -89.13 1 1 --
Indiana Ohio R Mill Creek MIL 40.77 -85.90 13 14 35
Indiana Ohio R Hanna Creek HAN 39.66 -84.88 13 10 44
Ontario Lake Huron Two Tree River TTR 46.25 -84.03 22 21 40
Kentucky Kentucky R
(Ohio R)
East Fork Indian
Creek
EFI 37.87 -83.66 11 14 29
Kentucky Kentucky R
(Ohio R.)
Red River RED 37.83 -83.63 10 9 10
Kentucky Licking R
(Ohio R)
Brushy Fork BRU 37.95 -83.51 14 14 29
Ontario Lake Huron Gully Creek GUL 43.61 -81.66 14 15 36
Ohio Lake Erie
East Branch
Chagrin River
EBC 41.54 -81.29 16 16 28
Ontario Lake Huron Saugeen River SAU 44.25 -80.41 2 0 14
West Virginia Monongahela R
(Ohio R)
Straight Fork STR 38.85 -80.37 10 9 12
West Virginia Monongahela R
(Ohio R)
Whiteday Creek WDC 39.43 -79.97 3 2 --
Ontario Lake Ontario
Sixteen Mile
Creek
SMC 43.57 -79.89 -- -- 45
114
Table 3.1 (continued)
State/Province Drainage Population Code
Name
Latitude
(ºN)
Longitude
(ºW)
ATPase Cyt
b
Microsatellite
Ontario Lake Ontario Fourteen Mile FOU 43.42 -79.77 12 12 43
Ontario Lake Simcoe Kettleby Creek KET 44.00 -79.56 18 16 35
Ontario Lake Ontario Humber River HUM 43.92 -79.56 -- -- 36
Ontario Lake Ontario Mitchell Creek MIT 43.97 -79.14 -- -- 34
Ontario Lake Ontario Carruthers Creek CAR 43.92 -79.03 14 12 50
Pennsylvania Allegheny R
(Ohio R)
East Branch
Mahoning
EBM 41.01 -78.76 12 12 34
Pennsylvania Allegheny R
(Ohio R)
Bloomster Hollow
Run
BHR 41.75 -78.52 22 21 31
New York Allegheny R
(Ohio R)
Dodge Creek DOD 42.13 -78.24 18 17 21
New York Susquehanna R Otselic River OST 42.76 -75.74 12 14 17
New York Black R (Lake
Ontario)
Kidder Creek KID 43.93 -75.64 5 5 --
New York Black R (Lake
Ontario)
Cobb Creek COB 43.85 -75.64 12 13 22
115
Table 3.2: ATPase variable sites for 23 unique haplotypes of C. elongatus (1st column), nucleotide positions at which mutations occur
(1st row), number of individuals (N) and populations that contain that particular haplotype. Haplotype 1A represents reference
sequence for table; dots within a cell represent nucleotide positions identical to the reference sequence.
Hap
0
3
5
0
5
0
0
5
8
0
6
7
1
3
6
1
4
8
1
5
0
1
5
2
1
8
1
2
0
5
2
1
4
2
7
4
2
9
5
3
4
7
3
5
8
3
8
6
4
0
0
5
1
1
5
3
2
5
5
6
5
6
8
5
9
8
6
0
7
6
5
0
6
5
2
6
6
7
6
7
7
6
7
9
6
9
4
N Populations
1A C C C G C G G G G G G A A A A C A C T T G A C C G G C A G 136
DOD, OST,
COB, KID,
DON, CAR,
FOU, HAN,
MIL, KET,
EBC, WOO,
BHR
2A . . . . . A . . . . . . . . . . . . . . . . . . . . . . . 1 DOD
3A . . . . . . . . . . . . . . G . . . . . . . . . . . . . . 39
DOD, WOO,
BHR, EBM,
STR, WDC
4A . . . . . . . . . . . . . . . . . . . . . G A . . . . . . 2 OST
5A . . . . . . . . . . . . . . . . G . . . . . . . . . . . . 3 CAR
6A . . . . . . . . . . . . . . . . . . . . . . . . . . . G . 4 GUL
7A . . . . . . . A . . . . . . . . . . . . . . . . . . . G . 10 GUL
8A . . . . . . . . . . . . . . . . . . . . . . . . . . T . . 2 HAN
9A T . . . . . . . . A A . . . G . . T . C . . . . . . . . . 23 EFR, TTR
10A . . . . . . . . . . . . . . . . . . . . . . . . . A . . . 11 MIL
11A . . . . . . . . . . . . . . . . . . . . A . . . . . . . . 5 KET
12A . . . . . . . . C . . . G . . . . . . . . . . . . . . . . 23 EFI, RED
13A . . . . . . . . C . . . G . . . . . . . A . . . . . . . . 12 BRU
14A T . . . T . . . . . . . . . G . . . . C . . . T A . . . . 15 SCH, UNN
15A T . . . T . . . . . . . . . G . . . . C . . . . A . . . . 38 SCH, LCR,
NFR, LRR
16A T . . . T . . . . . . . . G G . . . . C . . . . A . . . . 2 LCR
17A T . . . T . A . . . . . . . G . . . C C . . . . A . . . . 2 NFR
18A . . . . . . . . . . . . . . G . . . . . . . . . . . . . C 1 WOO
116
19A . . . . . . . . . . . G . . G . . . . . . . . . . . . . . 3 BHR
20A . T T . . . . . . . . . . . . . . . . . A . . . . . . . . 1 BHR
21A . . . . . . . . . . . . . . . . . T . . . . . . . . . . . 1 EBM
22A . . . . . . . . . . . . . . . T . . . . . . . . . . . G . 2 SAU
23A . . . A . . . . . . . . . . G . . . . . . . . . . . . . . 2 WDC
117
Table 3.3: Summary of ATPase 6 and 8 and Cytochrome b sequence results for 27 Redside Dace populations, showing numbers of
sequenced individuals (N), number of haplotypes detected (Nh), haplotypic richness (HR) and haplotype diversity (h) and nucleotide
diversity (π) for 27 Redside Dace populations.
ATPase 6 and 8 Cytochrome b
Population N Nh Rh
(n=10)
h π N Nh Rh (n=9) h π
LCR 12 2 0.99 0.30 3.80 x 10-4 14 3 1.8 0.56 9.00 x 10-4
NFR 14 2 0.93 0.27 6.50 x 10-4 15 1 0.00 0 0
UNN 12 1 0.00 0 0 12 1 0.00 0 0
SCH 7 2 -- 0.57 7.1 x 10-4 6 1 -- 0 0
LRR 12 1 0.00 0 0 10 3 1.9 0.60 6.1 x 10-4
MIL 13 2 0.96 0.28 3.5 x 10-4 14 2 1.00 0.53 4.8 x 10-4
HAN 13 2 0.96 0.28 3.5 x 10-4 10 1 0.00 0 0
TTR 22 1 0.00 0 0 21 1 0.00 0 0
EFI 11 1 -- 0 0 14 2 0.64 0.14 1.3 x 10-4
RED 10 1 0.00 0 0 9 1 0.00 0 0
BRU 14 2 0.26 3.3 x 10-4 14 1 0.00 0 0
GUL 14 2 1.0 0.44 5.5 x 10-4 15 2 0.96 0.34 6.2 x 10-4
EBC 16 1 0.00 0 0 16 1 0.00 0 0
SAU 2 1 -- 0 0 0 0 -- 0 0
STR 10 1 0.00 0 0 9 2 1.00 0.40 3.5 x 10-4
118
Table 3.3 (continued)
ATPase 6 and 8 Cytochrome b
Population N Nh Rh
(n=10)
h π N Nh Rh (n=9) h π
WDC 3 2 -- 0.67 8.3 x 10-4 2 1 -- 0 0
FOU 12 1 0.00 0 0 12 3 1.96 0.68 7.6 x 10-4
KET 18 2 1.0 0.43 5.3 x 10-4 16 2 0.98 0.40 1.09 x 10-
3
WOO 13 3 1.77 0.61 8.3 x 10-4 11 4 2.64 0.67 4.82 x 10-
4
DON 14 1 0.00 0 0 13 1 0.00 0 0
CAR 14 2 0.99 0.36 4.5 x 10-4 12 1 0.00 0 0
EBM 12 2 0.83 0.17 4.1 x 10-4 12 5 3.46 0.80 3.00 x 10-
3
BHR 22 4 2.30 0.64 1.27 x 10-
3
21 7 3.66 0.84 5.22 x 10-
3
DOD 18 3 1.53 0.45 5.9 x 10-4 17 6 3.80 0.83 4.07 x 10-
3
OST 12 2 0.99 0.30 7.5 x 10-4 14 2 0.64 0.14 1.3 x 10-4
KID 5 1 -- 0 0 5 1 -- 0 0
COB 12 1 0.00 0 0 12 2 0.75 0.17 1.50 x 10-
4
119
Table 3.4: Cytochrome b variable sites for 35 unique haplotypes of C. elongatus (1st column), showing nucleotide positions at which
mutations occur (1st row), number of individuals (N) and populations that contain that particular haplotype. Dots within a cell represent
nucleotide positions identical to the reference sequence.
Hap 1
1
3
0
9
0
1
0
3
1
2
6
1
2
8
2
1
2
2
1
3
2
2
7
2
3
4
2
5
5
2
8
8
3
3
0
3
7
5
3
9
9
4
0
2
4
0
8
4
2
6
4
3
5
4
3
8
4
5
6
4
6
1
5
1
3
5
3
9
5
7
3
6
0
8
6
1
2
1B A A T A A T C C G T T C G C G G C A G T A G A G A C C
2B . . . . . . . . A . C T . . T . . . A . . A . . . . T
3B . . . . . . . . A C C T . . T . . . A . . A . . . . T
4B . . . . . . . . A . C T . . T . . G A . . A . . . . T
5B . . . . . . . . A . C T . . T A . . A . . A . . . . T
6B . . . . . . T . . . . . . T . . . . . . . . . . . . .
7B . . . . . . . . A . C T . . T . . . A . . A . . . . T
8B . . . . . . . . A . C T . . T . . . A . . A . . . . T
9B . . . . . . . . A . C T . . . . . . A . . A . . . . T
10B . . . . . . . . A . C T . . T . . . A . . A . . . . T
11B . . . . . . . . A . C T . . . . . . A . . A . . . . T
12B . . . . G . . . A . C T . . . . . . A . . A . . . . T
13B . . . . . . . . A . . T . . . . . . A . . A . A . . T
14B . . . . . . . . A . C T . . . . T . A . . A . . . . T
15B . G . . . . . . A . C T . . . . T . A . . A . . . . T
16B . . . . . . . . A . C T . . T . . . A C . A . . . T T
17B . . . . . . . . A . C T . . . . . . A . . A . . . . T
18B . . . . . . . . A . C T . . . . . . A . . A . . G . T
19B . . . . . . . . A . . T . . . . . . A . . A . . . . T
20B . . . . . . . . A . . T . . . . T . A . . A . . . . T
21B . . . . . . . . A . . T . . . . . . A . . A . . . . T
22B . . . . . . . T A . C T . . T . . . A . . A . . . . T
23B . . . G . . . . A . C T . . T . . . A . . A . . . . T
24B . . . . . C . . . . . . . . . . . . . . . . . . . . .
25B . . . . . . . . . . . . . . . . . . . . . . G . . . .
26B . . . . . . T . . . . . . . . . . . . . . . . . . . .
27B . . . . . C . . A . C T . . T . . . A . . A . . . . T
28B . . . . . . . . A . C T A . T . . . A . G A . . . . T
29B . . . . . . . . . . . . . . . . . . . . . . . . . . .
120
Table 3.4 (continued)
Hap 1
1
3
0
9
0
1
0
3
1
2
6
1
2
8
2
1
2
2
1
3
2
2
7
2
3
4
2
5
5
2
8
8
3
3
0
3
7
5
3
9
9
4
0
2
4
0
8
4
2
6
4
3
5
4
3
8
4
5
6
4
6
1
5
1
3
5
3
9
5
7
3
6
0
8
6
1
2
30B . . C . . . . . . . . . . . . . . . . . . . . . . . .
31B . . . . . . . . . . . . . . . . . . . . . . . . . . .
32B . . . . . . . . A . C T . . T . . . A . . A . . . . T
33B . . . . . . . . A . . T . . . . . . A . . A . . . . T
34B G . . . . . . . A . . T . . . . . . A . . A . . . . T
35B . . . . . . . . . . . . . . . . . . . . . . . . . . .
121
Table 3.4 (continued)
Hap
6
4
2
6
7
2
6
7
5
6
9
9
7
2
5
7
2
9
7
4
4
8
3
7
8
3
9
8
8
5
1
0
1
7
1
0
2
0
1
0
2
3
1
0
6
1
N Populations
1B C G T G G T A C C A C G T A 23 DOD, WOO, EBM, STR, WDC
2B . . . . . . . . . . T . . . 56 DOD, COB, KID, OST, CAR, FOU, BHR
3B . . . . . . . . . . T . . . 1 DOD
4B . . . . . . . . . . T . . . 3 DOD, BHR
5B . . . . . . . . . . T . . . 5 DOD, BHR
6B . . . . . . . . . . . . . . 1 DOD
7B . . . . . . . . . . T . C . 1 OST
8B . . . . . . . . T . T . . . 1 COB
9B . . . . . . . . . . T . . . 56 DON, FOU, HAN, KET, EBC,
10B . . . . . . . . . . T . . G 2 FOU
11B T . . . . . . . . . T . . . 3 GUL
12B . . . . . . . . . . T . . . 12 GUL
13B . . . . . . . . . . T . . G 22 EFR, TTR
14B . . . . . . . . . . T . . . 8 MIL
15B . . . . . . . . . . T . . . 6 MIL
16B . . . . . . . . . . T . . . 4 KET
17B . . . . . . . . . . T . . G 36 EFI, BRU, RED
18B . . . . . . . . . . T . . G 1 EFI
19B . . . . . . . . . . T . . G 48 UNN, SCH, LCR, NFR, LRR
20B . . . . . . . . . . T . . G 2 LCR
21B . . . . . . . T . . T A . G 3 LCR
22B . . . . . C . . . . T . . . 3 WOO
23B . . . . . . . . . . T . . . 1 WOO
24B . . . . . . . . . . . . . . 1 WOO
25B . . . . . . . . . . . . . . 6 BHR
26B . . . . . . . . . . . . . . 4 BHR
27B . . . . . . . . . . T . . . 1 BHR
28B . . . . . . . . . . T . . . 3 BHR
29B . . . T . . . . . . . . . . 1 EBM
30B . C . T . . . . . . . . . . 4 EBM
122
Table 3.4 (continued)
Hap
6
4
2
6
7
2
6
7
5
6
9
9
7
2
5
7
2
9
7
4
4
8
3
7
8
3
9
8
8
5
1
0
1
7
1
0
2
0
1
0
2
3
1
0
6
1
N Populations
31B . . . . . . G . . . . . . . 2 EBM
32B . . . . . . . . . G T . . . 1 EBM
33B . . . . A . . . . . T . . G 3 LRR
34B . . . . . . . . . . T . . G 1 LRR
35B . . C . . . . . . . . . . . 2 STR
123
Table 3.5: Haplotype name, number of individuals and population occurrences for 47
unique haplotypes based on combined sequences (ATPase 6 and 8, and cytochrome b).
Haplotype N Populations
1 2 WDC
2 18 STR, EBM, WOO, DOD
3 2 STR
4 3 LRR
5 27 LRR, LCR, NFR, SCH
6 1 LRR
7 21 TTR, EFR
8 4 EBM
9 2 EBM
10 1 EBM
11 1 EBM
12 3 BHR
13 3 BHR, DOD
14 6 BHR
15 49 BHR, FOU, CAR, KID,
DOD, OST, COB
16 3 BHR
17 1 BHR
18 5 BHR, DOD
19 3 WOO
20 1 WOO
21 1 WOO
22 1 WOO
23 21 EFI, BRU, RED
24 54 EBC, KET, HAN, FOU,
DON
25 2 NFR
26 1 LCR
27 3 LCR
28 2 LCR
29 15 SCH, UNN
30 1 EFI
31 12 BRU
32 4 KET
33 7 MIL
34 4 MIL
35 1 MIL
36 1 MIL
37 2 HAN
38 10 GUL
39 3 GUL
124
Table 3.5: (continued)
Haplotype N Populations
40 2 FOU
41 2 CAR
42 1 COB
43 2 OST
44 1 OST
45 1 DOD
46 1 DOD
47 1 DOD
125
Table 3.6: Genetic description of 28 Redside Dace populations (see Table 3.1 for
localities). Columns represent letter codes, number of individuals genotyped (N),
observed number of alleles (Na), standardized allelic richness (AR) for n=20 gene copies,
observed heterozygosity (HO), expected heterozygosity (HE), and inbreeding coefficient
(FIS).
Pop N Na AR HO HE FIS
LCR 33.0 5.5 3.89 0.39 0.38 0.08
NFR 34.0 3.5 2.86 0.36 0.35 -0.03
UNN 36.0 2.9 2.51 0.40 0.43 0.11
LRR 27.9 5.3 3.69 0.42 0.38 -0.09
MIL 35.0 3.9 3.25 0.49 0.49 -0.01
HAN 44.0 7.0 4.92 0.61 0.62 0.03
TTR 39.5 2.9 2.50 0.36 0.36 0.10
EFI 27.7 5.2 4.36 0.59 0.59 0.00
RED 10.0 3.3 3.30 0.61 0.51 -0.18
BRU 29.9 5.2 4.27 0.62 0.60 -0.03
GUL 33.7 3.1 2.63 0.39 0.41 0.08
EBC 28.0 6.1 4.91 0.65 0.60 -0.10
SAU 13.4 2.8 2.69 0.35 0.35 0.01
STR 12.0 3.5 3.39 0.45 0.44 0.06
SMC 44.0 5.4 4.01 0.54 0.54 0.00
FOU 42.5 4.7 3.85 0.57 0.55 -0.04
KET 33.4 3.8 3.34 0.48 0.45 -0.04
HUM 35.7 5.4 4.17 0.53 0.50 -0.04
WOO 31.0 6.8 5.19 0.58 0.57 0.06
DON 29.9 3.5 3.11 0.49 0.47 -0.06
ROU 45.9 5.1 3.83 0.59 0.56 -0.07
MIT 33.9 5.0 4.23 0.58 0.58 0.02
CAR 49.9 3.3 2.82 0.45 0.45 0.02
EBM 33.8 6.6 5.08 0.68 0.64 -0.07
BHR 31.0 7.8 5.72 0.59 0.61 0.03
DOD 21.0 6.1 4.96 0.60 0.56 -0.06
OST 16.8 4.7 4.13 0.51 0.48 -0.07
COB 22.0 5.1 4.03 0.50 0.48 -0.04
12
6
Table 3.7: Pairwise FST values among 28 Redside Dace populations along with sample size for each population.
N EFI BRU LCR NFR HUM HAN MIL FOU KET SMC TTR CAR ROU DON MIT GUL DOD OST
EFI 28 0.00 BRU 30 0.08 0.00 LCR 33 0.47 0.47 0.00 NFR 34 0.49 0.48 0.10 0.00 HUM 36 0.27 0.25 0.50 0.51 0.00 HAN 44 0.15 0.14 0.43 0.45 0.27 0.00 MIL 35 0.30 0.34 0.44 0.47 0.41 0.33 0.00 FOU 43 0.23 0.22 0.46 0.48 0.10 0.24 0.38 0.00 KET 34 0.28 0.27 0.57 0.58 0.21 0.30 0.46 0.21 0.00 SMC 45 0.26 0.27 0.49 0.50 0.19 0.24 0.37 0.18 0.25 0.00 TTR 40 0.48 0.49 0.39 0.38 0.51 0.44 0.45 0.49 0.58 0.51 0.00 CAR 50 0.33 0.31 0.47 0.48 0.21 0.33 0.39 0.22 0.31 0.32 0.48 0.00 ROU 46 0.22 0.21 0.46 0.48 0.13 0.21 0.40 0.13 0.18 0.17 0.49 0.25 0.00 DON 30 0.21 0.25 0.54 0.57 0.24 0.29 0.41 0.22 0.19 0.22 0.57 0.34 0.20 0.00 MIT 34 0.18 0.19 0.47 0.49 0.16 0.19 0.35 0.15 0.12 0.14 0.50 0.27 0.11 0.15 0.00 GUL 36 0.30 0.35 0.60 0.62 0.43 0.33 0.45 0.39 0.46 0.35 0.62 0.49 0.38 0.39 0.35 0.00 DOD 21 0.26 0.26 0.51 0.54 0.24 0.27 0.43 0.23 0.18 0.23 0.54 0.35 0.21 0.21 0.15 0.42 0.00 OST 17 0.34 0.35 0.54 0.57 0.34 0.30 0.47 0.32 0.30 0.28 0.57 0.44 0.28 0.28 0.23 0.48 0.13 0.00 COB 22 0.29 0.32 0.54 0.56 0.22 0.29 0.40 0.24 0.28 0.13 0.56 0.35 0.17 0.23 0.17 0.36 0.22 0.23 UNN 36 0.45 0.45 0.29 0.32 0.47 0.42 0.44 0.42 0.51 0.46 0.33 0.45 0.44 0.50 0.44 0.57 0.47 0.49 WOO 31 0.26 0.24 0.50 0.52 0.20 0.26 0.41 0.21 0.14 0.22 0.53 0.32 0.21 0.19 0.13 0.40 0.02 0.17 BHR 31 0.23 0.22 0.47 0.49 0.21 0.23 0.39 0.21 0.14 0.21 0.50 0.31 0.19 0.20 0.13 0.38 0.02 0.12 EBM 34 0.24 0.22 0.47 0.49 0.20 0.22 0.39 0.22 0.19 0.25 0.49 0.29 0.20 0.25 0.16 0.38 0.12 0.22 EBC 28 0.22 0.27 0.46 0.47 0.36 0.21 0.29 0.32 0.39 0.30 0.47 0.38 0.31 0.32 0.27 0.33 0.35 0.37 SAU 14 0.37 0.39 0.61 0.62 0.47 0.38 0.53 0.40 0.53 0.43 0.62 0.53 0.42 0.48 0.43 0.44 0.46 0.52 STR 12 0.30 0.29 0.57 0.59 0.35 0.31 0.44 0.33 0.30 0.33 0.59 0.42 0.32 0.35 0.24 0.50 0.18 0.35 LRR 28 0.47 0.47 0.36 0.33 0.50 0.43 0.46 0.46 0.57 0.49 0.35 0.45 0.47 0.55 0.48 0.61 0.51 0.56 RED 10 0.09 0.18 0.52 0.54 0.32 0.20 0.38 0.29 0.37 0.31 0.53 0.40 0.27 0.29 0.24 0.32 0.31 0.37
12
7
Table 3.7: (continued)
COB UNN WOO BHR EBM EBC SAU STR LRR RED
EFI
BRU
LCR
NFR
HUM
HAN
MIL
FOU
KET
SMC
TTR
CAR
ROU
DON
MIT
GUL
DOD
OST
COB 0.00
UNN 0.51 0.00
WOO 0.23 0.46 0.00
BHR 0.21 0.43 0.04 0.00
EBM 0.25 0.43 0.12 0.08 0.00
EBC 0.31 0.44 0.35 0.33 0.32 0.00
SAU 0.47 0.57 0.45 0.42 0.41 0.39 0.00
STR 0.35 0.53 0.18 0.13 0.19 0.40 0.55 0.00
LRR 0.54 0.16 0.50 0.48 0.46 0.45 0.61 0.57 0.00
RED 0.32 0.49 0.31 0.27 0.26 0.25 0.41 0.37 0.52 0.00
12
8
Table 3.8: Analysis of Molecular Variance (AMOVA) for total evidence (Cytochrome b and ATPase 6 and 8) mitochondrial DNA
data based on hypothesized (i) Mississippi and Atlantic refugia (2 groups), (ii) mitochondrial DNA bootstrap supported groups (3
refugia), and (iii) microsatellite Principal Coordinate Analysis clustering (3 refugia hypothesis), and hierarchical FST analysis of
microsatellite data for (iv) eastern versus western groups (v) three groups identified by STRUCTURE, and (vi) contemporary drainage
patterns.
Source of variation d.f. Sum of
squares
Variance
components
% variation P-value
(i) mtDNA: Mississippi versus Atlantic
Among groups 1 337.08 2.73 53.79 <0.001
Among populations within
groups
25 486.73 1.63 32.14 <0.001
Within populations 288 205.29 0.71 14.07 <0.001
Total 314 1029.10 5.07
(ii) mtDNA: 3 refugial groups (based on mtDNA bootstrap support for total evidence)
Among groups 2 613.21 3.55 71.10 <0.001
Among populations within
groups
28 316.88 1.09 21.91 <0.001
Within populations 284 99.01 0.35 6.99 <0.001
Total 314 1029.10 4.99
(iii) 3 refugial groups (mtDNA data, grouped by microsatellite for STRUCTURE and PCoA)
Among groups 2 416.34 1.90 47.07 <0.001
Among populations within
groups
24 407.45 1.42 35.24 <0.001
Within populations 288 205.29 0.71 17.69 <0.001
Total 314 1029.10 4.03
12
9
Table 3.8 (continued)
Source of variation d.f. Sum of
squares
Variance
components
Percent of
% variation
P-value
(iv) Eastern versus Western populations
Among groups 1 568.63 0.96 20.78 <0.001
Among populations within
groups
26 1908.36 1.13 24.64 <0.001
Within populations 1736 4359.26 2.51 54.58 <0.001
Total 1763 6836.24 4.60
(v) K=3 STRUCTURE populations
Among groups 2 1015.34 0.84 19.71 <0.001
Among populations within
groups
25 1461.65 0.90 21.12 <0.001
Within populations 1736 4359.25 2.51 59.17 <0.001
Total 1763 6836.24 4.24
(vi) Contemporary drainage patterns
Among groups 3 660.92 0.41 10.09 <0.001
Among populations within
groups
24 1816.07 1.17 28.50 <0.001
Within populations 1736 4359.25 2.51 61.40 <0.001
Total 1763 6836.24 4.90
130
Chapter 4: General Discussion
The results presented in both data chapters highlighted the value of genetic tools
and information for improving knowledge and management plans for species at risk.
Chapter 2 results demonstrated the effectiveness of environmental DNA (eDNA) for
documenting the presence of Redside Dace, and that eDNA monitoring can be more
sensitive than electrofishing at species detection. Environmental DNA results also
showed that sampling design, number of replicates, and season are all important
considerations for the application of the technique. Used properly, eDNA monitoring
should help to increase the chances of detecting species when present at a site, and
support the implementation of recovery actions (Darling and Mahon 2011). Chapter 3
results similarly showed the value of genetic information for species conservation, with
both mitochondrial and microsatellite DNA identifying three phylogeographic lineages
within Redside Dace. Combined mtDNA and microsatellite data also indicate the
occurrence and extent of secondary contact between two of the lineages during
postglacial colonization. Microsatellite data also showed that contemporary populations
of Redside Dace are highly structured with little to no gene flow occurring, and that levels
of genetic diversity within populations do not reflect the regional declines that have been
observed.
Despite the potential for false positive and false negative detection in other studies
(Darling and Mahon 2011), Chapter 2 is the first to set qPCR detection thresholds using
the Receiver Operator Characteristic (ROC) approach (Fan et al. 2006). While other
studies accounted for false negatives by estimating imperfect detection (Schmidt et al.
2013, Ficetola et al. 2014, Hunter et al. 2015), there are few that quantify false positive
131
error rates (see critique by Wilson et al. 2015), and many fail to acknowledge
amplification in the negative control wells (Laramie et al. 2015, Roussel et al. 2015).
Although widely used in medical diagnostics to evaluate the efficacy of diagnostic tests
(false versus true positive and negative test results; Kumar and Indrayan 2011), ROC has
not previously been applied to interpreting eDNA detection levels and error rates. The
application of ROC criteria to eDNA detections could substantially reduce the error rates
and associated uncertainties highlighted by Darling and Mahon (2011). Setting an
arbitrary low eDNA threshold to maximize potential detections could increase the rate of
false positives, and lead to mistakenly protecting unoccupied habitats. Alternatively,
setting a higher or overly conservative threshold would result in an increased risk of false
negatives, potentially leading to a lack of protection for habitats supporting species of
conservation concern, as seen in other species (reviewed in Miller et al. 1989). In at least
two cases [European weather loach (Misgurnus fossilis) and spotted gar (Lepisosteus
oculatus)], species were presumed locally extirpated, but eDNA yielded positive
detections (Sigsgaard et al. 2015, Boothroyd et al. 2015). Integrating the ROC approach
into future eDNA studies would be useful to reduce the risk of the negative conservation-
related consequences of setting too low or too high a threshold.
The identification of three phylogeographic lineages within Redside Dace
(Chapter 3) is a significant contribution towards species management and recovery plans.
Conservation below the species level is critical for recovery efforts in order to maintain
the adaptive resources and potential of intraspecific lineages, as well as conserving the
species’ evolutionary legacy (Moritz 1994, Frankham et al. 2002, Geist 2011). My study
is the first initiative to identify Redside Dace evolutionary lineages and hierarchical
132
genetic structuring across the species range. Based on these results, conservation
strategies for managing Redside Dace should incorporate phylogeographic ancestry into
recovery planning, as well as for reintroduction or translocation efforts.
Microsatellite data were also useful for identifying and mapping the different
lineages within individual jurisdictions. Although all three lineages are present in the
United States, Redside Dace populations in individual states showed membership to only
one major microsatellite group, despite mitochondrial evidence of secondary contact in
Pennsylvania and New York. By contrast, all three microsatellite-based groups were
detected within Ontario, although all but one sampled population belonged to the same
mitochondrial lineage. Strong spatial structuring and limited gene flow shown by
microsatellite data in Chapter 3 also suggest that conservation efforts should be focused
on local or regional scales. The majority of populations were genetically distinct from all
others regardless of geographic distance or proximity. Accordingly, conservation efforts
should take a population-based approach to manage Redside Dace where possible. It
should be noted that the genetic diversity data did not always reflect population status.
Demographic data from some populations included in Chapter 3 indicate they may be
below the numbers needed for long-term population viability (Poos et al. 2012) despite
exhibiting moderate to high genetic diversity.
Both microsatellite and mitochondrial data suggest that Redside Dace populations
in Ontario represent three distinct groups, which should be taken into account for future
recovery efforts. The federal Species at Risk Act and the provincial Endangered Species
Act consider protection for units below the species level, and all populations in Ontario
are currently considered at the species level provincially and by COSEWIC as a single
133
designatable unit (DU) for conservation (COSEWIC 2007). While evidence of genetic
discreteness is important for identifying DUs, evidence of evolutionary and/or ecological
significance is also required (COSEWIC 2014). Although very little ecological data is
available for population comparisons, Novinger and Coon (2000) observed physiological
differences between Redside Dace populations from separate lineages, suggesting
adaptive ecological differences among the different genetic groups. It may therefore be
worth considering whether the genetic groups identified in my thesis should be
recognized as separate DUs. Regardless of whether populations in Canada are classified
as a single or multiple DUs, recognition of multiple distinct genetic conservation units is
important for recovery efforts; translocation of fish between different DU may not be
successful, as demonstrated by Novinger and Coon (2000).
The combined results from Chapters 2 and 3 provide potent information and tools
for aiding Redside Dace restoration and recovery efforts. Environmental DNA monitoring
can be used in advance of re-introduction efforts to determine if candidate habitats still
support unrecognized remnant populations. Negative eDNA results from a presumed
extirpated site should reduce the risk of inadvertently stocking fish on top of a local
population, with potential negative consequences (George et al. 2009). Conversely, if a
site considered to be extirpated yields positive eDNA detections for Redside Dace and
translocation is still considered appropriate, caution should be taken that the source
population for translocations or stocking is from the same lineage as the recipient
population. Translocating individuals into a re-introduction site that contains an
undetected remnant population could substantially alter the genetic composition of the
remnant population, as well as result in outbreeding depression (George et al. 2009).
134
Additionally, if an extirpated site tests “negative” for Redside Dace, eDNA could be used
after the reintroduction has occurred to evaluate the success of translocation or stocking
experiments.
For re-introduction efforts, the combination of genetic data and population
abundance estimates (e.g. Redside Dace Recovery Team 2010, Poos et al. 2012) will be
important to identify source populations for reintroductions at extirpated sites.
Mitochondrial and microsatellite data can be used in advance of re-introduction efforts to
identify suitable populations to serve as sources for re-introduction based on evolutionary
lineage(s) and levels of genetic diversity. Microsatellite markers can be used post re-
introduction to determine translocation or stocking success; genetic diversity levels can
be monitored after introduction, and assignment tests can be employed to identify how
well the source population(s) has/have contributed genetic material to successive
generations (Hansen et al. 2001, Piller et al. 2005). While captive breeding would allow
for a higher number of fish to be released into the wild with minimal consequences to
source populations, a potential undesirable genetic consequence of relying on hatchery
production could be the release of maladapted fish (Araki et al. 2009). As microsatellite
data have been used in sperm competition trials to assess the fitness importance of mate
choice in Redside Dace (Beausoleil et al. 2012), it would also be possible to assess the
effectiveness of hatchery mating based on mate choice to ensure that multiple fish are
contributing to future generations in order to avoid inbreeding, versus random mating to
select for specific strains. Augmentation using closely-related wild fish would result in
minimal negative genetic consequences to the recipient population; however, population
abundance estimates would also be advisable to ensure that the genetic diversity of source
135
populations are not compromised (Weeks et al. 2011). Additionally, genetic recapture
initiatives can take place in lieu of physical tagging for mark-recapture in order to obtain
population size estimates (Lukacs and Burnham 2005).
Future Directions
My thesis has made significant contributions to knowledge of Redside Dace
conservation, and is a starting point for future recovery efforts. Environmental DNA can
be used for long-term monitoring (Chapter 2), and the genetic data (Chapter 3) will serve
as a baseline for future translocation efforts (Redside Dace Recovery Team 2010). As per
recommendations of Chapter 3, multiple populations within each evolutionary lineage
should be protected, so that the evolutionary potential and historical legacy of Redside
Dace can be maintained. Once sites are selected for re-introduction, degraded habitats can
be restored, and suitable genetic stocks can be identified as source populations for
recovery (Meffe 1995). Considerations for translocations should include employing
ancestry matching versus environmental matching for selecting appropriate source and
recipient populations, determining if one or multiple source populations should be
translocated (Meffe 1995), and to decide whether transferring wild fish between
populations or relying on captive bred fish would be more appropriate (George et al.
2009, Houde et al. 2015).
136
References
Araki H, Cooper B, Blouin MS (2009) Carry-over effect of captive breeding reduces
reproductive fitness of wild-born descendants in the wild. Biology Letters, 5, 621-624.
Beausoleil JMJ, Doucet SM, Heath DD, Pitcher TE (2012) Spawning coloration, female
choice and sperm competition in the redside dace, Clinostomus elongatus. Animal
Behaviour, 83, 969-977.
Boothroyd M, Mandrak NE, Fox M, Wilson CC (2015) Environmental DNA (eDNA)
detection and habitat occupancy of threatened spotted gar (Lepisosteus oculatus). Aquatic
Conservation: Marine and Freshwater Ecosystems. Manuscript accepted.
COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace
Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in
Canada. Ottawa, ON.
COSEWIC (2014) Guidelines for Recognizing Designatable Units. Committee on the
Status of Endangered Wildlife in Canada. Available from:
http://www.cosewic.gc.ca/eng/sct2/sct2_5_e.cfm.
Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based
methods for monitoring biological invasions in aquatic environments. Environmental
Research, 111, 978-988.
Fan J, Upadhye S, Worster A (2006) Understanding receiver operator characteristics
(ROC) curves. Canadian Journal of Emergency Medicine, 8, 19-20.
Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false
presences, and the estimation of presence / absence from eDNA metabarcoding data.
Molecular Ecology Resources, 15, 543–556.
Frankham R, Briscoe DA, Ballou JD (2002) Introduction to Conservation Genetics.
Cambridge University Press, Cambridge.
Geist J (2011) Integrative freshwater ecology and biodiversity conservation. Ecological
Indicators, 11, 1507-1516.
George AL, Kuhajda BR, Williams JD, Cantrell MA, et al. (2009) Guidelines for
propagation and translocation for freshwater fish conservation. Fisheries, 34, 529-545.
Hansen MM, Kenchington E, Nielsen EE (2001) Assigning individual fish to populations
using microsatellite DNA markers. Fish and Fisheries, 2, 93-112.
137
Houde ALS, Garner SR, Neff BD (2015) Restoring species through reintroductions:
strategies for source population selection. Restoration Ecology, 23, 746-753.
Hunter ME, Oyler-McCance SJ, Dorazio RM, Fike JA, et al. (2015) Environmental DNA
(eDNA) sampling improves occurrence and detection estimates of invasive Burmese
pythons. PloS one, 10, e0121655.
Kumar R, Indrayan A (2011) Receiver operating characteristic (ROC) curve for medical
researchers. Indian Pediatrics, 48, 277-287.
Laramie MB, Pilliod DS, Goldberg CS (2015) Characterizing the distribution of an
endangered salmonid using environmental DNA analysis. Biological Conservation, 183,
29-37.
Lukacs PM, Burnham KP (2005) Review of capture–recapture methods applicable to
noninvasive genetic sampling. Molecular Ecology, 14, 3909-3919.
Meffe GK (1995) Genetic and ecological guidelines for species reintroduction programs:
application to Great Lakes fishes. Journal of Great Lakes Research, 21, 3-9.
Miller RR, Williams JD, Williams JE (1989) Extinctions of North American fish during
the past century. Fisheries, 6, 22-38.
Minckley WL (1995) Translocation as a tool for conserving imperiled fishes: experiences
in western United States. Biological Conservation, 72, 297-309.
Moritz C (1994) Defining ‘evolutionarily significant units’ for conservation. Trends in
Ecology & Evolution, 9, 373-375.
Novinger DC, Coon TG (2000) Behavior and Physiology of the redside dace,
Clinostomus elongatus, a threatened species in Michigan. Environmental Biology of
Fishes, 57, 315–326.
Piller KR, Wilson CC, Lee CE, Lyons J (2005) Conservation genetics of inland lake trout
in the upper Mississippi River basin: stocked or native ancestry? Transactions of the
American Fisheries Society, 134, 789-802.
Poos MS, Jackson DA (2012) Impact of species-specific dispersal and regional
stochasticity on estimates of population viability in stream metapopulations. Landscape
Ecology, 27, 405-416.
Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus
elongatus) in Ontario. Prepared for the Ontario Ministry of Natural Resources.
Peterborough, ON.
138
Roussel JM, Paillisson JM, Tréguier A, Petit E (2015) The downside of eDNA as a
survey tool in water bodies. Journal of Applied Ecology, 52, 823-826.
Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy
models in the analysis of environmental DNA presence/absence surveys: a case study of
an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.
Sigsgaard EE, Carl H, Møller PR, Thomsen PF (2015) Monitoring the near-extinct
European weather loach in Denmark based on environmental DNA from water
samples. Biological Conservation, 183, 46-52.
Wilson CC, Wozney KM, Smith CM (2015) Recognizing false positives: synthetic
oligonucleotide controls for environmental DNA surveillance. Methods in Ecology and
Evolution, doi: 10.1111/2041-210X.12452
Weeks AR, Sgro CM, Young AG, Frankham R, et al. (2011) Assessing the benefits and
risks of translocations in changing environments: a genetic perspective. Evolutionary
Applications, 4, 709-725.
139
Appendix 2.1: Field data collected at 29 sites including sampling dates, fish caught, habitat characteristics (channel width, channel
depth, conductivity, temperature), and GPS coordinates.
Site Name Code Lat Long. Date
sampled
Spring
Resample
Date
Mean
Channel
Width
(m)
Mean
Water
Depth
(m)
Mean
width
*
mean
depth
Water
Temp
(oC)
Conductivi
ty (μS)
Total Fish
Caught
Lynde Creek 4 L4 43.90 -78.96 30-May-13 4.78 0.32 1.53 17.9 704 0
Lynde Creek 3 L3 43.97 -78.96 30-May-13 5.07 0.24 1.22 14.2 643 0
Lynde Creek 1 L1 43.92 -78.98 14-May-13 28-May-13 0.86 0.11 0.09 8.9 596 1
Lynde Creek 2 L2 43.92 -78.99 14-May-13 28-May-13 5.09 0.19 0.97 7.6 670 0
Lynde Creek 5 L5 43.95 -78.99 11-Jun-14 1.72 0.27 0.46 17.9 704 0
East Carruthers
Creek 2 E2 43.92 -79.03 11-Jun-14 3.49 0.12 0.41 15.8 951
2
East Carruthers
Creek 1 E1 43.93 -79.03 07-Jun-13 2.28 0.36 0.82 13.1 803
1
Spring Creek DU2 43.93 -79.07 15-May-13 28-May-13 7.25 0.22 1.60 11.5 494 0
Ganateskiagon
Creek DU3 43.88 -79.10 15-May-13 28-May-13 3.40 0.09 0.31 12.1 588
1
Mitchell Creek DU1 43.97 -79.14 15-May-13 28-May-13 1.23 0.16 0.20 9.5 490 34
Morning Side
Creek R4 43.82 -79.21 16-May-13 3.24 0.22 0.71 14 1281
0
Morning Side
Creek R3 43.83 -79.23 16-May-13 2.33 0.17 0.40 15.9 1367
0
Berczy Creek R6 43.88 -79.33 17-May-13 3.51 0.21 0.74 12.4 1282 1
Trib to Berczy
Creek R2 43.88 -79.35 17-May-13 06-Jun-13 1.39 0.10 0.14 15.1 1514
2
Leslie St. Trib R5 43.88 -79.39 23-May-13 3.97 0.24 0.95 17.6 960 21
Leslie St. Trib R1 43.90 -79.39 21-May-13 2.30 0.18 0.41 16.9 806 8
Don River Creek
1 D1 43.86 -79.46 21-May-13 06-Jun-13 2.45 0.12 0.29 18.9 1046
0
140
East Humber
Drive H2 43.93 -79.53 06-Jun-13 4.07 0.23 0.9 14.6 724
0
East Humber
River H1 43.92 -79.56 31-May-13 7.80 0.27 2.1 20.6 759
2
Purpleville Creek P1 43.84 -79.60 23-May-13 06-Jun-13 4.11 0.27 1.1 17.8 909 0
Humber Trail W1 43.90 -79.61 27-May-13 6.98 0.25 1.7 12.6 795 0
Fourteen Mile 2 F2 43.42 -79.73 24-May-13 04-Jun-13 6.56 0.11 0.72 12.3 1340 0
Fourteen Mile 1 F1 43.42 -79.76 24-May-13 04-Jun-13 3.05 0.13 0.40 17.9 1300 3
Fourteen Mile 3 F3 43.42 -79.77 24-May-13 04-Jun-13 2.39 0.28 0.67 10.4 775 6
Sixteen Mile 1 SI 43.57 -79.89 31-May-13 2.43 0.10 0.24 17.9 670 7
Silver Creek CR1 43.64 -79.92 24-May-13 5.83 0.22 1.28 11.2 650 0
Stan J 2 SJ2 43.49 -81.66 03-Jun-13 0.88 0.19 0.17 16.8 609 0
Gully Creek 1 GC1 43.61 -81.68 03-Jun-13 6.15 0.12 0.74 13.4 600 3
Stan J 1 SJ1 43.50 -81.70 03-Jun-13 1.38 0.21 0.29 13.3 690 0
141
Appendix 2.2a: Raw qPCR values (copies/reaction) for TaqMan Fast qPCR mastermix (Applied Biosystems Inc.) during Spring (S)
and Fall (F) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and spatial (S1-S4) replicates.
Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5 F-S1 F-S2 F-S3 F-S4 F-T1 F-T2 F-T3 F-T4 F-T5
L4 0.26 0.14 0.00 0.00 0.18 0.09 1.47 0.58 0.051 0.00 0.00 0.03 0.02 0.06 0.00 0.04 0.00 0.00
L3 0.00 0.00 0.68 0.66 0.20 0.00 0.00 0.31 0.20 0.00 0.34 0.18 0.23 0.00 0.13 0.08 0.00 0.00
L1 0.10 0.00 0.08 0.17 0.00 0.15 0.03 0.00 0.00 0.73 1.68 1.15 0.00 0.19 0.83 0.11 0.60 0.00
L2 0.32 0.33 0.00 0.62 0.00 0.65 0.87 1.84 0.00 0.40 0.14 2.79 0.71 0.00 0.00 0.40 0.14 3.30
L5 0.09 0.00 0.00 0.25 0.00 0.00 0.00 0.19 0.00 0.00 0.00 0.00 0.19 0.00 0.00 0.00 0.27 0.00
E2 0.00 0.51 0.19 0.211 0.85 0.21 0.190 0.401 0.48 2.73 6.84 0.12 3.75 3.69 3.17 7.15 4.85 1.05
E1 5.80 10.55 10.03 14.59 13.62 11.84 9.95 9.08 9.65 0.00 0.41 0.63 1.29 0.37 0.00 0.00 1.40 0.36
DU2 0.42 0.39 0.30 0.00 0.00 0.53 1.43 0.54 0.00 6.30 1.32 2.03 1.94 1.96 0.60 0.75 1.90 1.64
DU3 0.51 2.47 1.65 1.127 2.35 3.86 4.31 2.77 0.61 0.42 1.00 2.17 1.88 2.35 1.11 3.12 5.30 2.14
DU1 0.33 0.76 0.78 1.12 0.32 0.70 1.55 0.83 0.91 107.5 14.07 81.39 90.33 12.21 80.90 14.21 6.72 27.45
R4 0.73 1.23 0.45 0.070 1.11 0.00 0.56 0.64 0.26 0.20 0.00 0.00 0.31 0.00 0.00 0.00 0.00 0.65
R3 0.36 1.31 1.33 1.24 0.30 0.83 0.90 1.99 2.56 1.29 0.40 0.00 0.37 6.30 1.06 0.00 1.75 0.30
R2 8.91 3.18 5.25 3.46 7.83 4.50 4.67 6.40 13.41 0.74 0.53 1.38 0.76 0.77 0.00 0.00 0.32 0.05
R5 8.32 4.66 7.04 7.85 4.12 4.38 11.74 55.32 4.45 7.20 5.07 5.93 6.47 4.38 1.75 3.64 5.91 4.04
R6 9.53 6.53 5.46 8.06 5.40 5.48 8.16 5.67 3.78 1.15 4.130 3.59 4.71 11.95 4.77 4.90 1.29 2.55
R1 0.25 40.46 79.43 61.73 35.59 74.18 27.05 15.71 12.15 11.80 21.11 9.55 9.66 14.91 8.97 55.98 10.50 18.13
D1 14.35 8.88 6.90 5.37 24.64 23.61 25.83 11.12 33.23 3.53 1.98 0.79 5.56 4.35 1.54 1.35 2.58 1.72
H2 1.24 1.32 1.77 0.48 1.85 0.08 1.89 1.18 1.82 2.63 0.22 1.50 0.84 0.34 2.26 0.51 0.52 0.12
H1 5.14 11.28 7.94 6.97 6.28 11.41 12.11 10.52 10.26 1.08 1.50 1.04 27.88 1.27 0.71 0.25 0.74 0.18
P1 5.34 3.12 6.16 1.63 1.88 0.96 2.97 1.53 3.70 1.77 3.23 5.23 3.04 5.98 3.44 2.08 5.26 5.20
W1 1.23 0.00 0.65 1.74 1.03 0.84 0.00 0.58 0.90 27.87 0.96 0.00 2.53 0.15 2.68 0.00 0.05 1.28
F2 4.24 4.55 2.80 3.89 3.49 3.21 3.89 2.70 1.52 0.80 0.20 2.04 0.70 1.52 0.00 0.20 0.28 0.71
F1 8.98 16.36 14.15 8.24 0.77 16.56 154.80 10.21 7.67 1.04 6.32 20.51 6.23 4.74 3.20 4.63 3.26 1.51
142
Appendix 2.2a (continued)
Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5 F-S1 F-S2 F-S3 F-S4 F-T1 F-T2 F-T3 F-T4 F-T5
F3 9.65 7.16 8.48 7.62 4.71 1.40 20.08 15.24 13.39 1156.9 587.06 3.57 3.03 17.96 22.93 0.18 14.29 731.15
SI 21.74 5.13 8.44 17.04 13.23 12.38 21.69 12.44 8.80 18.96 11.56 15.92 4.04 9.73 7.36 18.87 6.08 15.32
CR1 1.85 1.61 0.63 2.36 0.76 1.93 2.76 2.31 0.94 0.00 0.00 0.00 0.19 0.00 0.00 0.33 0.00 0.75
SJ2 72.52 72.58 44.31 52.94 42.09 35.19 59.55 64.88 71.06 0.00 0.60 0.00 0.00 0.37 0.52 0.42 0.00 2.41
GC1 2.15 4.12 6.10 3.746 4.64 7.52 0.39 3.02 1.67 4.65 2.61 2.28 0.38 1.10 2.06 2.06 0.31 3.84
SJ1 15.40 19.96 7.48 33.20 21.51 17.45 23.78 23.72 18.13 11.53 7.98 11.21 18.11 3.01 3.42 6.35 6.93 11.29
143
Appendix 2.2b: Raw qPCR values (copies/reaction) for TaqMan Environmental qPCR mastermix (Applied Biosystems Inc.) during
Spring(S) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and spatial (S1-S4) replicates.
Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5
L4 0.10 0.28 0.36 1.91 0.39 0.28 3.98 0.88 0.60
L3 0.15 0.21 0.86 0.47 0.49 0.348 0.48 0.50 0.15
L1 0.17 0.45 0.53 1.745 0.61 0.15 0.40 0.02 1.06
L2 0.24 2.14 0.64 3.11 1.45 3.27 1.86 5.65 1.51
L5 0.00 0.53 0.08 0.00 0.15 0.09 0.088 0.30 0.12
E2 0.50 0.00 0.55 0.75 0.97 0.24 0.49 1.39 0.55
E1 8.51 9.54 11.25 25.38 20.92 18.78 19.81 18.21 17.50
DU2 0.385 0.73 0.72 0.95 0.49 0.32 1.19 0.77 2.97
DU3 1.386 3.42 1.89 2.08 3.04 2.31 3.96 5.30 3.12
DU1 0.11 3.64 0.73 2.65 1.83 0.29 3.50 2.65 4.36
R4 1.24 0.36 0.00 0.00 0.31 0.56 1.58 0.57 0.60
R3 0.98 5.24 3.45 2.62 3.38 2.19 4.30 6.58 4.75
R2 4.65 2.76 3.99 4.20 9.80 7.05 7.76 10.18 12.65
R5 8.38 6.82 8.57 4.80 5.46 8.69 8.01 48.55 14.21
R6 10.07 5.73 7.67 10.07 7.30 8.90 14.10 8.94 5.83
R1 0.54 65.89 76.62 70.28 51.20 110.13 52.47 34.94 28.12
D1 23.85 13.94 16.33 6.13 39.68 31.05 31.29 17.34 43.11
H2 1.11 0.60 0.26 0.52 0.29 0.11 1.56 0.00 0.40
H1 4.62 14.03 7.90 5.53 9.71 15.01 13.46 11.32 11.80
P1 6.79 4.62 7.78 3.57 1.57 2.01 1.56 3.68 4.06
W1 0.59 1.77 1.46 1.26 1.50 0.52 3.01 1.57 1.73
F2 1.53 5.14 1.22 6.44 1.39 1.64 4.81 2.22 2.32
F1 10.18 21.75 11.44 7.82 11.88 16.55 146.93 10.60 11.58
144
Appendix 2.2b (continued)
Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5
F3 6.97 4.17 7.97 10.53 5.81 0.20 8.53 10.71 8.60
SI 17.69 7.51 12.34 15.62 13.83 12.35 24.92 10.84 13.71
CR1 2.35 2.92 0.89 2.72 1.53 1.285 3.12 2.09 3.53
SJ2 58.88 49.59 40.01 44.93 30.04 18.73 37.12 34.03 50.50
GC1 3.99 2.70 1.93 4.77 3.92 3.62 2.66 4.97 3.41
SJ1 10.65 13.48 6.52 18.92 16.38 17.74 17.16 13.21 17.30
145
Appendix 2.3: 10 Lake control sites absent for Redside Dace along with their GPS
coordinates and date sampled.
Site Name
Given
Latitude Longitude Date Sampled
Stoco Lake 44.47 -77.31 June 14th, 2013
Moira Lake 44.48 -77.47 June 14th, 2013
Eels Lake 44.89 -78.11 June 13th, 2013
Cedar Lake 44.97 -77.76 June 14th, 2013
Paudask Lake 44.99 -78.07 June 13th, 2013
Monck Lake 44.99 -78.11 June 13th, 2013
Auger Lake 45 78.05 June 13th, 2013
Mayo Lake 45.05 -77.57 June 13th, 2013
Weslemkoon
Lake 1
45.08 -77.45 June 13th, 2013
Fraser Lake 45.19 -77.64 June 13th, 2013
146
Appendix 2.4: Comparison of Environmental versus Fast mastermix
Introduction
Real-time qPCR mastermixes, TaqMan® Fast Universal PCR Master Mix and
Environmental Master Mix 2.0, are both used in quantification assays, however, the
sensitivity of the two have yet to be compared. The purpose of this section was to
determine which mix was better at detecting eDNA at lower copy numbers. This will
have important implications for future eDNA studies; a more sensitive mix will result in a
higher proportion of true positives for monitoring programs.
Methods
Water samples from fall 2012 were run using TaqMan® Fast Universal PCR
Master Mix (referred to as fast mix herein), while water samples from spring 2013 were
run using both fast mix as well as TaqMan® Environmental Master Mix 2.0 (referred to
as environmental mix herein), the latter of which was first used in the lab during 2013.
The environmental mix could not be tested on fall 2012 water samples due to the
potential for DNA damage as a result of freezing and thawing, which could have resulted
in DNA degradation over the past year of being in the freezer. The fast mix was
compared to the environmental mix to see if there was a significant difference between
the two. Based on these results, no significant difference would indicate the
environmental and fast mixes could be analyzed in conjunction, but a significant
difference would indicate that the data from each mix would have to be analyzed
separately.
147
The environmental mix was run with the samples collected in spring 2013/2014
and fall 2013. For two replicates, each sample was run with 15 L of the following
cocktail: 10 L TaqMan® Environmental Master Mix 2.0, 0.4 L of RSD-R, 0.4 L of
RSD-F, 0.4 L of RSD-probe, 3.8 L of ddH2O, and 5 L of stock DNA. For one of the
replicate plates: each sample was run with 15 L of the following cocktail in order to test
for inhibition: 10 L TaqMan® Environmental Master Mix 2.0, 0.4 L of RSD-R, 0.4 L
of RSD-F, 0.4 L probe, 1.2 L ddH2O, 2.2 L 10x TaqMan® Exogenous Internal
Positive Control, 0.4 L 50x TaqMan® Exogenous Internal Positive Control DNA. The
internal positive control was run with the environmental mastermix reaction to determine
if inhibition was present within the sample. StepOnePlus thermocycling conditions for
TaqMan® Environmental Master Mix 2.0 were as follows: initial denaturation for 10 min
at 95 °C for, followed by a 2 step-process of a 15 s denaturation at 95 °C, and a 1 min
annealing at 60 °C, repeated for 40 cycles. The fast mix was run on all field samples,
while the environmental mix could not be used for the 2012 field collections because the
lab just started using the mix in 2013. A paired t-test was used to determine if there was a
significant difference between the mean of the environmental and fast mixes. To
determine the direction and range of this difference, a Bland-Altman plot was generated
by graphing the mean value of DNA copies/reaction for the fast and environmental mix
against the difference of one method minus the other (Bland and Altman 2003). Due to
the wide range of x-values, the x-scale was logged.
148
Results
The mean DNA copies/reaction for samples assayed using the environmental and
fast mixes were significantly different based on a combination of a paired t-test and a
Bland-Altman plot. A paired t-test (n = 393 for both) showed that the log means of the
two mixes are significantly different from each other, indicating that the data could not be
pooled together (t = -6.5, p<0.05). Assumptions of normality and homogeneity of
variance (F392=1.04, p>0.05) were met by adding a constant of 0.5 to copy number values
and log base ten transforming. For values between zero and one copy/reaction, there
appeared to be no difference between mixes because values within this range fell on the y
= 0 line. Above one copy/reaction however, an increase in the mean of the fast and
environmental mixes is accompanied by an increase in the difference between mixes.
Although the points on the graph fall both above and below y = 0, the majority of y-
values are greater than zero, indicating a bias towards higher values for the environmental
mix. This is supported by an overall mean difference between the two mixes of y = 0.88
DNA copies/reaction, and the limits of agreement are y = -10.05 and y = 11.80 (Figure
A4-1). Despite the environmental mix being more sensitive at lower copy numbers, the
fast mix results were used in the analysis because the qPCRs were run for all samples.
The IPC assay for inhibition could only be run for the environmental mix, and
results indicated that for the most part, environmental water samples do not have
inhibitors present, as illustrated by the similar spread in values of the control and
environmental samples (Figure A4-2). Environmental samples had a median value of
28.87 copies/reaction (n = 392, x̅ = 28.60, s = 0.81), while the control samples had a
median value of 28.84 copies/reaction (n = 139, x̅ = 28.73, s = 1.48). Of the 139 controls,
149
11 samples did not have a Ct value reading, while two environmental samples (L5-S1-
Fall and H2-T4) did not have a Ct value reading, indicating that these samples were
potentially inhibited. Further tests were not done on these additional samples due to time
constraints.
Reference
Bland JM, Altman DG (2003) Applying the right statistics: analyses of measurement
studies. Ultrasound in Obstetrics & Gynocology, 22, 85-93.
150
Figure A4-1: Bland-Altman analysis of method comparison between TaqMan Fast and Environmental qPCR mastermixes (Applied
Biosystems Inc.), with the x-axis showing the average copy number of a particular samples using the fast and environmental mix, and
the y-axis showing the difference in copy number between the mixes using that same sample.
Limits of
agreement
(y=11.80)
Mean difference
of mixes
(y=0.88)
Limits of
agreement
(y=-10.05)
151
Figure A4-2: Boxplot of Internal Positive Control (IPC) values at the threshold cycle (Ct)
they cross at during the environmental mix qPCR assay. All negative controls are
represented by the “controls” boxplot on the left, while all eDNA water samples are
represented by the “environmental” boxplot on the right.
152
Appendix 2.5: Separating the signal from the noise: using receiver operator
characteristics to optimize sensitivity and specificity of environmental DNA detections
Introduction
A successful monitoring program is one that is able to detect change within a
system soon after it takes place (Lindenmayer et al. 2013). DNA-based approaches are
being utilized for monitoring because of their potential for higher specificity and
sensitivity, non-intrusiveness, as well as their reduced costs (Darling and Mahon 2011).
In particular, environmental DNA (eDNA) detection, which refers to detecting target
DNA from a water sample, is more commonly being applied as a monitoring tool for both
invasive and endangered species (Ficetola et al. 2008, Wilcox et al. 2013). False positive
and false negative error rates are not uncommon in species monitoring programs. A ‘false
positive’ refers to a species reported as being present at a particular site but is actually
absent, while a ‘false negative’ refers to a species reported as being absent at a particular
site, but is actually present (goes undetected). Despite the growing number of studies to
use real-time PCR (qPCR) methodology to determine species presence, the accuracy of
this platform in the eDNA field has been largely under-evaluated.
A Receiver Operator Characteristic (ROC) is a statistical test that can be used to
measure the sensitivity and specificity of a test at varying thresholds (Metz 1978). The
sensitivity of a test is defined as [True Positive (TP) / True Positive (TP) + False Negative
(FN)], and the test’s specificity as [True Negative (TN) / (True Negative (TN) + False
Positive (FP)] at varying pre-defined thresholds (Figure A5-1; Metz 1978). First used in
World War Two to differentiate the ability of radar operators to identify “noise” from
“true signal” (Fan et al. 2006), it is more commonly used in the medical literature to
153
evaluate the efficacy of diagnostic tests (Kumar and Indrayan 2011). ROC has been
previously used to evaluate the efficacy of qPCR (Nutz et al. 2011). However, ROC has
yet to be used to design eDNA monitoring programs and compare the effect of different
DNA copy thresholds on detection limits.
Methods
All eDNA samples were classified as either “control” or “detection”, in order for
ROC analysis to proceed. The “control” group consisted of filter, DNA extraction, field
and lake negative controls; these are the samples that are theoretically free of Redside
Dace DNA (identified as the “negative” group). The “detection” group, consisted of any
environmental samples that had a qPCR output of greater than zero, indicating Redside
Dace DNA presence (identified as the “positive” group). The ROC analysis arranges all
datapoints in numerical order, and goes through every point (“positive” and “negative”),
and sets that particular value as the threshold. Each datapoint is classified as one of four
options: true positive (TP), true negative (TN), false positive (FP), and false negative
(FN) at each threshold. The TP is any “positive” that has a value above the threshold,
while a TN is any “negative” value that falls below the threshold. The FP is any
“negative” value that falls above the threshold, while the FN is any “positive” value that
falls below the threshold. This is calculated for all “positive” and “negative” cases.
To measure the accuracy of the qPCR assay at each threshold, sensitivity and
specificity values are calculated. Sensitivity is measured as TP/(TP+FN), which are the
“positives” that are recognized as detections at a particular threshold. Specificity, on the
other hand is measured as TN/(TN+FP), and are the “negatives” that are recognized as
154
non-detections at a particular threshold. Once these values are calculated for all data
points, a curve of sensitivity versus specificity can be generated. Overall, an increase in
threshold would result in higher sensitivity and a lower specificity, while a lower
threshold would result in a lower sensitivity and higher specificity. The area under the
curve (AUC) can then be calculated for all the data points to determine the probability
that the qPCR assay will return a value that is greater for a random Redside Dace water
sample (in this case), than a randomly chosen negative control sample. An AUC value of
1.0 indicates that the particular test is able to accurately classify “positive” from
“negative”, while a value of 0.5 indicates that the assay isn’t appropriate for
distinguishing a “positive” test result from a “negative” test result (Figure A5-1).
Setting the threshold is a trade-off between sensitivity and specificity. Values
between one and ten were chosen as candidate thresholds in order to set a threshold,
reflective of the higher variance values observed within this range in the standards using
the fast mastermix. The ROC curves and AUC values were created for the fast and
environmental mixes and calculated in R using the program pROC, and sensitivities and
specificities were calculated using MedCalc (http://www.medcalc.org/). Additionally, the
approach taken by Kumar and Indrayan (2011) was taken into consideration when trying
to identify the optimal threshold.
Results
At one copy/reaction, sensitivity was the highest (60.2%), while specificity was
lowest (98.5%). The rate of true positives was highest at one copy/reaction, with a value of
42.8% (Table A5-1). The inverse relationship was observed when a higher threshold was
155
set; higher specificity was obtained at the cost of lower sensitivity. For example, at 10
copies/reaction, sensitivity was lowest (18.7%) and specificity was highest (100%). While
there were no false positives detected at the ten copy threshold, the rate of true positive
dropped from 42.8% at 1 copy/reaction to 13.2% at 10 copies/reaction (Table A5-1). The
AUC for the qPCR assay was 0.89 for fast mix and 0.84 for environmental mix (Figure
A5-1).
Based on the highest sensitivity: specificity ratio (Kumar and Indrayan 2011), the
optimal threshold was 0.5 copies/reaction. This threshold was not used in my study
because 0.5 copies/ reaction is theoretically impossible. Despite 1 copy/reaction having
the highest sensitivity: specificity ratio, a threshold of 3 copies/reaction was selected. A
more conservative threshold was used because of the inconsistencies and failures
associated with amplifying 1 copy/reaction. This decision is supported by four lines of
evidence: (i) the standards generated for each qPCR assay were quantified using DNA
copies from 1x106 down to 1 copy/ reaction. When generating the curve, some of the data
points at 1 copy/reaction had to be removed due to outliers (Figure 2.6), (ii) there was a
failure rate of 34% when amplifying DNA at 1 copy/reaction for all standards, (iii) for the
experiment in which the eDNA standards were set as unknowns to assess how accurately
qPCR could identify copy numbers, two of the seven replicates did not amplify at 1 copy/
reaction, despite the DNA at 1 copy/ reaction being taken from the same working stock
solution (iv) the MIQE guidelines suggest that 3 copies/reaction would be the most
sensitive. Therefore, assuming ideal PCR conditions and by setting a threshold of 1
copy/reaction, there would be no room for error, which would be a difficult assumption to
meet given the stochastic nature of qPCR (Bustin et al. 2009). From a management
156
perspective, a threshold of 3 copies/reaction would a) still be more sensitive than
electrofishing, and b) be ideal for reducing the chances of having false positives with the
more conservative threshold (in comparison to 1 copy/reaction).
Discussion
Contamination has been identified as a major issue for the reliable eDNA
detection of target species, because of its ability to inflate false positive rates (Darling and
Mahon 2011, Thomsen and Willerslev 2014). Despite this, many studies have reported
that contamination was not evident in their study (Turner et al. 2014, Laramie et al. 2015,
Spear et al. 2015), but it is unclear if this means (i) all negative controls had a qPCR value
of 0 copies/reaction, or (ii) if there was background reading present but the values fell
below the threshold and were therefore dismissed. My study indicated that positive
readings in the negative controls are not frequent, but do occur due to the stochastic
nature of PCR. This poses a problem for analyzing data, because of the difficulties in
differentiating between a “true” positive versus a “false” positive at low level detections.
The use of a Receiver Operator Characteristics analysis in future studies is therefore
encouraged in order to quantify error rates associated with setting different thresholds.
Setting an appropriate threshold is a critical first step for monitoring surveys
because it dictates which sites are considered “positive” versus “negative”, and therefore
impacts downstream data analysis. In spite of the growing number of studies and their
focus on answering the more complex ecological questions in the eDNA literature, very
little attention has been given to detection thresholds and its impacts on false positive
error rates (Ficetola et al. 2014, Schmidt et al. 2013). At lower levels of DNA within a
157
sample, quantification error exists, which could negatively influence monitoring
programs (Klymus et al. 2014). Furthermore, for the studies that have discussed detection
limits, there is no accepted standard threshold for qPCR. For example, while Wilcox et al.
(2013) set a target copy of 0.5 target copies/reaction, Turner et al. (2014) set 30
copies/reaction as their 95% limit of detection (LOD), while the LOD for Eichmiller et al.
(2014) was 50 copies/ reaction. This study attempts to overcome this inconsistency in the
literature, and is the first one to use ROCs as a way of examining the implications of
setting various thresholds on error rates.
Low error rates are important for implementing new technology into a monitoring
program. The false negative error rates calculated in this study using the ROC framework
were approximately 40%, higher than that calculated by Laramie et al. (2015), which had
a rate of approximately 8.2%. Laramie et al. (2015) determined false negative rates by
calculating (replicate number where no Chinook eDNA was detected)/ (number of
replicate sites where eDNA was confirmed to be present). The limitation of this approach
is that it does not account for low-level detections that fall really close to the threshold set
for all replicates, as this was only done for samples where there was at least one positive.
A second study derived a false negative eDNA rate of 8.7% and obtained this estimate by
determining the eDNA samples that tested negative for the target species out of total
samples collected (collection sites represent areas where newts are present in large
amounts) (Biggs et al. 2015). A limitation of using this approach is that it is only
calculating false negative error rates associated with field sampling and the potential to
not collect eDNA in the water bottle, given that a species is present at the site. The
estimation produced from this study differs from these two, because a more holistic
158
approach was taken by calculating the environmental samples that had a qPCR output of
greater than 0 copies/reaction, but below 3 copies/reaction.
This study indicated that electrofishing was less sensitive for Redside Dace
presence than eDNA at a lower threshold (Th=1), while electrofishing was more sensitive
than eDNA monitoring when a larger threshold is applied (Th=10). At a higher set eDNA
threshold, one can be more confident that the positive sites identified via eDNA
methodology are “true detections,” and this was demonstrated by the high agreement of
sites that tested positive using both methods (Figure A5-1). At the other end of the
spectrum, by setting a threshold of 1 copy/reaction, eDNA was able to detect at all sites
that tested positive using electrofishing, while detecting additional sites that traditional
methods could not (Figure A5-3). The important consideration here is whether these
additional sites actually contain Redside Dace, or if there is the potential to increase false
positive error rates as a result of setting a lower threshold (Type I error) (Burns and
Valdivia 2007). The limitation of setting a higher threshold is that we can be less certain
that we have identified the sites containing Redside Dace using eDNA (ie/ increasing our
false negative rate; Type II error) (Burns and Valdivia 2007), however this comes with
the advantage of being able to invest time and resources into populations that are likely to
contain the target species of interest. Additionally, extra sampling effort could be invested
into sites with eDNA detections that fall below the copy threshold.
159
References
Biggs J, Ewald N, Valentini A, Gaboriaud C, et al. (2015) Using eDNA to develop a
national citizen science-based monitoring programme for the great crested newt (Triturus
cristatus). Biological Conservation, 183, 19–28.
Burns M, Valdivia H (2007) Modelling the limit of detection in real-time quantitative
PCR. European Food Research and Technology, 226, 1513-1524.
Bustin SA, Benes V, Garson JA, Hellemans J, et al. (2009) The MIQE guidelines:
minimum information for publication of quantitative real-time PCR experiments. Clinical
Chemistry, 55, 611-622.
Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based
methods for monitoring biological invasions in aquatic environments. Environmental
Research, 111, 978–988.
Eichmiller JJ, Bajer PG, Sorensen PW (2014) The relationship between the distribution of
common carp and their environmental DNA in a small lake. PLoS ONE, 9,e112611.
Fan J, Upadhye S, Worster A (2006) Understanding receiver operator characteristics
(ROC) curves. Canadian Journal of Emergency Medicine, 8, 19-20.
Ficetola GF, Miaud C, Pompanon F, Taberlet P (2008) Species detection using
environmental DNA from water samples. Biology Letters, 4, 423–425.
Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false
presences, and the estimation of presence / absence from eDNA metabarcoding data.
Molecular Ecology Resources, 15, 543–556.
Kumar R, Indrayan A (2011) Receiver Operator Characteristic (ROC) curve for medical
researchers. Indian Pediatrics, 48, 277-287.
Klymus KE, Richter CA, Chapman DC, Paukert C (2014) Quantification of eDNA
shedding rates from invasive bighead carp Hypophthalmichthys nobilis and silver carp -
Hypophthalmichthys molitrix. Biological Conservation, 183, 77-84.
Laramie MB, Pilliod DS, Goldberg CS (2015) Characterizing the distribution of an
endangered salmonid using environmental DNA analysis. Biological Conservation, 183,
29–37.
Lindenmayer D, Piggott M, Wintle B (2013) Counting the books while the library burns:
why conservation monitoring programs need a plan for action. Frontiers in Ecology and
the Environment, 11, 549-555.
160
Metz CE (1978) Basic principles of ROC analysis. Seminars in Nuclear Medicine, 4, 283-
298.
Nutz S, Döll K, Karlovsky P (2011) Determination of the LOQ in real-time PCR by
receiver operating characteristic curve analysis: application to qPCR assays for Fusarium
verticillioides and F. proliferatum. Analytical and Bioanalytical Chemistry, 401, 717-726.
Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy
models in the analysis of environmental DNA presence/absence surveys: a case study of
an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.
Spear SF, Groves J.D, Williams LA, Waits LP (2015) Using environmental DNA
methods to improve detectability in a hellbender (Cryptobranchus alleganiensis)
monitoring program. Biological Conservation, 183, 38–45.
Thomsen P, Willerslev E (2014) Environmental DNA – An emerging tool in conservation
for monitoring past and present biodiversity. Biological Conservation, 183, 4-18.
Turner CR, Uy KL, Everhart RC (2014) Fish environmental DNA is more concentrated in
aquatic sediments than surface water. Biological Conservation, 183, 93-102.
Wilcox TM, McKelvey KS, Jane SF, Lowe WH, Whiteley AR, Schwartz MK (2013)
Robust detection of rare species using environmental DNA: the importance of primer
specificity. PLoS ONE, 8, e59520.
161
Table A5-1: Comparison of fast and environmental TaqMan® mastermixes (Applied Biosystems Inc.) for True Positive (TP), True Negative (TN),
False Positive (FP), False Negative (FN), Sensitivity (Sn) and Specificity rates (Sp) at thresholds of 1, 3, 5, and 10 target eDNA copies/reaction.
Threshold
(copies/reaction)
True
Negative
True
Positive
False
Positive
False
Negative
Sensitivity Specificity
1 28.5%(254) 42.8%(381) 28.2%(251) 0.5%(4) 60.28% 98.45% 3 28.9%(257) 29.2%(260) 41.8%(372) 0.1%(1) 41.40% 99.61% 5 29%(258) 22.3%(199) 48.7%(433) 0%(0) 31.90% 100% 10 29%(258) 13.2%(118) 57.8%(514) 0%(0) 18.67% 100%
162
Figure A5-1: Schematic representation of using Receiver Operator Characteristic (ROC)
to discriminate between true and false positive and negative PCR results, as well as to
evaluate test sensitivity and specificity. TN= True Negative; FN= False Negative; FP=
False Positive; TP= True Positive.
163
Figure A5-2: Receiver Operator Characteristic (ROC) curve of specificity (x-axis) plotted
against sensitivity (y-axis) with (left) an AUC = 0.89 and TaqMan Fast qPCR mastermix
(Applied Biosystems, Inc); total of 632 eDNA samples with a qPCR output of > 0
copies/reaction, and 258 negative control samples were used to generate the curve. (right)
an AUC = 0.84 with Environmental Mastermix 2.0; total of 388 eDNA samples, and n=170
negative controls). A curve that falls below the diagonal line would indicate an inaccurate
assay.
164
Figure A5-3: Venn Diagrams indicating the number of sites (n=29) with Redside Dace
detections in the Fall 2012 using electrofishing only (blue), eDNA only (green), and
electrofishing and eDNA combined (overlap of green and blue). Redside Dace were
detected at 14 of the 29 locations using single-pass electrofishing (Reid et al. 2008);
eDNA detections are shown based on qPCR thresholds of 1, 3,5, and 10 DNA
copies/reaction.
165
Appendix 2.6: Comparison of electrofishing and eDNA detections during Fall and Spring
sampling season (total of 29 sites).
Fall Spring
Total eDNA detections 18 16
eDNA positive detections unique to season under study 5 3
Positive electrofishing sites 14 n/a
Positive electrofishing sites with no positive eDNA
detections
3 n/a
Positive eDNA sites with no positive electrofishing 7 n/a
16
6
Appendix 2.7: Estimates of Redside Dace detection probability and occupancy, AICc, ΔAICc, AIC weights, number of parameters,
and -2log values from models for spring and fall field seasons (horizontal headings), at temporal sampling (R) of 3, 4, and 5 replicates.
*QAICc used for model selection due to over dispersion.
Model Occupancy ψ
(+ SE)
Detection
probability p
(+ SE)
AICc ΔAIC
c
AIC
weights
#
Param
eters
-2log
likelihood
Model
average
(+ SE)
No. of replicates
required for
reliable
detection
Fall; Replicates=3, (n = 29, χ2 = 4.39, p = 0.59, ĉ = 0.75)
ψ(.)p(.) 0.45 (0.094) 0.76 ( 0.07) 86.14 0 0.4 2 81.68
0.73
(0.096)
3 ψ(.)p(temp) 0.49 (0.10) 0.69 ( 0.093) 86.36 0.22 0.36 3 79.4
ψ (.)p(flow) 0.45 ( 0.094) 0.77 ( 0.11) 88.49 2.35 0.12 3 81.53
ψ(.)p(temp*flow) 0.49 (0.10) 0.71 ( 0.11) 88.52 2.38 0.12 4 78.85
Replicates=4, (n = 29, χ2 = 12.69, p = 0.51, ĉ = 0.91)
ψ(.)p(temp) 0.48 ( 0.10) 0.70 (0.082) 99.17 0 0.52 3 92.21
0.72
(0.079)
3 ψ(.)p(.) 0.44( 0.092) 0.77 (0.060) 100.46 1.29 0.27 2 96
ψ(.)p(temp+flow) 0.48 ( 0.1014) 0.70 (0.10) 101.86 2.69 0.13 4 92.19
ψ(.)p(flow) 0.45 ( 0.092) 0.76 ( 0.092) 102.73 3.56 0.09 3 95.77
Replicates=5, (n = 29, χ2 = 41.51, p = 0.08, ĉ =1.39)*
ψ(.)p(.) 0.52 ( 0.093) 0.65 ( 0.056) 105.39 0 0.44 3 136.82
0.64
(0.075)
3 ψ(.)p(flow) 0.53 ( 0.096 ) 0.62 ( 0.085) 105.39 0 0.29 4 134.26
ψ(.)p(temp) 0.52 (0.093) 0.66( 0.077) 107.80 2.14 0.13 4 136.4
ψ(.)p(temp+flow) 0.53 ( 0.096) 0.63 ( 0.10) 109.03 3.64 0.07 5 134.02
16
7
Spring
Replicates=3, (n = 29, χ2 = 1.96, p = 0.96, ĉ =0.29)
ψ(.)p(temp) 0.5 (0.0929) 0.81 (0.087) 74.22 0 0.46 3 67.26
0.84 (0.083)
2
ψ(.)p(temp*flow) 0.56 ( 0.099) 0.84 (0.076) 75.17 0.95 0.28 4 65.5
ψ(.)p(.) 0.52 (0.093) 0.89 (0.050) 75.98 1.76 0.19 2 71.52
ψ (.)p(flow) 0.52 (0.093) 0.90 (0.06) 78.04 3.82 0.07 3 71.08
Replicates=4, (n = 29, χ2 = 5.40, p = 0.99, ĉ =0.37)
ψ(.)p(temp+flow) 0.58 (0.099) 0.78 (0.088) 90 0 0.53 4 80.33
0.79 (0.081)
2 ψ(.)p(temp) 0.59 ( 0.09) 0.79 (0.069) 90.43 0.43 0.43 3 83.47
ψ(.)p(.) 0.55 ( 0.092) 0.86 (0.044) 96.32 6.32 0.02 2 91.86
ψ(.)p(flow) 0.55 (0.093) 0.85 (0.059) 96.55 6.55 0.02 3 89.59
Replicates=5, (n = 29, χ2 = 21.71, p = 0.84, ĉ =0.73)
ψ(.)p(temp+flow) 0.61(0.097) 0.73 (0.081) 121.87 0 0.82 4 112.2
0.74 (0.081)
3 ψ(.)p(flow) 0.58 (0.092) 0.79 (0.058) 126.39 4.52 0.09 3 119.43
ψ(.)p(temp) 0.59 (0.093) 0.76 (0.077) 126.88 5.01 0.07 3 119.92
ψ(.)p(.) 0.58 (0.092) 0.80 (0.044) 128.85 6.98 0.03 2 124.39
168
Appendix 2.8: Comparison of costs for eDNA versus electrofishing
Past studies have indicated that eDNA water collection yields higher detection and
entails lower costs than conventional sampling (Fukumoto et al. 2015, Goldberg et al.
2013), which is an important consideration when implementing programs. I estimated the
cost of using either method to sample Redside Dace sites. The cost comparison of eDNA
versus electrofishing is dependent on how (i) many eDNA water samples are being taken
at a site, (ii) water sample turbidity (determines filtration time), and (iii) qPCR replicates
being run. Based on my electrofishing fall data, 0.5 hours was spent monitoring each site
with two field technicians, excluding the time required to process and identify fishes.
With a cost of approximately $20/hour for field technician support, this would come up to
a total of $20 for one hour. A conservative estimate of processing one eDNA sample in
the lab run in triplicate is approximately $35 (excluding lab-tech hours cost). For lab-tech
hours, to process 90 samples, it would take approximately 7 hours to filter samples
(assuming that filtration per samples takes no longer than 10-15 minutes), 24 hours to do
DNA extractions (30 extractions in 8 hours), one hour to run a gel with the 96 samples,
and 1.5 hours to run each sample in triplicate. Therefore, on a per-sample basis, this
would take an average of 0.5 of an hour to process one sample, or a total of 1.5 hours to
process three samples (assuming triplicates water samples were taken at each site). This
cost does account for the costs of running Redside Dace standard controls, pre and post
filter controls, field blanks and extraction negatives. If temporal replicates were taken at
each site, with a 15 minute interval, the time spent electrofishing and collecting the eDNA
samples would be equal, and eDNA water collection would incur more of an expense.
However, if we were to take spatial replicates at each site, the field tech time spent to
169
collect water samples at each site, and the time spent to electrofish, would equal the time
spent to run two samples in the lab (this does not take into account the $35/sample).
Additionally, other important considerations to think about that could influence initial
start-up costs for eDNA monitoring are: (i) primer optimization, (ii) investment in lab
equipment for eDNA techniques, and (iii) turbidity of samples influences filtration time.
References
Fukumoto S, Ushimaru A, Minamoto T (2015) A basin-scale application of
environmental DNA assessment for rare endemic species and closely related exotic
species in rivers: a case study of giant salamanders in Japan. Journal of Applied Ecology,
52, 358–365.
Goldberg CS, Sepulveda A, Ray A, Baumgardt J, Waits P (2013) Environmental DNA as
a new method for early detection of New Zealand mudsnails (Potamopyrgus
antipodarum). Freshwater Science, 32, 792–800.
170
Appendix 3.1: Primer sequences used for microsatellite DNA analysis, along with their GenBank accession numbers, repeat motifs,
size range (bp) and annealing temperatures.
Locus GenBank
Accession
Primer Sequence (5’-3’) Repeat Motif Range (bp) Annealing
Temp (ºC)
Reference
RSD42A GQ150754 F: AACTGCAGACAGGGATCTGG
R: TATCTGTGCCTGCTGGTGAG
(TC)14(AC)7 171-205 54 Pitcher et al. 2009
RSD2-
58
GQ150756 F: TGAAATCAAAATGGTCAGTCCTT
R: TGCGCTAAACGTCATCAGAG
(CA)13(TA)6 195-223 57 Pitcher et al. 2009
RSD70 GQ150757 F: TGCAGTGGTTTGCAATCTAAG
R: CCGACGACCCCTTTAAGAAT
(GT)14 239-255 54 Pitcher et al. 2009
RSD86 GQ150758 F: CACAAAAACGGGATGAATTG
R: GCGAACTGCAGCACTTACAG
(TG)20 209-227 54 Pitcher et al. 2009
RSD2-
91
GQ150759 F: ACAGCCACTATACCTGAAATCAA
R: CGCAAATAAAGGTGACTTGAC
(TCTA)21 181-273 54 Pitcher et al. 2009
RSD142 GQ150760 F: CACCCTGCTGTTTCTGTTCA
R: ATTGCTTTCCCTGTGAATCG
(TATC)20 191-311 54 Pitcher et al. 2009
RSD179 GQ150761 F: GCTAGTCAAACTGGTCTCTTTCC
R: GGCTGCCAGCAAATATTAGAA
(AT)2GTCT(G
T)16
197-219 54 Pitcher et al. 2009
171
CA11 AF277582 F: TCCCTCACTGTGCCCTACA
R: GGCGTAGCAATCATTATACCT
(TAGA)7 251-355 57 Dimsoski et al.
2000
CA12 AF277584
F: GTGAAGCATGGCATAGCACA
R: CAGGAAAGTGCCAGCATACAC
(TAGA)10(CA
GA)4(TAGA)2
169-301 57 Dimsoski et al.
2000
Ppro118 AY254352 F: CCGGATGCACTGGTGGAGAAAA
R:CCAGCAATCATAGCAGGCAGGAA
C
(CTCA)2(CA)2 199-291 57 Bessert et a. 2003
172
Appendix 3.2: Proportion of polymorphic loci across ten microsatellite primers for 28
Redside Dace populations.
Population %P
LCR 90
NFR 60
UNN 90
LRR 80
MIL 90
HAN 90
TTR 80
EFI 90
RED 90
BRU 100
GUL 80
EBC 100
SAU 70
STR 80
SMC 100
FOU 100
KET 90
HUM 90
WOO 100
DON 90
ROU 90
MIT 90
CAR 90
EBM 100
BHR 100
DOD 90
OST 70
COB 90
173
Appendix 3.3: List of 20 populations that deviate from Hardy-Weinberg equilibrium
expectations.
Population: Locus:
EFI RSD142
LCR RSD179
HAN RSD86
MIL Ppro118
KET Ca12
KET Ppro118
KET RSD2-91
KET RSD42A
TTR RSD70
MIT Ca12
MIT RSD179
GUL Call
GUL Pro118
GUL RSD86
GUL RSD2-58
UNN Ppro118
WOO RSD70
EBM RSD42A
STR RSD2-58
LRR RSD2-91
174
Appendix 3.4: Total evidence haplotype numbers with corresponding ATPase 6 and 8
and cytochrome b haplotypes.
Total Evidence
Haplotype Number
ATPase 6 and 8
Haplotype Number
Cytochrome b haplotype
number 1 23 1
2 3 1
3 3 35
4 15 33
5 15 19
6 15 34
7 9 13
8 3 30
9 3 31
10 21 32
11 3 1
12 1 28
13 1 4
14 3 25
15 1 2
16 19 26
17 1 27
18 1 5
19 1 22
20 18 1
21 3 24
22 1 23
23 12 17
24 1 9
25 17 19
26 15 20
27 15 21
28 16 19
29 14 19
30 12 18
31 13 17
32 11 16
33 10 14
34 10 15
35 1 14
36 1 15
37 8 9
38 7 12
39 6 11
40 1 10
175
41 5 2
42 1 8
43 4 2
44 1 7
45 2 2
46 3 6
47 1 3