BASICS OF WET STATISTICS
SETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
GRAPH THE DATA
Concentration (% Effluent)0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
0
10
20
30
40
50
60
70
80Raw DataMean
ANALYZE DATA FOLLOWING EPA WET STATISTICAL FLOWCHARTS
• Hypothesis Tests–NOAEC (Acute)–NOEC (Chronic)
• Point Estimation–LC50 (Acute)–EC25 or IC25 (Chronic)
PURPOSE OF HYPOTHESIS TESTS AND BASIC CONSIDERATIONS
• Purpose - Determine if two things (responses) are different
• Relevance of initial (control) condition(s)• Power of statistical test
Test #0 2 4 6 8 10 12 14 16
Effe
ct a
t NO
EC
-10
-5
0
5
10
15
20
25
EFFECTS ASSOCIATED WITH THE NOEC IN FATHEAD MINNOW
GROWTH DATA
EPA HYPOTHESIS TEST FLOWCHART (MULTI-CONC)
• Test assumptions of ANOVA– Transform data if
necessary – Normally distributed
data• Shapiro-Wilks Test
– Variance is equal• Bartlett’s test
• Select appropriate test– Parametric Tests
• Assumptions met– Non-Parametric Tests
• Assumptions NOT met
MULTIPLE CONCENTRATION PARAMETRIC TESTS
• Dunnett’s Test–Equal number of replicates in
each treatment
• Multiple t-tests with Bonferroni adjustment–Unequal number of replicates in
each treatment
MULIPLE CONCENTRATION NON-PARAMETRIC TESTS
• Steel’s Many-one Rank Test–Equal number of replicates in
each treatment
• Wilcoxon Rank Sum–Unequal number of
replicates in each treatment
PASS/FAIL TESTS• Control and critical concentration (IWC)• Test assumptions
– Transformations - Arc sine square root– Normality - Shapiro-Wilk’s test– Homogeneity - F-test
• Test for statistical difference– Normal/homogeneous - t-test– Non-normal - Wilcoxon rank sum test– Normal/heterogeneous - Modified t-test
PURPOSE OF POINT-ESTIMATIONAND BASIC CONSIDERATIONS
Describe relationship between two parameters
Selection of a significant response
Elucidation of relationship
Confidence in relationship
0 2 4 6 8 10 120
2
4
6
8
10
12
EPA POINT-ESTIMATE METHOD SELECTION
• Binomial Data–Probit–Spearman-Karber
• Untrimmed or trimmed–Graphical
• Continuous Data–ICp / Linear Interpolation
PROBIT ANALYSIS• Binomial data only (two choices)
– Dead or alive, normal/abnormal, etc.• Normally distributed• Adjusted for control mortality
– Abbott’s correction• At least two partial mortalities• Sufficient fit
– Chi-square test for heterogeneity• Designed for LC50/EC50 and confidence intervals
SPEARMAN-KARBER• Nonparametric model• Monotonic concentration response
– Smoothing• Adjusted for control mortality• Zero response in the lowest concentration• 100% response in the highest concentration• Calculates LC50/EC50 • Confidence interval calculation requires at least
one partial response
TRIMMED SPEARMAN-KARBER
• Same basic procedure as Spearman-Karber
• Requires at least 50% mortality in one concentration
• The trimming procedure is employed when the zero and/or 100% response requirements of Spearman-Karber method are not met.
GRAPHICAL METHOD
• Specifics–Nonparametric procedure–Adjusted for control mortality–Monotonic concentration response
• Smoothing–Linear interpolation of “all or nothing” response–Calculates LC50/EC50 - No CI’s
INHIBITION CONCENTRATION (ICp)
• Specifics– Nonparametric procedure– Calculates any effect level– Monotonic concentration response
• Smoothing– Random, independent, and representative data– Piecewise linear interpolation– Bootstrapped confidence intervals
SOFTWARE PROGRAMS
• Many software packages/programs are available
• DO NOT assume they follow the EPA recommended analysis
• DO verify the software by running example datasets from the methods manuals
DO THE RESULTS MAKE SENSE ???
Concentration (% Effluent)0 1 2 3 4 5 6
Perc
ent E
ffec
t
80706050403020100
Raw DataMeanProbit% MSDEC25
TOXIC UNITS IN WET TESTS
• Goals1) Standardize the results of
toxicity tests to simulate chemical specific criteria.
2) Create a reporting value which increases with sample toxicity.
DEFINITIONS OF TU VALUES
• Acute– TUa = 100/LC50 OR
• Chronic– TUc = 100/NOEC
• where the NOEC is defined by hypothesis testing or the IC25
SUMMARY OF THE ANALYSIS OF WET DATA
• STEP 1: Graph The Data
• STEP 2: Analyze The Data By EPA Methods
• STEP 3: Do The Results Make Sense?
ANALYSIS OF MULTIPLE CONTROL
TOXICITY TESTSSETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
WHAT IS A CONTROL SAMPLE ?
• A treatment in a toxicity test that duplicates all the conditions of the exposure treatments but contains no test material. The control is used to determine the absence of toxicity of basic test conditions (e.g. health of test organisms, quality of dilution water). Rand and Petrocelli, 1985.
WHAT IS A REFERENCE SAMPLE?
• “A reference sample is the “control” by which to gauge the instream effects of a discharge at a particular site.” Grothe et.al. 1996. - site-specific - ecoregional
• When manipulations are made to SOME of the test concentrations or treatments.
• To compare “standard” and “alternative” methods.
• When testing control and/or reference samples in which the quality is unknown.
• When a sample used for toxicity testing possess physico-chemical properties significantly different from water in which surrogate test organisms were cultured.
• TIEs - Toxicity Identification Evaluations.
WHEN ARE MULTIPLE CONTROLS USED?
WHEN ARE MULTIPLE CONTROLS USED?
Example #1• When manipulations are made to
SOME of the test concentrations or treatments.
BRINE ADDITION IN MARINE TESTS
Concentration Effluent Volume Brine Volume Seawater Volume Salinity
( 0 ppt) (68 ppt)(34 ppt)
Seawater 0 ml 0 ml 1000 ml 34 ppt Control1.25 % 12.5 ml 0 ml
987.5 ml 34 ppt2.5 % 25 ml 0 ml
975 ml 33 ppt5 % (IWC) 50 ml 0 ml
950 ml 32 ppt10 % 100 ml 100 ml
800 ml 34 ppt20 % 200 ml 200 ml
600 ml 34 pptBrine 0 ml 200 ml
600 ml 34 pptControl + 200 ml
ANALYSIS OF TWO-CONTROL TOXICITY TESTS WHEN SOME CONCENTRATIONS
WERE MANIPULATED
N o Y es
Y esY es N o N o
A n a lyze IW C an d L ikeTrea ted C on cs . an d
C on tro l U s in gE P A F low ch arts
R ep eat Tes t
IW C Trea tedC on tro l V a lid ?
P oo l C on tro lsan d A n a lyze A ll D ata
U s in g E P A F low ch arts
A n a lyze IW C an d L ikeTrea ted C on cs . an d
C on tro l U s in gE P A F low ch arts
C on tro l t-Tes tN on -S ig n ifican t?
B o th C on tro lsV a lid ?
WHEN ARE MULTIPLE CONTROLS USED ?
Example #2• To compare “standard” and “alternative”
methods.• To determine treatment effects.
EFFECT OF KELP STORAGE ON SENSITIVITY TO COPPER
F re s h S t o r e d
Co
pp
er C
on
ce
ntra
tion
(pp
b)
0
5
1 0
1 5
2 0
2 5
3 0
3 5
4 0
F re s h S t o r e d0
1 0
2 0
3 0
4 0
5 0
6 0
F re s h S t o r e d0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
F re s h S t o r e d0
2 0
4 0
6 0
8 0
1 0 0
F re s h S t o r e d
Co
pp
er C
on
ce
ntra
tion
(pp
b)
0
2 0
4 0
6 0
8 0
1 0 0
1 2 0
E f f e c t L e v e l1 5 1 0 1 5 2 5
Ch
an
ge
in E
C V
alu
es
(Sto
red
- Fre
sh
; pp
b C
u)
- 7 0
- 6 0
- 5 0
- 4 0
- 3 0
- 2 0
- 1 0
0
1 0
E C 1 E C 5 E C 1 0 E C 1 5
E C 2 5
*
**
*
WHEN ARE MULTIPLE CONTROLS USED?
Example #3• When testing control and/or reference samples in
which the quality is unknown. - Use of a reference not previously tested (ambient). - Quality of reference may vary from season to season (ambient). - When the potential exists for a sample to be impacted or impaired.
EFFECT OF A NON-POINT DISCHARGE ON AN INSTREAM
DILUTION WATERC. dubia Control Survival
0
20
40
60
80
100
120
Apr-96 May-96 Jun-96 Aug-96 Dec-96Test Date
Perc
ent S
urvi
val
Lab ControlUpstream
WHEN ARE MULTIPLE CONTROLS USED ?
Example #4• When a sample used for toxicity testing
possess physico-chemical properties significantly different from water in which surrogate test organisms were cultured - As a natural phenomenon - Due to sample manipulation
WHEN ARE MULTIPLE CONTROLS USED ?
Example #5• TIEs - Toxicity Identification Evaluations.
- Methods require the use of multiple controls called “blanks” which are
exact manipulations on the dilution water.
TAKE HOME POINTS• Multiple negative controls are a good idea if:
- New reference or control sample.
- Performing any sample manipulations.
- Comparing “standard” vs. “alternative” methods. Multiple Positive Controls (e.g. Ref Tox tests) should be used in this situation
- Using multiple organisms with different sensitivities.
REFERENCES:• Short-Term Methods For Estimating The Chronic Toxicity Of Effluents And
Receiving Water To Freshwater Organisms. EPA-600-4-91-002. July, 1994.
• Methods for Measuring the Acute Toxicity of Effluents and Receiving Waters to Freshwater and Marine Organisms. EPA/600/4-90/027F. August, 1993. - Have recommendations for multiple controls under certain conditions.
• Methods for Aquatic Toxicology Identification Evaluations. Phase I Toxicity Characterization Procedures. EPA/600/6-91/003. February, 1991.- Has recommendations for multiple controls “blanks”.
• Whole Effluent Toxicity Testing: An Evaluation of Methods and Prediction of Receiving Water System Impacts. Grothe et al.. 1996.
SUSPICIOUS DATA AND OUTLIER DETECTION
SETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
CONCERNS
• Outliers make interpretation of WET data difficult by
– Increasing the variability in test responses
– Biasing mean responses
IDENTIFYING OUTLIERS
• Graph raw data, means and residuals
Raw Data and Means
Copper Concentration (ppb)0 100 200 300 400
Pro
porti
on A
live
0.0
0.2
0.4
0.6
0.8
1.0
Residuals
Copper Concentration (ppb)0 100 200 300 400
Res
idua
l (pr
edic
ted
- obs
erve
d)
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
IDENTIFYING OUTLIERS
• Formal statistical test - Chauvenet’s Criterion– Using the previous mysid data, the critical values are:
• Mean = .80, Std. Dev. = 0.302, n = 8
– Chauvenet’s Criterion Value = n/2 = 4– Z-score = 2.054 (two-tailed probability of n/2 = 4 %)
– The calculations are:• Equation 1) (Z-score)(Std. Dev.) = (2.054)(0.302) = 0.620• Mean Equation 1 = 0.80 0.620 = 1.42 - 0.18• Outlier Range is >1.42 or <0.18
– A value of 0.2 is not an outlier.
CAN A CAUSE BE ASSIGNED TO THE OUTLIER(S) ?
• Review analyst’s daily observations• Check water chemistry data• Check data entry• Check calculations
• If cause can be assigned to outlier, then reanalyze data without outlier
DETERMINE EFFECT ON TEST INTERPRETATION
• Keep all data unless cause is found• Analyze data with and without suspect data
• Determine effect of suspect data on test interpretation
• Results reported will depend on effect of outlier(s) on test interpretation, best professional judgement, and discussions with regulatory agency
REPORTING OF RESULTS• Insignificant Effect
– With Outlier• IC25 = 131 (96.9-158) ppb• NOEC = 100 ppb• % MSD = 28.1 %
– Without Outlier• IC25 = 124 (93.6-152) ppb• NOEC = 100 ppb• % MSD = 20.9 %
• Report results with suspect data included
• Significant Effect– With Outlier
• IC25 = 131 (96.9-158) ppb• NOEC = 100 ppb• % MSD = 28.1 %
– Without Outlier• IC25 = 106 (83.8-126) ppb• NOEC = 50 ppb• % MSD = 12.2 %
• Report results from both analyses
CONCENTRATION -RESPONSE CURVES
IN WET TESTSSETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
NON-MONOTONICITYvs. HORMESIS
• Hormesis is a toxicological response to a single toxicant characterized by low-concentration stimulation but is inhibitory at higher doses.
• Non-monotonicity is a relationship where a smaller response (e.g. mortality) is observed at the higher of two consecutive concentrations.
TYPICAL TRAITS OF HORMESIS
• Calabrese and Baldwin, 1998
• Hormetic - concentration range
• Magnitude of hormetic stimulation
• Range from maximum stimulation to NOEL (NOEC) Concentration
Res
pons
e
Max. Stimulation (30-60%)
Hormetic Range (10 x)
Max. Stimulationto NOEL Range
(4-5 x)
NOEL
WHY IS HORMESIS DIFFICULT TO DETECT IN TOXICITY TESTS?
• Inadequate concentration series
• Inadequate description of concentration - response
• Inadequate statistical power
• Hormesis is not the cause
Well Defined Hormetic Response
Concentration100 1000
Res
pons
e
Poorly Defined "Hormetic" Response
Concentration100
Res
pons
e
EFFECTS OF NON-MONOTONIC DATA
NOEC >LOECSea Urchin Fertilization Data
Percent Effluent0 1 2 3 4 5 6
Per
cent
Fer
tiliz
ed
70
75
80
85
90
95
100
Statistically Significant Reduction
NOEC = 6.0 %LOEC = 0.36 %% MSD = 5.82 %IC25 = > 6.0 %
• Limited replicates (4)• High control & low
concentration variability
• High Statistical Power• NOEC > LOEC
EFFECTS OF NON-MONOTONIC DATA
HETEROGENEITY IN PROBIT ANALYSIS
• Limited replicates (5)• High control & low
concentration variability• Significant chi-square • Inflated confidence
intervals• Reanalyze with non-
parametric models
Significant Chi-Square for Heterogeneity
0.00.10.20.30.40.50.60.70.80.91.0
1 10 100 1000 10000
Dose ppb
Resp
onse
EFFECTS OF NON-MONOTONIC DATA SMOOTHING IN ICp
ANALYSIS• Smoothing is used in
all non-parametric models.
• Smoothing procedure averages treatment responses
• Increases estimated toxicity
Selenastrum Cell Growth Data
Percent Effluent0 20 40 60 80 100
Res
pons
e (%
of C
ontro
l)
0
25
50
75
100
125
150
175
200
225
250
Actual ResponseSmoothed Response
REMEDIES FOR PROBLEMS ASSOCIATED WITH NON-
MONOTONIC DATA• Better concentration series selection• Increase number of replicates• % MSD limits (NOEC’s)• Use of more robust parametric models Bailer
and Oris, 1997 Kerr and Meador, 1996 Baird et al., 1996
• Concentration-response curve criterion
CONFIRMATION OF A CONCENTRATION-RESPONSE
CURVE
• Graphical• Linear regression Analysis• Correlation Analysis
GRAPHIC ANALYSIS OF CONCENTRATION-RESPONSE
CURVESConcentration-Response Curve Absent
Concentration (% Effluent)0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-10
0
10
20
30
40
50
60
70
80
% MSDRaw DataMean
Concentration-Response Curve Present
Concentration (% Effluent)0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-10
0
10
20
30
40
50
60
70
80
% MSDRaw DataMean
GRAPHIC ANALYSIS OF CONCENTRATION-RESPONSE
CURVESConcentration-Response Curve Present ???
Concentration(% Effluent)
0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-100
1020304050607080
Raw DataMean% MSD
LINEAR REGRESSION ANALYSIS OF CONCENTRATION-RESPONSE CURVES
Concentration-Response Curve Absent
Concentration (% Effluent)0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-10
0
10
20
30
40
50
60
70
80
Raw DataMeanProbit% MSD
Negative Slope Not Sig. Dif. from Zero
Concentration-Response Curve Present
Concentration (% Effluent)0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-10
0
10
20
30
40
50
60
70
80
Raw DataMeanProbit% MSD
Positive Slope and Sig. Dif. than Zero
LINEAR REGRESSION ANALYSIS OF CONCENTRATION-RESPONSE CURVES
Concentration-Response Curve Present ???
Concentration(% Effluent)
0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-100
1020304050607080
Raw DataMeanProbit% MSD
Positive Slope Not Sig. Dif. from Zero
CORRELATION ANALYSIS OF CONCENTRATION-RESPONSE
CURVESConcentration-Response Curve Present
Concentration (% Effluent)0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
0
10
20
30
40
50
60
70
80
% MSDRaw DataMean
Significant Negative Correlation(r = -0.965, P = 0.000)
Concentration-Response Curve Absent
Concentration (% Effluent)0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-10
0
10
20
30
40
50
60
70
80
% MSDRaw DataMean
Insignificant Correlation(r = -0.0931, P = 0.593)
CORRELATION ANALYSIS OF CONCENTRATION-RESPONSE
CURVESConcentration-Response Curve Present ???
Concentration(% Effluent)
0 1 2 3 4 5 6
Res
pons
e (%
Eff
ect)
-100
1020304050607080
Raw DataMean% MSD
Significant Negative Correlation(r = -0.389, P = 0.021)
SUMMARY
• Identification of a significant C-R curve is an important QA check.
• Graphical analysis is simple but subjective• Linear regression analysis is objective and
conservative but requires parametric analysis.• Correlation analysis is objective and liberal
and non-parametric methods are available.
BIOLOGICAL INTEFERENCE IN
FATHEAD CHRONIC TESTS
• Seasonal (cold months)
• Affects only fathead minnows
• High variability
• Poor dose response
• Fungus-like growth
TOXICITY CHARACTERISTICS
Normal Gills and Pharynx
Bacterial Clogging
% Survival on Day of Test
Rep 3 4 7
1 100 13 0
2 100 25 0
3 100 100 100
4 100 88 88
5 100 50 13
UV LIGHT
020406080
100
Untrt UV
% S
urvi
val
25%
50%
100%
Autoclaved
020406080
100
Untrt Autoclaved
% S
urvi
val
50
100%
PASTEURIZE
020406080
100
Untrt Pasteur
% S
urvi
val
25%
50%
100%
ANTIBIOTIC
020406080
100
Untrt Antibiotic
% S
urvi
val
25%
50%
100%
STERILIZATION
ANTIBIOTIC ADDITION
0
20
40
60
80
100
Baseline Diluent Only
% S
urvi
val
Rec control
32%
42%
56%
75%
100%
ANTIBIOTIC ADDITION
0
20
40
60
80
100
Baseline Diluent + Effluent
% S
urvi
val
Rec cont
32%
42%
56%
75%
100%
EFFECT OF ISOLATION
02040
6080
100
1 2 3 4 5 6
Day of Test
% A
live
Sinc
e Pr
evio
us
Day
Sick FishRemoved
Dead FishRemoved
• “Toxicity” due to a naturally occurring pathogen
• Best viewed as a kind of interference
CONCLUSION
• Heat
• Filtration (0.2 uM)
• UV light
• Antibiotics
CONTROLLING BIOLOGICAL INTERFERENCE
Advantages:
• Simple, no specialized equipment
Disadvantages:
• May be more “intrusive” (e.g. removal of volatile components
• Must re-aerate sample
HEAT
Advantages:
• Usually very effective
Disadvantages:
• Impractical with high suspended solids
• Requires specialized equipment for filtering large volumes
• May remove particulate bound contaminants
FILTRATION (0.2 UM)
Advantages:• Usually very effective.• Uses common equipment
Disadvantages:• Less effective with high suspended solids or stained water• May degrade organic contaminants or enhance organic toxicity (e.g. PAHs)
UV LIGHT
Advantages:• Usually very effective. • Chemicals inexpensive and widely
available• Easy to treat large volumes
Disadvantages:• May require determination of proper
dose
ANTIBIOTICS
• Chronic WET tests using fathead minnows may show evidence of interference due to pathogens.
• Interference = high variability, poorly defined dose response
• Most common with surface waters • Control measures = sample treatment
to kill or remove pathogens.
SUMMARY
STATISTICAL AND BIOLOGICAL
SIGNIFICANCESETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
TOXIC VS. NON-TOXIC
• WET Tests Developed to Identify Toxic Samples
• Two Methods Used–Hypothesis testing - Statistical
difference–Point-estimation - Standard level of
effect
TOXICITY ASSUMPTIONS OF HYPOTHESIS TESTING
• Non-Toxic = No statistical difference between control and critical concentration response
• Toxic = Statistical difference between control and critical concentration response
TOXICITY ASSUMPTIONS OF POINT-ESTIMATION
• A preselected level of effect is considered toxic– Acute test: 50 % effect–Chronic test: 25 % effect
• Toxic = ECx/ICx is less than the critical concentration (IWC)
• Non-Toxic = ECx/ICx is equal or greater than the critical concentration (IWC)
BOTH APPROACHES HAVE STRENGTHS AND LIMITATIONS
• Complete Discussion in:
–Grothe et al. Eds. 1996. Whole Effluent Toxicity Testing: An Evaluation of Methods and prediction of Receiving System Impacts, SETAC Press, Pensacola, FL, USA.
STRENGTHS AND LIMITATIONS OF HYPOTHESIS TESTS
• Strengths– Suited for
comparison of treatments
– Simple to calculate (no modeling)– Not model
dependent
• Limitations– NOEC is concentration
dependent– Variability reduces
statistical power and increases significant effect
– No confidence intervals– Results are independent of
concentration-response curve
STRENGTHS AND LIMITATIONS OF POINT ESTIMATES
• Strengths– Uses concentration-
response curve– Not limited to tested
concentrations– Confidence intervals
• Limitations– Selection of effect
level– Partial responses
increase accuracy– Model dependent– More difficult
computations
WHICH METHOD IS BEST?
• Both Approaches Are Supported By The TSD And The Methods Manuals
• Depends On The Purpose Of The WET Test–Hypothesis test - Identify statistical
difference from control response–Point-estimate - Concentration which
shows a standard effect
TOXIC MAY NOT = ECOLOGICAL IMPACT
• Hypersensitive Hypothesis Tests• Relatively Sensitive Test Species• Inconsistent Exposure Parameters Between
the Toxicity Test and Receiving Water– Magnitude, duration, frequency of exposure– Water chemistry
• Population/Community Structure Dynamics
NONTOXIC MAY NOT = NO ECOLOGICAL IMPACT
• Hyposensitive Hypothesis Tests• High Effect Level In Point-Estimates• Relatively Insensitive Test Species• Inconsistent Exposure Parameters Between
the Toxicity Test and Receiving Water– Magnitude, duration, frequency of exposure– Water chemistry
• Undetected Biological Effects• Population/Community Structure Dynamics
WHAT CONCLUSIONS CAN BE MADE?
• The Sample Is Toxic/Non-Toxic As Defined By The WET Program
• The Biological Impact Was Significant/Insignificant In The Beaker
• The Receiving Water May or May not Become Impacted
WAYS TO INCREASE THE ECOLOGICAL RELEVANCE
• Identification of Toxic Agent(s)• Consider the Use Of Indigenous Species In Toxicity Tests• Consider Exposure Parameters Found In Receiving Water
– Magnitude, duration, frequency of exposure– Water chemistry– Ambient water tests
• In Situ Bioassays• Detection and Study Of Other Biological Effects• Comprehensive Study Of Population/Community Structure Dynamics In
Receiving Water• Further Studies In A Variety Of Ecosystems Which Examine The
Relationship Between WET Tests And Ecological Impact.
COST OF “ECOLOGICALLY RELEVANT” WET TESTS
• Very Expensive–Methods Research and Development–Receiving water characterization–Field bioassessments
• Loss Of Comparability
• Increase In Complexity Of Water Quality Standards and Interpretation
SUMMARY
• WET Tests Were Developed To Identify Toxic and Nontoxic Samples
• WET Tests Are Useful In Conjunction With Chemical And Field Assessment Data To Protect Aquatic Ecosystems
• Adaptation Of WET Tests To Be Ecologically Relevant Can Be Helpful But Comes At A Cost
FALSE POSITIVES FALSE NEGATIVES
GUIDING PRINCIPLE = REPEATABILITY
Repeatable test results are taken as “true” or “real” or “correct”.
FALSE POSITIVES/NEGATIVES IN CONTEXT OF WET TESTS
Depends on presumed function of WET tests:
• WET Test as “predictor” of instream effects.
• WET Test as “detector” of toxic amounts of toxic chemicals
WET TEST AS “PREDICTOR” OF INSTREAM EFFECTS.
• False Positive = false indication of instream effects
• False Negative = failure to indicate instream effects
WET TEST AS “DETECTOR” OF TOXIC AMOUNTS OF TOXIC
CHEMICALS• False Positive = false indication of presence of toxic
amounts of toxic chemicals
• False Negative = failure to indicate presence of toxic amounts of toxic chemicals
WHAT IS “TOXICITY”?
• Statistically significant difference between effluent concentration and control
• An LC50 or other point estimate that is less than some predetermined value
The operational definition of toxicity is often statistical
TOXICITY AS A STATISTICAL CONCEPT
• False Positive = Statistically significant effect that is not “Real” (spurious, artifactual).
• False Negative = Effect that should be observed but is not.
THERE ARE REASONS WHY STATISTICALLY SIGNIFICANT
RESULTS HAPPEN
At most, 4 things are present in a test beaker: Diluent Sample Organism(s) Food
TOXICITY NOT DUE TO SAMPLE
• Technician error
• Bias in test chamber location or in assigning organisms to treatments.
• Statistical sampling error (Type 1 error)
• Other
TECHNICIAN ERROR
• Expertise• Experience
BIAS IN ORGANISM/CHAMBER ASSIGNMENT
• Bias in organism assignment is a tendency to assign healthier or less healthy organisms to certain test concentrations
• Systematic arrangement of test chambers can result in systematic bias in organism response (e.g. Selenastrum algal growth test)
• Can be eliminated through proper randomization.(See Davis, et al, 1998)
STATISTICAL OUTCOMESTypes of Errors in Hypothesis
Testing
If Ho is True If Ho is False
If Ho isrejected
Type I error No error
If Ho is notrejected
No error Type II error
HYPOTHESIS TESTING FACTS
• NOECs are not point estimates
• Cannot calculate coefficients of variation or confidence intervals
• NOEC is a lower concentration level than the LOEC when the dose response curve is smooth
• LOEC may represent a different amount of effect from test to test
= 0.05 = Type 1 Error
o msdNull Hypothesis is TRUE
Null Hypothesis is FALSE
o msd a
= 0.05
= 0.2 = Type 2 Error
Power = 0.8
STATISTICAL SAMPLING ERROR
• Type 1 error.
• Should be rare (P < alpha)
• Not repeatable
• Can be reduced by decreasing alpha but at cost of increasing Type 2 error (False Negatives)
“UNINTERESTING” TOXICITY
Toxic response due to a sample that deviates from culture conditions but is still within standard test conditions. E.g. The toxic response is due to a slight difference in pH (0.2 units).
FALSE NEGATIVE: FAILURE OF THE TEST SYSTEM TO INDICATE
TOXICITY• Operator error
• Bias
• Type 2 error
• Intrinsically variable data
• Interference
False +/- are “wrong” answers.• In the absence of technician error, biased test
design and biased sampling, the False +/- rate = Type I and II error rate, respectively.
• Repeatable results, in the absence of technician error and biased sampling, cannot be False +/-’s.
• An estimate of the False + rate could be obtained through testing of blanks.
CONCLUSIONS
INTRA- AND INTER- TEST VARIABILITY
SETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
TYPES OF VARIABILITY• Variability inherent in any analytical procedure• Intra-test : among and between concentrations• Intra-lab: within one lab, same method• Inter-lab: between labs, same method• Method specific: within limits of method
–organism age, length of test, dilution water, food type, etc.
INTRA-TEST VARIABILITY
Group N MeanSurvival
s.d. CV(%)
control 4 0.975 0.050 5.12 4 0.975 0.050 5.13 4 0.975 0.050 5.14 4 0.950 0.058 6.15* 4 0.675 0.150 22.26* 4 0.275 0.222 80.6
MSE = 0.033 MSD = 13.9 %
INTRA-TEST VARIABILITY AND ENDPOINT UNCERTAINTY
EC Conc. Lower95% CL
Upper95% CL
Conf.Int/EC
1 220 95 310 0.9810 332 196 422 0.6850 553 440 670 0.4190 919 744 1416 0.7399 1392 1024 2906 1.35
5
6
7
8
9
10
11
12
13
Tests
LC50
(mg/
l SD
S)
LC5095% UCI95% LCImean LC50
POINT ESTIMATE INTRA-LAB VARIABILITY
HYPOTHESIS TESTS INTRA-LAB VARIABILITY
Horizontal lines = acceptance limits for two dilution series(red dotted = 0.5; blue dashed = 0.75)
0
50
100
150
200
250
300
0 1 2 3 4 5 6 7 8 9 10Test #
NO
EC (p
pb C
u)
SOURCES OF INTRA-TEST VARIABILITY
• Genetic variability• Organism handling and feeding• Toxicity among and between treatments• Non-homogeneous sample source• Sample toxicity
SOURCES OF INTRA-TEST VARIABILITY
• Abiotic conditions
• Dilution scheme
• Number of organisms/treatment
• Dilution water pathogens
• Randomization important!
SOURCES OF INTRA-LAB VARIABILITY
• Intra-test sources• Analyst experience and practice• Organism age and health• Acclimation• Dilution water• Type of sample
SOURCES OF INTRA-LAB VARIABILITY
• Sample quality• Test chamber characteristics• Organisms/source• food type/rate/source
SOURCES OF INTRA-LAB VARIABILITY
• Replicate volume• Test duration• Procedures
SOURCES OF INTER-LAB VARIABILITY
• All of previous are important
• Differences allowed in methods - Could be significant between labs
• Differences in protocols - State, federal, local, etc. Use promulgated standard
• ANALYST EXPERIENCE
VARIABILITY AND POINT ESTIMATE UNCERTAINTY
Test #1 Test #2
Mean CV (%) 9.9 33.8
IC25 (%) 27.2 26.0
MSE 34.5 290.6
95% CI 25.7-28.5 17.2-31.3
HYPOTHESIS TESTSHIGH VARIABILITY - LOW
STATISTICAL POWER
Group n Mean wt(ug/ind)
s.d. CV%
Control 4 632 552 87.42 4 727 674 92.73 4 1080 408 37.74 4 564 493 87.55 4 748 235 31.4
MSD = 131 %
HYPOTHESIS TESTS LOW VARIABILITY - HIGH
STATISTICAL POWERGroup n Mean wt
mg/survivors.d. CV
Control 4 0.30 0.012 4.0%
10% 4 0.30 0.013 4.3%
18% 4 0.31 0.008 2.6%
32% 4 0.30 0.010 3.3%
56% 4 0.27* 0.013 4.8%
100% 4 0.27* 0.013 4.8%
MSD = 6.5 %
ACTIONS TO REDUCE VARIABILITY
• Establish performance criteria
• QA program
• Establish and follow strict procedures
• MAXIMIZE ANALYST SKILL
• Contract lab selection
• Additional QA/QC criteria
WHY DETERMINE METHOD VARIABILITY AND WHY
CONTROL VARIABILITY?• If inherent variability of each method is
known there will be less chance of making errors concerning toxicity.
• Variability too high - not detect toxicity when present. Variability too low - might detect toxicity when it is not there.
• At present there is little incentive to reduce variability.
EXAMPLES OF ADDITIONAL QC TEST CRITERIA
• EPA Region IX: upper MSD limits
• Washington: upper MSD limits, change in
• N. Carolina: limit control CVs, C. dubia “Practical Sensitivity Criteria”
• EPA Region VI: limit control CV, increase number replicates,biological significance
THE CHRONIC TEST GROWTH ENDPOINT
SETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
CHANGE IN GROWTH ENDPOINT CALCULATION
Pre-Nov., 1995 ApproachGrowth = D.W. surviving organisms # surviving organisms
Post-Nov., 1995 ApproachGrowth = D.W. surviving organisms # initial organisms
Treatment%
MortalityBefore
PromulgationAfter
PromulgationControl 5.1 325 308
2 2.6 353 3413 5.0 345 3294 17.9 387 3065 47.5 319 167
EFFECT ON MEAN TREATMENT RESPONSES
5
10
15
20
25
30
35
Observations
CV
(%)
AfterBefore
INTRA-TREATMENT VARIABILITY AND WEIGHT CALCULATIONS
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1 2 3 4 5 6 7 8 9 10
Tests
Ref. Tox.Effluent
OLD MSE/NEW MSE RATIO
EFFECTS ON HYPOTHESIS TEST ENDPOINTS
BeforePromulgation
AfterPromulgation
Test #
%MSD NOEC %MSD NOEC1 16.4 50 16.7 502 10.8 10 29.1 103 11.9 5 39.0 54 19.7 25 18.5 25
EFFECTS ON HYPOTHESIS TEST ENDPOINTS
BeforePromulgation
AfterPromulgation
Test #
%MSD
NOEC
Avg.wgt.at
NOEC
%MSD
NOEC
Avg.wgt.at
NOEC
1 20.9 100 296 23.4 100 2962 19.5 100 268 25.1 100 2333 22.1 100 254 24.1 100 2274 21.4 100 387 22.8 100 313
EFFECTS ON POINT ESTIMATE ENDPOINTS
BeforePromulgation
AfterPromulgation
Test #
IC25 95%CI IC25 95%CI1 56.2 45.4-79.3 48.3 43.3-61.9
2 NC NC 12.4 6.4-13.8
3 NC NC 4.2 1.5-7.3
4 33.7 28.2-40.6 30.0 19.4-35.0
EFFECTS ON POINT ESTIMATE ENDPOINTS
BeforePromulgation
AfterPromulgation
Test #
IC25 95%CI IC25 95%CI1 291 NC 234 191-262
2 386 NC 176 140-256
3 227 179-258 138 111-155
4 >400 NC 144 104-162
NOEC/IC25 RELATIONSHIP
Test # TestType
NOEC IC25Before
IC25After
1 Effluent 50% 56.2 48.3
2 Effluent 25% 33.7 30.0
3 Ref. Tox. 100 ppb 291 234
4 Ref. Tox. 100 ppb 386 176
5 Ref. Tox. 100 ppb 227 138
6 Ref. Tox. 100 ppb >400 144
IMPACT ON TEST INTERPRETATION
• Hypothesis Test Results - most cases show little change, but not always
• Point Estimate Results - usually increases predicted toxicity
ISSUES RELATED TO CHANGE IN APPROACH
• Test growth or biomass?
• Accurate representation of growth?
• Correlation between new results and instream responses?
ISSUES RELATED TO CHANGE IN APPROACH
• Conflict between new results and unchanged effluent quality?
• Effect on reference toxicant control charts
• Relationship between NOEC and IC25
AGE-RELATED SENSITIVITY OF FISH IN ACUTE WET TESTS
SETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
REVISIONS TO FISH AGES IN EPA ACUTE TEST MANUALS
• From: 1-90 days old in the 3rd edition of the acute manual (1985; EPA/600/4-85/013)
• To: 1-14 days old (or 9-14 days old for silversides) in the 4th edition of the acute manual (1993; EPA/600/4-90/027F)
COMMONLY USED TEST SPECIES
• Fathead minnows• Sheepshead minnows• Silversides (inland, atlantic, and
tidewater)
RATIONALE
• Younger life stage is generally more sensitive than older life stage
• Reduction in range of acceptable ages from 1-90 to 1-14 days will reduce variability
CONCERN
• Use of younger fish in NPDES testing may show an increase in apparent toxicity, without any changes in effluent conditions
COMMON QUESTIONS
• Are <14-day old fish more sensitive than <90-day old fish to toxicants?
• Does the use of <14-day old fish reduce intertest variability when compared to <90 day-old fish?
• How does the sensitivity and precision vary within the 1 to 14 day old age range?
SENSITIVITY OF 14, 30, AND 90 DAY-OLD FATHEAD MINNOWS
Copper
Age (days)14 30 90
Mea
n 96
hr L
C50
(ppb
)
0
200
400
600
800
1000
1200
Unionized Ammonia
Age (days)14 30 90
Mea
n 96
hr L
C50
(ppm
)
0.00
0.25
0.50
0.75
1.00
1.25
1.50
A
B
C
AA
B
INTER-TEST PRECISION OF 14, 30, AND 90-DAY OLD FATHEAD MINNOWS
Copper
Age (days)14 30 90
Coe
ffici
ent o
f Var
iatio
n
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Unionized Ammonia
Age (days)14 30 90
Coe
ffici
ent o
f Var
iatio
n
0.00
0.05
0.10
0.15
0.20
0.25
SENSITIVITY OF 1-14 DAY-OLD FATHEAD MINNOWS
Sodium Pentachlorophenol
Age (days)1 4 7 10 14
Mea
n 48
hr L
C50
(ppb
)
0
100
200
300
400
Hexavalent Chromium
Age (days)1 4 7 10 14
Mea
n 48
hr L
C50
(ppm
)
0
50
100
150
200
250
SDS
Age (days)1 4 7 10 14
Mea
n 48
hr L
C50
(ppm
)
01234567
Unionized Ammonia
Age (days)1 4 7 10 14
Mea
n 48
hr L
C50
(ppm
)
0.0
0.5
1.0
1.5
2.0
2.5
A
BB B B
A
AA A A
A
A
B BB B
BBB
B
INTER-TEST PRECISION OF 1-14 DAY-OLD FATHEAD MINNOWS
Age Range (days)1 - 14 4 - 14 7 - 14 10 - 14
Coe
ffici
ent o
f Var
iatio
n
0.0
0.1
0.2
0.3
0.4
0.5
0.6NaPCPCr+6SDSNH3
SUMMARY
• 14-day old fathead minnow larvae are more sensitive to copper & ammonia than 90 day- old fish.
• The inter-test precision of 90 day old fish is equal or better than 14 day-old fish for copper & ammonia.
SUMMARY - Cont.
• Within the 1-14 day age range, 1 day-old larvae are less sensitive to several toxicants.
• The sensitivity of these toxicants becomes constant after 4-7 days of age.
• Maximum inter-test precision for these toxicants is observed when the age range is limited to 7 -14 day old larvae.
REASONABLE POTENTIAL AND TOXICITY TEST
DESIGN
RP DETERMINATION DEFINITION
• “to determine whether the discharge causes, has the reasonable potential to cause, or contributes to an excursion of numeric or narrative water quality criteria” (TSD, 1991)
REASONABLE POTENTIAL
• 40 CFR 122.44(d)(1) requires that the RP procedure address the following:– effluent variability– existing controls on all pollution sources– available dilution– species sensitivity
• WERF POTW survey found that RP is not consistent among regulatory agencies
REASONABLE POTENTIAL EXAMPLES
• Virginia definition is that 75% of tests must meet decision criterion
• Region IX uses a statistical approach adopted from the TSD
• Some states do not issue limits• Some states issue limits to all
major dischargers
VARIABILITY AND RP• Primarily an inter-test issue
– effluent variability– method variability
• How is it determined?– Assumptions
• TSD• Similar facilities
– Collecting sufficient data• Monthly?, Quarterly?, Annually?
VARIABILITY ASSUMPTION ISSUES
• TSD assumption (CV=0.6) may not be accurate
• May take advantage of data for similar facilities, reduces some uncertainty
• Actual data always best - greater certainty in decision to issue limit
• Reduce potential for erroneous conclusions based on a few data points
95%1WLA 95%2
HOW TO ADDRESS VARIABILITYTHROUGH TEST DESIGN
• Consistency between tests:– dilution schemes– dilution water type and characteristics– test vessel dimensions and material– test replicate volume– increase sample size per rep. or conc.– test organism age (acute tests)– species sensitivity affects variability
SPECIES SENSITIVITY AND RP
• Two Components:– Representative of condition to be protected?– Magnitude of toxicity
• Both components affected by:– species– age of life stage– dilution water quality– test type (static, renewal, flow-through)– culturing/handling of organisms
SPECIES SENSITIVITY AND REPRESENTATION OF TOXICITY
• Important that tests be reliable indicators of toxicity, dependent on some test design parameters:–pH–hardness–alkalinity–treatment renewals
TEST AND INSTREAM HARDNESS
• C. dubia sensitive to hardness• C. dubia acclimated and tested at 120 ppm
hardness• Instream and effluent hardness is 300 ppm• Test result due to effluent or sensitivity to
hardness?• Solution: test different organism or C.
dubia cultured at higher hardness
SPECIES SENSITIVITY, TOXICITY, ORGANISM AGE & RP
• Flexibility in organism age tested–acute: significant–chronic: minimal
• Data indicates that age affects sensitivity
SPECIES SENSITIVITY, TOXICITY, DILUTION WATER QUALITY & RP
• Example: pH• If ammonia is present, and pH artificially rises in
test beyond that in real world, ammonia may contribute to toxicity and affect results used to determine RP
• Solution: control pH in tests at levels occurring at the condition of interest (IWC, 100% discharge, etc.) using direct control (CO2 headspace) or flow-through testing
DILUTION & RP
• EPA’s RP approach compares data distribution to WLA
• If WLA predicted to be exceeded by a specific percentile of the distribution, then RP exists
• WLA consists of numeric standard and dilution
Ceriodaphnia sp.
CV = 1.06
Long - Term Average
Chronic Toxic UnitsWLA1 95th % WLA2
Rel
ativ
e Fr
eque
ncy
ADDRESSING DILUTION & RP IN TEST DESIGN
• Center test dilutions on respective effluent concentrations of concern
• Test dilutions below and above • Avoid testing concentrations/conditions
which are unlikely to naturally occur• Maximize dilution factor with intra-test and
inter-test uncertainty in mind
CHOOSING TEST DILUTIONS
• Example:– Chronic IWC = 25%– Dilutions of 23%, 24%, 25%, 26% and 27% may miss
toxicity at 28% which is well within uncertainty of most chronic endpoints and may result in a false negative indication of toxicity
– If dilutions are 6.25%, 12.5%, 25%, 50% and 100%, there is little environmental relevance to results at concentration 4x the IWC
– Choose something in between, like 12%, 17%, 25%, 35% and 50% (dilution factor 0.7)
RP TEST DESIGN SUMMARY
• Minimize inter-test method variability• Insure representative test results
through control of parameters not limited by methods
• Account for dilution in tests• Balance maximum dilution factors in
tests with endpoint uncertainty
MOST SENSITIVE SPECIES SELECTION
SETAC Expert Advisory PanelPerformance Evaluation and
Data Interpretation
MOST SENSITIVE SPECIES (MSS) DETERMINATION
• Purpose–To determine which test species is “most
sensitive” to an effluent source or ambient water
• Desired Toxicity Information from MSS–Variability/Seasonality–Magnitude or frequency of “sensitive”
response
COMMON CONSIDERATIONS
• Test Frequency• Species Selection • Dilution Water Type• Sample Type • Concentration Series• Statistical Analysis
FREQUENCY AND TIMING OF MSS SCREENS
• Balance of Cost and Adequate Information
• Initial or Reevaluation
• Seasonal or Summary Information Desired
SELECTION OF TEST SPECIES
• Diversity of Organism Types–Plant, vertebrate, invertebrate
• Nature of Receiving Water–Salinity, resident species
• Non-promulgated, Resident Species• Suspected Toxicant(s)
–USEPA Region 9 & 10 Guidance Document
SELECTION OF DILUTION WATER
• Method Defined Synthetic Dilution Water
• Natural Receiving Waters
• Receiving Water Defined Synthetic Dilution Water
SELECTION OF SAMPLE TYPE
• Whole Effluents• Receiving Water• Composite or Grab Samples
CONCENTRATION SERIES SELECTION
• Multiple Concentration Tests– Preferred experimental design for MSS screens– Select concentrations based upon IWC and
elucidation of concentration-response (C-R) relationship.
• Single Concentration Tests (Pass/Fail)– Effective if cost is prohibitive– Control and IWC
STATISTICAL ANALYSIS AND INTERPRETATION
• Multiple Biological Endpoints• Combining Multiple Screen Results• Statistical Analysis Method
MULTIPLE BIOLOGICAL ENDPOINT ANALYSIS
• Evaluate each biological endpoint
• Use most “toxic” endpoint
Kelp Germination and Germ Tube Length
Statistical EndpointNOEC EC/IC25
Effl
uent
Con
cent
ratio
n (%
)
0
20
40
60
80
100 GerminationTube Length
METHODS OF COMBINING MSS RESULTS
• Proportion (X times out of Y screens)
• Averaging
Multiple MSSS Data Using FW Chronic Tests
Screen Number1 2 3
EC 25
/IC25
(% E
fflue
nt)
0
20
40
60
80
100
FH CD SC
*
*
*
Species Proportion (X/Y) Average Fathead Minnow (FH) 67 % (2/3) * 87 % Ceriodaphnia (CD) 33 % (1/3) 70 % *Selenastrum (SC) 0 % (0/3) 97 %
STATISTICAL ANALYSIS METHODS FOR MSS SCREENS
• NOEC’s• Point-estimates• Probability of effect at critical
concentration (pECC)
NOEC’S
• Experimental Question
Which method/species is most likely to identify a change from control
response?
ADVANTAGES OF NOEC’S
• Common method
• Integrates effect and intratest variability
SpeciesFH CD SC
NO
EC
(% E
fflue
nt)
0
20
40
60
80
100
*
DISADVANTAGES OF NOEC’S
• Can not separate biological effect and statistical sensitivity
• Can not average• NOEC’s may not
be environmentally relevant
SpeciesFH CD SC
Effl
uent
Con
cent
ratio
n (%
)
0
20
40
60
80
100
NOEC EC/IC25
>100 >100
IWC
POINT ESTIMATES
• Experimental Question
Which method/species shows the
specified effect at the lowest concentration?
ADVANTAGES OF POINT ESTIMATES
• Evaluates a common effect level
• Utilizes the entire concentration-response curve (parametric models)
• Can use proportion or average analysis
Concentration (%)0 20 40 60 80 100
Effe
ct (%
)
0
10
20
30
40
50
60
70
80
90
100FH - EC25/IC25 = 70 % *
CD - EC25/IC25 = 90 %SC - EC25/IC25 = > 100 %
DISADVANTAGES OF POINT ESTIMATES
• Effect level selection• Concentration-
response required• Smoothing• No consideration of
endpoint precision• EC values may not be
environmentally relevant
Concentration (%)0 20 40 60 80 100
Effe
ct (%
)
0
10
20
30
40
50
60
70
80
90
100 FH - EC25/IC25 = 70 % *
CD - EC25/IC25 = 90 %SC - EC25/IC25 = > 100 %
IWC
PROBABILITY OF EFFECT AT THE CRITICAL CONCENTRATION
(pECC)
• Experimental Question
At the concentration of environmental concern, which method/species had the greatest
effect at the lower 95 % confidence limit?
ADVANTAGES OF pECC
• Considers precision of response estimate
• Can use proportion or average analysis
• Environmental relevance
• No concentration-response required
SpeciesFH CD SC
Effe
ct (%
)
-10
0
10
20
30
ECCpECC
*
DISADVANTAGES OF pECC
• Zero replicate variance
• Boot-strapping • Obtaining 95%
confidence intervals at IWC
SpeciesFH CD SC
Effe
ct (%
)
-15
-10
-5
0
5
10
ECCpECC
*0 0
SUMMARY• Discuss the MSS procedure in detail during permit development
• Select variety of organism types• Initially test for trends in toxicity• Continue periodic screening
• Select type of statistical analysis carefully
• Make sure that statistical analysis and the raw results “make sense”
WHOLE EFFLUENT
TOXICITY TEST DESIGN
WET TESTING DESIGN• Important factors
– discharge concentration of concern– type of statistical analysis– typical toxicant(s)– dilution/control water– receiving water quality– number of concentrations tested– stage in testing program (initial, advanced)
DISCHARGE CONCENTRATION
OF CONCERN (COC)• Acute
– initial dilution, if allowed, at edge of acute mixing zone multiplied by 3.3 (TSD, 1991) to convert concentration at LC1 to concentration at LC50
• Chronic– dilution available at edge of chronic
mixing zone
TYPES OF WET TESTS
• COC and control
• Multiple concentrations and control
WET TESTS WITH MULTIPLE CONCENTRATIONS
• Recommended design for discharge monitoring• Usually includes small number of replicates• Focus more on concentration-response
relationship• Dilutions center on COC• EPA recommends dilution factor > 0.5• Maximize dilution factor with endpoint
uncertainty and inter-test variability in mind
WET TESTING ONLY THE COC
• Design for ambient and some discharge monitoring
• Little flexibility in test design• Increase number of replicates and/or
organisms to increase confidence in results • Information on concentration/response
relationship not available and not considered
WET TESTS & WATER QUALITY PARAMETERS
• Important that parameters match goals of testing, either:–instream condition of discharge
upon dilution, or–inherent toxicity of discharge
independent of instream condition
WET TEST WATER QUALITY PARAMETERS
• Most common parameters of concern– hardness– salinity– pH– temperature– conductivity
• Test design solution: extra controls
EXAMPLE OF ADDITIONAL CONTROL
TO ADDRESS HARDNESS• Example goal: test instream condition of
discharge after dilution • Daphnids cultured at 120 ppm• Discharge and receiving water are at 300
ppm• Prepare extra controls at 300 ppm
hardness and compare results with dilutions tested
WET TEST DESIGN AND TYPICAL TOXICANTS
• The toxicant(s) suspected determine if and which test conditions are important
• Good example is ammonia:– pH affects ammonia toxicity– pH is not strictly limited by the methods– pH drift beyond realistic levels may bring
unionized ammonia to unrealistic levels• Test design solution: use pH control in WET
tests
WET TEST DESIGN &DILUTION/CONTROL WATER
• Depends on test goals• Instream mixed discharge condition
– use of water upstream from discharge preferred
– second choice is water similar to upstream – as culture and dilution water differ,
acclimation importance prior to testing increases
WET TESTING FREQUENCY• Dependent on variability in condition
(instream or discharge)• As variability increases, frequency
should increase• Balance variability and frequency of
testing with cost• Goal is to accurately represent the
condition in question
WET TEST DESIGN & STAGE OF TESTING
• Species sensitivity varies with biological endpoints and test conditions
• Frequency of testing and number of endpoints tested can decrease as data set increases
WET TEST DESIGN & STATISTICS
• Statistical approach used to analyze results affects test design and usually is permit-defined
• Point estimates benefit from fewer replicates but more treatments
• Hypothesis testing benefits from greater numbers of replicates but the number of treatments minimally affects results
WET TEST DESIGN SUMMARY
• Focus on condition to be tested and question being asked
• Insure test parameters are representative of condition being tested
• Testing frequency is driven by temporal variability in condition
• Design tests to meet requirements of statistical approaches to be used
Ambient Water Testing:
Experimental Design and Data Analysis
SETAC Expert Advisory Panel Performance Evaluation and Data
Interpretation
AMBIENT TOXICITY TESTING
OBJECTIVES OF AMBIENT TOXICITY TESTING
• Objectives vary–General assessment of water quality in streams, rivers, bays,
ocean
• Determine whether water body should receive more focused assessment
• Assess whether water body or segment thereof should be placed or taken off of CWA 303d list of impaired waterways
• Ascertain source of water contamination
OBJECTIVES OF AMBIENT TOXICITY TESTING - Cont.
• Compare results of effluent toxicity tests with receiving water tests
• In conjunction with TIEs, and associated chemical analysis, identify the cause(s) of contamination
• Assess the success of remediation efforts
• Determine compliance with water quality standard for toxicity
INFORMATION PROVIDED BY AMBIENT TOXICITY TESTING
• Toxicity testing procedures with TIEs and chemical analyses have been used effectively to identify the chemical causes and sources of water quality contamination.
• When applied in conjunction with carefully designed sampling regimes (e.g., site selection and timing of collection) these procedures can describe:
– Magnitude of toxicity– Temporal extent (duration and frequency)– Spatial/geographic distribution– Land use practices responsible for toxicity
STRENGTHS OF SINGLE SPECIES TESTS
• An integrative measure of aggregate, additive toxicity
• Provide a direct measure of toxicity and bioavailablity
• In combination with TIEs, they can identify chemical cause(s) of toxicity
• Measure toxicological responses to chemicals for which there are no chemical specific water quality standards
STRENGTHS OF SINGLE SPECIES TESTS - Cont.
• Reliable predictors of instream impacts
• Afford reliable, repeatable, and comparable results compared to other types of biological and chemical tests
• Furnish an early warning signal so that actions can be taken to minimize ecosystem impacts from toxic chemicals
• Can be performed quickly and inexpensively compared to other biological monitoring procedures
LIMITATIONS OF SINGLE SPECIES TESTS
• Do not characterize the persistence/duration or frequency of exposures in ambient waters without repeated sampling and testing
• Do not directly measure biotic community responses
• Do not encompass the range of species, sensitivities, or functions (endpoints) responsive to toxic chemicals which occur in biological communities
LIMITATIONS OF SINGLE SPECIES TESTS - Cont.
• Do not measure delayed impacts nor effects due to bioaccumulation or bioconcentration, mutagenicity, carcinogenicity, teratogenicity, and enrichment.
• Laboratory tests do not reflect the multivariate and complex exposure conditions which exist in many aquatic ecosystems
• Results may underestimate biotic community responses to chemicals because of multiple stressors acting on aquatic ecosystems
• Use of surrogate species may not represent toxicological sensitivities in some aquatic ecosystems
LIMITATIONS OF SINGLE SPECIES TESTS - Cont.
AMBIENT TESTING METHODS
• Usually U.S. EPA marine or freshwater methods
• Other (e.g., ASTM) protocols or indigenous species tests are sometimes used
DEVIATIONS FROM U.S. EPA EFFLUENT TESTING
PROCEDURES
• Ambient water testing follows U.S. EPA protocols for testing effluents with a few exceptions
• A dilution series usually is not included in testing until TIEs are performed on toxic samples
• Water renewals may be from a single sample• Number of control replicates may be increased• Tests are conducted in glass or teflon containers
“TIERED” APPROACH TO AMBIENT TESTING
• Initial surveys intended to characterize watershed or waterbody sites over several years or hydrologic cycles - sampling may be monthly
• Focused follow-up studies may include:– Increased number of sites and frequency of sampling
– TIEs conducted
– Evaluation monitoring to assess toxicity reduction/remediation efforts
EXPERIMENTAL DESIGN
• Centers around selection of:
– Surface waterbody or segment(s) thereof to be monitored
– Number and location of sampling sites
– Sample type
– Timing/period and frequency of sampling
FACTORS TO CONSIDER WHEN SELECTING SAMPLING SITES
• Significant source of flow or loads into the watershed?
• Representative type of drainage (agriculture, urban, mining, etc.)?
• Receives runoff from particular land use?
• Predicted or suspected toxicity?
• “Integrator” site indicative of inputs and/or of waterway (e.g., near mouth of river)
• Previously identified toxicity?
• Critical or sensitive habitat?
TYPE OF SAMPLE
• Composite collected over various time periods
• Sub-surface grab sample
SELECTING PERIOD AND FREQUENCY OF SAMPLING
• Selecting sampling period depends on objectives of investigation
• Selecting sampling frequency relates to defining duration and frequency of toxic events
DATA ANALYSIS
• EPA recommends t-tests to compare laboratory control to single ambient water sample
• ANOVA and Dunnett’s multiple comparison are appropriate for multiple sites/samples
ECOLOGICAL RELEVANCE QUESTION
• Are the results of the U.S. EPA tests, or other single species tests, reliable predictors of biotic community responses/impacts?
TWO REVIEWS OF ECOLOGICAL RELEVANCE ISSUE
• Waller W.T., et. al. 1996.
• de Vlaming V, Norberg-King T.J. 1999.
ENCAPSULATED CONCLUSIONS OF REVIEWS
• SETAC Panel - “It is unmistakable and clear that when U.S. EPA toxicity test procedures are used properly, they are reliable predictors of environmental impact provided that the duration and magnitude of exposure are sufficient to resident biota.” and “a strong predictive relationship exists between ambient toxicity and ecological impact.”
ENCAPSULATED CONCLUSIONS OF REVIEWS - Cont.
• de Vlaming and Norberg-King - The U.S. EPA, and other single species toxicity test results are, in a majority of cases, reliable qualitative predictors of responses in aquatic ecosystem populations.
DE VLAMING AND NORBERG-KING SUMMARY
• Available literature yields a weight of evidence demonstration that WET, and other indicator species, toxicity test results are reliable qualitative predictors of biotic responses.
• There are no empirical data which demonstrate that the indicator species results consistently fail to provide reliable predictions of instream biological responses.
DE VLAMING AND NORBERG-KING SUMMARY - Cont.
• When toxicity test results fail to provide a reliable prediction, they more frequently underestimate instream biological responses.
• Lab toxicity test results do not tend to overestimate bioavailability of chemicals.
• Reliability with which toxicity test results predict instream biological responses increases when tests are performed on ambient waters and with magnitude of toxicity.
• Reliability with which toxicity test results predict instream biological responses increases with characterization of persistence and frequency of toxicity.
• Reliability with which toxicity test results predict instream biological responses increases with effective matching (or accounting for) of lab and field exposures.
DE VLAMING AND NORBERG-KING SUMMARY - Cont.
TIE/TRE TEST DESIGN
TIE/TRE GOAL• To identify, confirm and remove toxicant(s) in
order to bring effluent into compliance with water quality standards
• Test design is dependent on the phase of the TIE and the magnitude/variability of toxicity present
• As toxicity decreases, number of replicates and identification/confirmation trials may need to increase
TEST DESIGN AND PHASE I TIE
• Use species that were used in testing which suggests toxicity
• Many sample manipulations
• Minimum number of replicates/treatment
• Primarily analyze with hypothesis testing and BPJ
• Test at 100% concentration or concentration providing significant response compared to controls
• Minimum QA/QC
TEST DESIGN AND PHASE III TIE
• May use more than one species to compare sensitivities in supporting hypothesis
• Few sample manipulations• Number or replicates and treatments similar to normal
tests• May use hypothesis or point estimate statistical
approaches - depends on permit• Usually test at multiple concentrations to support point
estimates and to capture concentration-response relationships
• Standard QA/QC
OTHER TIE/TRE TEST DESIGN ISSUES
• Flexibility• Temporal variability within and
between samples• Screening• Dilution water• Controls for manipulations• QA/QC
FLEXIBILITY• Be creative• Do not be constrained by required methods• Consider toxicology in test design and
interpretation– rate of action– changes with organism age or development
• Consider magnitude of toxicity for chronic TIEs - can you use acute tests?
REFERENCE TEST APPROACHFLUORIDE LC50S FOR EFFLUENT
AND LAB WATER Age (days) Series #1 Series #2 Series #3
2 7.8, 4.7 7.1, 4.4 8.0, 5.0
4 11.0, 6.8 11.7, 6.8 9.5, 7.3
6 16.3, 9.3 17.6, 8.0 18.6, 9.2
TEST DESIGN & TEMPORAL VARIABILITY
• Variability can occur within and between samples, as well as between toxicant(s), over time
• As toxicity persistence within samples decreases, may increase requirement for renewals
• As temporal variability in toxicant identity and magnitude of toxicity increases, the number of trials increases
TIE/TRE TEST DESIGN AND SCREENING
• Only possible if screen can be a reliable predictor of toxicity in definitive test
• Utility of screens impacted when toxicity is not persistent
• Good idea when toxicity is unpredictable between samples - saves resources
• Difficult for chronic TIE/TREs
TIE/TRE TEST DESIGN AND DILUTION WATER
• Should use same dilution water as that in tests which originally suggested toxicity
• Advisable to test another dilution water to see if it impacts test results
• Dilution water may influence toxicity and TIE interpretation
• Differences may be biological, chemical or physical
TIE/TRE TEST DESIGN AND ADDITIONAL CONTROLS
• Phase I includes numerous manipulations of tested sample
• Manipulations may cause toxicity independent of samples
• Be wary of chemical additions which oxidize or reduce (examples will be provided)
• Solution: treat control water in same fashion as sample and add to test as another control
TIE/TRE TEST DESIGN SUMMARY
• Design changes with stage of study
• Focus resources on issues specific to each stage of study
• Maintain flexibility and creativity
• Avoid false conclusions with multiple controls and checks
• Expertise