Post on 27-Aug-2018
transcript
Case Studies
of Change-of-Support Problems
L. BERTINO & H. WACKERNAGEL
IMPACT Project Report No 20 (Contract IST-1999-11313)
December 2002
Technical Report N21/02/GENSMP - ARMINES, Centre de Gostatistique
35 rue Saint Honor, F-77305 Fontainebleau, France
http://cg.ensmp.fr
http://www.mai.liu.se/impacthttp://cg.ensmp.fr
Contents
Summary 4
I Hamburg harbor oxygen deficit 5
1 Oxygen deficit problem and data 7
1.1 Oxygen deficit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Goals of the case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 The Elbe monitoring network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Prediction strategy: using past upstream observations . . . . . . . . . . . . . . . . . 14
1.5 A simplified linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Estimation of transport times 22
2.1 Definition of a transformed time coordinate . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Analysis of conductivity time series . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Accounting for mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 An empirical predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Oxygen losses from Cumlosen to Seemannshoeft 30
3.1 Identifying a covariate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Conditioning to discharge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Conditioning to temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4 Multivariate analysis using both helicopter and station data 38
4.1 Intersections of the station and helicopter data . . . . . . . . . . . . . . . . . . . . . 38
4.2 Preliminary analysis: variables selection . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 Predicting the location of the oxygen minimum . . . . . . . . . . . . . . . . . . . . 41
4.4 Analysis of the raw variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5 Defining a seasonal normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.6 Analysis of the residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5 Conclusion 46
Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2
CONTENTS 3
II Downscaling Paris area ozone pollution 48
6 Points, blocks, panels 50
6.1 The point-block-panel problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.2 Uniform conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7 Uniform conditioning by external drift kriging 53
7.1 Change of support model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.2 Global statistics for blocks and panels . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3 Uniform conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8 Conclusion 62
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
III Exploring a decade of hourly Helsinki ozone data 63
9 Seasonal behavior 65
9.1 Data presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.2 Exploring seasonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
10 Decadal trends 89
11 Daily cycle 99
12 Conclusion 116
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Bibliography 117
Summary
This report1 is Deliverable 20 of the EC funded IMPACT IST project on Estimation of Human
Impact in Presence of Natural Fluctuations2. It is subdivided into three parts, each presenting a
different case study undertaken during the last year of the IMPACT project.
The Part I examines the periodic oxygen depletion that has been observed in Hamburg harbor
every summer for almost twenty years. This work was performed in collaboration with the GKSS
Forschungszentrum (Geesthacht, Germany).
Part II treats a downscaling problem based on a change of support model: to evaluate with point
station data and 66 km2 transport model output the proportion of 11 km2 surface units that areabove prescribed ozone exposure levels. This study was prepared after discussions with the air pollu-
tion group of the LMD and the Airparif association (Paris, France). The study will be given in a more
detailed form with computational explanations in Deliverable 4 to serve as an example of application
of geostatistical techniques.
Part III presents an exploratory data analysis of eleven years of hourly data of ozone, nitrogen
oxide/dioxide at three stations, as well as of meteorogical variables from an airport station. This work
was performed in collaboration with the Finnish Meteorological Institute (Helsinki, Finland).
In view of Deliverable 4 about software use we gather in this report already some details about
computation and in particular about demonstrative use of statistical (S-Plus, R) and geostatistical
(Isatis) software. The close contact with end users mainly in the French air pollution community
we had during the IMPACT project has taught us that the community is in great need of practical case
studies showing how to analyze and model air pollution data to quantify and understand the impact of
natural and human factors.
1The report is available as a pdf file and is best viewed with the Acrobat-reader, which allows to take full advantage ofthe internal and external links.
2See the Website: http://www.mai.liu.se/impact/.
4
http://www.mai.liu.se/impact
Part I
Hamburg harbor oxygen deficit
5
6 Hamburg harbor oxygen deficit
Abstract
The Hamburg harbor oxygen deficit study is subdivided into five chapters:
Chapter 1 presents the objectives and the available station and helicopter data. Chapter 2 deals with the estimation of transport times. Chapter 3 examines the oxygen losses between an upstream and a down-
stream station.
Chapter 4 sets up a multivariate analysis between station data and helicopterprofiles.
Chapter 5 draws conclusions and perspectives.
Chapter 1
Oxygen deficit problem and data
In this chapter the oxygen depletion problem and the goals of the study are first discussed. The two
automatic measurement stations to be used and the helicopter profiles are then presented in detail.
1.1 Oxygen deficit
A periodic oxygen depletion has been observed in Hamburg harbor every summer for almost twenty
years, even though the deficit is less dramatic since 1990, low oxygen concentrations around 3 mg/l are
still often observed (see for example af summer helicopter profile on Figure 1.1). The oxygen deficit
has dramatic ecological consequences, in particular such low concentrations do not allow migrating
fishes to reach the upstream part of the river.
The oxygen deficit problem. The oxygen balance is controlled by several processes (atmospheric
aeration, primary production, biochemical oxygen demand (BOD) from both the water and the sedi-
ment phase and transport processes). The oxygen deficit has been interpreted [17] as the result of the
decay of algae in the deep waters of Hamburg harbor, and a zero-dimensional model (WAMPUM) has
shed some light on the difficulties of ecological modeling of the river and has shown that the oxygen
balance is very sensitive to meteorological (light and wind conditions) and biological parameters (algal
growth, respiration and loss rates). Detailed discussions of the modeling of different processes have
been given in [2], and the biomass balance has been described by [1], showing substancial losses dur-
ing the transport. Physical modeling of transport in the estuarine part of the Elbe has been performed
by the German Federal Waterways Engineering and Research Institute (BAW-AK, Hamburg) using
the TRIM hydrodynamical model. It should be noted, however, that most of the ecological modeling
of the Elbe has been performed without including transport terms, i.e. under lake-like hypotheses.
The Elbe is one of the best observed rivers in Europe, continuous measurements at fixed stations
provide temporally dense information [13], while helicopter profiles (6 per year: typically in february,
may, june, july, august and november) give information at high spatial resolution. In [2] changes in
the oxygen deficit has been related to changes in the nutrient loads in the Elbe river. The weekly
nutrient measurements at 20 fixed stations along the Elbe has been used for process identification
using Principal Component Analysis and Graphical Gaussian models [12]. Empirical computation of
7
8 Oxygen deficit problem and data
Figure 1.1: Measurements from the 4th of july 2001 helicopter profile. The helicopter flies from theNorth Sea (left) to Geesthacht weir (right). The oxygen deficit in Hamburg haven is clearly visiblebetween km 630 and km 650, the critical sill of 3 mg/l is reached.
transport times from cross-covariance analysis has been performed by [3].
It appears that the oxygen deficit is dependent on both human and natural factors. The dependence
on human activities is explained by at least two factors: the main one is the nutrient load, mostly com-
ing from urban, agricultural and industrial activities. Further, the Hamburg port authorities contribute
significantly to an increased water turbidity - thereby enhancing the processes responsible for the
oxygen deficit - by dragging the river banks, which entails a lowering of water temperatures.
Natural factors are the light conditions, the water temperature and the river discharge.
1.2 Goals of the case study
In close collaboration with the GKSS group we have set up this study of the estuarine part of the
river Elbe over the last ten years, after dramatic changes of biological regime due to the German
reunification that indirectly caused a reduction of nutrient loads of the river. The domain limits are
from its upstream boundary at weir Geesthacht to the North Sea through the city of Hamburg. With
the aim to assist a proper description of the phenomena responsible for the oxygen deficit in Hamburg
and to quantify the probability that the oxygen concentrations get below a critical level, geostatistical
tools can be applied to address the following questions:
1. The transport times from the part of the river where the algae bloom to Hamburg harbor benefits
Oxygen deficit problem and data 9
the modeling of all biogeochemical processes because it determines the final time of the reac-
tions. The transport time is related to the river discharge, transport of particular matter is slower
than that of dissolved matter and is also subject to the influence of tides. The transport times
are usually computed using a simplistic model of successive unmixed water sections under
the assumption that the river level is constant. In case this way of computing transport times is
found unsatisfactory, this information can alternatively be accessed by cross-covariance analysis
of the time series of different variables measured in stations Wehr Geesthacht (upstream) and
Seemannshoeft (in Hamburg harbor). The main difficulty in this task is that the numerous
biogeochemical processes occurring in the river partially mask the signature of the transport
phenomenon.
2. Multivariate linear statistics can be used to link the characteristics of the oxygen deficit (its posi-
tion and the value of the minimum oxygen concentration) to variables expressing (1) the natural
variations of the water characteristics (light, temperature, discharge) and (2) variables reflecting
human activities (nutrients). This provides a basis for normalizing the oxygen concentration
data and for evaluating the impact of human activities on the oxygen deficit phenomenon.
3. The relative share of different processes likely to cause the oxygen deficit might vary in time,
which makes the calibration of models a difficult task. A thorough geostatistical analysis of
spatio-temporal multivariate data like helicopter profiles and measurements at fixed stations
can provide insight on unexpected and complex relationships between the numerous biologi-
cal, chemical and physical variables. Although these relationships are often nonlinear (see
for example the scatterplots of dissolved oxygen versus river discharge in Figure 3.4), linear
multivariate geostatistics can be used to exhibit the main relationships and to compute their
spatial and temporal scales. Nonlinear geostatistics could then be applied to explore problems
of support normalization and threshold characterization.
Only the two first topics are detailed further in the present report. The resolution of the first problem
provides a change-of-coordinates model [3] that has direct consequences on the second. Empirical
short-term predictions of the Elbe oxygen deficit, using only observations but no model, are studied
thereafter.
1.3 The Elbe monitoring network
Three different types of data are collected along the German part of the Elbe: the first type are the
weekly manual measurements taken at 20 locations measuring ordinary physical and chemical param-
eters (temperature, conductivity, pH, oxygen concentration) and biological parameters such as nutrient
concentrations (phosphate, nitrate, nitrite and ammonium). An analysis of these data from 1993 to
1997 can be found in [3]. Here we focus on the two other types of data: the continuous records of
automatic stations and the helicopter profiles, keeping in mind that the latter have a much higher cost,
related to hiring the helicopter and to laboratory work.
10 Oxygen deficit problem and data
Figure 1.2: Map of the middle Elbe with the station Cumlosen. The physical boundary of the middleElbe is at Geesthacht weir. The flow comes from the South East to the North West.
Automatic measurement stations
Two stations have been selected because of their location: Cumlosen and Seemannshoeft.
Upstream data: the station Cumlosen
The Cumlosen station located in the region of maximum primary production, at the former East-West
German borderline and downstream from the junction of the Elbe with its main tributaries (Havel and
Saale). See the map on Figure 1.2.
The data considered are the 5 years from 1997 to 2001 included. These years were interesting
because Chlorophyll-a data were not available before. This station will be denoted below as the up-
stream station, at position x1. The Cumlosen data have been preferred to the Weir Geesthacht station
data, being more representative of the region of primary production. However it should be noted that
depending on the light and discharge conditions, the maximum of primary production activity can be
located at varying locations between Cumlosen and Weir Geesthacht. A separate study could account
for this. The sensors are located at the water surface. The parameters automatically recorded are the
following: oxygen, pH, water temperature, conductivity, global solar radiation and Chlorophyll-a con-
centrations. A plot of the original time series can be found at the end of the chapter in Figure 1.5, p16
(some outliers have been removed from the dataset). The Chlorophyll-a is a measure of the primary
production activity in the Elbe. The pH and dissolved oxygen concentration are also related to pri-
mary production and should react to the solar irradiation and the water temperature. The conductivity
Oxygen deficit problem and data 11
is indifferent to primary production and varies as an inverse of the river discharge.
Some problems have been encountered with this data, apart from missing or outlying values. The
oxygen concentrations before the 1st of january 2001 are bounded to 15 mg/l. That is, all superior
oxygen concentrations are equal to 15 mg/l, 166 hourly measurements (among 43848!) are therefore
underestimated. As the oxygen concentrations at Cumlosen exhibit a strong daily cycle, a way to
correct this sampling problem could be local spectral fitting, since the lower half of every cycle is
correctly sampled.
Hamburg data: the station Seemannshoeft
The Seemannshoeft station is located at Hamburg harbor, approximately where the oxygen deficit
starts. See the map on Figure 1.3. The Seemannshoeft data kept for the study were the 5 years
from 1996 to 2000 included, the year 2001 was not delivered by the Arge Elbe so that most of the
work has been done with only 4 years of data common to both stations from 1997 to 2000 included.
This estuarine station location will be denoted x2 hereafter. The parameters are the same as in
Cumlosen, at the exception of Chlorphyll-a and solar radiation that are not measured. The water
levels at Seemannshoeft or at nearby stations (Blankenese or Bunthaus) are recorded. They vary
according to the tides, to storms in the North Sea or to the Elbe discharge. The time series are plotted
at the end of the chapter on Figure 1.6, p17.
The pH time series has been corrected, the values drifting to 9.6 have been removed from the data.
The water level (Pegel) variable is more complex than the others since from one year to another,
the measurements given do not come all from the same station, as below:Year Station 1 Station 2
1996 Seemannshoeft Blankenese1997 Seemannshoeft1998 Seemannshoeft1999 Kattwich Blankenese2000 Blankenese Kattwich
The selected station is in the first place. In the year 2000, the scatterplot between Kattwich and
Blankenese looks strange, as if the variable names were inverted. Therefore we inverted the selection
for 2000.
The main difference between the water levels at the two locations is an additional constant. As a
consequence, we subtracted the data median in each year to put all water levels on the same baseline.
Details of the Splus database The original hourly time series are stored in the Cuml and Seem
data frames under Splus3, the frame columns are the different variables and two time coordinates.
Datum is the decimal day coordinate inherited from the date conversion to numeric in Excel. A more
convenient time coordinate is defined as the decimal year, Year
12 Oxygen deficit problem and data
Figure 1.3: Map of the Elbe estuary with the station Seemannshoeft as the green point at kilometer628.8. The Elbe estuary starts in principle from Geesthacht weir, but actual mixing with marine watersmost often occurs a little downstream from Hamburg. The Elbe river flows from the South East to theNorth Sea.
december 2001 were missing in the original Excel files so that the values in the file were declared as
missing values.
A particular data problem arose with the Cumlosen solar irradiation data (Glob) since all night
measurements showed no value instead of zeros. The problem in Splus was to convert all these NA
to zero radiation without turning genuine missing values to no light values, which would bias the
whole radiation dataset. The problem was solved by extracting the dawn and dusk times from the
Glob dataset when sufficient data is available during the day, and by putting to zero all NA before
dusk and after sunset times. A few singular situations were tuned by hand. Moreover, the solar
radiation values from number 10000 to 11350 (in 1997) were 10 times lower than all other radiations
and multiplying them by 10 made them similar to the other periods.
Discharge measurements
The river discharge is measured every day from 1985 to 2001, in station Neu Darchau (see Figure 1.2),
as the Elbe has few tributaries downstream from Neu Darchau, this measure is expected to be relevant
all along the river down to the estuary until the tide influences the runoff.
Splus database The original daily discharge measurement data has been stored in the data frame
Discharge, as it varies slowly from one day to another, it has been linearly interpolated to an
Oxygen deficit problem and data 13
hourly support and copied to the data frames Seem and Cuml as the variable Q. In the Discharge
data frame, the Year time coordinate was set rigorously equal as the Year above, the transformation
from the Excel numeric time coordinate Datum is:
Discharge$Year
14 Oxygen deficit problem and data
and suspended matter. Each of the tables contains the measurements from the 31 helicopter profiles
at 28 locations, the curvilinear coordinate of the measurement location (starting from the Elbe source:
585.5 km is Geesthacht weir and 757 km is the North Sea) is reported in all tables as a profile zero.
The numerical date was created by extracting integers from the character date by the command sub-
script(), and defined from 1996 to the 28/02/2000 as below:
day + 365*year + 26 + Year.cum[month] ,
where Year.cum[i] is the cumulated number of days in a year before month i (0, 31, 59, etc.).
From the 29th february 2000 to the end of 2001, we add 27 days instead of 26.
1.4 Prediction strategy: using past upstream observations
The experimental variograms of the oxygen concentrations times series (Figure 1.4) show that the
estimation of future oxygen concentrations by linear combinations of the past ones is not a reasonable
task. Indeed, the average temporal range of both variograms (upstream and estuary) is too short: only
15 days, which means that present oxygen concentrations are completely uncorrelated with the ones
that were observed 15 days before.
However, useful empirical predictions can be produced by multivariate statistics using upstream
information. The transport times from the upstream sampling site to the common location of the
oxygen minimum, depending of the river runoff, can take a few days in case of high runoff, or more
than two weeks in case of low runoff [2]. As the oxygen deficit can be explained by the biological
interactions in the Elbe water [17], it should be possible to correlate the oxygen concentration in a
water body in the estuary with upstream covariates observed at the time the same water body passed
the upstream measurement station. A prediction could be then produced, whose result would be the
transport time from upstream to downstream.
1.5 A simplified linear model
We assume that the oxygen losses from upstream to the estuary depend linearly on some upstream
physical and biological parameters, after a time delay reflecting transport times. We write therefore
the following linear regression:
O2(x2, t + ) = O2(x1, t) +
i
iZi(x1, t),
where the upstream variables Zi are, up to a possible transformation, Chlorophyll-a, water tempera-
ture, solar radiation, water pH, water conductivity and Elbe runoff. The regression parameters i are
unknown and can be computed via a Principal Component Analysis.
If we assume some reactor-like dynamics1, an unknown multiplicative parameter can also be set
for the upstream oxygen measurements, but we will see hereafter that this might not be necessary.
1Following a remark by Renata Romanowicz (Lancaster).
Oxygen deficit problem and data 15
Distance
Var
iogr
amm
e
0 10 20 30 40 50 60
0.0
0.5
1.0
1.5
2.0
UpstreamEstuary
Figure 1.4: Temporal variograms of oxygen concentrations time series upstream and in the estuary,computed on whole years. The average temporal range is no longer than 15 days. In the estuarystation, the seasonal normal was removed from the data to make the variogram stationary, this mightexplain the presence of the hole effect in the estuary variogram at around 45 days.
Accounting for mixing, the water body observed in the estuary is made from particles that have
passed the upstream station at different times, thus the previous equation writes:
O2(x2, t + ) =
tO2(x1, t + t) +
i
iZi(x1, t + t)ft(t)dt,
where t is distributed along the density ft that may also vary in time (according to the various hy-drodynamical parameters, mainly river discharge). A simpler presentation2 consider the first equation
and make the transport time a random variable. Here we will first estimate the mean transport time
in an adequate time coordinate and suppose that the distribution ft is uniform on a given interval.
2Following a remark by Julien Sngas (Fontainebleau).
16 Oxygen deficit problem and data
Tim
e (y
ear)
T
1997
1998
1999
2000
2001
2002
01020
Tim
e (y
ear)
O2
1997
1998
1999
2000
2001
2002
481216
Tim
e (y
ear)
pH
1997
1998
1999
2000
2001
2002
7.58.59.5
Tim
e (y
ear)
Cond
1997
1998
1999
2000
2001
2002
6001000
Tim
e (y
ear)
Chl
1997
1998
1999
2000
2001
2002
050150
Tim
e (y
ear)
Glob
1997
1998
1999
2000
2001
2002
0400800
Figure 1.5: Upstream hourly measurements at Cumlosen.
Oxygen deficit problem and data 17
Tem
pera
ture
See
m$Y
ear
Seem$T
1996
1997
1998
1999
2000
2001
051020
Oxy
gen
Con
cent
ratio
n (m
g/l)
See
m$Y
ear
Seem$O2
1996
1997
1998
1999
2000
2001
0246812
pH
See
m$Y
ear
Seem$pH
1996
1997
1998
1999
2000
2001
7.58.08.5
Con
duct
ivity
(m
icro
S/c
m)
See
m$Y
ear
Seem$Cond
1996
1997
1998
1999
2000
2001
6008001200
Dis
char
ge (
m3/
s)
Dis
char
ge$Y
ear
Discharge[, 2]
9697
9899
100
101
10003000
Figure 1.6: Estuary hourly measurements at Seemannshoeft (Hamburg).
18 Oxygen deficit problem and dataT
=5
X12
.02.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
5
X12
.02.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
13.9
X06
.05.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
23.4
X11
.06.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
22.6
X09
.07.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
21.1
X01
.09.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
6.1
X05
.11.
1997
helicopter$O2[, i]
580
620
660
700
0246812
T=
6.9
X17
.02.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
20.2
X12
.05.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
22.5
X10
.06.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
16.6
X09
.07.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
16.1
X25
.08.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
16.5
X07
.09.
1998
helicopter$O2[, i]
600
620
640
0246812
T=
2.4
X03
.02.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
14.9
X17
.05.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
21.9
X15
.06.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
25.2
X13
.07.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
22.5
X30
.08.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
3.8
X25
.11.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
3.1
X22
.02.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
19.8
X08
.05.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
20.5
X06
.06.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
22.3
X04
.07.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
23.6
X14
.08.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
11.4
X01
.11.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
4.6
X12
.02.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
14.8
X07
.05.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
14.4
X05
.06.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
23.4
X04
.07.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
24
X20
.08.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
10.4
X05
.11.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
2.6
X12
.02.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
12.5
X06
.05.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
19.5
X11
.06.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
20.5
X09
.07.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
22.8
X01
.09.
1997
helicopter$O2[, i]
600
650
700
750
0246812
T=
6.4
X05
.11.
1997
helicopter$O2[, i]
580
620
660
700
0246812
T=
5.1
X17
.02.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
16.6
X12
.05.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
18.9
X10
.06.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
17
X09
.07.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
18.4
X25
.08.
1998
helicopter$O2[, i]
600
650
700
750
0246812
T=
16.5
X07
.09.
1998
helicopter$O2[, i]
600
620
640
0246812
T=
3.4
X03
.02.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
15.3
X17
.05.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
20
X15
.06.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
22.8
X13
.07.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
19.5
X30
.08.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
6
X25
.11.
1999
helicopter$O2[, i]
600
650
700
750
0246812
T=
3.8
X22
.02.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
18.8
X08
.05.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
17.5
X06
.06.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
19.1
X04
.07.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
20.2
X14
.08.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
11.4
X01
.11.
00
helicopter$O2[, i]
600
650
700
750
0246812
T=
3.3
X12
.02.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
14.1
X07
.05.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
16
X05
.06.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
20.2
X04
.07.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
21
X20
.08.
01
helicopter$O2[, i]
600
650
700
750
0246812
T=
11.4
X05
.11.
01
helicopter$O2[, i]
600
650
700
750
0246812
Figure 1.7: Oxygen concentration profiles, average temperatures are given above.
Oxygen deficit problem and data 19X
12.0
2.19
97
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X06
.05.
1997
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X11
.06.
1997
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X09
.07.
1997
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X01
.09.
1997
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X05
.11.
1997
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
580
620
660
700
7.08.09.0
X17
.02.
1998
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X12
.05.
1998
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X10
.06.
1998
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X09
.07.
1998
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X25
.08.
1998
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X07
.09.
1998
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
620
640
7.08.09.0
X03
.02.
1999
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X17
.05.
1999
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X15
.06.
1999
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X13
.07.
1999
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X30
.08.
1999
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X25
.11.
1999
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X22
.02.
00
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X08
.05.
00
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X06
.06.
00
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X04
.07.
00
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X14
.08.
00
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X01
.11.
00
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X12
.02.
01
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X07
.05.
01
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X05
.06.
01
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X04
.07.
01
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X20
.08.
01
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
X05
.11.
01
helic
opte
r$pH
[, 1]
helicopter$pH[, i]
600
650
700
750
7.08.09.0
Figure 1.8: pH helicopter profiles.
20 Oxygen deficit problem and dataX
12.0
2.19
97
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0200600
X06
.05.
1997
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0200400
X11
.06.
1997
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50100150
X09
.07.
1997
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
050100150
X01
.09.
1997
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100200300
X05
.11.
1997
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
580
620
660
700
0100300
X17
.02.
1998
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0200400600
X12
.05.
1998
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100300
X10
.06.
1998
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100200300
X09
.07.
1998
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50100200
X25
.08.
1998
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50150250
X08
.12.
1998
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50150
X03
.02.
1999
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100200300
X17
.05.
1999
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100200300
X15
.06.
1999
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
050100200
X13
.07.
1999
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
050150250
X30
.08.
1999
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50100200
X25
.11.
1999
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100300
X22
.02.
00
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50150250
X08
.05.
00
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50100150
X06
.06.
00
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
050100150
X04
.07.
00
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
050100200
X14
.08.
00
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
04080120
X01
.11.
00
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
50150250
X12
.02.
01
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0200400600
X07
.05.
01
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100300
X05
.06.
01
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
100200300
X04
.07.
01
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
050150250
X20
.08.
01
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
050150250
X05
.11.
01
helic
opte
r$F
iltr[
, 1]
helicopter$Filtr[, i]
600
650
700
750
0100300
Figure 1.9: Suspended matter helicopter profiles.
Oxygen deficit problem and data 21X
12.0
2.19
97
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.040.08
X06
.05.
1997
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X11
.06.
1997
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X09
.07.
1997
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.08
X01
.09.
1997
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.060.100.14
X05
.11.
1997
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
580
620
660
700
0.060.080.10
X17
.02.
1998
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.030.050.07
X12
.05.
1998
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X10
.06.
1998
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X09
.07.
1998
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.040.10
X25
.08.
1998
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.040.10
X07
.09.
1998
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
620
640
0.030.060.09
X03
.02.
1999
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.040.06
X17
.05.
1999
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.010.040.07
X15
.06.
1999
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X13
.07.
1999
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X30
.08.
1999
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.00.060.12
X25
.11.
1999
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.060.100.14
X22
.02.
00
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.0400.0550.070
X08
.05.
00
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.010.040.07
X06
.06.
00
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.06
X04
.07.
00
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X14
.08.
00
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.020.060.10
X01
.11.
00
helic
opte
r$P
O4[
, 1]
helicopter$PO4[, i]
600
650
700
750
0.040.080.12
Figure 1.10: Ortho-phosphate helicopter profiles.
Chapter 2
Estimation of transport times
In this chapter we attempt to substitute a simple statistical model to a physical hydrodynamic model of
the river Elbe in order to obtain the transport times of a water body from the upstream to the estuarine
continuous measurement stations (from Cumlosen x1 to Seemansshoeft x2). At this point, we need
the measurements of a chemically passive tracer variable that can remain almost unchanged from the
upstream region of primary production to the estuary where the primary production is inverted. The
conductivity has been chosen for this task, although we can expect that in the estuary, the mixing of
the Elbe freshwater with salty marine water might increase the water conductivity. This effect will be
negligible as we will see below.
2.1 Definition of a transformed time coordinate
The most basic knowledge we have about transport times is their relation to river runoff. If we assume
that the Elbe water bodies (or slices) move downstream without mixing with neighboring slices, and
if we assume the absence of tributaries from x1 to x2 in the Elbe, the time that a water body takes to
flow from x1 down to x2 is the curvilinear integral of the water body speed:
= t2 t1 (2.1)=
x2x1
S(x, t(x))Q(x, t(x))
dx (2.2)
= x2
x1
S(x)Q(t(x))
dx (2.3)
1/Q(t1) x2
x1
S(x)dx. (2.4)
The no-mixing assumption allowed us to write time as a function of space t(x), whereas x(t) wouldnot be bijective in case of mixing. Of course, it is difficult to deny any mixing in the river Elbe, but
the purpose here is to propose a more convenient time coordinate. The river section area S(x) is nottime dependent for huge rivers like the Elbe, the river runoff Q(t) is independent from the locationx along the Elbe since the Elbe tributaries are marginal in the part of the Elbe that we are studying.
22
Estimation of transport times 23
Remarking that the discharge is varying slowly from day to day, we approximated Q(t(x)) by Q(t1).Therefore the transport times are inversely proportional to the river discharge, which means that by
a simple stretching of the time coordinate, it can be made discharge independent and therefore time
independent.
We suggest here (as in [3]) the transformed time coordinate:
t0 =
t change.coordsfunction(dat=Seem$CondRes, nnew=dim(SeemQ)[1], coord=Seem$Qcum,
index = 8785:43824, freq = 0.04){# dat : input data# nnew: length of the input time series# coord: new coordinate time series (cumulated runoff)# index: origin and end times of the new coordinate# freq : frequency in the new time domain (in km3 of water)
home
24 Estimation of transport times
selecdat
Estimation of transport times 25
Lag (days)
corr
elat
ion
-20 -10 0 10 20
0.2
0.4
0.6
0.8
Lag (days)
corr
elat
ion
-20 -10 0 10 20
0.1
0.2
0.3
0.4
0.5
0.6
Figure 2.1: Experimental cross correlation functions of conductivity time series from Cumlosen toSeemannshoeft time is expressed in the old coordinate (days). Left: high runoff (Q > 500m3/s),right: low runoff (Q < 500m3/s). The dotted vertical line indicates the average lag on the wholetime series (6 days).
2.2 Analysis of conductivity time series
To measure the delay between conductivity time series in both stations Cumlosen and Seemannshoeft,
we use the experimental cross-correlation function:
1,2() = Cor(Z(x1, t), Z(x2, t + )
),
Z(x, t) being here the conductivity at location x and time t. In Figures 2.1 and 2.2 the cross corre-lation functions are shifted symmetrical functions, the maximum correlation being also the center of
symmetry please notice that the value of this maximum correlation is very high, as the conductivity
time series are very similar in both stations. The computation has been performed with raw conduc-
tivities, using residuals from the seasonal normal instead produced similar results with only weakly
inferior correlations. The time lag corresponding to the maximum correlation can be interpreted as
the average delay between the time series. It is always positive since Seemannshoeft is downstream
from Cumlosen.
The cases of high and low runoff (resp. inferior and superior to the median runoff 500 m3/s) were
separated to test for discharge dependence. On Figure 2.1, the delay is naturally shorter for higher
runoff (4.5 days in average for Q > 500m3/s) and vice versa (8.5 days in average for Q < 500m3/s),as the runoff is proportional to the average water speed.
Figure 2.2 is somehow harder to interpret: the delay expressed in the new coordinate is still
runoff dependent, but in an inverse way. The higher the runoff, the higher the delay. This means that
26 Estimation of transport times
Lag (km3)
corr
elat
ion
-0.5 0.0 0.5
0.5
0.6
0.7
0.8
0.9
Lag (km3)
corr
elat
ion
-0.5 0.0 0.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Figure 2.2: Experimental cross correlation functions of conductivity time series from Cumlosen toSeemannshoeft time is expressed in the new coordinate (km3 of water). Left: high runoff (Q >500m3/s), right: low runoff (Q < 500m3/s). The dotted vertical line indicates the average lag onthe whole time series (0.28 km3 of water).
for high runoff, it takes more water to bring the conductivity variation signal from Cumlosen down
to Seemannshoeft. The mixing process should be the right explanation why the delay is still runoff
dependent in the new coordinate. In case of important mixing, a water drop can remain longer at
a single location because of turbulences, eddies and recirculations so that its individual lagrangian
speed is lower than the eulerian water speed, as seen from the river banks. Therefore the effect of
mixing can be identified on the cross correlation function by higher lag of maximum correlation and
a broader peak1.
Figure 2.2 tends to indicate that mixing is more important in the Elbe when the discharge is
higher2. For further analysis, we assumed that the transport times are constant in the new coordinate
although we can see in Figure 2.2 that this is not the case. The average transport time computed on
4 full years (1997 to 2000 incl.) is 0.28 km3 of water, this represents the average water volume in the
Elbe between Cumlosen and Seemannshoeft.
2.3 Accounting for mixing
Some mixing occurs between Cumlosen and Seemannshoeft, this is a common physical process in
rivers. A difficult question is how to quantify experimentally the degree of mixing? We suggest here
that a way to determine to which extent the Elbe waters are mixed is to measure the roughness of a
1If were a stochastic parameter, its distribution would be more widespread on the right for high mixing.2Wilhelm Petersen finds it odd, he would rather think that there are more turbulences and mixing when the runoff is low
Estimation of transport times 27
tracer time series, here water conductivity.
Here the distribution of the transport times, denoted ft( + t) above, is set arbitrarily to auniform density defined on the interval from 0.16 km3 and 0.44 km3 of water, centered on the average
transport time 0.28 km3. The boundaries of this interval were set by the look at the cross correlations
on Figure 2.2 in cases of high and low runoff.
If we consider again the predictor we have built, the predicted value for oxygen is simply a moving
average of the upstream measurements over a certain period of time, which is runoff dependent: if
the Elbe runoff is low, say 300 m3/s, which is rather common in the summer, then the predictor is
an average of upstream observations taken from 17 days to 6 days ago. In case of high runoff, 1500
m3/s for example, this time interval shrinks to the following: from 3.4 days to 1.23 day before the
term, which leaves little time for environment management. Hopefully, the problem of oxygen deficit
seldom occurs during high runoff situations.
Some validation of this simple mixing by moving average model can be seen on Figure 2.3.
The upstream conductivity temporal variogram (squares) is continuous at the origin, but with a sharp
slope, proving the upstream condictivity to be a continuous, but non differentiable variable of time. In
the estuary, the experimental variogram (triangles) is also continuous and differentiable at the origin.
This indicates that the estuary conductivity time series is smoother than it is upstream. By contrast,
the difference of time series roughness can be explained by an effect of mixing. If we plot the cross-
variogram of upstream conductivity and its moving average on Figure 2.3, then the upstream time
series have been smoothed, and even oversmoothed by the moving average since it turns out to be
a little smoother than the estuary conductivity time series. This should be fixed by setting a more
adequate distribution ft than the arbitrary uniform density above.
Moreover, the above predictor is constructed to predict downstream concentrations at the location
of the Seemannshoeft station (km 628.8). The oxygen minimum is generally located a dozen of
kiloneters downstream from Seemannshoeft. It is therefore not necessarily adequate, and a physical
model of the Elbe is needed to produce better predictions.
The predictor can be made more rigorous in a geostatistical sense by performing ordinary kriging
instead of moving average.
Z(x2, tn) =
k
28 Estimation of transport times
Distance
Var
iogr
amm
e
0 1 2 3 4
020
0040
0060
0080
0010
000
1200
014
000
1600
0
Upstream simple variogramEstuary simple variogramPredictor moving average
Figure 2.3: Experimental conductivity variograms. Squares: upstream, triangles: downstream, dottedline: cross-variogram upstream/predictor (moving average).
However we should be critical with this predictor since the aim is not to predict conductivity but
oxygen that may depend on several variables (solar radiation, discharge, nutrients ...) and the most
relevant time delay should be different for each of these variables.
The Elbe water does not receive solar radiation at a single location but all over its course in theupper, middle Elbe and in the tributaries. So that the actual distribution of the delays may be
more widespread for this variable.
The Elbe discharge is measured at Neu Darchau, a station located between Cumlosen and Ham-burg, additional cross-correlations between the discharge measurements and the conductivities
in Cumlosen and Seemannshoeft3 indicate that the discharge variations are 0.28km3 of water
ahead from the Seemannshoeft conductivity variations and 0.12km3 late compared to the Cum-
losen conductivity variations. The fact that these two delays sum up to 0.40 km3 of water,
which is superior to the average delay 0.28km3 between Cumlosen and Seemannshoeft leaves
me motionless. Maybe the maximum of cross correlation is a biased estimator of the average
transport time. This calls for further probabilistic work.
The nutrients are not necessarily in the dissolved phase like the salt, they can interact with thesuspended matted that moves at a slower rate than the water down the river. Therefore the
transport times might be actually longer for biological processes than for the conductivity.
3Not shown here.
Estimation of transport times 29
Time (yr)
Con
duct
ivity
(m
S/c
m)
1997 1998 1999 2000 2001
600
800
1000
1200
Obs.Pred.
Figure 2.4: Predicted (Blue) and observed (Black) conductivity at Seemannshoeft, the predictor is ashifted and smoothed average of the upstream conductivity in a modified time coordinate.
Practically it was not possible to estimate empirical delays between each of these variable and the
oxygen losses since the corresponding cross correlation functions did not exhibit a clear peak.
Splus routine used to build the predictor
S+> simple.predictorfunction(date = heli$Datum, var = CumlQ2$Chl){# date: vector of target times for prediction# var : predictor time series (new time coordinate)
n
Chapter 3
Oxygen losses from Cumlosen toSeemannshoeft
In this chapter we will study the time series from automatic stations Cumlosen and Seemannshoeft
during 4 years (1997 to 2000). The oxygen losses are defined as the delayed difference O2(x1, t) O2(x2, t + ) in the new time coordinate. The best covariates explaining the losses are identified.
3.1 Identifying a covariate
Here Figures 3.1 and 3.2 show two examples of scatterplots among the most meaningful. The
Chlorophyll-a is the best covariate, it is also the parameter best related to primary production. The
scatterplot of oxygen losses against upstream oxygen shows no trend at all. Maybe the upstream
oxygen saturation index could be a better covariate than the oxygen concentrations since it would be
independent from the water temperature and pressure. The water temperature is also linked to the
oxygen losses (Figure 3.2) since primary production always happpen in the summer when the tem-
perature is high, but there are also many situations where the temperature is high and no oxygen loss
can be observed. Other scatterplots with relevant upstream variables can be found on Figure 3.4, in
particular, it is clear that oxygen losses only occur in case of low runoff.
We will then focus on the most relevant covariate (Chlorophyll-a) and try to refine this relationship.
3.2 Conditioning to discharge
If we split the scatterplot on Figure 3.1 conditioning on 6 discharge intervals1, we obtain the scatter-
plots on Figure 3.5. The sigmoid shape has disappeared and each cloud appears to follow a linear
relation with a slope decreasing with discharge. It appears that the linear fit is weaker for the lower
values of discharge (see the bottom left plot on Figure 3.5), possibly because of the influence of
temperature. On Figure 3.1, the aggregated cloud over all discharge situations loses the linear aspect.
The nonlinear effect of discharge on the scatterplots can be explained by a dilution effect. It
1After a suggestion by Christian Lajaunie, easily experimented with the Splus library Trellis.
30
Oxygen losses from Cumlosen to Seemannshoeft 31
Chl-a (mg/l)
O2
loss
(m
g/l)
0 50 100 150
-20
24
68
1012
Figure 3.1: Scatterplot of the oxygen losses against upstream Chlorophyll-a. The sigmoid shape ispartly due to the 2000 chlorophyll concentrations that are much higher than for previous and followingyears.
causes nonlinear relationship between chlorophyll and the oxygen loss variable and raises the ques-
tion whether concentrations are the adequate way to look for linear relationships. Is it possible to
remove the dilution effect by replacing chlorophyll concentrations, upstream and downstream oxygen
concentrations by the associated loads2?
Figure 3.6 show the transformed scatterplots. For higher runoff, the oxygen and chlorophyll con-
centrations that were near to zero on Figure 3.5 are amplified by the multiplication with the discharge
value and exhibit a noisy behaviour that makes no sense. The lower runoff situations are more inter-
esting since they contain the cases of severe oxygen deficit. In the latter case, the linear relationships
are better defined, and even better with loads than with concentrations. Further the regression slopes
seem to be independent for the river discharge.
Then the most natural suggestion to make proper use of the above remark is to carry on the study
with loads, but discarding all data taken at higher runoff that will perturb the statistics and bring no
information about the oxygen deficit since this does not happen in case of high runoff.
A last remark that can be done about the change from concentrations to loads is that the exception-
ally high chlorophyll concentration measurements in 2000 produce quite ordinary loads on Figure 3.3
similar to the 1999 loads. Yet the regression slope seems a bit lower but we should wait for the 2001
Seemannshoeft data to come to a conclusion.
2The load is the concentration multiplied by the discharge, it is expressed in kg/s
32 Oxygen losses from Cumlosen to Seemannshoeft
Temperature (deg C)
O2
loss
(m
g/l)
0 5 10 15 20 25
05
10
Figure 3.2: Scatterplot of the oxygen losses against temperature.
3.3 Conditioning to temperature
By the same conditioning technique as above, the influence of upstream cc water temperature on
the relation between oxygen and chlorophyll (expressed as loads) is explored. On Figure 3.7 the 6
scatterplots show this relation conditioned by 6 intervals of temperature with equal number of points.
At lower temperature, the plots are as messy as they were for higher discharge on Figure 3.6 because
the winter and spring season are periods of low temperature and high discharge at the same time. We
will therefore comment the two upper scatterplots on Figure 3.7. First the clouds are more spread
than in the discharge conditioning case (Figure 3.6) which shows that the temperature has a weaker
impact than discharge. However the regression slope is clearly higher on the top right plot than on
the top left. This illustrates the effect of a temperature change from 15oC to 20oC: the oxygen loss ismuch more sensitive to chlorophyll at higher temperature.
Oxygen losses from Cumlosen to Seemannshoeft 33
Year.names[!highQQ] : 1997
10 20 30 40 50 60
-10
12
34
Year.names[!highQQ] : 1998
-10
12
34
Year.names[!highQQ] : 1999
Year.names[!highQQ] : 2000
10 20 30 40 50 60
Chl-a loadings (kg/s)
O2
loss
es (
load
ings
kg/
s)
Figure 3.3: Scatterplots of the oxygen load losses against chlorophyll loads, for the 4 successive years.Only data sampled at low discharge (Q < 500m3/s) are represented. The 2000 measurements arenot outlying anymore.
34 Oxygen losses from Cumlosen to Seemannshoeft
Run off (m3/s)
O2
loss
(m
g/l)
500 1000 1500 2000 2500 3000
05
10
upstream O2 concentrations (mg/l)
O2
loss
es (
mg/
l)
6 8 10 12 14 16
05
10
upstream pH
O2
loss
es (
mg/
l)
7.5 8.0 8.5 9.0
05
10
Global solar radiation
O2
loss
es (
mg/
l)
0 100 200 300 400 500
05
10
Figure 3.4: Scatterplots of the oxygen losses against (resp.) d d discharge, oxygen, pH and globalsolar radiation.
Oxygen losses from Cumlosen to Seemannshoeft 35
05
10
given.Q
0 50 100 150
given.Q
given.Q
05
10
given.Q
05
10
given.Q
given.Q
0 50 100 150
Chl-a (mg/l)
O2
loss
es (
mg/
l)
Figure 3.5: Scatterplots of the oxygen losses against chlorophyll conditioning to river discharge, splitinto 6 intervals containing equal number of points. The dotted lines are plotted between successivepoints in order to follow the trajectories and the linear regression is drawn.
36 Oxygen losses from Cumlosen to Seemannshoeft
-6-4
-20
24
6
given.Q
0 20 40 60 80
given.Q
given.Q
-6-4
-20
24
6
given.Q
-6-4
-20
24
6
given.Q
given.Q
0 20 40 60 80
Chl-a load (kg/s)
O2
load
loss
es (
kg/s
)
Figure 3.6: Scatterplots of the oxygen load losses against chlorophyll loads, conditioning to riverdischarge, as in previous Figure.
Oxygen losses from Cumlosen to Seemannshoeft 37
-6-4
-20
24
6
given.T
0 20 40 60 80
given.T
given.T
-6-4
-20
24
6
given.T
-6-4
-20
24
6
given.T
given.T
0 20 40 60 80
Chl-a load (kg/s)
O2
load
loss
es (
kg/s
)
Figure 3.7: Scatterplots of the oxygen load losses against chlorophyll loads, conditioning to riverupstream temperature.
Chapter 4
Multivariate analysis using bothhelicopter and station data
In this chapter a Principal Component Analysis is set up, in order to describe the multivariate relations
between the predictors and the observed characteristics of a helicopter profile.
4.1 Intersections of the station and helicopter data
The helicopter provides a spatially rich, temporally poor information, and vice-versa for the contin-
uous station. Having too little time to set up a multivariate spatio-temporal model that mimics both
the mineralization of the upstream biomass and the mixing with marine water, we extracted a few
interesting features of the helicopter profiles (value and localization of the oxygen minimum, av-
erage PO4 and suspended matter concentrations) and explored their statistical linear relations with
upstream predictors. Therefore, the samples for the principle component analysis are irregular time
series with at least one month between two consecutive samples. We can therefore expect the samples
to be independent since the biological situation in the estuary changes completely every 15 days (see
Figure 1.4).
From 1997 to 2001 included, there are 31 helicopter profiles that we can try to predict using
upstream data previously recorded and estuarine data that are easily predictable (water levels by the
prediction of tides). From the above section, a natural data pretreatment would be to transform all
concentrations and all variables subject to dilution (conductivity) into loads and select the data with
low discharge only. However this selection would suppress 11 from the 31 intersecting points between
the fixed stations and the helicopter data. So we shall prefer keeping concentrations in the analysis,
i.e. using linear statistics with variables nonlinearly linked, than performing a principal component
analysis of 20 samples.
As each helicopter profile samples the Seemannshoeft location where continuous measurements
are performed, we can simply compare both measurement protocols for the variables oxygen, pH,
conductivity and temperature. As the helicopter samples are not analyzed directly on board, but wait
for their posterior treatment in the lab, the continuous measurements are supposed to be of better
quality for the oxygen and pH variables. This can be observed for the oxygen concentrations on
38
Multivariate analysis using both helicopter and station data 39
Helicopter O2 data (mg/l)
Aut
omat
ic s
tatio
n da
ta (
mg/
l)
0 2 4 6 8 10 12 14
02
46
810
1214
Figure 4.1: Comparison between helicopter and hourly oxygen concentrations taken at the same timeat Seemannshoeft (km 628.8). The solid line is the first bissector, and the dotted line is the linearregression. The helicopter data are systematically 1 mg/l higher than the station data.
Figure 4.1. The helicopter data are higher than the seemannshoeft station measurements by 1 mg/l.
A support effect cannot be responsible for such a bias as the oxygen measurements are both spatially
and temporally smooth. For example on the 14th of august 2000 at 10:25am in Seemannshoeft station,
the measured oxygen concentration was 1.8mg/l, it was varying slowly from one hour to the other. At
10:48am, the helicopter sampled the same location and the analysis produced a higher concentration
of 2.9mg/l.
The most convincing explanation for this bias is the fact that the Elbe water is not at the thermo-
dynamical equilibrium and the oxygen content of estuarine water is often below saturation. During
the time that the Elbe water is sampled but not analyzed yet, the water dissolves the oxygen at the
air interface, which increases the oxygen concentrations. However, the standardization of the data
previous to the principal component analysis removes such an additive bias.
4.2 Preliminary analysis: variables selection
As a preliminary principal component analysis, we put together all variables of potential interest.
The first EOF (horizontal on Figure 4.2) can be viewed as the seasonal opposition between the
variables that have their maximum in the summer (on the left) against those that are maximum in the
winter (on the right). The second EOF shows an opposition between the Tide (water level) variable
40 Multivariate analysis using both helicopter and station data
...
..................
......
......
......
.......
........
..........
...............
.....................................................................................................................................................................................................................
...........
........
.......
..........................................
Axe 1
Axe
2
-1.0 -0.5 0.0 0.5 1.0
-1.0
-0.5
0.0
0.5
1.0
Chl O2up
T
Q
pH
Cond
Glob
Tide
O2min
locO2
Susp
Figure 4.2: Projection of the raw variables on the two first EOFs. The helicopter variables are: locO2the location of the O2 minimum, O2min the oxygen minimum value, Susp the average suspendedmatter concentrations all over the estuary. locO2 is out of the correlation circle because it is definedon only 20 profiles and the correlation matrix is therefore not positive definite. The Seemannshoeftpredictor is Tide the water level averaged in the new time coordinate. The Cumlosen predictors(delayed and averaged upstream observations) are O2up the upstream oxygen concentration, Chl thechlorophyll concentration, pH, T the water temperature, Glob the solar irradiation, Q the dischargeand Cond the water conductivity.
and both the location of the oxygen minimum and the river discharge. The localization of the oxygen
minimum is outside the circle of total correlations because this variable is not defined in the winter and
only part of the samples are used for computing correlations with locO2. Therefore the correlation
matrix is not positive definite. This second EOF is quite harder to interpret, it tends to separate
situations when (1) the oxygen minimum is located far downstream in the estuary (2) the water levels
are low and (3) the discharge is high, paradoxically. As the time integration of the water level variable
(in the new coordinate: on a 0.04km3 of water) might not be consistent with the water body sampled by
the helicopter, it seemed hazardous to interpret this second EOF and instead we removed the locO2,
Tide and Susp variables that may be controlled by processes that have no link with the oxygen
deficit. Furthermore, the helicopter profiles are done at low tide, which constitutes a preferential
sampling strategy. It is frustrating not to exploit the information contained in the water level data. A
possible way to use it may be to separate the 12 hours mean water level (varying with the sea water
levels and the river discharge) from the cycle amplitude (varying with the moon influence). These two
Multivariate analysis using both helicopter and station data 41
Run off (m3/s)
Loca
lisat
ion
O2
min
imum
300 400 500 600 700
630
640
650
660
07.199709.1997
05.1998
06.1998
07.1998
08.1998
09.1998
05.1999
06.199907.199908.1999
05.00
06.0007.00
08.00
Figure 4.3: Location of the oxygen minimum (from helicopter profiles, 600km is Hamburg, 750km isin the North Sea) against river discharge, the text is the date of the helicopter profile. The position ofthe oxygen minimum seems to be further downstream when the discharge is high, but this may be aseasonal effect since the profiles performed in may can produce the correlation alone.
parts certainly have a different influence on the mixing process and should be treated as two distinct
variables. We will hereafter focus on variables that we are able to comment upon.
A statistical problem occuring when selecting in the helicopter profiles the oxygen minimum
is that this random variable is a minimum among 28 observations. The statistical treatment of an
extremum is much more tricky than the estimation of the expected value at a given location: it is
more variable than an average and it is highly dependent on the chosen support of the observations.
However, the oxygen minimum value will feed the linear statistical analysis procedures in order to
evidence multivariate dependences.
4.3 Predicting the location of the oxygen minimum
Figure 4.3 shows the empirical relation between river runoff and the position of the oxygen minimum,
a seasonal effect can be responsible for the trend since discharge also has seasonal variations. The
remaining uncertainty on the position is still very high. This seems to be a fairly difficult task, the
influence of mixing with marine water is undoubtedly a crucial factor, the river temperature may also
influence this position, as Wilhelm Petersen suggested.
42 Multivariate analysis using both helicopter and station data
EOF 1 EOF 2 EOF 3 EOF 4 EOF 5 EOF 6 EOF 7
020
4060
Principal axes
% e
xpla
ined
var
ianc
e
PCA: Concentrations
...
..................
......
......
......
.......
........
..........
...............
.....................................................................................................................................................................................................................
...........
........
.......
..........................................
Axe 1
Axe
2
-1.0 -0.5 0.0 0.5 1.0
-1.0
-0.5
0.0
0.5
1.0
ChlT
Q
pH
Cond
Glob
O2loss
Figure 4.4: Screegraph of the principal component analysis of the raw variables (left) and projectionof the raw variables on the two first EOFs (right). The screegraph shows the overall domination of thefirst EOF. The O2loss variable is the difference between the predicted oxygen by transport andmixing of the upstream oxygen measurements and the value of the oxygen minimum measured fromthe helicopter.
4.4 Analysis of the raw variables
Figure 4.4 shows the result of the principal component analysis with selected raw variables. The
screegraph on the left shows that the main part of the data variance can be explained by the single
first EOF. The projection on the right graph clearly shows that the first EOF opposes the summer
variables (right hand side) to the winter variable Q on the left hand side. The opposition between
discharge and conductivity is a typical pattern of the dilution effect (see [3] for example). As dilution
is also a seasonal phenomenon, it is gathered with the biological processes on the first EOF. All
variables more or less directly linked with biology (solar radiation Glob, pH, chlorophyll, water
temperature and oxygen losses) are concentrated in the same location in the EOF1-EOF2 plane, as
if all these variables were giving exactly the same information. However as we have seen earlier on
the scatterplots, water temperature is a less informative variable than chlorophyll for predicting the
oxygen losses. The present principal component analysis is not able to separate them. In other words,
the principal component analysis evidenced a strong covariation of the biological variables, yet only
because they were following the same seasonal cycles. Such an analysis of raw variables having
strong seasonal cycles is obviously indadequte for short-term predictions. We therefore privileged an
analysis of deviations from the seasonal normal.
Multivariate analysis using both helicopter and station data 43
Year
O2
conc
entr
atio
ns (
mg/
l)
1997 1998 1999 2000 2001 2002
24
68
1012
Seasonal normalO2 min (helicopter)
Figure 4.5: Oxygen minimum time series and the seasonal normal. Crosses: Minimum oxygen val-ues from helicopter profiles, solid line: seasonal normal defined from the Seemannshoeft continuousstation. The helicopter measurement bias (+1mg/l) has not been removed from the data.
4.5 Defining a seasonal normal
After the massive change in the Elbe river regime between 1991 and 1994, the seasonal normal value
for many river variables is not what it used to be. Therefore, we used the continuous measurements of
the stations Cumlosen and Seemannshoeft to define what is the expected normal value. The solution
is the same as the one used in [3]:
Construct an average year hour-by-hour from the data available (for the example of temperaturebelow, we use both stations as they have a common seasonal normal for temperature).
A
44 Multivariate analysis using both helicopter and station data
A drawback of the use of smoothing splines, is that they are not meant for periodical variables, there-
fore the smoothed yearly minimum is overestimated and the smoothed yearly maximum is underesti-
mated. Tuning the number of degrees of freedom permits to dissimulate this effect.
Some variables are more problematic than temperature. The solar radiation seasonal normal
should normally be zero every night, therefore we suppressed the smoothing step from the above
procedure to keep only the yearly average value. The normal oxygen minimum value is even more
problematic since the continuous station Seemannshoeft is generally a little upstream from the true
location of the oxygen minimum. However we considered that the oxygen seasonal normal for station
Seemannshoeft is a reasonable normal value in this region of the estuary. The helicopter minimum
values are reported on the yearly normal curve on Figure 4.5 and we can see that the main part of the
fluctuations is present in the seasonal normal, although defined for the wrong location.
Splus routine used to compute the annual mean
S+> seasonal.meanfunction(x1=Seem$O2, x2=NULL, window.length=8766, weights=F){
n1
Multivariate analysis using both helicopter and station data 45
EOF 1 EOF 2 EOF 3 EOF 4 EOF 5 EOF 6 EOF 7
05
1015
2025
30
Principal axes
% e
xpla
ined
var
ianc
e
PCA: Residuals
...
..................
......
......
......
.......
........
..........
...............
.....................................................................................................................................................................................................................
...........
........
.......
..........................................
Axe 1
Axe
2
-1.0 -0.5 0.0 0.5 1.0
-1.0
-0.5
0.0
0.5
1.0
Chl
GlobT
pHQ
Cond
O2loss
Figure 4.6: Screegraph of the principal component analysis of the residuals (left) and projection onthe two first EOFs (right). The screegraph shows the domination of the 2 first EOFs.
an abnormally high discharge will cause an abnormally low conductivity. The bunch of variables
linked to the biological activity on Figure 4.4 has split up on Figure 4.6 so that we are now able to
discriminate:
the variables that are directly linked to the oxygen deficit: upstream pH and chlorophyll markingthe primary production
those that are indirectly linked to it: solar radiation and temperature that are seasonally linkedto the oxygen losses, but do not have apparently a direct relationship.
In the residuals analysis, we still can observe an opposition between the river discharge and the oxygen
losses, that was already present in the raw data analysis. The meaning of this opposition is that a
discharge stronger than normal makes the oxygen losses lower than normal. This may be interpreted
in terms of a dilution effect, too much water can stop some biological processes.
Chapter 5
Conclusion
From the analysis of residuals from the seasonal normal, it appears that predictions of the estuary
oxygen deficit should take into account the pH, chlorophyll and discharge measured on the same
water body when it crossed the upstream monitoring station.
If a simple statistical analysis can provide a reasonable predictor for the inert conductivity, it
however turns out that the prediction of oxygen is a much more complex task. More variables must be
taken into account, and transport times should be made different from one variable to another. Here
it was assumed that all active material was in the dissolved phase, which is of course abusive. An
additional mixing term by moving average was supposed to gather all the uncertainty on the transport
times, but this may not be enough. A probabilistic study using a random delay should make these
things clearer. Furthermore the use of measurements at the stations Weir Geesthacht and Bunthaus
located between the two automatic stations studied here can provide more insight about the transport
and mixing phenomena. This might also provide an answer to the following unresolved question:
how to predict the oxygen loss if we do not know where it will happen? As the mixing with marine