Extended Prediction of North Indian Ocean Tropical Cyclones using the
ECMWF Variable Ensemble Prediction System
James I. Belanger*, Peter J. Webster, and Judith A. Curry
School of Earth and Atmospheric Sciences, Georgia Institute of Technology,
Atlanta, GA
July 2011
Submitted to Monthly Weather Review
*Corresponding Author Address: James I. Belanger School of Earth & Atmospheric Sciences Georgia Institute of Technology 311 Ferst Drive Atlanta, GA 30332-0340
Email: [email protected]
2
Abstract 1
This analysis examines the predictability of several key forecasting parameters using the 2
ECMWF Variable Ensemble Prediction System (VarEPS) for tropical cyclones (TCs) in the 3
North Indian Ocean including: tropical cyclone genesis, pre-genesis and post-genesis track and 4
intensity projections, and regional outlooks of tropical cyclone activity for the Arabian Sea and 5
the Bay of Bengal. Based on the evaluation period from 2007 to 2010, the VarEPS TC genesis 6
forecasts demonstrate low false alarm rates and moderate to high probabilities of detection for 7
lead-times of one to seven days. In addition, VarEPS pre-genesis track forecasts on average 8
perform similar to the VarEPS post-genesis forecasts through 120 hrs and feature a total track 9
error growth of 41 nm per day. The VarEPS provides superior track forecasts for lead-times 10
greater than 12 hrs compared to other models including: UKMET, NOGAPS, and GFS, and 11
slightly lower track errors than the Joint Typhoon Warning Center. Given the societal need for a 12
modern forecasting system that extends beyond three days in the NIO, we conclude with a 13
discussion of how the VarEPS can provide much of this extended predictability in a probabilistic 14
framework.15
3
1. Introduction: 16
Tropical cyclones (TCs) in the North Indian Ocean have a profound impact on the littoral 17
countries of the Arabian Sea and the Bay of Bengal. The combination of a shallow coastal plain 18
along with a thermodynamically favorable environment allows TCs to impart high surface 19
winds, torrential rains, and significant wave heights (wave setup plus storm surge) as these 20
systems move inland. Low socioeconomic conditions in the region combined with the world’s 21
highest population density allow landfalling TCs that historically have had little advance warning 22
to become devastating natural disasters. In fact, eight of the ten deadliest TCs of all time have 23
occurred in the Bay of Bengal and the Arabian Sea with five impacting Bangladesh and three 24
making landfall in India (WMO-TD No. 84). In addition, the highest surge height ever recorded 25
by a TC (45 feet) occurred in the North Indian Ocean near the Meghna Estuary, Bangladesh in 26
1876 (WMO-TD No. 84). 27
The complex meteorological and socioeconomic environment in the Bay of Bengal and 28
throughout the North Indian Ocean necessitates a warning system that provides several days of 29
advance notice for TC formation, track movement, intensity change, horizontal distribution of 30
surface winds, and maximum coastal wave heights. However, the World Meteorological 31
Organization (WMO) only requires a small subset of this critical forecast information to be 32
generated daily. The Indian Meteorological Department (IMD), which is the WMO–designated 33
Regional Specialized Meteorological Centre (RSMC), provides the official forecasts and 34
warnings for tropical cyclones in the North Indian Ocean region. As mandated by the WMO, the 35
IMD is required to coordinate and release their forecasts daily with each member country within 36
the North Indian Ocean. However, ultimate responsibility for forecast development and warning 37
dissemination lies with each country’s national meteorological service. As part of the daily 38
4
operational procedure when a TC is not present in the region, the IMD is required to prepare a 39
daily tropical weather outlook, which assesses the possibility of tropical depression development 40
in the Bay of Bengal and the Arabian Sea. Unlike the National Hurricane Center’s Tropical 41
Weather Outlook, which provides the likelihood of TC genesis during the next 48 hours 42
(Rappaport et al. 2009), the WMO provides no time constraint for this outlook product as the 43
time period that is covered by this forecast is determined separately by each RSMC. In addition, 44
this product does not provide any quantitative, probabilistic information about the potential for 45
tropical cyclone formation. 46
After a TC has formed in the NIO, the IMD begins issuing advisories, which contain 47
information for the TC including its past movement, current location and intensity, and forecast 48
information including its future location, translation speed, wind intensity, maximum average 49
surface wind speed, and highest surface wind gust. However, the WMO only requires that these 50
tropical cyclone advisories and forecasts extend out to three days in the NIO even though most 51
numerical guidance for tropical cyclone forecasting today spans a full five-day forecast window 52
(Rappaport et al. 2009). Since the average lifetime of a tropical cyclone in the North Indian 53
Ocean is about four days, these limited outlooks do not provide enough lead-time for disaster 54
preparation or evacuation. Also even though this region has historical experienced devastating 55
impacts from tropical cyclone-induced storm surge in the Bay of Bengal, the IMD does not 56
produce storm surge forecasts, as these remain the responsibility of each country’s national 57
meteorological service (WMO-TD No. 48). 58
Given the societal need for a modern forecasting system that incorporates several additional 59
forecast parameters with a lead-time well beyond three days, here we show how forecasts from 60
5
the European Centre for Medium-Range Weather Forecasts (ECMWF) Variable Ensemble 61
Prediction System (VarEPS) can provide much of this new information. 62
2. Data and Methods 63
a) ECMWF Variable Ensemble Prediction System 64
The predictability of tropical cyclones in the North Indian Ocean is evaluated using the 65
European Centre for Medium-Range Weather Forecasts (ECMWF) Variable Ensemble 66
Prediction System (VarEPS). During the period 2007–2010, the ECMWF VarEPS (hereafter; 67
VarEPS) has undergone a number of important changes1 that include increasing the horizontal 68
and vertical resolution of the modeling system, expanding data assimilation procedures to 69
include a greater number of satellite radiance measurements, updating model physics and 70
cumulus convection parameterization schemes, and changing how initial and stochastic 71
perturbations are generated using singular vectors. 72
As of January 26, 2010, the VarEPS includes the ECMWF global model that is run at 73
TL1279 spectral truncation (horizontal resolution ~ 16 km) with 91 vertical levels out to ten days 74
along with 51 ensemble members (50 perturbed members + 1 control run) at TL639 spectral 75
truncation (horizontal resolution ~ 32 km) with 62 vertical levels. For days 11–15, the 51-76
member ensemble is processed at a reduced TL319 spectral truncation (horizontal resolution 77
about 63 km). 78
To represent the uncertainty in initial conditions, ensemble perturbations are constructed 79
using singular vectors, which capture the fastest growing errors in the first 48 hours (Buizza and 80
Palmer 1995). Stochastic perturbations are also added during the model integration to account 81
for the uncertainty in parameterized physical processes. Five additional singular vectors are 82
82 1 A complete description of these changes may be found at: http://www.ecmwf.int/products/data/technical/model_id/index.html.
6
computed and perturbed in the six grid spaces enclosing each TC using a diabatic, adjoint 83
version of the ECMWF global atmospheric model at TL42 spectral truncation with 42 vertical 84
levels (Barkmeijer et al. 2001, Puri et al. 2001). The 15-day VarEPS 00UTC forecasts for the 85
period 1 January 2007 to 31 December 2010 were obtained through the THORPEX Interactive 86
Grand Global Ensemble (TIGGE; http://tigge.ecmwf.int/) project where the North Indian Ocean 87
domain included the region: 0–30N; 40–110E and were at a horizontal resolution of 0.25o x 88
0.25o. 89
b) Tropical Cyclone Tracking Scheme 90
To isolate tropical cyclones in the VarEPS analysis and forecast fields, a modified version of 91
the Suzuki-Parker tracking scheme is used (Holland et al. 2010). This tracking scheme utilizes 92
the following variables including: 850 hPa relative vorticity, mean sea level pressure, 10 m 93
winds, and geopotential heights at 925, 850, 700, 500, and 200 hPa. To increase the maximum 94
intensity retrievals from the VarEPS, 10 m winds are replaced with winds averaged in the lower 95
troposphere (10 m, 925 hPa, and 850 hPa). The tracking scheme processes each ensemble 96
member for tropical cyclones by first identifying candidate vortices that exhibit a local minimum 97
in mean sea level pressure. This initial set of vortices is filtered by removing systems that do not 98
have a maximum lower tropospheric wind speed greater than 16 kts and 850 hPa relative 99
vorticity greater than 1x10-4 s-1. The next component of the tracking scheme confirms that each 100
identified vortex possess a warm core as defined using the Hart (2003) phase-space method. 101
After the tracking scheme has been implemented for each ensemble member, utilizing the land-102
sea mask from the VarEPS, any tracks that originate over land are removed unless the ensemble 103
tracks are within 300 nm of an observed tropical cyclone. Furthermore, all ensemble forecast 104
tracks must have a lifetime of at least one day. Still, the modified tracking scheme still has some 105
7
deficiencies especially in localized regions where vorticity and winds in the lower troposphere 106
exceeds tracking thresholds. These exceedances may occur in association with topography and 107
with the Somalian easterly jet over northeastern Africa. Accordingly, due to a spurious number 108
of unrealistic forecasts tracks that originate within the region: 10–17N; 40–55W, any candidate 109
tracks that emerge from this region are automatically removed unless they were in association 110
with an observed tropical cyclone. 111
c) Filtering Tropical Cyclone Forecast Tracks and Determining False Alarms 112
After applying the tropical cyclone tracking scheme and post-processing routines to the 113
VarEPS, a tropical cyclone filtering algorithm is then used to determine which ensemble track 114
forecasts are false alarms versus those associated with an observed tropical cyclone. The 115
essential information is the initial location and time of the ensemble track forecast, which is 116
compared to observed locations and times of tropical cyclones. If the forecast track precedes or 117
follows the date of tropical cyclone genesis, defined as the date when the first tropical depression 118
advisory is issued by the Joint Typhoon Warning Center (JTWC), then the forecast track is 119
classified as a pre-genesis or post-genesis forecast track, respectively. To determine the 120
clustering of pre-genesis tracks, the initial location and time of the tropical cyclone when genesis 121
occurred is used as the observation point. To determine the post-genesis clustering, the 122
observation point is the location and time of the tropical cyclone based on the initialized date of 123
the VarEPS forecast. Initially, the distance and time threshold used for defining an ensemble 124
forecast track as a tropical cyclone forecast track is ‘loose’, such that an ensemble track forecast 125
that is within 1500 nm and within 15 days of the observed TC is considered a ‘tropical cyclone 126
forecast track’. 127
8
The pre-genesis ensemble tracks for all tropical cyclones during the period 2007–2010 are 128
then used to determine ‘strict’ distance and time separation thresholds, which replace the set of 129
‘loose’ thresholds used initially. In this case, the thresholds that were derived include an initial 130
distance threshold of 500 nm, which increases by 100 nm for every 24 hrs of forecast lead-time. 131
In addition, the initial temporal separation threshold is ± 5 days with the upper threshold 132
increasing by one day for every 24 hrs of additional forecast-lead time. 133
Upon determining which ensemble forecast tracks are TC tracks, there is an implication that 134
all other ensemble forecast tracks must be false alarms. Using this definition of a false alarm 135
track leads to the conclusion that >50% of all VarEPS forecast tracks for the period 2007–2010 136
are actually false alarms. The major problem with this false alarm definition is that it does not 137
take advantage of the probabilistic framework of the VarEPS, where for a particular forecast the 138
distribution of all ensemble track forecasts provides a probabilistic estimate for the potential 139
formation and movement of a tropical cyclone. Therefore, we have developed a false-alarm 140
clustering algorithm such that a false alarm occurs when a cluster of VarEPS ensembles produces 141
a localized set (in space and time) of forecast tracks. The method uses the tropical cyclone 142
filtering routine discussed earlier, as well as k–means clustering (Mirkin 1996), which is an 143
algorithm designed to divide a set of points (or in this case initial ensemble forecast coordinates) 144
into k–clusters whose membership is based on minimization of the distance between each 145
ensemble’s forecast point and each cluster’s mean or centroid location. 146
The procedure used to determine the number and location of the false alarm clusters for a 147
particular forecast is shown schematically in Figure. First, a probability threshold is assigned 148
such that a false alarm cluster occurs when the normalized number of ensemble tracks within a 149
cluster is in excess of the false alarm probability. Next, all ensemble tracks for a particular 150
9
forecast are compared with the set of ensemble forecast tracks that were defined as either pre-151
genesis or post-genesis tropical cyclone tracks. Those ensemble tracks that are not in the set of 152
tropical cyclone forecast tracks become the ‘candidate false alarms’ creating a new set of 153
ensemble forecast tracks for further analysis. Thereafter, the clustering routine begins first with 154
the assumption that all ensemble tracks belong to the same set, so that an ensemble mean initial 155
location and forecast time is defined. This coordinate information then becomes the ‘observation 156
point’ that the tropical cyclone filtering routine uses to determine which ensemble forecast tracks 157
are within the distance and time thresholds of the cluster’s ensemble mean starting time and 158
initial location. After a subset of ensemble forecast tracks is identified, if the total number of 159
forecast tracks within this set is in excess of the false alarm probability threshold, then the 160
process described previously begins again except that the number of clusters is increased by one 161
(k = 1 ! k = 2), so that now, two k-means clusters are created from the initial set of ensemble 162
forecast tracks. Each cluster’s coordinate information is then used in the tropical cyclone 163
filtering routine. If ensemble track membership of both clusters is in excess of the false alarm 164
probability threshold, then the complete process is repeated except that three clusters (k = 3) are 165
defined. This iterative process continues until the number of ensemble tracks for any cluster 166
drops below the false alarm probability threshold. When this occurs, the finalized number of 167
false alarm clusters is determined to be k–1 clusters. 168
3. Results and Discussion 169
In Section 3a we present an analysis of how well the VarEPS predicts tropical cyclone 170
formation in the North Indian Ocean for the period 2007–2010. The analysis is accomplished by 171
evaluating the probability of detection and the false alarm rate as a function of forecast lead-time 172
and increasing probability threshold. We also include an evaluation of how well the pre-genesis 173
10
tropical cyclone forecast tracks and intensities from the VarEPS perform relative to observations. 174
In Section 3b, a traditional evaluation of forecast skill (i.e. post-TC genesis) is presented through 175
an examination of the track and intensity error statistics. Finally, in Section 3c we examine the 176
basin-wide predictability of regional TC activity in the North Indian Ocean by evaluating the 177
VarEPS forecasts for the Arabian Sea and the Bay of Bengal. 178
a) Tropical Cyclone Formation and Pre-genesis Forecasts 179
To illustrate some of the forecast information that is contained within the VarEPS forecasts 180
for tropical cyclones, Figurea shows the spatial distribution of tropical cyclone forecast tracks for 181
Severe Cyclone Nargis from the VarEPS initialized on 23 April 2008 00UTC, which is about 182
four days prior to the initiation of tropical depression advisories by the JTWC. In this case, the 183
VarEPS forecasts are in good agreement that the pre-cyclonic vortex that would become Nargis 184
would reach advisory criteria around 27 April 2008 in the central Bay of Bengal. Thereafter, the 185
tropical cyclone is forecast to move generally towards the east-northeast on a track that would 186
cause the system to make landfall in Myanmar around 30 April 30 or 1 May with a high (60%+) 187
probability as a hurricane and a very low (5%) probability as a severe cyclone/major hurricane. 188
In reality, Nargis intensified to category 4 hurricane level on the Saffir-Simpson intensity scale 189
with maximum sustained winds of 115 kts. The system made landfall in southern Myanmar on 2 190
May 2 12UTC, propagating eastward across the Irrawaddy delta (Webster 2007). The forecast 191
performance of the VarEPS for Severe Cyclone Nargis is exceptional in nearly all facets of 192
prediction: tropical cyclone genesis was forecast correctly eight days in advance, ensemble mean 193
track errors never exceeded 375 nm even based on pre-genesis track forecasts, and maximum 194
intensity forecasts, although underestimated, indicated a moderate (30%+) probability of Nargis 195
reaching hurricane intensity nearly six days in advance of TC formation. 196
11
A more systematic study of all tropical cyclones in the North Indian Ocean is now conducted 197
to determine whether the Nargis forecast performance is a distinguishing characteristic of the 198
VarEPS. Figure 3 is a relative operating characteristic (ROC) evaluation of the VarEPS 199
forecasts for NIO tropical cyclone genesis using the metrics of probability of detection (POD; 200
also known as hit rate) and false alarm rate (FAR; also known as probability of false detection) 201
for all 23 tropical cyclones occurring during the 2007–2010 period. The POD is a measure of the 202
fraction of observed tropical cyclones where TC genesis was forecasted correctly relative to all 203
observed TCs. The FAR is the proportion of all forecasts where a forecast of TC genesis was 204
issued and did not occur along with the number of correction rejections (i.e. TC genesis was not 205
forecast to occur and it did not occur). The false alarm rate should not be confused with the false 206
alarm ratio, which is the proportion of all forecasts where TC formation is forecast to occur but 207
did not (Barnes et al. 2009). Since the false alarm rate is a function not only of the false alarms 208
but also the number of correct rejections, the number of VarEPS forecasts included in this 209
evaluation will modulate the false alarm rate mainly through the number of correct rejections. If 210
all VarEPS forecasts from 2007–2010 are included, the false alarm rate as a function of forecast 211
probability threshold is significantly lower than if only the months traditionally associated with 212
NIO TC activity are included. Therefore, for this analysis, the VarEPS evaluation is restricted to 213
the months of April to June and August to December for the 2007–2010 period. 214
The POD and FAR statistics are also sensitive to the time window that is used for verification. 215
For instance, if one were to verify all one-day lead-time TC genesis forecasts, but did not require 216
that the VarEPS correctly forecast the actual date and time of TC genesis (i.e. the full period of 217
the VarEPS integration is used), the POD (FAR) would be much larger (smaller) than if a short 218
window centered on the time of TC genesis is required. Figure 3a-c show the POD and FAR of 219
12
the VarEPS forecasts using 48-hr, 96-hr, and 360-hr time windows calculated at a 10% forecast 220
probability interval ranging from 0 to 100%. Using the 48-hr time window (Figure 3a), the 221
VarEPS forecasts for lead-times of one to five days in advance exhibit moderate probability of 222
detections (0.4 to 0.7) with very low false alarm rates (0.1–0.2) mainly for forecast probability 223
thresholds of 10–40%. Although the false alarm rate never exceeds 0.2 even at extended lead-224
times, VarEPS forecasts made over seven days in advance tend to have low POD once forecast 225
probability thresholds increase beyond 20%. This observation reflects a decreased frequency of 226
VarEPS forecasts at long lead-times where the forecast probability exceeds 20%, which is in 227
contrast with the occurrence frequency of higher forecast probability levels at shorter lead-times 228
(not shown). If a 96-hr time window (Figure 3b) is used, the probability of detection (false alarm 229
rate) of the VarEPS on average increases by about 0.2 (0.1) for forecasts of TC genesis made less 230
than seven days in advance. Forecasts greater than seven days in advance benefit even more 231
from the increase in time window, as the POD for a ten-day lead-time forecast increases on 232
average by about 0.15, and the FAR increases by only 0.05 for a forecast probability threshold of 233
10–30%. Finally, using the full 15-day period of the VarEPS to define the time window of TC 234
genesis (Figure 3c), the average POD exceeds 0.7 and the FAR ranges from 0.3 to 0.6 based on 235
forecast probability thresholds of 10–40%. 236
A key component of any operational forecasting system for TC genesis is knowing what 237
combination of forecast lead-time and time window maximizes the probability of detection while 238
minimizing the false alarm rate. The area under the ROC curve, known as the ROC score, varies 239
from 0 to 1, where 0.5 indicates no forecast skill, and a value of 1 indicates a perfect forecast 240
system (Mason and Graham 1999). If the ROC score is calculated as a function of forecast lead-241
time for various time windows, the optimum time window is found by determining at which time 242
13
window the ROC score obtains a maximum value. For TC genesis forecasts with lead-times of 243
one to five days in advance, a 48-hr time window around the forecast date of genesis maximizes 244
the probability of detection while minimizing the false alarm rate. For TC genesis forecasts 245
beyond a lead-time of five days, the ROC score is maximized if the full 15-days of the VarEPS 246
integration is used to determine the forecast time of TC genesis. 247
The following analysis allows a greater understanding of the NIO false alarms that occur in 248
the VarEPS. The spatial distribution of each false alarm cluster’s starting centroid location from 249
2007 to 2010 is shown in Figure 4a. Although the false alarms in the Arabian Sea and southern 250
Bay of Bengal tend to be distributed uniformly during the 2007–2010 period, there is a relatively 251
high concentration of false alarm clusters that stretch from the northwestern Bay of Bengal into 252
the extreme northeastern Arabian Sea. The false alarm clusters in the northwestern Bay of 253
Bengal tend to occur in a localized region with a large gradient in topography between the Bay 254
of Bengal and the northern portion of the Eastern Ghats mountains. This concentration of false 255
alarms may be defining the preferential track of pre-existing cyclonic vortices that move through 256
the Bay of Bengal and into India but never become TCs. The localized nature of this false alarm 257
concentration may indicate a relationship between current convective parameterization schemes, 258
topography, and the forecast frequency of TC genesis in the VarEPS. 259
Figure 4b shows the false alarm ratio for the period 2007–2010 with Figure 4c indicating the 260
probability density function of all false alarms as a function of forecast lead-time. The false 261
alarm ratio obtains a peak value around 0.5 for a forecast probability level of 10% and decays 262
nearly exponentially as the forecast probability threshold increases. At a forecast probability 263
level of 25%, the false alarm ratio and the forecast probability level are equivalent. Figure 4c 264
shows that nearly half of all false alarms during the 2007–2010 period occurred at a forecast 265
14
lead-time of 96 to 240 hours. Relative to the period 2007–2009, the distribution of false alarms 266
for 2010 shifts to earlier lead-times with the mean forecast lead-time decreasing from 153 to 93 267
hours. 268
We now evaluate how well the VarEPS forecasts for track and maximum intensity perform 269
prior to TC genesis. Figure 5a-b show the total track error (in nm) and absolute wind error (in 270
kts) as a function of lead-time for all pre-genesis VarEPS forecasts during the period 2007–2010. 271
The average ensemble mean track error at a lead-time of 24 hrs is 69 nm with a 50% interval of 272
42–86 nm, and at 120 hrs it is 241 nm with a 50% interval of 116–324 nm. From a lead-time of 273
24 hrs to 240 hrs, the mean ensemble track error growth is nearly linear at 41 nm per day, so that 274
by a lead-time of 240 hrs, the total mean ensemble error is 409 nm with a 50% percent interval of 275
184–504 nm. From a lead-time of 240 hrs to 360 hrs, the track error growth plateaus to an error 276
of about 443 nm; however, this lack of error growth at extended lead-times is due likely to the 277
decreasing number of pre-genesis VarEPS forecasts relative to their increased forecasts 278
frequency at lead-times less than 240 hrs. Figure 5a also shows that the VarEPS track forecasts 279
perform similarly regardless of year, indicating that even though the VarEPS has undergone 280
several major changes during the 2007-2010 period, there has not been a substantial change in 281
forecast track performance in the NIO. To place these pre-genesis track errors in perspective, the 282
Indian Meteorological Department’s post-genesis track errors at a lead-time of 48 hrs and 72 hrs 283
typically average 162 nm and 270 nm, respectively2. This result suggests that the VarEPS can 284
provide a track forecast prior to TC formation, at a lead-time of 138 hrs, and achieve the same 285
track accuracy as the IMD’s 72-hr track forecasts, which are not provided until after a TC has 286
formed. 287
287 2 http://www.imd.gov.in/section/nhac/dynamic/faq/FAQP.htm
15
Figure 5b shows the absolute wind error (in kts) for all pre-genesis ensemble forecasts during 288
the period 2007–2010. Forecasts with a lead-time of 24 hrs have a mean absolute wind error of 289
12 kts with a 50% percent interval of 4–16 kts, and by 120 hrs the mean absolute wind error 290
grows to 23 kts with a 50% interval of 7–29 kts. However, unlike the pre-genesis TC track 291
forecasts where the 95th percentile interval is approximately 2.5 times as large as the mean 292
ensemble track error at 120 hrs, the 95th percentile interval for absolute wind error is 3.4 times as 293
large, reflecting a substantial negative wind bias for several of the most intense tropical cyclones 294
during the 2007–2010 period (not shown). In terms of the interannual variation in forecast 295
performance of maximum intensity, Figure 5b indicates that the VarEPS forecasts for 2008–2010 296
have on average performed substantially better than the VarEPS forecasts from 2007, with 2008 297
and 2009 showing three times the improvement relative to 2007. This marked change in forecast 298
skill of maximum intensity is likely due to the horizontal and vertical resolution increase that 299
occurred after 2007 in the VarEPS. 300
Forecasting tropical cyclone formation requires an estimate of not only the likely location of 301
TC genesis, but also the time when a system is likely to reach advisory thresholds Figure 5c 302
shows the relative error (in days) in the VarEPS’s forecast timing of TC genesis. Positive values 303
indicate the VarEPS TC genesis forecasts occur sooner than observations, while negative values 304
indicate a later genesis date. For a lead-time of 24 hours, the VarEPS mean ensemble error is 305
-0.5 days with a 50% interval of -1 to 0.5 days. By a lead-time of 120 hrs, the VarEPS mean 306
ensemble error exhibits a slight positive bias of 0.5 days with a 50% interval of -0.9 to 1.9 days. 307
After a lead-time of 168 hrs, the VarEPS mean ensemble error begins to increase more rapidly 308
such that at 240 hrs the VarEPS mean ensemble bias grows to 2.2 days with a 50% interval of 309
0.8 to 3.3 days. 310
16
A possible explanation why the VarEPS’s forecasts for timing of TC genesis are well-311
constrained through a lead-time of 168 hrs is due to the dispersion or spread among the VarEPS’ 312
ensembles. Figure 5d shows the distribution of ensemble spread in genesis time for each TC 313
from 2007-2010 as a function of forecast lead-time. In this case, the ensemble spread is 314
calculated as the difference in time (in days) of TC genesis between the earliest and latest 315
ensemble member. As shown in Figure 5d, the ensemble spread in TC genesis time grows 316
rapidly for the first seven days of forecast lead time, then reaches a time limit plateau of around 317
12 days after a forecast lead-time of 168 hrs. This limit is likely in response to the integration 318
limit of 15 days imposed by the VarEPS and could explain why the VarEPS begins showing 319
significant positive biases in the time of TC genesis relative to observations after 168 hrs (c.f. 320
Figure 5c). 321
b) Post-genesis Tropical Cyclone Forecasts 322
Figure 6a shows the distribution of all VarEPS track forecasts during the 2007–2010 period 323
indicating how the total track error statistic varies as a function of lead-time once TC genesis has 324
occurred. From Figure 6a it is seen that at a lead-time of 24 hrs, the mean track error is 72 nm 325
with a 50% interval of 36 to 105 nm. From 24 to 120 hrs, the VarEPS track errors increase 326
linearly at a rate of 53 nm such that by a lead-time of 120 hrs, the mean track error is 287 nm 327
with a 50% interval of 162–332 nm. Similar to the VarEPS pre-genesis track forecasts, the 328
VarEPS post-genesis track errors show no significant improvement in annual track performance 329
for the period 2007–2010. 330
To place the VarEPS’s post-genesis track forecasts in perspective, Figure 6b compares the 331
VarEPS control and ensemble mean forecasts with other forecasting agencies including the 332
JTWC, the U.S. Navy’s version of the GFDL (GFDN), the United Kingdom Meteorological 333
17
Office’s global model (UKMET), the National Center for Environmental Prediction’s global 334
forecast system model (GFS), and the U.S. Navy’s NOGAPS model. Since these forecasts were 335
obtained through the U.S Navy’s Automated Tropical Cyclone Forecasting System, most of the 336
forecast guidance is limited to 72-hrs, similar to the temporal limit of the JTWC’s forecasts prior 337
to 2010. Although the VarEPS control and ensemble mean forecast on average begins with the 338
largest initial track error, 12-hrs later and beyond the VarEPS control and ensemble mean 339
showcase the lowest track errors among all other model forecasts. In addition, the VarEPS 340
control and ensemble mean on average exhibit slightly lower track errors than the JTWC through 341
a lead-time of 72-hrs, although this difference is not statistically significant at the 95% 342
confidence level. Relative to the next best performing forecast model, the VarEPS ensemble 343
mean’s 24-hr, 48-hr, and 72-hr track forecast error is on average 10%, 19%, and 27% smaller 344
than NCEP’s GFS. Since the linear track error growth per day for the VarEPS ensemble mean 345
(41 nm/day) is considerably smaller than the GFS (66 nm/day), greater track forecast utility is 346
obtained at longer lead-times with the VarEPS in comparison to other model forecasts. 347
Figure 6c shows the distribution of absolute wind error for all VarEPS post-genesis forecasts 348
during the 2007–2010 period. At the analysis time step (0 hrs), the mean absolute wind error of 349
all VarEPS forecasts is high at 21 kts with a 50% interval of 13–48 kts. By a lead-time of 72 hrs 350
the mean absolute wind error reaches 32 kts with a 50% interval of 8–54 kts, and by 108 hrs, the 351
VarEPS mean intensity forecasts reach their maximum wind error of 41 kts with a 50% interval 352
of 21–52 kts. Unlike the intensity error statistics for the pre-genesis VarEPS forecasts, the 353
interannual variation of post-genesis intensity forecasts reflects a more substantial improvement 354
for the 2008–2010 period compared to 2007. Using the first 72 hrs of lead-time as a reference, 355
an average improvement of 67% relative to 2007 is evident. Finally, Figure 6d compares the 356
18
VarEPS control and ensemble mean absolute wind error to other forecast models and the JTWC. 357
Generally, for the 2007–2010 period, the VarEPS begins with much higher initial error than any 358
other forecast model but also shows the lowest intensity error growth through the first 72 hrs. If 359
the 2007 period is excluded from the analysis, the VarEPS control and ensemble mean perform 360
better than any other global numerical weather model (GFS, NOGAPS, UKMET) and similar to 361
the high-resolution GFDN model and JTWC’s operational forecasts. 362
One rationale for why tropical cyclone forecasts are not issued prior to TC genesis is the 363
notion that pre-genesis track forecasts have significantly larger errors than track forecasts for 364
systems that have already formed. Although a large number of pre-genesis and post-genesis 365
forecasts are needed to test this hypothesis robustly, the sample of VarEPS mean forecasts from 366
2007–2010 for the North Indian Ocean of around 100 forecasts is sufficient to test this 367
hypothesis. Figure 7a shows the probability distribution function of pre-genesis and post-genesis 368
VarEPS track forecast errors for lead-times less than 48 hours. Consistent with the hypothesized 369
notion of track forecasting performance, the VarEPS forecasts made after TC genesis on average 370
are more skillful at predicting track movement (median track error of 62 nm) than forecasts made 371
prior to tropical cyclone formation (median track error 75 nm), with the difference being 372
statistically significant at the 90% confidence level. Similar Figure 7a, Figure 7b also shows the 373
probability distribution function of total track error for the VarEPS forecasts except at longer 374
lead-times (48 to 120 hours). Unlike the first 48 hours, pre-genesis TC track forecasts on 375
average exhibit lower total track errors at these extended lead-times (median track error of 136 376
nm versus 171 nm), which is a statistically significant difference at the 99% confidence level. 377
This difference between pre-genesis and post-genesis forecasts involves a systematic, 378
statistically significant shift in the first moment of the distributions. However, there is also a 379
19
narrowing of the VarEPS post-genesis track error distribution, which implies that there is greater 380
dispersion among the VarEPS ensembles prior to TC genesis. 381
A number of factors might be causing this rather counterintuitive longer lead-time result for 382
pre-genesis TC tracks. One possible explanation is that even though ensemble spread is 383
generated through moist singular vectors introduced into a 5o x 5o degree region enclosing the 384
TC after genesis has occurred, this process produces less dispersion than simply modifying the 385
initial TC location prior to model integration–a process which implicitly occurs with forecasts 386
prior to TC formation. Therefore, since the pre-genesis track forecasts begin with larger track 387
dispersion, this larger initial spread results in a greater sample of the forecast phase space, lower 388
track error growth rates, and allows the ensemble mean on average to perform better than for 389
forecasts with smaller initial dispersion (i.e. post-TC genesis). 390
c) Regional Outlooks of Tropical Cyclone Activity 391
We now consider the VarEPS’s ability to produce skillful regional outlooks of TC activity 392
(i.e. the probability that a tropical depression strength or greater vortex will be located within a 393
region) by dividing the North Indian Ocean into two sub-domains: the Arabian Sea and the Bay 394
of Bengal. Figure 8a shows the ROC for the Arabian Sea using the VarEPS forecasts from 395
April–June and August–December of 2007–2010. Similar to the ROC analysis of the VarEPS 396
TC genesis forecasts, the VarEPS forecasts for TC activity in the Arabian Sea exhibit moderate 397
probabilities of detection with very low false alarm rates, even as the forecast decision threshold 398
decreases to lower probabilities. As forecast lead-time increases, a transition in forecast 399
performance occurs after a lead-time of ten days. The ROC curve for forecasts with lead-times 400
five to ten days in advance have similar POD and FAR with forecasts at shorter lead-times. 401
However, comparing the ROC curve for forecasts with lead-times of ten to fifteen days relative 402
20
to five to ten days in advance reveals a large decrease in the probability of detection although the 403
false alarm rate essentially remains constant. From this analysis it is unclear whether this 404
decrease in POD is due an inherent lack of predictability at this longer-time scale or is 405
functionally dependent on the current configuration of the VarEPS with reduced horizontal 406
resolution at lead-times greater than 240 hrs. However, it is clear that the current configuration 407
of the VarEPS is incapable of generating forecast probabilities of TC activity in the Arabian Sea 408
greater than 30% for lead-times of ten to fifteen days, which is one reason why the POD is so 409
much lower than for forecasts with lead-times less than ten days. Figure 8b shows the ROC for 410
the Bay of Bengal using the same set of VarEPS forecasts as in Figure 8a. Relative to the 411
Arabian Sea, the VarEPS’s forecasts of TC activity in the Bay of Bengal have lower probabilities 412
of detection for shorter lead-times, but higher POD values at longer lead-times. Although for 413
lead-times of ten to fifteen days in advance, forecasts in the Bay of Bengal also exhibit slightly 414
higher false alarm rates than the Arabian Sea, the relative change in POD compared with FAR 415
indicates that overall the VarEPS forecasts are more skillful in the Bay of Bengal than the 416
Arabian Sea at extended lead-times. 417
To quantify how well the VarEPS forecasts of TC activity perform in the Arabian Sea and 418
the Bay of Bengal as a function of forecast lead-time, two skill score metrics are used: the Brier 419
skill score (BSS) and the ROC score. The BSS measures the accuracy or relative skill of a 420
forecast over climatology by comparing whether or not an event is forecast to occur relative to 421
observations. A BSS greater than zero implies forecast skill beyond climatology. In this 422
analysis, a 30-yr climatology (1980–2009) of tropical cyclones was developed from the JTWC 423
best-track dataset. Although the BSS metric reveals how skillful a forecast system is relative to 424
climatology, the BSS is regarded as a harsh forecast standard, as it can often hide useful 425
21
formation information even when the BSS is less than 0 (Mason 2004). Therefore, we use the 426
ROC score as another skill score metric. Table 1 provides the BSS and ROC scores for the 427
Arabian Sea and the Bay of Bengal based on the VarEPS forecasts from April–June and August–428
December from 2007 to 2010 as a function of forecast lead-time. To establish statistical 429
significance at the 95% confidence level, a nonparametric bootstrap test was used. In this case, 430
statistical significance of the BSS (ROC score) is determined if the 95% confidence interval of 431
the BSS exceeds 0 (0.50). From Table 1, the BSS metric indicates that the VarEPS forecasts for 432
TC activity in the Arabian Sea are skillful beyond climatology for forecasts less than two days, 433
two to five days, and five to ten days in advance. Beyond ten days, however, forecasts for TC 434
activity are not skillful relative to climatology. In contrast to the BSS metric, the ROC score is 435
well-above 0.50 for all time periods, which indicates that the VarEPS provides skillful forecasts 436
of TC activity through fifteen days in the Arabian Sea. 437
Table 1 also provides the BSS and ROC scores for TC forecasts in the Bay of Bengal. 438
According to the BSS, forecasts less than five days in advance in the Bay of Bengal are nearly 439
60% less skillful than in the Arabian Sea, which is likely due to the higher frequency of false 440
alarms in the Bay of Bengal (c.f. Figure 4c and Figure 8a-b). However, consistent with the ROC 441
interpretation of Figure 8a-b, the BSS is positive and statistically greater than 0 at the 95% 442
confidence level through a lead-time of ten days. For the ten to fifteen day forecast period, the 443
BSS indicates that the VarEPS performs as well as climatology in the Bay of Bengal. Although 444
TC forecasts from the VarEPS in the Arabian Sea are more skillful than those in the Bay of 445
Bengal for lead-times less than ten days, this result is not true at longer time scales where the 446
Bay of Bengal forecasts have on average an 18% higher BSS. If the ROC score metric is used to 447
22
determine forecast skill, the VarEPS forecasts for TCs in the Bay of Bengal are skillful through 448
fifteen days, which is similar to the ROC score results for the Arabian Sea. 449
The results in Table 1 may be compared to a similar analysis that Belanger et al. (2010) 450
performed for the tropical North Atlantic using the ECMWF Monthly Forecast System during 451
the hurricane seasons of 2008 and 2009. They find that the most predictable region for TC 452
activity in the North Atlantic is the Main Development Region, as ROC scores for forecast days 453
8-14 (15-21) are 0.81 (0.75). Relative to these findings, the results presented here indicate 454
similar predictability in the Bay of Bengal as in the North Atlantic’s Main Development Region, 455
with TC predictability in the Arabian Sea more characteristic of that of the Caribbean Sea at 456
extended forecast lead-times. Physically, the authors attribute the predictability at these time 457
scales to the ability of the VarEPS to predict accurately the magnitude of deep-layer (850-200 458
hPa) vertical wind shear as well as the correct frequency of pre-existing cyclonic vortices such as 459
African easterly waves. However unlike the Caribbean Sea, where predictability is modulated 460
largely by the variability in intensity and location of the tropical upper-tropospheric trough, in 461
the Arabian Sea it is likely the combination of dry environmental air and changes in deep-layer 462
vertical wind shear in association with the onset and end of the south Asian monsoon. In 463
addition, Vitart (2009) along with Belanger et al. (2010) show that regional TC predictability in 464
the tropical Atlantic is strongly modulated by the phase and amplitude of the Madden-Julian 465
Oscillation (MJO). Given the more pronounced impact that the MJO has in the NIO (Webster 466
and Hoyos 2004, Hoyos and Webster 2007), we expect this finding to be even more applicable to 467
the results that have been presented here. 468
4. Summary and Conclusions 469
23
The performance of the VarEPS in forecasting tropical cyclones in the North Indian Ocean 470
has been examined. An evaluation of the VarEPS forecasts for TC genesis from 2007–2010 was 471
conducted in which the VarEPS is shown to have low false alarm rates and moderate to high 472
probabilities of detection for a lead-time through seven days. The VarEPS TC genesis forecast 473
performance is sensitive to the time window that is used to define whether or not an event is 474
forecast to occur. Based on an optimization procedure to achieve the highest ROC score (i.e. 475
maximizing probability of detection and minimizing the false alarm rate), the optimum 476
forecasting combination to predict TC genesis is a 48-hr time window for a forecast lead-time 477
through five days. Thereafter, the full time period of the VarEPS integration should be used to 478
generate TC genesis forecasts five to ten days in advance. 479
An evaluation of the pre-genesis and post-genesis track and maximum intensity forecasts 480
from the VarEPS was presented in which the total track error for pre-genesis forecasts are shown 481
to perform comparable to the track forecasts produced after TC genesis. For a lead-time of 24 to 482
240 hours, the total mean track error statistic grew at a rate 41 nm per day such that by a lead-483
time of 120 hours (240 hours), the average track error of all VarEPS forecast is 241 nm (409 nm) 484
with a 50% interval of 116–324 nm (184–504 nm). The performance of these track forecasts 485
prior to TC genesis is remarkable considering that the Indian Meteorological Department’s 486
average 72-hr forecast track error is 500 km (270 nm), and these forecasts are issued only after 487
TC genesis has occurred. In terms of post-genesis TC track forecasting, the VarEPS forecasts 488
for a lead-time of 24 hrs average 72 nm with a 50% interval of 36 to 105 nm, and by 120 hours 489
the mean track error is 287 nm with a 50% interval of 162–332. Although the VarEPS control 490
and ensemble mean forecast on average starts with the largest initial track error when compared 491
to other forecast models (i.e. NOGAPS, UKMET, GFS, GFDN), 12-hrs later and beyond, the 492
24
VarEPS control and ensemble mean showcase the lowest track errors among all other model 493
forecasts. In fact, the VarEPS control and ensemble mean on average exhibit slightly lower track 494
errors than the JTWC through a lead-time of 72-hrs. Since these forecasts are not bias-adjusted 495
to account for the difference in the starting location of the observed TC relative to initialized 496
location in the VarEPS model, additional statistical post-processing steps could be applied to 497
significantly lower the average track errors at extended lead-times. 498
In addition, the VarEPS forecasts were translated into regional outlooks to provide the 499
likelihood of TC activity in the Arabian Sea and the Bay of Bengal. Skill score metrics including 500
the Brier Skill Score and the Relative Operating Characteristic Score were used to evaluate the 501
VarEPS forecasts. The BSS statistic indicates that the VarEPS TC forecasts for the Arabian Sea 502
and Bay of Bengal are predictable relative to climatology through ten days, whereas the ROC 503
score statistic show that TC activity for both regions are predictable through two weeks. 504
Based on this evaluation of the VarEPS TC forecasts, we recommend that warning agencies 505
in the NIO begin providing a probabilistic TC formation outlook that assesses the potential for 506
TC development through a lead-time of seven days. When the probability of formation is within 507
moderate (30-60%) levels, the VarEPS’s probability of detection will average around 60% with a 508
false alarm rate of about 30% for a lead-time of seven days. In addition, since the distribution of 509
the VarEPS forecasts provides a dynamical measure of the forecast uncertainty in the 510
atmosphere’s future state, some TCs will be more predictable than others. Therefore, we also 511
recommend including a probabilistic track and intensity outlook that is developed based on the 512
distribution of VarEPS forecast tracks. Although the VarEPS tends to be slightly 513
underdispersive at longer forecast lead-times (Majumdar et al. 2010), some additional statistical 514
25
post-processing steps could be incorporated to ensure that the final forecast track and intensity 515
probabilities are well-conditioned relative to observations. 516
Acknowledgments 517
The authors would like to thank Mark Jelinek for his helpful suggestions during manuscript 518
preparation and the TIGGE project for supplying the ECMWF VarEPS data. Also, we 519
appreciate continuing interactions with Dr. Frederic Vitart and his colleagues at ECMWF. The 520
Climate Dynamics Division of the National Science Foundation under grant NSF 0826909 521
provided funding support for this research.522
26
References 523
Barkmeijer, J., R. Buizza, T. N. Palmer, K. Puri, and J.-F. Mahfouf, 2001: Tropical singular 524
vectors computed with linearized diabatic physics. Quart. J. Roy. Meteor. Soc., 127, 685–708. 525
Barnes, Lindsey R., David M. Schultz, Eve C. Gruntfest, Mary H. Hayden, Charles C. Benight, 526
2009: Corrigendum: false alarm rate or false alarm ratio? Wea. Forecasting, 24, 1452–1454. 527
doi: 10.1175/2009WAF2222300.1 528
Belanger, J. I., J. A. Curry, and P. J. Webster, 2010: Predictability of North Atlantic tropical 529
cyclones on intraseasonal time scales, Mon. Wea. Rev., 138, 4393–4401. 530
Buizza, R., and T. N. Palmer, 1995: The singular vector structure of the atmospheric global 531
circulation. J. Atmos. Sci., 52, 1434–1456. 532
Hart, R.E., 2003: A cyclone phase space derived from thermal wind and thermal asymmetry. 533
Mon. Wea. Rev., 131, 585–616. 534
Holland, G. J., J. Done, C. Bruyere, C. Cooper, and A. Suzuki-Parker, 2010: Model 535
investigations of the effects of climate variability and change on future Gulf of Mexico tc 536
activity. Offshore Technology Conference, 20690, 1–13. 537
Hoyos, C. D. and P. J. Webster, 2007: The Role of intraseasonal variability in the nature of 538
Asian monsoon precipitation. J. Climate, 20, 4402–4424. 539
Mason, S. J., and N. E. Graham, 1999: Conditional probabilities, relative operating 540
characteristics, and relative operating levels. Wea. Forecasting, 14, 713–725. 541
Mason, S. J., 2004: On using “climatology” as a reference strategy in the Brier and ranked 542
probability skill scores. Mon. Wea. Rev., 132, 1891–1895. 543
Majumdar, Sharanya J., Peter M. Finocchio, 2010: On the Ability of Global Ensemble Prediction 544
Systems to Predict Tropical Cyclone Track Probabilities. Wea. Forecasting, 25, 659–680. 545
27
doi: 10.1175/2009WAF2222327.1 546
Mirkin, B., 1996: Mathematical Classification and Clustering. Kluwer Academic, 428 pp. 547
Puri, K., J. Barkmeijer, and T. N. Palmer, 2001: Ensemble prediction of tropical cyclones using 548
targeted diabatic singular vectors. Quart. J. Roy. Meteor. Soc., 127, 709–734. 549
Rappaport, Edward N., and Coauthors, 2009: Advances and challenges at the national hurricane 550
center. Wea. Forecasting, 24, 395–419. doi: 10.1175/2008WAF2222128.1 551
Tropical cyclone operational plan for the Bay of Bengal and Arabian Sea, World Meteorological 552
Organization Tech. Document WMO/TD-No. 84, 1–97. [Available from World Meteorological 553
Organization, Case Postale No. 5, HC-1211, Geneva 20, Switzerland.] 554
Vitart, F., 2009: Impact of the Madden-Julian Oscillation on tropical storms and risk of landfall 555
in the ECMWF forecast system. Geophys. Res. Lett., 36, L15802, doi:10.1029/2009GL039089. 556
Webster, P. J. and C. Hoyos, 2004: Prediction of monsoon rainfall and river discharge on 15-30 557
day time scales. Bull. Amer. Met. Soc., 85, 1745–1765. 558
Webster, P. J., 2008: Myanmar’s deadly daffodil. Nature Geoscience, 1, 488–490, 559
doi:10.1038/ngeo257 560
561
28
Figure Captions 561
Figure 1: Schematic of the false alarm clustering routine. Input variables include PT, tt, xt, 562
!
! O i! x ( ) , and
!
! E j! x ( ). PT is the false alarm probability threshold, tt is a time threshold of 120 hrs, 563
!
xt =10024
"
# $
%
& ' t + 500(nm),
!
! O i! x ( ) contains location and time information for the ith observed TC 564
during a particular season,
!
! E j! x ( )is the ECMWF forecast track for the jth ensemble member. 565
Other variables include
!
! E TC j
! x ( ) which is the ECMWF forecast track for the jth ensemble member 566
that corresponds to an observed TC.
!
! E TC j
! x ( ) is made up of pre-genesis TC forecast tracks, 567
!
! E PRE"TC j , and post-genesis TC forecast tracks,
!
! E POST"TC j . K is the number of clusters employed 568
in the k–means clustering algorithm,
!
! F k! x ( ) contains the kth false-alarm cluster’s mean location 569
and starting time information.
!
! E CFA j
! x ( ) is the ‘candidate false-alarm’ for each jth ECMWF 570
forecast track,
!
! E FAC k
! x ( ) is the final false-alarm grouping for the kth cluster. 571
572
Figure 2a (left): Example of the VarEPS forecasts for Severe Cyclone Nargis on 23 April 2008 573
00UTC, which was 3.75 days prior to tropical cyclone genesis according the JTWC Best Track 574
dataset. The black line with red dots denotes the observed track of Nargis. The thin grey lines 575
indicate unique ensemble track forecasts from the VarEPS with the thick black (blue) line 576
denoting the VarEPS ensemble mean (control) track. Figure 2b (right): The maximum intensity 577
forecast from the VarEPS is shown as a probabilistic time series in which color shading denotes 578
percentile intervals of the VarEPS forecasts ranging from 10 to 90%. Note: The red line 579
indicates the observed maximum intensity of Nargis. 580
581
29
Figure 3: Relative operating characteristic of the VarEPS forecasts for tropical cyclone genesis 582
during the period 2007–2010. a) ROC is determined as a function of forecast days in advance 583
using a 48-hour window on the date of TC genesis. b) Similar to a), except for a 96-hour 584
window on the date of TC genesis. c) Similar to a), except the full 360-hour forecast period is 585
used for the TC genesis evaluation. 586
587
Figure 4: a) Mean initial location for each false alarm cluster from 2007 to 2010. b) False alarm 588
ratio as a function of increasing probability detection threshold from 2007 to 2010. The red line 589
indicates the location where the false alarm ratio is equivalent to the probability threshold. c) 590
Probability distribution functions of the mean genesis lead-time in hours. 591
592
Figure 5: a) VarEPS pre-genesis track errors (in nm) and b) absolute wind errors (in kts) for all 593
ensemble forecasts during 2007–2010. c) VarEPS forecasts of the difference between forecast 594
time of TC genesis and observations (in days) with values greater (less) than 0 indicating the 595
VarEPS forecasts are sooner (later) than observations. d) Ensemble spread in the forecast time of 596
TC genesis. Color shading indicates the percentile ranges for the VarEPS forecasts and the black 597
line is the VarEPS ensemble mean. The total number of ensemble forecasts included in the 598
verification is listed above the abscissa. 599
600
Figure 6: a) VarEPS post-genesis track errors (in nm) and c) absolute wind errors (in kts) for all 601
ensemble forecasts during 2007–2010. Color shading indicates the percentile ranges for the 602
VarEPS forecasts and the black line is the VarEPS ensemble mean. The total number of 603
ensemble forecasts included in the verification is listed above the abscissa. b) Comparison of 604
30
ECMWF control and ensemble mean track errors (in nm) and d) absolute wind errors (in kts) to 605
other global weather models and the Joint Typhoon Warning Center for the period 2007–2010. 606
607
Figure 7: a) Probability distribution function of the 0–48 hour total track error statistics for pre-608
genesis (109 forecasts) and post-genesis VarEPS mean forecasts (175 forecasts) for the 23 North 609
Indian Ocean TCs of the period 2007–2010. b) Similar to a), except for the period 48–120 hours 610
and includes 171 (51) forecast pre-genesis (post-genesis). The thick line indicates the actual 611
PDF with shading denoting the 95% confidence intervals determined by a bootstrap resampling 612
technique. 613
614
Figure 8: Relative operating characteristic for the a) Arabian Sea and b) Bay of Bengal using the 615
VarEPS forecasts from 2007 to 2010 during the months of April–June and August–December for 616
various forecast lead-times.617
31
Tables 618
Table 1. Brier skill scores (BSS) and relative operating characteristic scores (ROCS) for the 619
Arabian Sea and the Bay of Bengal based on VarEPS forecasts for tropical cyclone activity 620
during the months of April–June and August–December for 2007–2010. BSS (ROCS) in bold 621
are statistically different from 0 (0.5) at the 95% confidence level. 622
Arabian Sea Brier Skill Score Relative Operating Characteristic Score
All Forecast Days 0.17 0.82
Forecast Days ! 2 0.47 0.85
Forecast Days 2–5 0.32 0.87
Forecast Days 5–10 0.04 0.82
Forecast Days 10–15 -0.14 0.70
Bay of Bengal Brier Skill Score Relative Operating Characteristic Score
All Forecast Days 0.09 0.80
Forecast Days ! 2 0.32 0.78
Forecast Days 2–5 0.18 0.80
Forecast Days 5–10 0.18 0.83
Forecast Days 10–15 0.00 0.75
623
624
32
Figures 624
625
Figure 1: Schematic of the false alarm clustering routine. Input variables include PT, tt, xt, 626
!
! O i! x ( ) , and
!
! E j! x ( ). PT is the false alarm probability threshold, tt is a time threshold of 120 hrs, 627
!
xt =10024
"
# $
%
& ' t + 500(nm),
!
! O i! x ( ) contains location and time information for the ith observed TC 628
during a particular season,
!
! E j! x ( )is the ECMWF forecast track for the jth ensemble member. 629
Other variables include
!
! E TC j
! x ( ) which is the ECMWF forecast track for the jth ensemble member 630
that corresponds to an observed TC.
!
! E TC j
! x ( ) is made up of pre-genesis TC forecast tracks, 631
!
! E PRE"TC j , and post-genesis TC forecast tracks,
!
! E POST"TC j . K is the number of clusters employed 632
in the k–means clustering algorithm,
!
! F k! x ( ) contains the kth false-alarm cluster’s mean location 633
33
and starting time information.
!
! E CFA j
! x ( ) is the ‘candidate false-alarm’ for each jth ECMWF 634
forecast track,
!
! E FAC k
! x ( ) is the final false-alarm grouping for the kth cluster. 635
636
34
636
637
Figure 2a (left): Example of the VarEPS forecasts for Severe Cyclone Nargis on 23 April 2008 638
00UTC, which was 3.75 days prior to tropical cyclone genesis according the JTWC Best Track 639
dataset. The black line with red dots denotes the observed track of Nargis. The thin grey lines 640
indicate unique ensemble track forecasts from the VarEPS with the thick black (blue) line 641
denoting the VarEPS ensemble mean (control) track. Figure 2b (right): The maximum intensity 642
forecast from the VarEPS is shown as a probabilistic time series in which color shading denotes 643
percentile intervals of the VarEPS forecasts ranging from 10 to 90%. Note: The red line 644
indicates the observed maximum intensity of Nargis. 645
646
35
646
Figure 3: Relative operating characteristic of the VarEPS forecasts for tropical cyclone genesis 647
during the period 2007–2010. a) ROC is determined as a function of forecast days in advance 648
using a 48-hour window on the date of TC genesis. b) Similar to a), except for a 96-hour 649
window on the date of TC genesis. c) Similar to a), except the full 360-hour forecast period is 650
used for the TC genesis evaluation. 651
652
36
652
Figure 4: a) Mean initial location for each false alarm cluster from 2007 to 2010. b) False alarm 653
ratio as a function of increasing probability detection threshold from 2007 to 2010. The red line 654
indicates the location where the false alarm ratio is equivalent to the probability threshold. c) 655
Probability distribution functions of the mean genesis lead-time in hours. 656
657
37
657
Figure 5: a) VarEPS pre-genesis track errors (in nm) and b) absolute wind errors (in kts) for all 658
ensemble forecasts during 2007–2010. c) VarEPS forecasts of the difference between forecast 659
time of TC genesis and observations (in days) with values greater (less) than 0 indicating the 660
VarEPS forecasts are sooner (later) than observations. d) Ensemble spread in the forecast time of 661
TC genesis. Color shading indicates the percentile ranges for the VarEPS forecasts and the black 662
line is the VarEPS ensemble mean. The total number of ensemble forecasts included in the 663
verification is listed above the abscissa. 664
38
665
Figure 6: a) VarEPS post-genesis track errors (in nm) and c) absolute wind errors (in kts) for all 666
ensemble forecasts during 2007–2010. Color shading indicates the percentile ranges for the 667
VarEPS forecasts and the black line is the VarEPS ensemble mean. The total number of 668
ensemble forecasts included in the verification is listed above the abscissa. b) Comparison of 669
ECMWF control and ensemble mean track errors (in nm) and d) absolute wind errors (in kts) to 670
other global weather models and the Joint Typhoon Warning Center for the period 2007–2010. 671
672
39
672
Figure 7: a) Probability distribution function of the 0–48 hour total track error statistics for pre-673
genesis (109 forecasts) and post-genesis VarEPS mean forecasts (175 forecasts) for the 23 North 674
Indian Ocean TCs of the period 2007–2010. b) Similar to a), except for the period 48–120 hours 675
and includes 171 (51) forecast pre-genesis (post-genesis). The thick line indicates the actual 676
PDF with shading denoting the 95% confidence intervals determined by a bootstrap resampling 677
technique. 678
679
680
40
680
Figure 8: Relative operating characteristic for the a) Arabian Sea and b) Bay of Bengal using the 681
VarEPS forecasts from 2007 to 2010 during the months of April–June and August–December for 682
various forecast lead-times. 683