Qin Danyu
Thanks contributor Dr Sun Fenglin
National Satellite Meteorology Center (NSMC)China Meteorological Administration (CMA)
New approaches for Convective Initiation
Nowcasting based on the TV -L1 optical flow and
BP_Adaboost neural network algorithm
AOMSUC-10, 4-6 Dec 2019, Melbourne, Australia
Outline
1. Overview of Convective initiation(CI) by Satellite
2. Current FY-4A CI Product
3. New approaches for Convective Initiation
• TV -L1 optical flow
• BP_Adaboost neural network algorithm
4. Validation
5. Summary
This kind of convection activities usually
produce severe weathers such as heavy
rainfall, strong wind, hail, tornado etc.
and can make great damages and loses
even death.
China suffers from severe convective storms every year
@ 6minutes
◼Numerical Weather Prediction(NWP)◼In situ◼Radar◼Satellite
@ 5-15minutes
6hrs
For 0-2 hour nowcasting
Disadvantage of Radar :
• Depend on the elevation angle that some
time may probably lost low level information
of convections, like CI.
• Do not coverage all area, such as high
land(Tibet Plateau),deserts and ocean.
• Calibration problem
While GEO sat is excellent to monitor the CI and its environment,
especially for the new generation satellites such as FY-4 and H8,
because they have more powerful imager, ana ABI/AHI/AGRI.
1816 TD
FY-4 AGRI 0.64μm 5-minute animation
CI *Advantages:• Broad coverage
• High spatial
resolution
• Higher calibration
accuracy
• More stable
• More lead time
1. Overview of Convective initiation(CI) by Satellite
Motivation of Geo Sat CI AlgorithmDetect Convective Initiation using geostationary satellites to provide increased lead times for ANY event
Mecikalski and Bedka first
presented CI conception in
2006(MB06)
1. Overview of Convective initiation(CI) by Satellite
Motivation of Geo Sat CI AlgorithmDetect Convective Initiation using geostationary satellites to provide increased lead times for ANY event
Tracing
Convective targets
identification
CI interesting fields check
1
2
3
CI
Sensitive factors
CI
Evolution
ChannelNo Band
(μm)Spatial
resolution(km)Primary
application
Utilized in CI
Visible1 0.45-0.49 1 Aerosol no
2 0.55-0.75 0.5 Fog, cloud yes
Near-infrared
3 0.75-0.90 1 Vegetation no
4 1.36-1.39 2 Cirrus no
5 1.58-1.64 2 Cloud, snow
no
6 2.1-2.35 2 Cloud, aerosol
no
SW*-infrared
7 3.4-4.0(high)
2 Fire, land,
andsurface
No
8 3.4-4.0(low)
4 No
Water vapor9 5.8-6.7 4 WV* Yes
10 6.9-7.3 4 WV* Yes
LW*-Infrared
11 8.0-9.0 4 WV*, cloud Yes
12 10.3-11.3 4 SST*, cloud Yes
13 11.5-12.5 4 SST*, cloud Yes
14 13.2-13.8 4 Cloud Yes
FY-4A/AGRI specifications and channels used in CI algorithm
FY-4 CI/RDC Algorithm
1.Convective targets Identification
2.Multi Targets Trace3.Cloud Top Cooling rate
• Water shed method• IR and VIS thresholds • Multi channel tests
IR VIS
Choose one of the two
tracing methods according
to the combined
observation mode
⚫ Baseline observation every hour one
FD(15min)
⚫ Every 3 hours , two more FD
observation(15 min),Deriving AMV
⚫ During 17-19 (BJT), AGRI is suspended to
ensure its safety.
⚫ All the other time RRS (5min*9=45min)
15min FD 5min RRSAGRI combined observation
FY-4A
Convective Initiation(pink) and Possible Convective Initiation(blue)
• The FY-4A convection
product can provide
convective initiation and
rapid convection growing
information to end-user, it is
good used for nowcasting
• The FY-4A convective
initiation results have false
alarms, many CI detections
do not produce severe
weather
Alternative Approaches-False Alarm Reduction
In order to track rapid developing CIs, the first step is to calculate the cloud-
tracking motion derivation based on a classical TV-L1 optical flow(OF) method.
The OF fields represent the motion of pixels in two consecutive image frames under
the brightness constancy assumption. Some previous studies. Some previous studies
have developed a variation formulation to deal with the optical-flow problem. The
main equation of the TV-L1 OF method can be written as follows
𝒖 𝒙 ,𝒙∈𝜴
𝒎𝒊𝒏σ𝒙∈𝜴 𝛁𝒖𝟏 𝒙 + 𝛁𝒖𝟐 𝒙 + 𝝀 𝑰𝟎 𝒙 + 𝒖 𝒙 − 𝑰𝟏 𝒙
𝟐,
where 𝐼0 𝑥 and 𝐼1 𝑥 are the consecutive images sampled at time T and T+ΔT,
𝑢 𝑥 represents the motion fields desired to be reconstructed, Ω is the neighborhood
region of 𝑥, 𝜆 is the weight between the data fidelity term and the regularization
force.
Why use TV -L1 optical flow?
correlation coefficient
The smaller target has higher
possibility of false trace results.
The optical flow method can increase
correct percentage of tracing .
optical flow+overlap
overlap
Blue for previous time
Red for current time
Green bar for velocity estimated
Algorithm 1 The AdaBoost method
Input: Given sequence of 𝑁 labeled examples < 𝒙1, 𝑦1 , … , 𝒙𝑁, 𝑦𝑁 >
Distribution 𝐷 over the 𝑁 examples
Weak learning algorithm Weak Learn(BP Intelligence network)
Integer 𝑇 specifyinh number of iterations
Initialize the weight vector: 𝑤𝑖1 = 𝐷(𝑖) for 𝑖 = 1,… ,𝑁.
Do for 𝑡 = 1,… , 𝑇:
1.Set 𝒑𝑡 =𝑾𝑡
σ𝑖=1𝑁 𝑤𝑖
𝑡 (𝑾𝑡 =< 𝑤𝑖𝑡 >; 𝒑𝑡 =< 𝑝𝑖
𝑡 >, 𝑖 = 1, . . , 𝑁)
2.Call Weak Learn, providing it with the distribution 𝒑𝑡;get back a hypothesis
ℎ𝑡: 𝑋 → [0,1].
3.Calculate the error of ℎ𝑡: 𝜖𝑡 = σ𝑖=1𝑁 𝑝𝑖
𝑡|ℎ𝑡 𝒙𝑖 − 𝑦𝑖|.
4.Set 𝛽𝑡 = 𝜖𝑡/(1 − 𝜖𝑡).
5.Update the weights: 𝑤𝑖𝑡+1 = 𝑤𝑖
𝑡𝛽𝑡1−|ℎ𝑡 𝒙𝑖 −𝑦𝑖|
Output the hypothesis
ℎ𝑓 𝒙 = ቐ1 𝑖𝑓 σ𝑡=1
𝑇 (𝑙𝑜𝑔1
𝛽𝑡)ℎ𝑡 𝒙 ≥
1
2σ𝑡=1𝑇 (𝑙𝑜𝑔
1
𝛽𝑡)
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
AdaBoost classification
BP classification
ML: BP+AdaBoost=BP_Adaboost
Objective is to choose CI from candidates
more accuracy
2018 16
MAY 09:42
CI 09:00
2018 16
MAY 09:30
Severe hail weather in RuDong, Jiangsu Province, eastern China, 2018
Radar 09:30
30min lead time
Testing && validation datasets
◆ A rough validation method(SATCAST v2 by Walker et al)
75km…
T0T0-
30Minutes
T0+Tk
…
T0+2hCI
Eastern
China
Southern
China
North-
Eastern
China
Qinghai-
Tibet
Plateau
Area
≥35dBZ
◆ A fine validation method(RDCMS)
Observed value
1 0
Forecast
value
1 TP(”hit”) FP(”false alarm”)
0 FN(”miss”) TN
Qinghai-
Tibet
Area
Eastern
China
North
Eastern
China
Southern
China
14~29
Jun. 8551 5137 3710 5732
05~12 Jul.5283 2762 1804 3440
08~24
Aug.5782 2115 1636 4195
10~19
Sept.3309 3401 1998 3754
09~25
Oct.5857 2589 1660 6001
Name Formula Range Optimum
POD TP/(TP+FN) [0,1] 1
FAR FP /(TP+ FP) [0,1] 0
CSI TP/(TP+FP+FN
)
[0,1] 1
Statistical metrics for RDCMS.
The meanings of TP, FN and FP are shown in the Table
Contingency table
Spatial and temporal distribution and total number
of CI forecast samples for the RDCMS validations
Testing && validation datasets
Evaluation-scores
• AGRI • AGRI+LMI• AGRI+LMI+GIIRS• AGRI+LMI+GIIRS+ NWP• AGRI+LMI+GIIRS+ NWP+Radar or GPM…
Question:
How to combine use these new data to better identify the convective activities?
How to get added value information of severe and high impact weather, more
easily and more quickly?.
Future Plan
• To use machine learning and deep learning to develop
new satellite convection products in next year.
• Machine learning and deep learning will introduce to
generate better products and better applications for FY-
4B/C satellites
• Link convective initiation with QPE.
Machine Learning for predicting Convective Storm and QPE of FY-4
Data
• NASA GPM IMERG 0.1°*0.1 ° grid data in a half hour resolution (Truth for training)
• FY-4A/AGRI or Himawari-8/AHI FullDisk infrared band measurements (TBB) FY-
4A/AGRI uses 6 infrared bands or Himawari-8/AHI uses 9 infrared bands
observations for training the model
• Numerical Weather Prediction (NWP) data (GFS 0.5°*0.5 ° /Grapes 0.25°*0.25 ° )
• Surface ancillary data (i.e., elevation, surface type)
Methodology
• Track Convection Cells
• Co-locate GPM data, FY-4A/AGRI or Himawari-8/AHI data, and NWP data
• Extract some useful samples from matched dataset
• Train classification and regression models for predicting Convective Storm and
QPE based on Machine Learning
• Predict Convective Storm and QPE using real-time FY-4A/AGRI or Himawari-
8/AHI and NWP data and models
Rank of predicted factorsrank name score
1 dtb62max 0.10849001
2 ch13 var min 0.102537347
3 dtb73max 0.094444007
4 dtb70max 0.088137094
5 dtb96max 0.080524218
6 ch16-ch13max 0.07700988
7 area 0.063007372
8 dtb96mean 0.029540668
9 ch13 var mean 0.026863466
10 ch14-ch15min 0.01872382
11 dtb86min 0.014983453
12 dtb12min 0.011430704
13 dtb12max 0.010817736
14 dtb86max 0.009642071
15 dtb11min 0.009274285
16 ch11-ch14max 0.008871846
17 dtb11max 0.008870689
18 ch16-ch13 min 0.008594803
19 dtb62mean 0.008255241
20 dtb70mean 0.008029407
21 dtb73mean 0.006768807
22 div850max 0.005789622
23 ch13 10per warm mean 0.005747327
24 thtse925min 0.005490885
25 dtb73min 0.005393001
Random Forest
n_estimators=100
max_depth=10
max_features=10
Training sample
date: April-October,2016
total 389315 convection cloud system
Courtesy of Min Min
Result TBB at 11μmGround rainfall
observation
Machine Learning for Predicting Convective Storm and QPE by FY-4 Data
prediction
GPM IMERG observation
5. Summary
• Case studies show that the new approaches’ skills have been
improved as expected in four regions of China. In the southern China
region, the CI lead time is 17-40 mins, and the best probability of
detection (POD) is as high as 0.80, with the FAR lower than 0.34.
• Every day the forecaster has to face torrent of data from satellite,
surface, balloon, radar, lightning etc. How to use these data to forecast
the severe weather is a challenge.
• How to pick out the valuable information from data automatically, more
quickly and more easy to use is also an other challenge.
• We have to look for new approaches to solve these problems, and
machine learning and deep learning technique show great potential
benefit for convection applications.