BoM / CAWCR.
Text Generation in the Next-Gen Forecast System (GFE)
J Bally &T Leeuwenburg
Background & Drivers.... Next-Gen Forecast System
Better use of NWP models
Systematic forecast process
Temporal and spatial detail
Can verify everything
Efficiency gains
Many new services: grids, graphics and text all from the same weather database
Nowcast: TIFS (objects) On-the-fly, shallow, slot filling
Text Generation… introduction
Most sophisticated meteorological text generation system ???
Large jump from “slot filling” systems (TIFS, TC, Scribe etc)
Text as a network of nodes
Goal directed multi-pass processing
64,000 lines of python - > 15 p-yr development
Text Generation : example goals
Try for <= three weather sub-phrases (2 for wind etc.)
Describe the weather trends, rather than a sequence
Describe changes in weather only if the impact differs substantially
Try for elegant sentence structure; split out unusual weather types if they are not part of the trend
Must-goals (guarantees) vs should-goals
……….etc etc
Text Generation…multi-pass processing
Text Generation…multi-pass processing
Text Generation…multi-pass processing
Text Generation.... overview
Information representation
Data Gathering
Information Processing and Document Planning
Mapping to Words ( Surface Realisation )
Post Processing
Information Representation: Scalars, Vectors, Weather……
PoPSky
WeatherTemp / Wind
Information Representation: Hazards
Hazards
Text Generation....
Information representation
Data Gathering
Information Processing and Document Planning
Mapping to Words ( Surface Realisation )
Post Processing
Data Gathering.... Grid sampling
Use Statistics for scalars and vectors Element ...
30th percentile wind speed, 90th percentile wind speed Wind Phrase
25th and 75th percentile wind directions centred on average dir Wind Phrase
90th percentile, 10th percentile Sea Height
90th percentile, 10th percentile Swell Height
25th and 75th percentile swell directions centred on average dir Swell Direction
What about weather and hazards?
How to summarise a bit of patchy rain, isolated severe thunderstorms and raised dust?
Lets concentrate on the weather ..........
Data Gathering.... Grid sampling- eg 3 hr time slices
} Isolated Thunderstorms
NoWxSct SH -
WideSH m
PatchyRA m
Sct TS n
Isolated Showers}Key Number of
Points*Percentage
Wide SH m 10, 000 10%
Sct SH - 34, 533 35%
Patchy RA m 7, 644 8%
Sct TS n 10, 000 10%
No Weather 45, 000 45%
Reported coverage = Σ (internal coverage * grid point count)
total points
Data Gathering.... Grid sampling
NoWxSct SH -
WideSH m
PatchyRA m
Sct TS n
Reported coverage = Σ (internal coverage * grid point count)
total points
Reported Intensity = Σ (intensity contribution* grid point count)
total affected grid points
Similar calculation to collapse similar weather types…Sh/Dz/Ra
Data Gathering.... Grid sampling
NoWxSct SH -
WideSH m
PatchyRA m
Sct TS n
Filtering the Weather List
Wx Types Coverage Threshold
SN, SNSH, SL, SLSH 2.5% of total area
TS, FG, MI 5% of total area
FR 5% of the area below 500m
All other types 15% of the total area
Text Generation....
Information representation
Data Gathering
Information Processing and Document Planning
Mapping to Words ( Surface Realisation )
Post Processing
Information Processing.... Embedded Local Effect > Winds: Easterly 10 to 20 knots decreasing to 10 to 15
knots around midday then increasing to 15 to 20 knots during the afternoon, locally up to 30 knots in the east. Seas: Below 0.5 metres increasing to 0.5 to 1 metres by early evening, locally up to 1.5 metres in the east.
Forecast-Split Local Effect > In the east: Winds: Easterly 10 to 20 knots increasing
to 20 to 30 knots during the afternoon. Seas: 0.5 to 1 metres, increasing up to 1.5 metres by early evening.
Elsewhere: Easterly 10 to 20 knots decreasing to 10 to 15 knots around midday then increasing to 15 to 20 knots during the afternoon. Seas: Below 0.5 metres increasing to 0.5 to 1 metres by early evening.
Information Processing....
Check for Local Effects …. Scalar Metrics
Element Stat Scale Value / Embedded
Consideration Value At 0.5
Wind DIR AVG 135 deg 90 deg
Wind SPEED MAX 15 kt 10 kt
Sea HEIGHT MAX 1.5m 1.0 m
Swell DIR AVG 135 deg 90 deg
Swell HEIGHT MAX 1.5m 1.0 m
Information Processing....
Check for Local Effects
Name Wind Speed
Wind Dir Sea Swell Height
Swell Dir Avg
East-West 0 0 0 2 2 0.8
Far West 1 1 2 2 2 1.6
Far East 0 0 0 0 0 0
Inshore 3.0 0 0 0 0 0.6
Offshore 4.0 0 0 0 0 0.8
Text Generation…multi-pass processing
Information Processing.... Pre-Process Weather......
Arrange statistics in time order; Combine where appropriate, maintaining ranges; Separate co-reportable types
0-3am 3-6am 6-9am 9-noon noon-3pm
3-6pm 6-9pm 9-night
NoWx NoWx SH+ + DU SH+ + TS SH+ + TS SHm SHm NoWx
0-3am 3-6am 6-9am 9-noon noon-3pm
3-6pm 6-9pm 9-night
NoWx NoWx
-----------------------( SH+, SHm)-----------------------
----------TS-----------
---DU---
Subphrases after preProcessWx
Subphrases before preProcessWx
Information Processing.... Simplify Weather......
Collapse Ranges
0-3am 3-6am 6-9am 9-noon noon-3pm
3-6pm 6-9pm 9-night
NoWx SHm RA+ SH-
Subphrases after preProcessWx
Subphrases before preProcessWx
0-3am 3-6am 6-9am 9-noon noon-3pm
3-6pm 6-9pm 9-night
NoWx (SHm, RAm) (RAm, RA+) SH-
Information Processing.... Merge Weather …. telling little white lies
0-3am 3-6am 6-9am 9-noon noon-3pm 3-6pm 6-9pm 9-night
NoWx SH + TS SH
After mergeOverlap:
0-3am 3-6am 6-9am 9-noon noon-3pm 3-6pm 6-9pm 9-night
NoWx SH + TS
before mergeGap:
0-3am 3-6am 6-9am 9-noon noon-3pm 3-6pm 6-9pm 9-night
NoWx Isol SH- NoWx Sct SH - AreasRAm
Subphrases after mergeGap:
0-3am 3-6am 6-9am 9-noon noon-3pm 3-6pm 6-9pm 9-night
NoWx Isol SH- Sct SH - AreasRAm
Recall …multi-pass processing
Information Processing....
Have we tried every processing step enough?
Have we achieved our goals for level of detail?
Can Adjust Detail by…..
Looking for more local effects?.. Split forecast?
More aggressive sub-phrase combining
Coarser sampling strategy
Start again !
Text Generation....
Information representation
Data Gathering
Information Processing and Document Planning
Mapping to Words ( Surface Realisation )
Post Processing
Mapping to words.... Process Trends….
Recognise and Summarise trends
0-3am 3-6am 6-9am 9-noon noon-3pm
3-6pm 6-9pm 9-night
NoWx NoWx NoWx SH- SHm RAm RAm RAm
0-3am 3-6am 6-9am 9-noon noon-3pm
3-6pm 6-9pm 9-night
NoWx SH- developing >..skip.. > increasing to Ram
Subphrases after ProcessTrends
Subphrases before ProcessTrends
Mapping to words....
Connectors Increasing / Decreasing Becoming / Tending Developing / Clearing
Winds W toNW’y at 15 to 25 knots tending W to SW’ly then increasing to 30 knots.
Isolated showers developing during the morning then increasing to heavy widespread rain…..
Mapping to words....
Time reporting Transition (change verbs) Over-time (nouns) Mixed (trend verbs)
Winds W to NW’y at 15 to 25 knots tending W to SW’ly around noon then increasing to 30 knots.
Morning Fog. Isolated showers developing during the afternoon then increasing to widespread rain…
Text Generation....
Information representation
Data Gathering
Information Processing and Document Planning
Mapping to Words ( Surface Realisation )
Post Processing
Post Processing....
Post-Process Phrases
- string replacements to cover limitations
- “band-aid”… eg
Early frost. Early fog. >> Early frost and Fog.
Remove repeated words eg
W to NW’y winds becoming NW’ly
Example District Forecast... inc local effects
Products all forecasts are in XML ...
QC.. with some help from our testing infrastructure ...
Change Management ....
Importance of specifications
Agreed? policies
Big change in the role of forecasters
Forecaster edits for style and/or substance
Change management
The End
Text Generation in the GFE