Status and Plans for the International Comprehensive Ocean-Atmosphere
Data Set (ICOADS)
Scott Woodruff (NOAA/ESRL USA)
Contributions include:Eric Freeman (NOAA/NCDC USA)Sandy Lubker (NOAA/ESRL USA)Shawn Smith (FSU/COAPS USA)
Clive Wilkinson (UEA, UK)Steve Worley (NCAR USA)
4th Atmospheric Reconstructions over the Earth (ACRE) WorkshopKNMI, De Bilt, The Netherlands, 21-23 September 2011
Topics
(1) Release 2.5 overview
(2) Data rescue status
(3) ICOADS Value-Added Database (IVAD)
(4) Release 2.6.0: planned by Oct. 2012
(5) International data management issues
RECLAIM
(1) R2.5 Overview Major delayed-mode update completed in 2009
• 1662-2007
Also: “Preliminary” near-real-time updates• Link to Research to Operations (R2O) plans
Data, metadata, and product access• NCAR, NCDC, and ESRL all provide complementary
capabilities serving a diverse range of customers• E.g.: ~400 unique users per year just from NCAR
Project web portal: http://icoads.noaa.gov/
Woodruff, S.D., S.J. Worley, S.J. Lubker, Z. Ji, J.E. Freeman, D.I. Berry, P. Brohan, E.C. Kent, R.W. Reynolds, S.R. Smith, and C. Wilkinson, 2011: ICOADS Release 2.5: Extensions and Enhancements to the Surface Marine Meteorological Archive. Int. J. Climatol., 31, 951-967.
R2.5: Recent platform mixture• Voluntary Observing Ships (VOS) plus drifting and
moored buoys, and other marine platform types• VOS metadata (with help from UK NOCS)
The official Release 2.5 period (1662-2007) is now extended monthly with “preliminary” real-time data and products
based on GTS dataWMO Pub. 47 VOS metadata 1966-2007
1955-65 problematic
Post-R2.5 Assessments Data Quality – Part 1:World Ocean Database 2005 AT Errors
Flow of investigation – UK NOCS, NCAR, NOAA NODC Anticipating a correction in WOD09 Illustrates the benefits of cross-agency team work and problem solving Somewhat outside the funding envelop in all organizations
Realistic distribution of AT with latitude from WOD
Error in one component of WOD
Data work takes time and is constantly challenging…
Post-R2.5 Assessments of Data Quality – Part 2:20CR Feedbacks on R2.5 data
Unfiltered by 20CR QC Filtered by 20CR QC
ΔP: ensemble mean first guess pressureminus modified obs pressure for selected “decks”
197=Danish
155=HSST
702=Norw.701=Maury
(2) Data Rescue Status
Severe downsizing of NOAA/CDMP was a major loss
Can more be done now with Crowdsourcing (& OCR)?
Potential Crowdsourcing issues:• Generating diverse public interest?
• Are multiple form types a problem?
• Competition for limited capabilities?
Wilkinson, C., S.D. Woodruff, P. Brohan, S. Claesson, E. Freeman, F. Koek, S.J. Lubker, C. Marzin, and D. Wheeler, 2011: RECovery of Logbooks And International Marine Data: The RECLAIM Project. Int. J. Climatol., 31, 968-979.
ICOADS Marine Data Rescue: Status and Future CDMP Priorities (http://icoads.noaa.gov/reclaim/pdf/marine-data-rescue.pdf)
RECLAIM
Data Rescue Best Practices:Proposed “pipelining”
• As practical, initiate concurrent processing:– (b) prior to completing (a)– (c) prior to completing (a-b)
• Can be helpful to explore data quality/characteristics in advance (e.g. dups)
(a) Imaging
(b) Digitizing (keying)
(c) IMMA translation
Resource constraints• All these steps have been costly, but
Crowdsourcing is a new (b) option• (c) IMMA translation has generally
been under-resourced – but serves as a critical foundation for applications
Current Data Rescue Candidates for BlendingMajor (past) contributions from NOAA/CDMP
RECovery of Logbooks And International Marine Data (RECLAIM) ProjectAtmospheric Circulation Reconstructions over the Earth (ACRE)
Green =digitized
Yellow =partially
Red =undigitized
Note: all require translation
R2.5
R2.4
US Lightship Collection1916-82 component; ~430K daily obs.
(CDMP-funding project initiated by WHOI)
Lightship Name Period(s) of record
Ambrose 1937-74Barnegat 1947-70Boston 1958-75
Buzzards Bay 1958-80Chesapeake 1947-79
Delaware 1961-70Diamond Shoals 1947-74
Five Fathoms 1957-72Frying Pan Shoals 1936-79
Georges Shoal AFS 1956-60Nantucket 1916-18 and
1947-82Pollock Rip 1947-69
Portland 1956-66Savannah 1954-64
• 33 of 39 files currently available: undergoing additional QC and
IMMA translation pre-validation of data planned by
ICOADS prior to blending
• Potential crowdsourcing project?• Imaging 100% complete (~190K
pages) and readily available• Very large collection (2.6M rpts)
1201
1201-M
1211210
1210A
121A
123
138
201
407 42
4350
911
AB-STRACT STORM
LOG
BAROMETER
COVER
FOG RE-PORTS
GALE
AND STORM RE-PORTS
GREEN-WIC
H MEA
N NOO
N OB-SERVA-TIONS
MISC
1910-1947 Simultaneous Observations Form Type Counts
Greenwich Mean Noon (GMN): 1910-47 Obs.(plus many other forms/published data – back to 1870s)
Additional CDMP-KNMI project possibility for Crowdsourcing: Dutch Extract Journals: 1826-92
• Imaging also 100% complete 17,565 images 193 logbooks from 327 ships
• 650K estimated daily reports
German Maury Collection: 1845-67; 544KDue to resource constraints, only an estimated 50% of the collection can be QC’d and translated in next 6-7 months
(3) ICOADS Value-Added Databasehttp://icoads.noaa.gov/ivad/
Project aim: Address our current inability to trace value-added improvements back to individual ICOADS observations through:
establishment of DBMS to support development of value-added records and facilitate user access;
implementation of supporting modifications to IMMA format
scientifically demonstrate the impact of value-added records on air-sea flux estimates & common climate indicators
Limited funding for FY2011-13 obtained from NOAA Climate Program OfficeDivided among NOAA (ESRL and NCDC), FSU/COAPS, and NCARE.g. ESRL programmer levels over three years = ~6 person months
IMMA: A Robust and Extensible Observational Data Format
. . .
Key requirement: attm of original data forms: experience
demonstrates format translations
frequently contain errors or omissions
core icoads immt meta model suppl
International Maritime Met. Archive (IMMA) format (ASCII)
Core + optional “attachments” A new IVAD attachment with:
• Field number (FN)
• Value-added data (VAD)
• Author reference code (ARC)
• Uncertainty, QC, etc.Advantage:
exact copy of original permits re-translation and cross-
checks at any time
(2) IMMA IVAD attm: FN, VAD, ARC, etc.
ICOADS Value-Added Database (IVAD): Overview Schematic
Author reference code (ARC)
(4) ARC overview table
(1) IMMA core or attms: selected field(FN)e.g. field(35)=SST
(5) ICOADS-hosted file respository (URL1)
Value-added data (VAD; e.g. if FN=35
for SST)Field(35)=SST
(3) ICOADS web service for serving IMMA data, e.g. SST and/or VAD
(6) Optional external file repository (URL2)
ARC Text DOI URL1 URL2
1
2
…
(4) Release 2.6.0 Plans Data ingest cutoff: ~April 2012
• Available historical inputs – in IMMA format A variety of data corrections
• Will process an updated WOD09• Incorporate near-surface salinity (+ related data/metadata)
IMMA format improvements in conjunction with IVAD Other new attachments:
Physical Oceanography (Ocn) Automated instrumentation (Auto) Reanalysis feedbacks
to be developed later w/ reanalysis projects Platform tracking (Track)
storage capacity, but no resources in sight to implement
Establish Unique Report ID (UID)(within Icoads attm)
Critical advance to provide a permanent ID for each ICOADS record to benefit reanalyses and other applications (plus IVAD work)
6-digit base-36 number (i.e. alphanumeric) Tentatively we plan to number the entire R2.6.0
intermediate product (i.e. containing all duplicates): 1, …, R2.6.0i
Note: R2.5i contains ~295M reports
Subsequent additions would be numbered >R2.6.0i prior to blending (insertion of historical data would disorder the initial pure numeric sequence)
Unresolved questions regarding data report modifications, possible report compositing, etc.
(5) InternationalData Management Issues
Proposed network of mirrored WMO-IOC Centres for Marine-met. and Ocean Climatological Data (CMOC)
Proposed requirements:• Host standardized formats and QC processing• Reliably mirror data and products• Open data access; WIS (WMO Information System)
interoperability Benefits e.g. historical data exchange Countries can be reluctant to exchange historical data
without assurance of formal international repository
JCOMM Marine Climate Data System (MCDS) Workshop28 November-2 December 2011, Hamburg Germany
Conclusions Regular MARCDAT/CLIMAR workshops (~every 2 yr)
• data focus; help drive progress & develop shared ownership• latest: MARCDAT-III, 2-6 May 2011, Frascati• Overall “CLIMAR” initiative
http://www.marineclimatology.net/web/ Involvement with satellite projects and the surface
temperature (land) initiative offers an important new avenue for closer linkages between communities• E.g. interoperable tracking of data provenance (UID)
QC and bias-adjustment improvements needed• e.g. static QC limits extensively missing for high-latitude data• link with IVAD work – cross-checking via 20CR results?