EarthCubeTransforming the Geosciences
UCGIS Symposium - George Mason U: May 23, 2013
A Joint Venture of the NSF Directorate of Geosciences and Office of Cyberinfrastructure
Big Questions, Big Problems!!
geohazards
climate change
life as a geologic agent
formation & evolution of the atmosphere & oceansenvironmental
change & resilience
extreme events –
causes,
periodicity, & im
plications
future world
the origin of life
resource discovery &
abundance
human-earth interactions
continental evolution & changes thru time
deep – surface earthInteractions & feedbacks
YOU ARE HERE!
Community: Crazy, Complicated, Fascinating
Present Relative State of Cyber-Sophistication and Knowledge in the Geosciences
Atmospheric and Climate Science Communities
Seismology/Earthquake and Physical Oceanography Communities
Nearly all other Geoscience groups
Geospatial/Cyberinfrastructure CommunitiesAge of Enlightenment
Industrial Age
Modern Age
Bronze Age
I am here
the 15%
the 85%
The 85% spend about 80% of their time looking for, collecting, and getting the necessary data together in a format they can use and about 20% of their time actually thinking/doing science
Read It and Weep
The 15% spend an increasing amount of time having problems wrestling with unmanageably large data arrays and problems scaling from global to regional or local scales
Neither are well integrated with each other and both types of data and types of geoscience disciplines are required to solve the complex, inter-related, and pressing environmental problems we and the earth are facing
Two very different levels of investment HPC, big iron, federal archives, modeling centers, data repositories, dedicated personnel and facilities
Excel spreadsheets, hero code, dark data, cultural issues, no sustainability
Two very different relationships with data Array-based: No personal ownership, don’t care about any given data point, computationally intensive processing and modeling
Point-based: intense personal ownership, care deeply about each point, can interpret directly or simply
Two very different types of data sensor, bit-stream, real-time: GB/TB size (satellite, radar, seismic)
point-based, observations, images, multi informational, hard to describe
The Problem (the 15% vs the 85%)
Softw
are
Anal
ytics
Mod
elin
g
Com
mun
ities
Visu
aliza
tion
Inte
rope
rabi
lity
Sea of Data
CIF21
Grand Challenge
Multi-disciplinary & multi-scale integration
The Geosciences: Diverse Communities, Data Types, Cultures, and Levels of Cyber Sophistication
AcceleratingScientific Discovery
Our Biggest Present Problem
Dynamic Earth
Changing Climate
Earth & Life
Geosphere-Biospheric Connection
Water: Changing
Perspectives
• Transform the conduct of data-enabled geoscience-related research.
• Create effective community-driven cyberinfrastructure.
• Allow global data discovery and knowledge management.
• Achieve interoperability and data integration across disciplines.
What Is EarthCube?
Atmosphere Chemistry
Climate Dynamics
Paleo-climate
Meteor-ology
Aeronomy
Cyber Computer
Science
GeodeticsSpace
Physics
Solar Terrestial
Geo-chemistry
Tectonics Structure
EarthScience
Education
Polar Programs
NCAR
Geophysics
EarthScope
TectonicsStructure.
Geobiology
Biological Oceano-graphy
Geomorph-ology
HydrologySediment-
ology
Marine Geophysics
Physical Ocaeno-graphy
OceanDrilling
Chemical Oceano-graphy
Marine Geology
Ocean Education
HPC, super computing
Biology
Glaciology
Ecosystem
Geospatial
Data manage-
ment
Software Engineering
EarthCube CI
?
?
Who Is EarthCube? You Are!!!
An alternative approach to respond to daunting
science and CI challenges
EarthCube is an outcome
AND a process
EarthCube will require broad
community involvement; new ways of
doing
Path to the Vision
Unidata
IRISIEDA
NCAR
OOI
CUASHI
Important Features: • Builds off existing data/modeling systems/cyberinfrastructure investments • Provides tools/approaches that enhance data discovery, access, and integration • Addresses serious cyber needs in fields where individual data points and observations are important • Leverages investments across fields • Allows for more integrative and interdisciplinary science
Convergence Using Spiral Development
10 Years
Given: Technology improves and changes over time.
Result: EarthCube being designed in a step-wise, modular fashion to accommodate change and allow refreshing over time.
Timeline 2013 - 2014
Release of umbrellaSolicitation w/1st AmendmentNov 2012
May 2013
GEO End-User Workshops Phase 1
FY 2014-FY 2016 (cycle repeats)
Proto-Gov& EC-RCN Awards – 1st Amend
Deadline of 1st & release of 2nd Amendment
Feb 2013
Oct 2012-
Mar 2013
Jun 2013
Building Blocks & Concept Design
Architecture Awards – 2nd
Amend
Sept 2
013
Community Meeting
Release of 3rd Amendment
Nov 2013
End-User Workshops Phase 2
Feel Our Pain!
help me!
Seven Modes of FailureUnrealistic or misaligned expectations among people presently involved in EarthCube
“Build it and they will come” mindset – users don’t show up, data is not shared, etc.
Not valuing what presently exists – current cyber/geo science efforts and initiatives that represent parts of the EarthCube vision
Not advancing the frontier in transformative ways relative to what presently exists – only automating the current state
Not engaging the 120,000+ geoscience and cyber stakeholders not presently involved in EarthCube
Not anticipating the needs of the next generation of geoscience and cyber stakeholders (todays doctoral students and post docs, as well as the generation behind them)
“Unknown Unknowns” – additional unknown unknowns including transformational changes in the technology, catastrophic shifts in the policy arena, etc.
Barriers to ProgressLack of cyber-readiness for some; and lack of unawareness of tools and approaches that could speed discovery and analysis from those other than “the usual suspects”
Interoperability of disparate data types and formats; bringing dark data to light and allowing “power processing”
Need for automation and smart tools to create metadata and facilitate direct lab/lab notebook data delivery to data systems in the appropriate format for ingestion
Need for vastly improved handling of “big data” and ability to extract the needed information that may only be a tiny part of the whole dataset
Overcoming cultural and semantic barriers between cyber/computer scientists and geoscientists to allow acceleration of development and identification of user needs
Anticipating the needs of the next generation of geoscientists and questions/models focusing on more realistically simulating complex natural systems
“Unknown Unknowns” including extensibility into transformational changes in the technology, catastrophic shifts in the policy, etc.
Now:• Imagine a world with easy, unlimited access to scientific
data from any field.
• Imagine a world where anyone can easily plot data of interest and display it any way they want.
• Imagine a world with where people can easily model their results and explore any ideas they might have.
Blue-Skying the Future
What science could they do?
What discoveries could you help them make?