Semantic Enablers for Next Generation Data Environments:
Environmental and Disaster Data Focus
Deborah L. McGuinnessTetherless World Senior Constellation Chair
Professor of Computer and Cognitive ScienceWeb Science Research Center Director
Rensselaer Polytechnic Institute, Troy, NY
With thanks to the extended RPI Tetherless World Team, particularly Jim McCusker, Katie Chastain, Zach Frye, John Erickson, Rui Yan
April 19, 2013
Open Data Workflow
First Responder Network
THEMES Observatories: Science, Open Government, Health and Life Science, Social
Web Science Research Foundations• Making Data Transparent and Actionable • Provenance• Semantic Methodology• Social Network Analysis• Semantically-Enabled Visualization• Web Data "Challenge Response" Enablers
Social Media: Reasoning on Graph Database
Health and Human Services Data Challenge
International Open Government Data Sets
Rensselaer Tetherless World ConstellationWeb Observatory Foundations & Directions
Multi-Dimensional Data Portals
Semantic eScience Data Portals
First Responder Portal(with NIST)
How do we make First Responders*
Safer and More Effective?
Answer: Leverage Semantic Technologies and Social Media Analysis
3
McGuinness, Erickson, McCusker, Chastain, Frye, Yan
* Emergency Medical Personnel, Firefighters, and Police Officers
Social Observatory – First Responder effort (NIST funded)
Social Media use is on the rise. Every day, we write:
294 billion emails2 million blog postsOver 40 Million Tweets*
How can we leverage Social Media sites to gather
requirements for active First Responders?
to identify communities
to identify stakeholders within those First Responder communities?
To identify trends http://tw.rpi.edu/web/project/FirstResponders
Finding Topics
Finding Users
Semantically-Enabled Environmental and Ecological Monitoring
• Where are pollution events happening?
• What are the health impacts? • How does pollution correlate with
population changes (wildlife, invasives, etc.)?
Semantic Environmental Monitoring (SemantEco)
• Enable/Empower communities (citizens & scientists) to explore pollution sites, facilities, regulations, and health impacts along with provenance
• Connections to USGS, Lake George, IBM, expanding to discussions of predictions and intervention suggestions
1
1 4
1. Explanation of pollution limits2. Graphing thresholds and trends3. Possible health effect of contaminant (EPA)4. Filtering by facet to select type of data5. Link for reporting problems6. Extended with input from USGS, with
population counts for birds & fish
5
McGuinness, Patton, Seyed, Wang, et al.
26
Multi-dimensional Semantic Integration and Analysis
6
Ex. Questions:- What intervention strategies
correlate with decreased smoking? - By age group, geographic area, etc.
- What scientific topic is emerging in a time period
Semantic Web methodology, accountable mashups, multi-dimensional analysis, aggregation semantics toolsDomain answer: bans and labeling for certain age brackets PopSciGrid with NIH looked at “preventable cancers” , hypothesized contributors (smoking), and interventions – taxation, bans, labeling
McCusker,McGuinness,Lee,Thomas,Courtney,Tatalovich,Contractor,Morgan,Shaikh. Towards Next Generation Health Data Exploration: A Data Cube-based Investigation into Population Statistics for Tobacco, 2013
Multi-dimensional Semantic Analysis work also being developed for the IARPA FUSE (Foresight and Understanding from Scientific Exposition Project
Creating More Accessible Health Data as Open Linked Data
Variety of health questions• What is a good hospital
for my concerns? (by condition, by bounceback, communication, etc.
Semantic Data Environment (Prizms, DataCube Explorer,
• Health and Human Services (HHS) Award winning platform
• Being repurposed for mutations and malignant melanoma data in Melagrid
7Lebo, McCusker, Graves, Gloria, Erickson, Hendler, McGuinness
Ultimately: Community Science
8
Extras
9
An Example: HawaiiChanges in cigarette use viewed against policy changes
10
We link states from year to year to that state across time, adding data for each year.
Semantic Web Observatories
• Web Observatories are emerging in many areas
• Can be leveraged to engage broader types of communities -
11
Department of Health and Human Services' Developer Challenge
6
A group from RPI TWC won first place in the competition, by using semantic technologies and in-house developed software, such as csv2rdf4lod, LODSPeaKr, Farrah and DataFAQS.
Example Workflow (SemantAqua)
Archive
CSV2RDF4LODEnhance
derive derive
integrate archive
Publish
CSV2RDF4LODDirect visualize
13
Open Government DataTWC –Intl Open Government Data Sets
First Responder Portal (with NIST):Gather requirements
First Responders can be found via existing social media sites,
then directed to new community portals for more
focused engagement
15
McGuinness, Erickson, McCusker, Chastain, Frye, Yan
First Responder Portal (with NIST):Finding Stakeholders and Communities
16
Top #wx tweeters
Top #smem tweeters
First Responder Portal (with NIST):Identifying Trends, Topics
17
Topics mentioned with #wx
McGuinness, Erickson, McCusker, Chastain, Frye, Yan