+ All Categories
Home > Documents > Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Date post: 18-Jan-2018
Category:
Upload: beatrix-jordan
View: 220 times
Download: 0 times
Share this document with a friend
Description:
Introduction Semantic Water Quality Project –Continuing swqp project from last semester’s Semantic Escience Class –Goal: Help citizens to identify polluted water sources, and potential pollution sources, therefore, alleviating/controlling adverse health effects. –Credits: Evan, Theodora, Ping, Jin
25
Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation
Transcript
Page 1: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Semantic Water Quality Portal

Jin Guang Zheng and Ping WangTetherless World Constellation

Page 2: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Outline

• Introduction• Methods

– System Architecture– Ontology– Provenance– Visualization

• Demo• Claims• Conclusion

– Improvements– Future Work– Contributions

Page 3: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Introduction

• Semantic Water Quality Project– Continuing swqp project from last semester

’s Semantic Escience Class– Goal: Help citizens to identify polluted

water sources, and potential pollution sources, therefore, alleviating/controlling adverse health effects.

– Credits: Evan, Theodora, Ping, Jin

Page 4: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Motivation Use Case

• Use Case:– Children start getting sick: vomiting– Residents request authority perform

checks on the water supply.– Authority collects data from various

sources: EPA, USGS, State regulation, etc.– Authority analyze the data– Authority reports the analyzed result– And More …

Page 5: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

SWQP

• Semantic Water Quality Portal can ease the process:– Integrate data from various sources– Perform automatic analysis(reasoning) on

polluted water sources and possible sources of pollutants: facilities that violate regulations

– Present analyzed results in an user friendly interface.

Page 6: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Research Question

How can we use semantic web technology to solve environmental related problems?

Page 7: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

System Architecture

Page 8: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Ontology

• Two types of Ontology:– Core Ontology

• Encode main inference, reasoning rules– Regulation Ontologies

• Encode regulations from different states

• Reasoning Example:– “any water source has a measurement over certain threshold

is a polluted water source” (core ontology)– “any measurement has value 0.01 mg/l of Arsenic is a

threshold” (regulation ontology)– “any water source contains 0.01 mg/l of Arsenic is a polluted

water source.” (inferred from the above rules)

Page 9: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Provenance

• Data Level Provenance– Where are the original data?– Provide provenance based query.

• Application Level Provenance– What data did we use in the analysis and

reasoning step?– Provide explanation to the user when a

water source is marked as polluted water source

Page 10: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Visualization

• Map Visualization:– Presents analyzed results with Google Map

• Polluted Water Source, Polluting Facility– Presents explanation on why a water source

is marked as polluted – Use “Facet” type filter to select type of data

• Trend Visualization:– Presents data in trend visualization for user

to explore and analyze the data.

Page 11: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Demo Time

Page 12: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim I - Problem

• Problem:– Data are collected from various sources:

• EPA, USGS, etc.– Heterogeneous Data:

• Difficult to perform query• Data are stored using different schema, and the

semantics of the terms in different schema can be very different from each other

Page 13: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim I

Semantic Data Integration helps SWQP to integrate data from various sources, eases the process of future data integration, and make it easier to use existing reasoners to perform reasoning.

Page 14: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim I Example• Various Data Sources:

– Convert into RDF, and load to triple store.– Use Sparql to query data

• Use EPA ontology as central schema to encode converted data

• Easier for future data integration:– Easier to accommodate schema changes: add

equivalent statements, new properties, new classes etc.

• Easier to use existing reasoners:– Jena, Pellet, etc.

Page 15: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim II - Problem

• Problem:– Analysis process of identifying a water

source is polluted can be complex and time consuming.

• Example:– 10 contaminants in a water source.– Each contaminant has been measured 10

times.– There are 50 regulation limits.

Page 16: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim II

Automatic inference and reasoning supported by semantic web technologies helps SWQP to perform automatic analysis on water qualities etc.

Page 17: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim II Example

• Reasoning and Inference:– Identify measured object is a water source– Find all measurements for the water source– Validate measurement is measuring water

contaminants.– Perform reasoning on whether the

measurement exceeds threshold• What element? What Unit? What Value?

– Identify the type of water source: polluted?

Page 18: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim III - Problem

• Problem:– User may not trust the analyzed result

presented by SWQP.• I don’t think Hudson river has been polluted.

– User may trust data from certain sources only.

• I don’t trust the data collected by a student for his class project.

Page 19: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim III

Provenance information encoded in semantic web technology helps SWQP solve trust related problems.

Page 20: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Claim III Example

• Data Source Based Query:– User can select what data to be

analyzed.– Data Source Provenance

• Explanation on polluted water source:– Pop out window to show the regulation used and measured value

Page 21: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Conclusion

Page 22: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Improvements

• More data:– Regulation data from CA, NY, MASS, EPA– EPA, USGS data for multiple states– Provenance data are captured for both regulation data, and

EPA, USGS data.• More Features:

– Provenance based data query and analysis– Trend visualization

• Speed:– ~ 15 – 30 seconds.– Main draw-back now is real-time inference and reasoning

and the large size of the data

Page 23: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Future Work

• Provenance:– support building, linking and displaying proof traces that track

how the answers are derived from source data.

• Health Related Reasoning:– Model the effects of drinking polluted water source.

• Identify which polluted water source cause people vomit more quickly.

• Flood Reasoning:– Model Flood

• Identify which water sources will flood with high probability• Identify possible effects of flood w.r.t water quality

• Other Work:– Pollutant based query: e.g. interested in Arsenic

Page 24: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Contributions• Ping:

– Use Tim’s converter to convert EPA and USGS Data.– Preprocess regulation data to CSV format– Implement data visualization part of the project– Write part of this final class write up, and present the

visualization part of the demo.

• Jin:– Write script to convert data to RDF format encoded use Ontology– Design Ontology to support automatic reasoning and inference– Re-implement Jena-Pellet based backend reasoner.– Class related works: since this project is Ping’s out of class

project, I am responsible for most of the project related write up, presentation, etc.

Page 25: Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.

Questions

Thank you for your attention!


Recommended