Date post: | 19-Jan-2015 |
Category: |
Technology |
Upload: | els-descheemaeker |
View: | 381 times |
Download: | 3 times |
Challenges and Opportunities of Linked Open Energy Data
Chris Davishttp://enipedia.tudelft.nl
Who am I?
● Postdoc Energy & Industry, TBM, TU Delft
● Focus on Industrial Ecology, Open Data, Collaborative Software, Modeling, Visualization, Analytics, etc.
Motivations
● Energy and sustainability are some of the most important topics of the 21st century
● Need both aggregated and fine-grained data
● Research can be data intensive● There's a lot out there, but
connecting it is tedious● Researchers often duplicate effort● It would be great to revolutionize
how we deal with this data
Information wants to be free because it has become so cheap
to distribute, copy, and recombine - too cheap to meter.
Stewart Brand
There's a Tension...
It wants to be expensive because it can be
immeasurably valuable to the recipient.
Stewart Brand
There's a Tension...
That tension will not go away. It leads to endless wrenching debate
about price, copyright, “intellectual property,” and the moral rightness of casual distribution,
because each round of new devices makes the tension worse, not better.
Stewart Brand
There's a Tension...
If you cling blindly to the expensive part
of the paradox, you miss all the action
going on in the free part.
The pressure of the paradox forces information
to explore incessantly.Stewart Brand
There's a Tension...
Pirolli & Card (2005) The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis
Pirolli & Card (2005) The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis
Data Collectors
Data Scientists, Statisticians, Researchers
Policy/Decision Makers
=-
A Metaphor for Open Data...A Metaphor for Open Data...
It's about Resource Efficiency
● Information is a resource just as much as physical resources
● ...however, it ideally gets better the more that it is used● Data quality is (partly) a function of the amount of
attention it gets● Structure leads to benefits, but requires effort – figure
out what has most value to the community
Inspiration from Pokemon...
http://www.youtube.com/watch?v=XpvQNn0n_Qw
OpenStreetMap (last 90 days)
http://www.itoworld.com/map/129
enipedia.tudelft.nl/maps
Enipedia.tudelft.nl
18
19
20
About data quality...
21
A tale of one (or four?) power stations and seven data sets
22
How the European Commission manages data
Large Combustion Plants Directivehttp://ec.europa.eu/environment/air/pollutants/stationary/lcp/legislation.htm
I wish...
Kraftwerk (Anlagennummer: 0001)
Who is this? (EU ETS Data)
Who is this?
● Name: Kraftwerk (Anlagennummer: 0001)● Account Holder: Felix Schoeller jr. Foto und Spezialpapiere GmbH
& Co KG● address1: Fabrikstraße 1● address2: Felix Schoeller jr. Foto und Spezialpapiere GmbH & Co.
KG● City: Weißenborn/Erzgeb.● CountryCode: DE● InstallationIdentifier: 891● InstallationName: Kraftwerk (Anlagennummer: 0001)● MainActivityTypeCode: 1● MainActivityTypeCodeLookup: Combustion installations with a
rated thermal input exceeding 20 MW● PermitIdentifier: 14310-0300● ZipCode: 09600
Inspiration
http://www.flickr.com/photos/maxbraun/98688824/ http://www.flickr.com/photos/acme/229065626/
Matching Entities
0001 09600 1 909 anlagenkonto anlagennummer co erzgeb fabrikstrasse felix foto gmbh jr kg kraftwerk schoeller spezialpapiere technocell und weissenborn
09600 1 co erzgeb fabrikstrasse felix foto gmbh jr kg schoeller spezialpapiere und weissenborn werk
49086 burg co felix foto gmbh gretesch jr kg osnabruck schoeller spezialpapiere und
0001 09600 1 909 anlagenkonto anlagennummer co erzgeb fabrikstrasse felix foto gmbh jr kg kraftwerk schoeller spezialpapiere technocell und weissenborn
https://en.wikipedia.org/wiki/Claude_Shannonhttp://en.wikipedia.org/wiki/Self-information
30
The current data management practices results in:
Unintentionally Anonymized Open Data
Optimized for Inefficient Maintenance
and an Uphill Battle to Enforce
Principles of Data Integrity
It's power laws all the way down
● Both contributors & data● Challenge is aligning the two
Officially Curated vs. Crowdsourced Data
Officially Curated vs. Crowdsourced Data
34
Officially Curated vs. Crowdsourced Data
● Crowdsourcing generally OK for easily verifiable data● Officially curated data needed for comprehensive, hard
to verify data, small specialized communities● Crowdsourced data is only possible because of revision
control.
How to Measure Data Quality?
DataQuality
ResearcherSkill/Experience
# Viewers/Editors
Ease of IndependentVerification
= X X
Low Editor Diversity
High Editor Diversity
36
How to Measure Data Quality?
● Eric Raymond – “With many eyes all bugs are shallow”● But... not all eyes are evenly distributed
Big Data (?)