Date post: | 21-Apr-2017 |
Category: |
Data & Analytics |
Upload: | sally-jenkinson |
View: | 38,695 times |
Download: | 0 times |
AN INTRODUCTION
TO OPEN DATASally Jenkinson - Fronteers - Amsterdam - 09.10.2015 @sjenkinson | [email protected]
[email protected] | @sjenkinson
Digital solutions architect & consultant Records Sound the Same Ltd
Sally Jenkinson
DATA
OPEN DATA
“Big data”
@sjenkinson
90% of the world’s total data has been created within the last 2 years
!
(IBM, 2014)
@sjenkinson
I ♡ DATA
@sjenkinson
@sjenkinson
sallyjenkinson.co.uk/labs/teatracker
BUT…
“You agree to maintain your apps
and your systems in accordance with
industry standard quality levels…”
DATA SHARING
WHAT IS OPEN DATA?
Open data and content can be freely used, modified, and shared by anyone for any purpose.
opendefinition.org
Re-publish
Derive new content or data
Make money by selling products
Charge a fee for access
Make money by selling products
Charge a fee for access
“We observed that often people think of open data as a specific ‘kind’ of data –
something separate and distinct from the data they use day-to-day in their
organisation or team – rather than a choice about how people publish data.”
theodi.org/blog/closed-shared-open-data-whats-in-a-name
theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing
Public domain (CC0) Attribution (CC-by) Attribution & share-alike (CC-by-sa)
OPEN LICENCES FOR CREATIVE CONTENT
theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing
Public domain (PDDL) Attribution (ODC-by) Attribution & share-alike (ODbL)
OPEN LICENCES FOR DATABASES
theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing
Open Government Licence OS Open Licence etc
OTHER OPEN LICENCES
WHERE CAN I GET IT FROM?
wiki.dbpedia.org
musicbrainz.org
earthquake.usgs.gov/earthquakes/search/
plaidplug.com
data.id/dataset/daftar-titik-reklame-di-dki-jakarta/resource/361ce01f-34ed-4e00-a204-6062c7b9ad64
web.archive.org/web/20150520175645/http://137.189.35.203/WebUI/CatDatabase/catData.html
vision.stanford.edu/aditya86/ImageNetDogs/
{"gilded":0,"author_flair_text":"Male","author_flair_css_class":"ma
le","retrieved_on":1425124228,"ups":3,"subreddit_id":"t5_2s30g","edited":false,"controversial
ity":0,"parent_id":"t1_cnapn0k","subreddit":"AskMen","body":"I can't agree with passing the blame, but I'm glad to hear it's at least helping you with the anxiety. I went the other direction and started taking responsibility for everything. I had to realize that people make mistakes
including myself and it's gonna be alright. I don't have to be shackled to my mistakes and I don't have to be
afraid of making them. ","created_utc":"1420070668","downs":0,"score":
3,"author":"TheDukeofEtown","archived":false,"distinguished":null,"id":"cnasd6x","score_hidden":false,"name":"t1_c
nasd6x","link_id":"t3_2qyhmp"}
x ~1.7 billion
reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
♥github.com/caesar0301/awesome-public-datasets
CONSUMING OPEN DATA
@sjenkinson
d3js.org
MORE THAN WEBSITES
iquantny.tumblr.com/post/92116352544/mapping-nyc-hydrant-revenue-upper-easts-19th
Generating value & making savings
@sjenkinson
+$3 trillion / year
mckinsey.com/insights/business_technology/open_data_unlocking_innovation_and_performance_with_liquid_information
open data
Transparency
@sjenkinson
“…within two years chemical emissions nationwide (at least as reported, and
presumably also in fact) had decreased by 40 percent.
!
Some companies were launching policies to bring their emissions down
by 90 percent, just because of the release of previously sequestered
information.”
maban.co.uk/80
DATA & USER EXPERIENCES
“How far do you live from your workplace? Chances are, you'd answer that question in minutes rather than miles. !
An hour on the bus tells us a lot more than 47 miles. That's why we made Mapumental. !
Given any start point or destination, it'll show everywhere within the chosen commute time, by public transport.”
mapumental.com/services/travel-time
“How accessible is your nearest school, post office, or GP’s surgery?
!
In Wales, that’s not always a simple question: the country’s mountainous landscapes, rural
populations, and sometimes infrequent bus services can mean that those without cars are rather cut off from public service provision.”
mapumental.com/services/accessibility
“Just how quickly could fire engines reach a given postcode in case of a fire?
!
It’s a question that’s pivotal to decisions made by both the emergency services and
the insurance industry.”
mysociety.org/2013/04/22/fire-fire-mapumental-and-fire-engine-journey-times
Improved efficiency Improved effectiveness Impact measurement
@sjenkinson
Improved or new private products or services & innovation
@sjenkinson
NOT JUST DIGITAL
opensensors.io
DOUG MCCUNEdougmccune.com
STEFANIE POSAVECstefanieposavec.co.uk
“Air Transformed is a series of wearable data objects that communicate this physical burden in different ways. Though seemingly decorative, they are based entirely on open air quality data
from Sheffield, UK, a former steelmaking city and notorious for its bad air.”
stefanieposavec.co.uk/data/#/airtransformed
Participation & self-empowerment
@sjenkinson
LINKED DATA
New knowledge from combined data sources and patterns in large
data volumes
@sjenkinson
Misrepresentation
tylervigen.com/spurious-correlations
tylervigen.com/spurious-correlations
Combining data sets & licences
clipol.org/tools/compatibility
PUBLISHING OPEN DATA
“There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know
there are some things we do not know. But there are also unknown unknowns – the
ones we don't know we don't know.”
en.wikipedia.org/wiki/There_are_known_knowns
STEP ONE Identification & planning
@sjenkinson
Clear licensing & usage information
Structure & quality
A plan for support
@sjenkinson
Accuracy
STEP TWO Extracting & cleaning
@sjenkinson
Data privacy & the individual
openrefine.org
STEP THREE Sharing
@sjenkinson
FIVE STAR DATA5stardata.info
★ Make your data available on the web (in whatever format) under an open license.
★★ Make it available as structured data (e.g., Excel instead of image scan of a table).
★★★ Use non-proprietary formats (e.g., CSV instead of Excel).
★★★★ Use URIs to denote things, so that people can point at your data.
★★★★★ Link your data to other data to provide context.
OPEN DATA CERTIFICATEScertificates.theodi.org
IN CONCLUSION…
1. Choose open data 2. Publish your data 3. Link it 4. Use standards 5. Promote freedom 6. Do some good 7. Be creative
@sjenkinson !
recordssoundthesame.com
THANK YOU. Thank you to these lovely people for making their content open:
Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak - lod-cloud.net
The Data Spectrum - theodi.org/data-spectrum
Doug McCune - dougmccune.com
Stefanie Posavec - stefanieposavec.co.uk
Data abstract painting - flickr.com/photos/rachubarama/2709346242
IE Market Share vs Murder Rate - imgur.com/47D7zGq
Troy Marusek - flickr.com/photos/troymars/9113025616
The Roof of Wales - flickr.com/photos/stray_croc/4743302841
Fire Wall - flickr.com/photos/epleitez/1714341218
Money - flickr.com/photos/mikephotoart/12839909303
cc - flickr.com/photos/kalexanderson/7175627336
RDF - flickr.com/photos/gertcha/8292978031
Small Parts - flickr.com/photos/oskay/2156889157/
Hydrant - flickr.com/photos/pamhule/4677109732/
Upsala Glacier Retreat - flickr.com/photos/nasamarshall/10726540434/