Date post: | 11-May-2015 |
Category: |
Technology |
Upload: | rinke-hoekstra |
View: | 1,283 times |
Download: | 0 times |
Linked (Open) DataBut what does it buy me?
Rinke HoekstraVU University Amsterdam/University of Amsterdam
Linked (Open) Data - But what does it buy me? by Rinke HoekstraLicensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
maandag 11 maart 13
http://www.youtube.com/watch?v=ga1aSJXCFe0
maandag 11 maart 13
http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.html
maandag 11 maart 13
Linked Open Data
Texts taken from http://5stardata.info
maandag 11 maart 13
Why people go “Meh”
• Data needs to be converted to RDF
• Data needs to be published on the Web
• An open license is required even for a single ★
Pacific Barreleye, http://imgur.com/gallery/Mzyb5(can rotate its eyes forwards or upwards to look through the transparent head to prey above)
maandag 11 maart 13
Why people go “Meh”
• Data needs to be converted to RDF
• Data needs to be published on the Web
• An open license is required even for a single ★
What if people draw incorrect conclusions from my data?
Pacific Barreleye, http://imgur.com/gallery/Mzyb5(can rotate its eyes forwards or upwards to look through the transparent head to prey above)
maandag 11 maart 13
Why people go “Meh”
• Data needs to be converted to RDF
• Data needs to be published on the Web
• An open license is required even for a single ★
What if people draw incorrect conclusions from my data?
What if journalists draw incorrect conclusions from my data?
Pacific Barreleye, http://imgur.com/gallery/Mzyb5(can rotate its eyes forwards or upwards to look through the transparent head to prey above)
maandag 11 maart 13
Why people go “Meh”
• Data needs to be converted to RDF
• Data needs to be published on the Web
• An open license is required even for a single ★
What if people draw incorrect conclusions from my data?
What if journalists draw incorrect conclusions from my data?
What if combining data results in privacy infringement?
Pacific Barreleye, http://imgur.com/gallery/Mzyb5(can rotate its eyes forwards or upwards to look through the transparent head to prey above)
maandag 11 maart 13
DataLinkedSix Ingredients
The missing ★
Mix ‘n MashContextualize!
Choose your Grain Size
Lower the Threshold
Repeatable Transformation
maandag 11 maart 13
1The missing ★
http://give.everything/a/URI
HTTPs URIs only please!(or resolver + URN)
Version information
Version agnostic
Guessable
maandag 11 maart 13
Messy Datahttp://wetten.overheid.nl/BWBIdService/BWBIdList.xml.zip
NB: The problem with the XML processing instruction was reported and fixed, but returned some weeks later
maandag 11 maart 13
Example: Juriconnect
• Existing identification standard: Juriconnect
• URN-like... but no naming servercf. Document Object Identifiers
• Named elements do not carry identifier
• No explicit version information, only contextual
1.0:c:BWBR0005416&artikel=6vs
http://wetten.overheid.nl/cgi-bin/deeplink/law1/bwbid=BWBR0005416/article=6/date=2005-01-14vs
http://wetten.overheid.nl/BWBR0005416/TitelII698946/HoofdstukII/Artikel16/geldigheidsdatum_14-01-2005
maandag 11 maart 13
Levels of Identification
• IFLA FRBR levels
• Work
• Expression
• Manifestation
Bibliographic Entity Work
Expression
Manifestation
Item
XML version of regulation
exemplifies
embodies
realizes
Version of regulation Regulation
XML version of regulation on my harddisk
maandag 11 maart 13
• Hierarchical information (work)
• Version and language (expression)
• Format information (manifestation)
http://doc.metalex.eu/id/BWBR0011823/hoofdstuk/1/artikel/1
http://doc.metalex.eu/id/BWBR0011823/hoofdstuk/1/artikel/1/nl/2010-09-01
http://doc.metalex.eu/doc/BWBR0011823/hoofdstuk/1/artikel/1/nl/2010-09-01/data.xml
http://doc.metalex.eu/id/BWBR0011823/artikel/1
Transparent = Guessable
maandag 11 maart 13
Versioning Issues• URIs don’t carry semantics...
• Detect changes:
• which element versions are the same
• ... and which versions are different?
Art. 44, lid 4(2011-03-26)
Art. 44, lid 4(2011-04-05)
From: Besluit prudentiële regels Wft, BWBR0020420
maandag 11 maart 13
Opaque Identifiers
• Content information
• Unique SHA1 Hash of text
http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9
vermogen van de erflater
SWHoofdstuk I, Artikel 10
2011-01-01
dcterms:subject
SHA18738ef273ea4dbc73
owl:sameAs
SWHoofdstuk I, Artikel 10
2011-10-12
maandag 11 maart 13
Opaque Identifiers
• Content information
• Unique SHA1 Hash of text
http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9
vermogen van de erflater
SWHoofdstuk I, Artikel 10
2011-01-01
dcterms:subject
SHA18738ef273ea4dbc73
owl:sameAs
SWHoofdstuk I, Artikel 10
2011-10-12
owl:sameAs
maandag 11 maart 13
Opaque Identifiers
• Content information
• Unique SHA1 Hash of text
http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9
vermogen van de erflater
SWHoofdstuk I, Artikel 10
2011-01-01
dcterms:subject
SHA18738ef273ea4dbc73
owl:sameAs
SWHoofdstuk I, Artikel 10
2011-10-12
owl:sameAs
dcterms:subject
owl:sameAs
maandag 11 maart 13
Opaque Identifiers
• Content information
• Unique SHA1 Hash of text
http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9
vermogen van de erflater
SWHoofdstuk I, Artikel 10
2011-01-01
dcterms:subject
SHA18738ef273ea4dbc73
owl:sameAs
SWHoofdstuk I, Artikel 10
2011-10-12
SHA1a433f53273c78a56f2
owl:sameAs
maandag 11 maart 13
2Repeatable Transformation
Transformation should be part of routine ...... manageable and scalable ...
... repeatable ...http://www.w3.org/TR/prov-overview/
maandag 11 maart 13
2Repeatable Transformation
Transformation should be part of routine ...... manageable and scalable ...
... repeatable ...
Linked Data will not be the official source anytime soon
http://www.w3.org/TR/prov-overview/
Provenance is key
maandag 11 maart 13
40.745.554.078 Triples!(1.6 Billion)
(I tried to check the latest figures, but http://stats.lod2.eu was down)
maandag 11 maart 13
3Choose your Grain Size
• The document is the traditional grain size(dublin core)
• Linked data allows for deep links into data
• Cost versus usefulness
• Are you the right party to provide detailed descriptions?
http://creatingandeducating.blogspot.nl/2011/11/blog-post.html
maandag 11 maart 13
RDF Report Card
Report Card Categories
RDF Report Card by Leigh Dodds, talk at Semtech Biz London, 2011, http://slideshare.net/ldodds
Report Card Categories
MetadataScope
StructureInternals
Low Detail High Detail
maandag 11 maart 13
4 Mix ‘n Mash
• Multiple vocabularies won’t bite
• Multiple identifiers won’t bite
• Choose what’s useful for you...
• ... then map to others!
Image © David Sykes 2009 All rights reserved
maandag 11 maart 13
4 Mix ‘n Mash
• Multiple vocabularies won’t bite
• Multiple identifiers won’t bite
• Choose what’s useful for you...
• ... then map to others!
Image © David Sykes 2009 All rights reserved
Good News: the bulk has already been done for you!
maandag 11 maart 13
Example: Provenance
http://doc.metalex.eu/id/BWBR0017869/2009-10-23
http://doc.metalex.eu/id/process/BWBR0017869/2009-10-23 http://doc.metalex.eu/id/event/BWBR0017869/2009-10-23
opmv:wasGeneratedByml:resultOf
http://doc.metalex.eu/id/date/2009-10-23
opmv:wasGeneratedAt
ml:date
ml:LegislativeModification
rdf:type
opmv:Process
rdf:type
"2009-10-23"^^xsd:date
time:inXSDDateTime
time:hasEnd
time:Instant
rdf:type
ml:Date
rdf:type
opmv:Artifact
rdf:type ml:BibliographicExpression
rdf:type
sem:Event
rdf:type
sem:eventType
sem:hasTime
sem:Time
rdf:typesem:timeTypesem:hasTimeStamp
The expression (version) URI of a regulation
The process that generated the expression
The date at which the expression was created
rdf:value
The creation event of the regulation
maandag 11 maart 13
5• Information is not always compatible
• Make explicit in which context the information holds ...
• ... and who stated the information, why and how.
Contextualize!
Flat Earth and Square Earth idea courtesy of Szymon Klarman
maandag 11 maart 13
• Namespaces don’t mean anything
• Use named graphs to compartmentalize metadata
• Add provenance information about groups of statements
<http://example.com/workbook1/sheet1/corrected><http://example.com/workbook1/sheet1>
:curation20120126
provo:wasGeneratedBy
provo:Activity
:RinkeHoekstra
_:a_:b
rdf:type
provo:hadAgent
provo:endedAtprovo:startedAt
"20120126T09:00:00" "20120126T08:30:00"
time:inXSDDateTime time:inXSDDateTime
_:x
:14--15_1875--1874
d2s:dimension
"11"^^xsd:int
d2s:populationSize
"1"^^xsd:int
d2s:populationSize
:14-15
d2s:ageGroup
:1875--1874d2s:birthYears
"1889"^^xsd:intd2s:censusYear
:Assendelft
d2s:gemeente
maandag 11 maart 13
Compliance
Regulation A Art 12 Art 14, lid 3, 2e volzin
start
State Nameentry/actiondo/activityexit/actionevent/action(arguments)
Stateaction
end
maandag 11 maart 13
Compliance
Regulation A Art 12 Art 14, lid 3, 2e volzin
start
State Nameentry/actiondo/activityexit/actionevent/action(arguments)
Stateaction
end
maandag 11 maart 13
Compliance
Regulation A Art 12 Art 14, lid 3, 2e volzin
start
State Nameentry/actiondo/activityexit/actionevent/action(arguments)
Stateaction
end
maandag 11 maart 13
Compliance
Regulation A Art 12 Art 14, lid 3, 2e volzin
start
State Nameentry/actiondo/activityexit/actionevent/action(arguments)
Stateaction
end
maandag 11 maart 13
Compliance
Regulation A Art 12 Art 14, lid 3, 2e volzin
start
State Nameentry/actiondo/activityexit/actionevent/action(arguments)
Stateaction
end
Art 14, lid 3, 2e volzin
maandag 11 maart 13
Compliancestart
State Nameentry/actiondo/activityexit/actionevent/action(arguments)
Stateaction
end
Regulation A(01-01-2011)
Art 12(04-02-2011)
Art 14, lid 3, 2e volzin(11-06-2008)
Art 14, lid 3, 2e volzin(01-07-2011)
maandag 11 maart 13
Contextual Annotation
vermogen van de erflater
Successiewetvermogen van de erflater
SW Hoofdstuk Ivermogen van de erflater
SW Artikel 10vermogen van de erflater
SW Art. 10, zin 1vermogen van de erflater
Successiewet
SWHoofdstuk I, Artikel 10
SWHoofdstuk I
SWHoofdstuk I, Artikel 10
Zin 1
dcterms:subject
dcterms:subject
dcterms:subject
dcterms:subject
No nice background because Google Image search only returned boring images
maandag 11 maart 13
6Lower the Threshold
• Integrate Linked Data production into everyday tools
• Allow tools to do the work for you
• Use a built-in reward model
Image courtesy of http://themaisonette.net
maandag 11 maart 13
6Lower the Threshold
• Integrate Linked Data production into everyday tools
• Allow tools to do the work for you
• Use a built-in reward model
Image courtesy of http://themaisonette.net
Linked Data allows you to trace usage!
maandag 11 maart 13
Wrap Legacy Systems
http://www.w3.org/TR/r2rml/
maandag 11 maart 13
• Lightweight Web Application
• Interface to API of existing data repositories
• Enrich metadata by linking to Linked Data resources
• Provide annotation services for data files
• Plugin based architecture
• Publish RDF metadata as new data publicationhttp://linkitup.data2semantics.org
maandag 11 maart 13
recoprovReconstruct provenance using
Dropbox file edit history
0
1
8
9
12
13
16 22
2
4
7
11
17
19
3
5
6
14
23
10 15
18
21
20
24
Sara Magliacane and Paul Grothmaandag 11 maart 13
plsheetHow are results calculated (1)? Automatic analyis of workflow in spreadsheets
Analyse dependencies between cells in complex spreadsheets
Martine de Vos, Jan Wielemaker and Willem van Hagemaandag 11 maart 13
plsheet
Reconstruct and explain the workflow of computations
Martine de Vos, Jan Wielemaker and Willem van Hagemaandag 11 maart 13
Albert Merono-Penuela, Rinke Hoekstra, Laurens Rietveld, Christophe Gueret
TabLinker
http://www.cedar-project.nl
Semi-automatic RDF converter for eccentric spreadsheets
maandag 11 maart 13
Albert Merono-Penuela, Rinke Hoekstra, Laurens Rietveld, Christophe Gueret
TabLinker
http://www.cedar-project.nl
Semi-automatic RDF converter for eccentric spreadsheets
maandag 11 maart 13
DataLinkedSix Ingredients
The missing ★
Mix ‘n MashContextualize!
Choose your Grain Size
Lower the Threshold
Repeatable Transformation
maandag 11 maart 13