Date post: | 14-Jul-2015 |
Category: |
Education |
Upload: | victor-de-boer |
View: | 392 times |
Download: | 1 times |
Dutch Ships and Sailors Linked Data Cloud
Victor de Boer, Matthias van Rossum, Jur Leinenga, Rik Hoekstra
With input from Andrea Bravo Balado and Robin Ponstein
Netherlands Institute for Sound and Vision / VU University Amsterdam [email protected]
ISWC2014
But why Linked Data
• Heterogeneous models, one dataformat– Link what can be linked– Keep specificity of original data – Allow integration at project level (and beyond)
• Links to other sources: re-use knowledge
• Extensible
• Allow multiple levels of semantic enrichment/ normalization – through Named Graphs – Provenance
KB Delpher
Dutch-Asiatic Shipping (DAS) – Voyages (Huygens ING)
“VOC Opvarenden”Mustering and payroll information (DANS Easy)
Dutch Ships and Sailors
Modeling in collaboration with historians (1)
dss:Recordmdb:Aanmonstering
mdb:aanmonstering-del_gem-1879-101
dss:Recordmdb:PersoonsContractmdb:persoonscontract-
del_gem-1879-101-16858-Pieter_Hoekstra
dss:Schipmdb:Schip
mdb:schip-del_gem-1879-101-Isadora
dss:shipmdb:ship
“1870-1894"
"Isadora"
rdfs:labeldss:shipname
mdb:scheepsnaam
dss:ShipTypemdb:ScheepsTy
pemdb:schoener
dss:shiptypemdb:scheepstype
“32”
dcterms:identifiermdb:inventarisnummer
mdb:has_KB_article
<http://resolver.kb.nl/resolve?urn=ddd:010063756:mpeg21:a0045:ocr>
mdb:schip-del_gem-1879-137-Isadora
owl:sameAs
dss:has_aanmonstering
mdb:has_personfoaf:Persondss:Person
mdb:Personmdb:persoon-del_gem-1879-101-16858
dss:rank
mdb:rank
dss:Rankmdb:Rang
mdb:matroos
mdb:maandgage
“Pieter"foaf:firstname mdb:voornaa
m“Hoekstra"
foaf:lastnamemdb:achternaam
Jur Leinenga (Huygens ING) Muster-rolls Northern Provinces1803-1937
Modeling in collaboration with historians (2)
dss:Recordgzmvoc:Telling
gzmvoc:telling-1046-De_Berkel __bnode_
1gzmvoc:aziatischeBemanning
dss:Shipgzmvoc:Schip
gzmvoc: schip-1046-De_Berkel
dss:has_shipgzmvoc:schip
"1046"
“Schip”
“De Berkel”rdfs:label
dss:scheepsnaamgzmvoc:scheepsnaam
dss:ShipTypegzmvoc:Scheepst
ypegzmvoc: type-
Ship
dss:has_shiptypegzmvoc:has_shiptype
gzmvoc:scheepstype
“21”
“Moorse mattroosen”
dss:azRegistratieKop
gzmvoc:azAantalMatrozen
gzmvoc:telling
gzmvoc:heeft DAS heenreis
dss:Recorddas:Voyagedas:voyage-
1918_61
Matthias van Rossum (VU-hist) Payroll information for European
vs Asiatic Sailors (17th / 18th C)
Modelling principles
• Model each dataset as directly as possible– Only “syntactical” transformation to RDF– No normalization
• Reusability• Transparency, trust
• Normalize and link in second stage – store in separate RDF Named Graphs
mdb:Schip1 mdb:Kof
mdb:scheepsType
das:ShipX das:Kofship
das:typeOfShip
dss:has_shipTypedss:has_shipType
rdfs:subPropertyOf
rdfs:subPropertyOf
Link properties and classes to interoperability layer
mdb:Schip1 mdb:Kof
mdb:scheepsType
das:ShipX das:Kofship
das:typeOfShip
Aat:Kof
Aat:Platbodems
skos:exactMatch
skos:exactMatch
skos:exactMatch
Vocabulary Links
Links to DBPedia (Ship types, places, ranks)Links to Getty AAT (Ship types, ranks)Links to GeoNames (Places)
http://semanticweb.cs.vu.nl/amalgame/
Identifying ships
• Identify ships within a dataset using Machine Learning techniques– Based on: name, size, type, destinations etc.– Background knowledge
• 33,435 owl:sameAs links
Date ShipName ShipType ShipSize HomePort CurrentPort Captain1852-02-27 Alberdiena kof NULL NULL Noorwegen (N) Wolkammer Albert Augustinus1852-07-31 Alberdina kof NULL Farmsum Friedrichstadt (D) Wolkammer Albert A.1861-09-30 Alberdina kof 98 NULL Gdansk, Danzig (PL) Wolkammer Albert Augustinus1870-03-08 Alberdina brik 222 NULL NULL Wolkammer Albert Augustinus1875-09-22 Alberdina bark 309 NULL Oostzee Wolkammer Augustinus
– Robin Ponstein
Linking to Historical newspapers
• Use ML to detect links between ships and historical newspaper articles (delpher.nl)– Features: ship name, time
intervals, captain’s names, ship type, named entities, keywords, background knowledge
• 179,120 links- Andrea Bravo Balado
Example
[HARLINGEN, 24 October.] . «et gestrande Zweedsche schip , waarvan wij ons vorig no. melding maakten , is door de 'eepboot van hier afgebragt en hier binnengede u BiJ die gelegenheid werd ons medegeeeid, dat nog vier vaartuigen op Terschelling aren gestrand. Tevens is het berigt ontvan°e > dat het hier behoorende schoonerschip Transit, kapitein Schaap, in de Noordzee is gezonken, nadat het achterschip was weggeslagen ; een ligtmatroos verloor daarbij het leven. Mede zijn hier drie vreemde schepen met meer en minder zware averij binnengeloopen.Spoiler alert! It sank in the North Sea.
Provenance (PROV-O)
• Individual named graphs have provenance information– Who made it (people/software?)– Based on what source– Content confidence
• Matches historical
science requirements
ClioPatria Triplestore
• Data live at Huygens Institute for Dutch History– http://dutchshipsandsailors.nl/data– ~30 Million triples
• Dev. Server – http://semanticweb.cs.vu.nl/dss
• Purl.org URIs redirect to live server w/ content negotiation
• SPARQL endpoint• Web interface
DAS
GZMVOC
MDB
VOCOPVBegunstig
den
VOCOPVSoldijboek
en
PROV
AAT
VOCOPVOpvaren
den
foaf
owl:sameAs
dss:hasKBLink
rdfs:subClassOf,rdfs:subPropertyOf
dss:DAS link
skos :exactMatch
Take home
• Linked Data principles are a great fit to digital history requirements– Heterogeneous models/datasets, light-weight
reusable integration– Multiple levels of normalisation, through separate
named graphs– SW Provenance matches Historical Provenance
• Watch out when you sail your Schooner into the North Sea