Post on 15-Jul-2015
transcript
Dutch Ships and Sailors
Victor de Boer - v.de.boer@vu.nl
Digitale historische kranten als big data 24-3-2015
DIVE
Dutch Ships and Sailors
Victor de Boer, Matthias van Rossum, Jur Leinenga, Rik Hoekstra
With input from Andrea Bravo Balado and Robin Ponstein
Netherlands Institute for Sound and Vision / VU University Amsterdam v.de.boer@vu.nl
The Problem: ((Maritime) historical) data is not integrated
25+ Maritime datasets; Heterogeneous
The solution
Well, Linked Data obviously!
KB Delpher
Dutch-Asiatic Shipping (DAS) – Voyages (Huygens ING)
“VOC Opvarenden” Mustering and payroll information (DANS Easy)
Dutch Ships and Sailors
Jur Leinenga (Huygens ING) Monsterrollen Noordelijke provincies
Matthias van Rossum (VU-hist) Generale Zeemonsterrollen VOC
DAS
GZMVOC
MDB
VOCOPV Begunstig
den
VOCOPV Soldijboek
en
PROV
AAT
VOCOPV Opvaren
den
foaf
owl:sameAs
dss:hasKBLink
rdfs:subClassOf, rdfs:subPropertyOf
dss:DAS link
skos :exactMatch
Links to original scans
Linking to Historical newspapers
• Use ML to detect links between ships and historical newspaper articles (delpher.nl)
– Features: ship name, time intervals, captain’s names, ship type, named entities, keywords, background knowledge
• 179,120 links
- Andrea Bravo Balado
Example
[HARLINGEN, 24 October.] . «et gestrande
Zweedsche schip , waarvan wij ons vorig no.
melding maakten , is door de 'eepboot van hier
afgebragt en hier binnengede u BiJ die
gelegenheid werd ons medegeeeid, dat nog vier
vaartuigen op Terschelling aren gestrand.
Tevens is het berigt ontvan°e > dat het hier
behoorende schoonerschip Transit, kapitein
Schaap, in de Noordzee is gezonken, nadat het
achterschip was weggeslagen ; een ligtmatroos
verloor daarbij het leven. Mede zijn hier drie
vreemde schepen met meer en minder zware
averij binnengeloopen. Spoiler alert! It sank in the North Sea.
Data analysis and visualisation
Results
• 30 Million RDF triples of integrated maritime historical data – 180.000 links to KB newspapers (Background information on arrivals, departures,
cargo, other events )
– New visualisations and query options – Conversion process documented to allow for additional datasets (current work) – Online RDF triple store at Huygens ING
• Linked Data principles are a great fit to digital history requirements
– Heterogeneous models/datasets, light-weight reusable integration – Multiple levels of normalisation, through separate named graphs (including
links to newspapers) – SW Provenance matches Historical Provenance
• Watch out when you sail your Schooner into the North Sea
DIVE INTO THE EVENT-BASED
BROWSING OF LINKED HISTORICAL MEDIA
VICTOR DE BOER, JOHAN OOMEN, OANA INEL, LORA AROYO,
ELCO VAN STAVEREN, WERNER HELMICH AND DENNIS DE BEURS
DIGITAL HUMANITIES
RESEARCHERS
Med
ia research
er Lars Arve
Rø
ssland
of th
e Un
iversity o
f Bergen
. (Ph
oto
: An
dreas R
. Grave
n)
http
s://ww
w.flickr.co
m/p
ho
tos/d
rainrat/1
47
79
92
899
8/
EXPLORATIVE SEARCH
Erp, M. van; Oomen, J.; Segers, R.; Akker, C. van de; Aroyo, L.; Jacobs, G.; Legêne, S; Meij, L. van der; Ossenbruggen, J.R. van; Schreiber, G. Automatic Heritage Metadata Enrichment with Historic Events Museums and the Web 2011 http://www.museumsandtheweb.com/mw2011/papers/automatic_heritage_metadata_enrichment_with_hi
DATA: OPENIMAGES.EU
Open videos Netherlands Institute for Sound and Vision
3000, mostly news broadcasts
DATA: DELPHER.NL
Scans of Radio bulletins (hand annotated)
• 1937 – 1984
• 1.5 Million OCR’ed and NErred
ENTITY EXTRACTION
CROWDTRUTH.ORG
ENTITY EXTRACTION
EVENTS CROWDSOURCING AND LINKING TO
CONCEPTS THROUGH CROWDTRUTH.ORG
SEGMENTATION & KEYFRAMES
LINKING EVENTS AND
CONCEPTS TO KEYFRAMES
MEDIA OBJECTS LINKED THROUGH EXTRACTED ENTITIES
DIVE:MEDIA OBJECT SEM:EVENT
SEM:PLACE
SEM:TIME
SEM:ACTOR
SKOS:CONCEPT
OA:ANNOTATION
LINKS TO EUROPEANA LINKS TO DBPEDIA
SIMPLE EVENT MODEL (SEM), OPENANNOTATION (OA) AND SKOS
INFINITY OF EXPLORATION
http
s://ww
w.flickr.co
m/p
ho
tos/m
ibu
chat/2
77
42
51
41
5 h
ttps://w
ww
.flickr.com
/ph
oto
s/ben
jcarson
/24
51
71
88
5
DIGITAL SUBMARINE UI
THANK YOU
http
s://ww
w.flickr.co
m/p
ho
tos/ro
bysalto
ri/ DUTCHSHIPSANDSAILORS.NL
DIVE.BEELDENGELUID.NL
v.de.boer@vu.nl
http://semanticweb.cs.vu.nl/dss/user/query
# Give me all records that have both a link to an original scan and one to a KB news article, that have an associated ship whose shiptype is a subtype of "kustvaarders". prefix dss: <http://purl.org/collections/nl/dss/> prefix mdb: <http://purl.org/collections/nl/dss/mdb/> SELECT * WHERE { ?record dss:hasOriginalScan ?scan. ?record dss:has_kb_link ?kblink. ?record mdb:schip ?schip. ?schip dss:has_shiptype ?shiptype. ?shiptype skos:exactMatch ?em. ?em skos:broader ?b. } LIMIT 50