Linked Datapast, present and futures
Pierre-Yves Vandenbussche
ESSnet May 27th 2019
@pyvandenbussche
Introduction – Python notebook
2
+
Outline
• 1. Semantics on the Web
− Vision and construction
• 2. Linked Data: Pragmatic use
• 3. Promising futures
3
1. Semantics on the WebVision and construction
4
Creation of the Web (1989)
5
https://www.w3.org/History/1989/proposal.html
• A “Graph” view of the Web with semantic relations
between documents and entities
Theory towards the Semantic Web (1994)
https://www.w3.org/Talks/WWW94Tim/
https://videos.cern.ch/record/2671957
• “Flat” view of the Web for a computer. No typed relations
• A Web for agents
• Coordination needed:
Construction of the Semantic Web (1999-2012)
7
• SPARQL – Querying
• OWL – Formal semantic
• RDF-S – Semantic
• RDF – Representation
2. Linked DataPragmatic use of SW
8
9
Adoption technology ≠ Original thoughts
Shift from Semantic Web to Linked Data
• Ontology / AI implication fails to address most of real world data that is uncertain, incomplete, inconsistent and includes errors
− Relation between concepts can be defined as logical entailments in a formal system (Student(?x) => Person(?x))
• Linked Data
− Emphasis on sharing the information in form of a graph
− Facilitating data integration through common vocabularies with light formal commitment
− Constraints
o ShEx (extension for wikidata May 2019) / SHACL
10
2012
2015
Peak of inflated expectations
Trough of disillusionment
Shift from Semantic Web to Linked Data
• Reflected in industry
− Search Engines:
o Google 2012 (Peter Norvig) vs
o Google 2012 (Ramanathan V. Guha)
− Wikipedia (Jimmy Wales) https://www.youtube.com/watch?v=MY4s8uuHmy0
• W3C broaden its Semantic Web activity (https://www.w3.org/2001/sw/) giving rise to the Data Activity / Web of Data (https://www.w3.org/2013/data/)
11
Gartner 2012 and 2015
Semantic Web
Linked Data
Linked Data successes /Limitations • Common semantics for a community:
− Schema.org: Web pages metadata / Enhanced search engines
− Museum and art – Getty
− Library – DC/Bibframe
− Statistical data – ESSnet!
• Linked Open Data
− Dbpedia, Wikidata, Eurostat, etc.
• Semantic pipeline
− BBC
− Thomson Reuters
• Limitations
− Cost / incentive
− Use Feedback
− Tools and maintenance
12
Jem Rayfield, https://www.slideshare.net/JemRayfield/dsp-bbcjem-
rayfieldsemtech2011
The Rise of Graph data
• Enterprise Knowledge Graphs
− Google Knowledge Vault
− Microsoft Academic KG
− Facebook Graph Search
− …
• Graph mining
• Convergence with Property Graphs
− RDF* (https://www.w3.org/Data/events/data-ws-2019/)
13
3. Promising futuresKnowledge Discovery and Sentient Web
14
Knowledge Discovery
15
• Literature based
discovery - Swanson
1980
• Panama papers (2016)
− Neo4J
− Linkurious
• Discovery of Cancer related
protein interactions
“Artificial Intelligence to win Nobel Prize and Beyond” Hiroaki Kitano - ISWC 2016
Discovery of Cancer related protein interactions
16
Prediction of new phosphorylation relationsData
Angiogenesis was expressed in the
majority of cases. In CRC, the
microvascular density (MVD) was
higher than that from ACC. The ratio
CD31/CD105 was 1 in ACC and 3 in
CRC. VEGF was positive in 25% of
ACC and 80% of CRC. In CRC were
more mature vessels, marked only
with CD31 than immature vessels or
endothelial isolated cells marked
with both CD31 and CD105. In ACC
prevailed the neoformed vessels
marked with both CD31 and CD105.
18060184
Unstructured data
LOD
ESR1
Tamoxifen
PIK3CA
SGK1
AKT1 PDPK1
Copanislib
GSK650394
S102
T291 S74 T37
T65 T369
S529
Known Links Fujitsu Prediction
Sentient Web (Graphs, + IoT + AI/ML)
17
“Ecosystems of services with awareness of the world through sensors, and reasoning based upon graph data & rules together with graph algorithms and machine learning” Dave Raggett (W3C/ERCIM) / Michael N. Huns (University of South Carolina)
• Combining symbolic information with statistics based upon prior knowledge and past experience
− Large range of reasoning techniques
o Deductive, inductive, abductive, causal, counterfactual, temporal, spatial, etc.
o Together with efficient graph algorithms
− Continuous learning
o Heuristics, simulated annealing, reinforcement learning
Thank you
Pierre-Yves Vandenbussche
@pyvandenbussche
Linked Data example
19
Sovereign Statewd:Q3624078
wd:Q219
rdfs:label“Bulgaria”@en
“74.61”
“8031.59”
“7,265,115”
“BG”NUTS code
wdt:P605
“BG”
BULGARIANUTS code
BULGARIA
NUTS Regionramon:NUTSRegion
eus:geo
Participation rates of
4-years-olds in educationR05_1
“2012”
“79.5”
dic/geo#BG