Post on 10-May-2015
description
transcript
Evolving the Web into a Giant Global Database
Marko A. RodriguezT-5, Center for Nonlinear StudiesLos Alamos National Laboratory
http://markorodriguez.com
February 12, 2009
Abstract
The Web as we know it today will not be the Web as we know ittomorrow. The Web of today is oriented towards the universal accessibilityof files (e.g. web pages, images). The Web of today can be thought of asa large-scale, distributed file system. The Web of tomorrow will encodeany datum (e.g. strings, integers, dates). The Web of tomorrow can bethought of as a large-scale, distributed database. This talk will discuss thethe future Web with special focus on the supporting standards andapplication visions.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Outline
• The Space of Uniform Resource Identifiers
• The World Wide Web vs. the Semantic Web
• Relational Databases vs. Triple Stores
• Ontologies and Reasoning
• General-Purpose Computing on the Semantic Web
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Outline
• The Space of Uniform Resource Identifiers
• The World Wide Web vs. the Semantic Web
• Relational Databases vs. Triple Stores
• Ontologies and Reasoning
• General-Purpose Computing on the Semantic Web
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Internet Address Spaces
• The Uniform Resource Identifier (URI) is the superclass of the UniformResource Locator (URL) and Uniform Resource Name (URN).
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The Uniform Resource Locator
• The set of all URLs is the address space of all resources that can belocated and retrieved on the Web. URLs denote where a resource is.
? http://markorodriguez.com/index.html∗ Domain name server (DNS): markorodriguez.com→ 216.251.43.6∗ http:// means GET at port 80,∗ /index.html means the resource to get at that Internet location.
markorodriguez.com216.251.43.6
Web Server
index.html
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The Uniform Resource Name
• The set of all URNs is the address space of all resources within the urn:namespace.
? urn:uuid:bd93def0-8026-11dd-842be54955baa12? urn:issn:0892-3310? urn:doi:10.1016/j.knosys.2008.03.030
• Named resources need not be retrievable through the Web.
• URNs denote what a resource is.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The Uniform Resource Identifier
• The URI address space is an infinite space for all Internet resources.
? http://markorodriguez.com/index.html? urn:issn:0892-3310? ftp://markorodriguez.com/private/markos_secrets.txt? http://www.lanl.gov#fluffy
• Imporant: URIs can denote concepts, instances, and datum.
lanl:fluffy lanl:fluffy_legs
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The “Uniform Resource Graph”
• We can denote where something is, what something is, but how do wedenote how something relates to something else?
• How can we denote what something means, where meaning is determinedby its place within a larger relational structure?
? URIs are like words. They denote things in the real or imaginary world.? Linking URIs is like defining words. Similar to how a dictionary defines
words in terms of other words.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Outline
• The Space of Uniform Resource Identifiers
• The World Wide Web vs. the Semantic Web
• Relational Databases vs. Triple Stores
• Ontologies and Reasoning
• General-Purpose Computing on the Semantic Web
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Undirected Single-Relational Network
Human-B
Human-C
Human-D
Human-E
Human-F
Human-A
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Directed Single-Relational Network
Article-B
Article-C
Article-D
Article-E
Article-F
Article-A
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
From the World Wide Web to the Semantic Web
• The World Wide Web is primarily concerned with the Hyper-TextTransfer Protocol (HTTP) and with retrievable resources in the URLaddress space.
• These retrievable resources are files: HTML documents, images, audio,etc. The “web” is created when HTML documents contain URLs.
index.html
Home.html Research.htmlResume.html hrefhref
href
http://markorodriguez.com/
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Directed Multi-Relational Network
Article-A
Journal-A
Publisher-A
Article-B
Human-B
Human-A
authored
authored
authoredcontainedIn
editorOf
publishedBy
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
From the World Wide Web to the Semantic Web
• The Semantic Web is primarily concerned with URIs. If the WorldWide Web is the web of files, the Semantic Web is the web of concepts.In other words, for the World Wide Web, the level of granularity isthe retrievable file. For the Semantic Web, it is the ideas in that file.Moreover, these ideas are not necessarily contained in a file. Thereexistence is predicated on their URI. Their meaning is predicated on theirrelationship to other URIs. The web of URIs is the Semantic Web.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The Resource Description Framework
• The Resource Description Framework (RDF) is the standard forrepresenting the relationship between URIs and literals (e.g. float, string,date time, etc.). I would have preferred the name “Uniform ResourceGraph” (URG).
• Relationships are directed, labeled links between URIs. A subject URIpoints to an object URI or literal by means of a predicate URI.
lanl:marko lanl:jhwfoaf:knows
foaf:name
"Marko A. Rodrigez"^^xsd:string
foaf:name
"Jennifer H. Watkins"^^xsd:string
subject objectpredicate
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
lanl:marko lanl:jhwfoaf:knows
foaf:name
"Marko A. Rodrigez"^^xsd:string
foaf:name
"Jennifer H. Watkins"^^xsd:string
foaf:member
lanl:lanl
foaf:member
foaf:name
"Los Alamos National Laboratory"^^xsd:string
unm:unm
foaf:member
foaf:name
"University of New Mexico"^^xsd:string
urn:doi:10.1016/j.joi.2008.04.002
foaf:publicationsrdf:type
foaf:Person
rdf:type
foaf:Document
rdf:type
foaf:Organization
rdf:type rdf:type
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The RDF Data Model and its Serializations
• RDF is a data model. As such, there exists many serializations(encodings) of that model.
• RDF/XML is not RDF. It is a serialization of RDF. It is smart to, at allcosts, avoid learning RDF/XML as it is an unintuitive standard. Otherserializations include: N-TRIPLE, N3, TRIX, TRIG, ...
<http://www.lanl.gov/uc33c7c98> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.mesur.org/schemas/2007-01/mesur#Journal> .<http://www.lanl.gov/uc33c7c98> <http://xmlns.com/foaf/0.1/name> "Journal of Neuroscience Research"^^<http://www.w3.org/2001/XMLSchema#string> .<http://www.lanl.gov/uc33c7c98> <http://www.mesur.org/schemas/2007-01/mesur#hasDoi> "urn:doi:10.1002/(issn)1097-4547"^^<http://www.w3.org/2001/XMLSchema#anyURI> .<http://www.lanl.gov/uc33c7c98> <http://www.mesur.org/schemas/2007-01/mesur#hasIssn> "urn:issn:0360-4012"^^<http://www.w3.org/2001/XMLSchema#anyURI> .<http://www.lanl.gov/uc33c7c98> <http://www.mesur.org/schemas/2007-01/mesur#hasIssn> "urn:issn:1097-4547"^^<http://www.w3.org/2001/XMLSchema#anyURI> .
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The Semantic Web is a Distributed Database
• The URI address space is distributed.
• URIs can denote datum.
• RDF denotes the relationships URIs.
• The Semantic Web’s foundational standard is RDF.
• Therefore, the Semantic Web is a distributed database.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
The World Wide Web vs. the Semantic Web
Web Server
127.0.0.1
HTML
Web Server
127.0.0.2
HTMLhref
Web Server
127.0.0.1
RDF
Triple Store
127.0.0.2
foaf:knows
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Linked Data Cloud1
SWConference
Corpus
DBpedia RDF Book Mashup
DBLPBerlin
Revyu
Project Guten-berg
FOAFprofiles
Geo-names
Music-brainz
Magna-tuneJamendo
World Fact-book
DBLPHannover
SIOCprofiles
Sem-Web-
Central
Euro-stat
ECS South-ampton
BBCLater +TOTP
Doap-space
Open-Guides
Gov-Track
US Census Data
W3CWordNet
flickrwrapprWiki-
company
OpenCyc
lingvoj
Onto-world
BBCJohnPeel
Flickrexporter
Audio-Scrobbler QDOS
updated
RKB Explorer
NEW!riese
NEW!
1provided by Richard Cyganiak (richard@cyganiak.de)
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Outline
• The Space of Uniform Resource Identifiers
• The World Wide Web vs. the Semantic Web
• Relational Databases vs. Triple Stores
• Ontologies and Reasoning
• General-Purpose Computing on the Semantic Web
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Relational Databases vs. Triple Stores: Technology
• A relational databases’ (e.g. MySQL, PostgreSQL, Oracle) naturalrepresentation is a collection interlinked tables.
• A triple stores’ (e.g. OpenSesame, AllegroGraph, Neo4j) naturalrepresentation is a multi-relational network, or graph.
Triple Store
127.0.0.2
Relational Database
127.0.0.1
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Relational Databases vs. Triple Stores: Culture
• Relational databases tend to not maintain public access points.
• Relational database users tend to not publish their schemas.
• Triple stores maintain public access points called SPARQL end-points.
• Triple store users tend to reuse and extend public schemas calledontologies.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
SQL vs. SPARQL
SELECT ?x WHERE {?y foaf:name "Los Alamos National Laboratory"^^xsd:string .?y foaf:member ?x .?x foaf:knows ?z .?z foaf:name "Marko A. Rodriguez"^^xsd:string }
SELECT p1.idFROM person p1, organization o1 AS r1, person p2 WHEREo1.name="Los Alamos National Laboratory" ANDo1.id = p1.member ANDp1.id = r1.id ANDr1.knows=p2.id ANDp2.name="Marko A. Rodriguez";
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Outline
• The Space of Uniform Resource Identifiers
• The World Wide Web vs. the Semantic Web
• Relational Databases vs. Triple Stores
• Ontologies and Reasoning
• General-Purpose Computing on the Semantic Web
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
lanl:marko lanl:jhwfoaf:knows
foaf:name
"Marko A. Rodrigez"^^xsd:string
foaf:name
"Jennifer H. Watkins"^^xsd:string
foaf:member
lanl:lanl
foaf:member
foaf:name
"Los Alamos National Laboratory"^^xsd:string
unm:unm
foaf:member
foaf:name
"University of New Mexico"^^xsd:string
urn:doi:10.1016/j.joi.2008.04.002
foaf:publicationsrdf:type
foaf:Person
rdf:type
foaf:Document
rdf:type
foaf:Organization
rdf:type rdf:type
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Ontologies
• An ontology defines your domain of discourse.
• An ontology helps to define the types of abstract classes that exist inyour domain and the types of relationships that exist between instancesof those classes.
• An ontology allows you to infer implicit knowledge from explicitknowledge.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
lanl:marko lanl:fluffylanl:hasOwner
foaf:Person
rdf:type
lanl:Dog
owl:Restriction
lanl:jhw
lanl:hasOwner
lanl:hasOwner owl:onProperty
owl:maxCardinality
"1"^^xsd:int
rdfs:subClassOf
_:12345
rdf:type
owl:differentFrom
rdf:type
lanl:Mammal
rdf:type
rdf:subClassOf
rdf:typerdf:type
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Outline
• The Space of Uniform Resource Identifiers
• The World Wide Web vs. the Semantic Web
• Relational Databases vs. Triple Stores
• Ontologies and Reasoning
• General-Purpose Computing on the Semantic Web
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
General-Purpose Computing on the Semantic Web
• It is possible to represent a virtual computing machine and software inthe Semantic Web.
• Given that the URI address space is distributed, the computing structuresare inherently distributed.
• Thus, the Semantic Web can be used as a giant computer – data,programs, and virtual machines all within the same address space.
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
x = (3 ∗ 2) + 1
urn:uuid:361604bc neno:hasSymbol
"x"^^xsd:string
neno:nextInst
"1"^^xsd:intlanl:Push urn:uuid:3cff4d2e neno:hasValue
"2"^^xsd:inturn:uuid:403d632c neno:hasValue
"3"^^xsd:inturn:uuid:47fe91e2 neno:hasValue
neno:nextInst
urn:uuid:7c08528e
neno:nextInst
urn:uuid:7c08528e
neno:nextInst
neno:nextInst
lanl:Push
lanl:Multiply
rdf:type
rdf:type
lanl:Push rdf:type
lanl:Add
lanl:Set
rdf:type
rdf:type
rdf:type
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
halt
Fhat
Instruction
programLocation
Frame
hasFrame
[0..*]
[0..1]
returnTop
ReturnStack
Instruction
rdf:firstrdf:rest
[0..1][0..1]
blockTop
[0..*]
FrameVariable
rdf:li
hasValue
rdfs:Resource
operandTop
OperandStack
rdfs:Resource
rdf:firstrdf:rest
[0..1]
[0..1]
[0..1]
RVM
[0..*]
hasSymbol
xsd:string
[1]
xsd:boolean[1]
forFrame[1]
fromBlock
Block
[1]
currentFrame
[0..1]
methodReuse
xsd:boolean[1]
[0..1]
BlockStack
Block
rdf:firstrdf:rest
[0..1]
[0..1]
[0..1]
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009
Conclusion
• Thank you for your time...
? My homepage: http://markorodriguez.com? Neno/Fhat: http://neno.lanl.gov
New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009