+ All Categories
Home > Education > SuRf – Tapping Into The Web Of Data

SuRf – Tapping Into The Web Of Data

Date post: 24-Jan-2015
Category:
Upload: cosbas
View: 3,997 times
Download: 6 times
Share this document with a friend
Description:
SuRF is an Object - RDF Mapper based on the popular rdflib python library. It exposes the RDF triple sets as sets of resources and seamlessly integrates them into the Object Oriented paradigm of python in a similar manner as ActiveRDF does for ruby.
35
Copyright 2007 Digital Enterprise Research Institute. All rights reserved. www.deri.org SuRF – Tapping into the Web of Data Cosmin Basca Digital Enterprise Research Institute, Galway [email protected] Special Thanks to: Benjamin Heitman and Uldis Bojars Digital Enterprise Research Institute, Galway [email protected]
Transcript
Page 1: SuRf – Tapping Into The Web Of Data

Copyright 2007 Digital Enterprise Research Institute. All rights reserved.

www.deri.org

SuRF – Tapping into the Web of Data

Cosmin Basca

Digital Enterprise Research Institute, Galway

[email protected]

Special Thanks to: Benjamin Heitman and Uldis Bojars

Digital Enterprise Research Institute, Galway

[email protected]

Page 2: SuRf – Tapping Into The Web Of Data

Outline

• About DERI• Why Semantic Web?

– Linked Open Data (LOD)– RDF (Resource Description Framework)– SPARQL

• O-RDF Mapping (ActiveRDF / SuRF)– How?– Architecture– Installation– Examples

• Simple: access DBpedia (Semantic Wikipedia)• More complex: create a blog on top of RDF

2

Page 3: SuRf – Tapping Into The Web Of Data

DERI – http://www.deri.ie/

• Digital Enterprise Research Institute (DERI): – http://www.deri.ie/ – main goal: enabling networked knowledge– research about the future of the Web– biggest Semantic Web research institute in the world

• 120 people– part of the National University of Ireland, Galway

3

Page 4: SuRf – Tapping Into The Web Of Data

Outline

• About DERI• Why Semantic Web?

– Linked Open Data (LOD)– RDF (Resource Description Framework)– SPARQL

• O-RDF Mapping (ActiveRDF / SuRF)– How?– Architecture– Installation– Examples

• Simple: access DBpedia (Semantic Wikipedia)• More complex: create a blog on top of RDF

4

Page 5: SuRf – Tapping Into The Web Of Data

Why ?

• Develop Web applications that allow – Data Integration– Flexibility

• Schema definition and modeling• Schema evolution

– Robustness– Support for new Data

• Sources• Types

5

Page 6: SuRf – Tapping Into The Web Of Data

There is a Wealth of (RDF) data out there

6

Page 7: SuRf – Tapping Into The Web Of Data

Popular Semantic Web Vocabularies

• FOAF = for describing people and social network connections between them   http://xmlns.com/foaf/spec/

• SIOC = for describing Social Web content created by people   http://sioc-project.org/

• DOAP = for describing software projects   http://trac.usefulinc.com/doap – used by PyPi

7

Page 8: SuRf – Tapping Into The Web Of Data

Linked Open Data - Growth

8

Page 9: SuRf – Tapping Into The Web Of Data

Linked Open Data - Growth

9

Page 10: SuRf – Tapping Into The Web Of Data

Linked Open Data - Growth

10

Page 11: SuRf – Tapping Into The Web Of Data

The data model

• Traditional Approach use the Relational model– Usually leads to big ugly Schemas

11

Page 12: SuRf – Tapping Into The Web Of Data

The RDF (Graph) Data model

• Flexible– Support for both schema and data evolution during runtime– Simple model

• Relations are represented explicitly• Schema is a graph• Can integrate data – union of two graphs

12

Page 13: SuRf – Tapping Into The Web Of Data

A triple

The RDF (Graph) Data model

13

Eric Personis a

Subject Predicate Object

Page 14: SuRf – Tapping Into The Web Of Data

Example RDF graph describing Eric Miller (RDF Primer) – human readable format

14

[email protected]

EricEric Miller

Dr.

Person

is a

has full name

has e-mail

has personal title

Page 15: SuRf – Tapping Into The Web Of Data

Example RDF graph describing Eric Miller (RDF Primer) – machine readable format

15

mailto:[email protected]

http://w3.org/People/EM/contact#meEric Miller

Dr.

http://w3.org/2000/10/swap/pic/contact#Person

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

http://www.w3.org/2000/10/swap/pim/contact#fullName

http://www.w3.org/2000/10/swap/pim/contact#mailbox

http://www.w3.org/2000/10/swap/pim/contact#personalTitle

Page 16: SuRf – Tapping Into The Web Of Data

The RDF (Graph) Data model – Identification

• URI’s provide strong references– The URIref is a an unambiguous pointer to something of

meaning

Nodes (“Subjects”)

connect via Links (“Predicates”)

to Objects• Can be Nodes or Literals (plain or typed strings)

16

Page 17: SuRf – Tapping Into The Web Of Data

SPARQL – Querying the Semantic Web

• SPARQL is to RDF what SQL is to Relational tables• Expressive, designed with the Graph data model in mind

17

CarrieFisher Star

Wars

HarrisonFord

DarrylHannah

Blade Runner

starred_in

SELECT ?actor ?movie WHERE {?actor starred_in ?movie

}

starred_in

starred_in

starred_in

Page 18: SuRf – Tapping Into The Web Of Data

Levels of Data abstraction

18

APPLICATION

CONCEPTUAL

Relational Schemata Ontology

LOGICAL

SQL SPARQL RDQL Prolog Queries

PHYSICAL

Indexes Disk / Memory Data representation

DATA

Direct SPARQL Access

O-RDF Mapper SuRF

Page 19: SuRf – Tapping Into The Web Of Data

O-RDF Mapper, Why?

• Clean OO design

• Increased productivity– model is free from persistence constraints

• Separation of concerns and specialization

• ORMs often reduce the amount of code needed to be written, making the software more robust– 20% to 30% less code needs to be written– Less code – less testing – less errors

19

Page 20: SuRf – Tapping Into The Web Of Data

O-RDF Mapper, How?

• How do we see RDF data?– As a SET of triples?– As a SET of resources?

• The resource view is more suitable for the OO model

• How do we define an RDF resource ?– All triples <S,P,O> with same subject (ActiveRDF, SuRF)– And all triples <O,P,S> (SuRF)

• Apply Open World principles

20

Page 21: SuRf – Tapping Into The Web Of Data

Outline

• About DERI• Why Semantic Web?

– Linked Open Data (LOD)– RDF (Resource Description Framework)– SPARQL

• O-RDF Mapping (ActiveRDF / SuRF)– How?– Architecture– Installation– Examples

• Simple: access DBpedia (Semantic Wikipedia)• More complex: create a blog on top of RDF

21

Page 22: SuRf – Tapping Into The Web Of Data

SuRF – Semantic Resource Framework

• Inspired by ActiveRDF– Developed in DERI for ruby– Expose RDF as sets of resources

• Semantic attributes exposed as a “virtual API”, generated through introspection. – Naming convention:

• instance.namespace_attribute• cosmin.foaf_knows

• Finder methods– Retrieve resources by type or by attributes

• Session keeps track of resources, when calling session.commit() only dirty resources will be persisted

22

Page 23: SuRf – Tapping Into The Web Of Data

SuRF – Architecture

23

Session

Store

Reader Writer

Resource Proxy

Serializer Query

Namespace Manager

Page 24: SuRf – Tapping Into The Web Of Data

SuRF – Architecture – Currently supported plugins

24

Store

Reader

SPARQL HTTP

protocol

Sesame2 API

(Franz)

Sesame2 HTTP

Writer

Sesame2 API

(Franz)

Sesame2 HTTP

• Add your own plugins, extend:

surf.store.plugins.RDFReader

surf.store.plugins.RDFWriter

Redefine the __type__ attribute

This is the plugin identifier

• To install plugins

import my_plugin

Page 25: SuRf – Tapping Into The Web Of Data

SuRF - installation

• Available on PyPi– easy_install –U surf (to get the latest)

– Open-source available on Google Code, BSD licence

• http://code.google.com/p/surfrdf/

25

Page 26: SuRf – Tapping Into The Web Of Data

Outline

• About DERI• Why Semantic Web?

– Linked Open Data (LOD)– RDF (Resource Description Framework)– SPARQL

• O-RDF Mapping (ActiveRDF / SuRF)– How?– Architecture– Installation– Examples

• Simple: access DBpedia (Semantic Wikipedia)• More complex: create a blog on top of RDF

26

Page 27: SuRf – Tapping Into The Web Of Data

SuRF – simple example

DBpedia public SPARQL endpoint - read-only• Create the store proxy

from surf import *

store =  Store(reader='sparql-protocol',endpoint='http://dbpedia.org/sparql',                default_graph='http://dbpedia.org')

• Create the surf session

print 'Create the session'session = Session(store,{})

• Map a dbpedia concept to an internal class

PhilCollinsAlbums = session.get_class(ns.YAGO['PhilCollinsAlbums'])

27

Page 28: SuRf – Tapping Into The Web Of Data

SuRF – simple example

DBpedia public SPARQL endpoint - read-only• Get all Phill Collins albums

all_albums = PhilCollinsAlbums.all()

• Do something with the albums (display the links to their covers)

print 'All covers'for a in all_albums:    if a.dbpedia_name:        print '\tCover %s for "%s"'%(a.dbpedia_cover,a.dbpedia_name)

28

Page 29: SuRf – Tapping Into The Web Of Data

Outline

• About DERI• Why Semantic Web?

– Linked Open Data (LOD)– RDF (Resource Description Framework)– SPARQL

• O-RDF Mapping (ActiveRDF / SuRF)– How?– Architecture– Installation– Examples

• Simple: access DBpedia (Semantic Wikipedia)• More complex: create a blog on top of RDF

29

Page 30: SuRf – Tapping Into The Web Of Data

SuRF – integrate into Pylons

• Create a blog on top of an RDF database• Replace SQLAlchemy with SuRF• Download and install either AllegroGraph Free Edition

(preferred) or Sesame2– http://www.franz.com/downloads/clp/ag_survey– Free for up to 50.000.000 triples (records)

• Install pylons: easy_install pylons• Install SuRF: easy_install surf• Create a pylons application:

paster create -t pylons MyBlog

cd MyBlog

30

Page 31: SuRf – Tapping Into The Web Of Data

SuRF – Pylons Blog

• ~/MyBlog/development.ini: In the [app:main] section add

rdf_store = localhost

rdf_store_port = 6789

rdf_repository = tagbuilder

rdf_catalog = repositories

• ~/MyBlog/myblog/config/environment.pyfrom surf import *

rdf_store = Store( reader = 'sparql-sesame2-api',

writer = 'sesame2-api',

server = config['rdf_store'],

port = config['rdf_store_port'],

catalog = config['rdf_catalog'],

repository = config['rdf_repository'])

rdf_session = Session(rdf_store, {})

31

Page 32: SuRf – Tapping Into The Web Of Data

SuRF – Pylons Blog

• ~/MyBlog/myblog/model/__ init __.py from surf import *

def init_model(session):

global rdf_session

rdf_session = session

# register a namespace for the concepts in my blog

ns.register(myblog=‘http://example.url/myblog/namespace#’)

Blog = rdf_session.get_class(ns.MYBLOG[‘Blog’])

• Create the blog controller paster controller blog

• ~/MyBlog/myblog/controllers/blog.pyimport logging

from myblog.lib.base import *

log = logging.getLogger(__name__)

class BlogController(BaseController):

def index(self):

c.posts = model.Blog.all(0,5)

return render("/blog/index.html")

32

Page 33: SuRf – Tapping Into The Web Of Data

SuRF – Pylons Blog

• Create the template mkdir ~/MyBlog/myblog/templates/blog

• ~/MyBlog/myblog/templates/blog/index.html

<%inherit file="site.html" />

<%def name="title()">MyBlog Home</%def>

<p>${len(c.posts)} new blog posts!</p>

% for post in c.posts:

<p class="content" style="border-style:solid;border-width:1px">

<span class="h3"> ${post.myblog_title} </span>

<span class="h4">Posted on: ${post.myblog_date} by ${post.myblog_author}</span>

<br> ${post.myblog_content}

</p>

% endfor

• ~/MyBlog/myblog/templates/blog/site.html• Start the development built in server:

paster serve --reload development.ini

33

Page 34: SuRf – Tapping Into The Web Of Data

SuRF – Tapping into the Web of Data

• Can tap into the web of Data– SPARQL endpoints– Local or remote RDF Stores– Plugin framework, allows for more access protocols to be

defined

• Code is generated dynamically (pragmatic bottom up approach):– Introspection, meta-programming, – exposing a virtual API (defined by the data and the schema) to

the developer

• Can easily be integrated into popular python frameworks– pylons

34


Recommended