+ All Categories
Home > Documents > An Introduction to Linked Data, Its Applications and Challanges

An Introduction to Linked Data, Its Applications and Challanges

Date post: 12-Jan-2016
Category:
Upload: bonita
View: 38 times
Download: 2 times
Share this document with a friend
Description:
An Introduction to Linked Data, Its Applications and Challanges. Samad Paydar [email protected] WTLab Research Group Ferdowsi University of Mashhad. 2 nd October 2009. Outline. The Web of Documents vs. the Web of Data Linked Data Linking Open Data Project - PowerPoint PPT Presentation
Popular Tags:
72
Samad Paydar [email protected] WTLab Research Group Ferdowsi University of Mashhad An Introduction to Linked Data, Its Applications and Challanges 2 nd October 2009
Transcript
Page 1: An Introduction to Linked Data, Its Applications and Challanges

Samad [email protected] Research GroupFerdowsi University of Mashhad

An Introduction to Linked Data,

Its Applications and Challanges

2nd October 2009

Page 2: An Introduction to Linked Data, Its Applications and Challanges

Outline

The Web of Documents vs. the Web of DataLinked DataLinking Open Data ProjectLinked Data Technology StackLinking Data Applications Outlook Similar Developments Challenges

2

Page 3: An Introduction to Linked Data, Its Applications and Challanges

The Web of Documents vs. the Web of Data

3

Page 4: An Introduction to Linked Data, Its Applications and Challanges

The Web of Documents

4

Traditional Web, Hypertext WebAnalogy

A global filesystemDesigned for

Human consumptionPrimary objects

DocumentsLinks

UntypedBetween documents (or parts of documents)

Degree of structure in objectFairy low

Semantics of content and linksimplicit

Page 5: An Introduction to Linked Data, Its Applications and Challanges

The Web of Documents

5

Page 6: An Introduction to Linked Data, Its Applications and Challanges

The Web of Documents : Challenges

The Web has radically altered the way people share knowledge

By lowering the barrier to publishing and accessing documents

But it is not so about applications and dataTraditionally, data on the Web is published as formats like

HTML tables, CSV or XML files, … Much of the structure and semantic of data is sacrificed.

6

Page 7: An Introduction to Linked Data, Its Applications and Challanges

The Web of Documents : Challenges

7

Data integration “Show me all the publications from Semantic Web-related

conferences in 2007”Querying across data sources

“Which WWW2008 papers have been written by people from companies of less than 100 people?”

Note that all the data required to answer the above questions might be available on the Web.

Page 8: An Introduction to Linked Data, Its Applications and Challanges

The Web of Data

8

AnalogyA global data space

Designed forMachines first, humans later

Primary objectsThings (description of things)

LinksTypedBetween things

Degree of structure in objectsHigh

Semantic of content and linksExplicit

Page 9: An Introduction to Linked Data, Its Applications and Challanges

The Web of Linked Data

9

Page 10: An Introduction to Linked Data, Its Applications and Challanges

Linked Data

10

Page 11: An Introduction to Linked Data, Its Applications and Challanges

Linked Data

11

Is about using the Web to create typed links between data from different sources

Refers to data published on the Web in such a way thatIt is machine-readableIts meaning is explicitly definedIt is linked to other datasetsIt can be linked to from external datasets

Page 12: An Introduction to Linked Data, Its Applications and Challanges

Linked Data and Web of Data

The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions - the Web of Data.

12

Page 13: An Introduction to Linked Data, Its Applications and Challanges

Properties of the Web of Data

13

It is genericCan contain any type of data

Data about anythingAnyone can publish dataNo constraints on choice of vocabularies entities are

connected by RDF links

Page 14: An Introduction to Linked Data, Its Applications and Challanges

A Taste of Linked Data

14

Page 15: An Introduction to Linked Data, Its Applications and Challanges

A Taste of Linked Data

15

Page 16: An Introduction to Linked Data, Its Applications and Challanges

A Taste of Linked Data

16

Page 17: An Introduction to Linked Data, Its Applications and Challanges

Linked Data

17

Page 18: An Introduction to Linked Data, Its Applications and Challanges

Linking Open Data Project

18

Page 19: An Introduction to Linked Data, Its Applications and Challanges

LOD Project

19

Linking Open Data ProjectA community projectFounded in January 2007Supported by W3C Semantic Web Education and Outreach

GroupGoal: to bootstrap the Web of Data by identifying existing

datasets that are available under open licenses, converting them to RDF (according to Linked Data principles), interlink them with other datasets, and publishing then on the Web

Page 20: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud : May 2007

20

Page 21: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud

21

The image shows only datasets that are published based on Linked Data Principles and are interlinked with at least one other dataset in the cloud

Each circle represents a dataset Size of the circle corresponds to the number of triples Arrows represent the links between datasets Thickness of arrows indicates number of links between

datasets Some datasets act as hub

E.g. DBpedia, Geonames, …

Page 22: An Introduction to Linked Data, Its Applications and Challanges

DBpedia

22

Extract structured information from Wikipedia and making it available on the Web under an open license

Page 23: An Introduction to Linked Data, Its Applications and Challanges

Geonames

23

Contains over eight million geographical names6.5 million unique features2.2 million populated places and 1.8 million alternate namesfeatures categorized into one out of nine feature classesfurther subcategorized into one out of 645 feature codes

Page 24: An Introduction to Linked Data, Its Applications and Challanges

Geonames

24

Page 25: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud : July 2007

25

Page 26: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud : August 2007

26

Page 27: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud : November 2007

27

Page 28: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud : February 2008

28

Page 29: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud : March 2009

29

Page 30: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud : July 2009

30

Page 31: An Introduction to Linked Data, Its Applications and Challanges

LOD Cloud

31

Content of the cloud is diverseData about geographic locations, people, companies, books,

scientific publications, companies, books, films, music, TV programs, genes, proteins, …

Some statisticsThe Web of Data currently consists of 4.7 billion RDF

triples, interlinked around 142 million RDF links (May 2009)

Page 32: An Introduction to Linked Data, Its Applications and Challanges

A Programmer’s Point of View

Semantic technologies like Linked Data, decouple applications from data through the use of a simple, abstract data model

Any application that understands the model, can consume any data source published based on the model

32

Page 33: An Introduction to Linked Data, Its Applications and Challanges

Don’t Miss books

To really feel it, I recommend to study

33

Page 34: An Introduction to Linked Data, Its Applications and Challanges

Linked Data Technology Stack

34

Page 35: An Introduction to Linked Data, Its Applications and Challanges

Linked Data Technology Stack

35

Page 36: An Introduction to Linked Data, Its Applications and Challanges

Linked Data Principles

36

Berners-Lee, 20061. Use URIs as names for things2. Use HTTP URIs so that people can lookup those names3. When someone looks up a URI, provide useful

information4. Include links to other URIs, so that they can discover more

things

Page 37: An Introduction to Linked Data, Its Applications and Challanges

URI: Uniform Resource Identifier

37

“URI provides a simple and extensible means for identifying a resource” RFC 3986

URL: for documents and other entities that can be located on the Web

URI is a more generic means to identify any entity existing in the world

Page 38: An Introduction to Linked Data, Its Applications and Challanges

HTTP

38

Provides URI dereferencing: A simple mechanism for retrieving resources that can be serialized as a stream of bytes

E.g. picture of a dogDescriptions of entities that cannot themselves be sent

across networkE.g. the dog itself

Page 39: An Introduction to Linked Data, Its Applications and Challanges

RDF

39

HTML provides a means to structure and link documentsRDF provides a generic, graph-based data model to

structure and link data that describes thingsA triple [subject, predicate, object]

Subject: a URIObject: a URI or a string literalPredicate: a URI

Page 40: An Introduction to Linked Data, Its Applications and Challanges

RDF Link

RDF Link: take the form of RDF triples, where the subject of the triple is a URI reference in the namespace of one data set, while the object of the triple is a URI reference in the otherS: http://data.linkedmdb.org/resource/film/77P: http://www.w3.org/2002/07/owl#sameAsO: http://dbpedia.org/resource/Pulp_Fiction_%28film%29

Allow client applications to navigate between data sources to discover additional data

40

Page 41: An Introduction to Linked Data, Its Applications and Challanges

RDFS / OWL

41

Provide a basis for creating vocabularies that can be used to describe entities in the world and how they are related

Page 42: An Introduction to Linked Data, Its Applications and Challanges

Linked Data

42

Linked Data employsHTTP URIs to identify resourcesHTTP Protocol to retrieve resourcesRDF data model to represent resources

Therefore, it is built on the general architecture of the Web

Page 43: An Introduction to Linked Data, Its Applications and Challanges

Linking Data Applications

43

Page 44: An Introduction to Linked Data, Its Applications and Challanges

Current Applications

Numerous efforts are underway to research and build applications that exploit this Web of data. At present, these efforts can be broadly classified into three categories:1. Linked Data browsers2. Linked Data search engines and indexes3. Domain-specific Linked Data applications

44

Page 45: An Introduction to Linked Data, Its Applications and Challanges

Linked Data Applications

45

Linked Data BrowsersBrowse things, not just documentsBrowse and navigate between data E.g. Disco, Tabulator, Marbles

Page 46: An Introduction to Linked Data, Its Applications and Challanges

46

Data about Berlin on DBpedia is linked to data about Berlin on Geonames

Page 47: An Introduction to Linked Data, Its Applications and Challanges

Linked Data Search Engines and Indexes

47

Crawl Linked Data from the Web and provide query capabilities over aggregated dataHuman-oriented

E.g. Falcon, SWSEApplication-oriented

E.g. Swoogle, Watson,

Page 48: An Introduction to Linked Data, Its Applications and Challanges

Domain-Specific Applications

48

RevyuDbpedia MobileTalis AspireBBC Programmes and BBC Music

Page 49: An Introduction to Linked Data, Its Applications and Challanges

Revyu

49

Page 50: An Introduction to Linked Data, Its Applications and Challanges

DBpedia Mobile

50

Uses Dbpedia, Revyu, and Flickr

Page 51: An Introduction to Linked Data, Its Applications and Challanges

Outlook

51

Page 52: An Introduction to Linked Data, Its Applications and Challanges

Future Queries

Which European city has the greatest concentration of works by Caravaggio?and has direct flights from my home town?with an airline that is rated good or excellent?by me? ...by my friends?

Whereabouts near my home can I see buildings by architects who were influenced by the Bauhaus?On a Monday?and with a student discount?

52

Page 53: An Introduction to Linked Data, Its Applications and Challanges

53

Similar Developments

Page 54: An Introduction to Linked Data, Its Applications and Challanges

RDFa

A common serialization format for Linked Data is RDF/XML

Alternatively, Linked Data can also be serialized as RDFaA way of annotating XHTML Web pages with RDF dataIdea: to publish content once, mixing the human readable

and machine readable content together

54

Page 55: An Introduction to Linked Data, Its Applications and Challanges

RDFa

<body> .... Toby's nickname is: kiwitobes ...</body>

55

Simple HTML human readable

Page 56: An Introduction to Linked Data, Its Applications and Challanges

RDFa

56

<body>

....

Toby's nickname is:

<span xmlns:foaf="http://xmlns.com/foaf/0.1/"

about="http://kiwitobes.com/toby.rdf#ts"

property="http://xmlns.com/foaf/0.1/nick">

kiwitobes

</span>

...

</body>

RDFa embedded in HTML both human readable and machine readable

Page 57: An Introduction to Linked Data, Its Applications and Challanges

Microformats

57

Aim at extending traditional Web with structured data Define a set of simple data formats that are embedded into

HTML via class attributed Differences with Linked Data

Linked Data is not limited in the vocabularies, vocabulary development is completely open

Microformats are restricted to a set of vocabularies developed by a specific community

Page 58: An Introduction to Linked Data, Its Applications and Challanges

Microformats

58

Data items that are included in HTML pages via Microformats do not have their own identifier. This prevents assertions about the relationships between data items and to connect data items between pages and sites. By using URIs as global identifiers and RDF to represent relationships, Linked Data does not have these limitations.

Page 59: An Introduction to Linked Data, Its Applications and Challanges

Microformats

…<div> Toby Segaran </div><div> Organization: The Semantic Programmers

</div><div> Tel: 919-555-1234 </div>…

59

Simple HTML human readable

Page 60: An Introduction to Linked Data, Its Applications and Challanges

Microformats : hCard

…<div class="vcard"><div class="fn">Toby Segaran</div><div class="org">The Semantic

Programmers</div><div class="tel">919-555-1234</div>…

60

Semantic embedded in HTML both human readable and machine readable

Page 61: An Introduction to Linked Data, Its Applications and Challanges

Web APIs

61

Many major Web data sources such as Amazon, eBay, Yahoo!, and Google provide access to their data via Web APIs.Web APIs are accessed using a wide range of different

mechanisms, and data retrieved from these APIs is represented using various content formats.

E.g. return JSON, XML, RDF, …Most Web APIs do not assign globally unique identifiers to

data items. Therefore it is not possible to set links between items in different data sources in order to connect data into a global data space.

Page 62: An Introduction to Linked Data, Its Applications and Challanges

Web APIs slice the Web into Walled Gardens

62

Page 63: An Introduction to Linked Data, Its Applications and Challanges

Semantic Web

63

“The first step is putting data on the Web in a form that machines can naturally understand, or converting it to that form. This creates what I call a Semantic Web – a web of data that can be processed directly or indirectly by machines” TBL, 2000

Semantic Web, or Web of Data, is the goal or the end result of this process, Linked Data provides the means to reach that goal.

Over time, with Linked Data as a foundation, some of the more sophisticated proposals associated with the Semantic Web vision, such as intelligent agents, may become a reality.

Page 64: An Introduction to Linked Data, Its Applications and Challanges

64

Challenges

Page 65: An Introduction to Linked Data, Its Applications and Challanges

Challenges

65

HCI-related issuesApplication architectures for linked data access

Crawling and caching issues Search engines On-the-fly link traversalFederated querying

Page 66: An Introduction to Linked Data, Its Applications and Challanges

Challenges

66

Link ManagementLink Discovery, Automatic link generations

Linking algorithms String matching (lexical distance between labels)Common key matching (ISBN, Musicbrainz IDS)Property-based matching

link validation, link maintenance,

Page 67: An Introduction to Linked Data, Its Applications and Challanges

Challenges

67

Schema mapping and data fusionToday, most Linked Data applications display data from different

sources alongside each other but do little to integrate it further. To do so does require mapping of terms from different vocabularies to the applications target schema, as well as fusing data about the same entity from different sources, by resolving data conflicts.

Data sources can publish correspondences between their local terminology and the terminology of related data sources on the Web of Data. RDFS and OWL define basic terminology like owl:equivalentClass, owl:equivalentProperty, rdfs:subClassOf, rdfs:subPropertyOf that can be used to publish basic correspondences.Languages are required to specify more fine-grained schema mappings

Page 68: An Introduction to Linked Data, Its Applications and Challanges

Challenges

68

Data fusion the process of integrating multiple data items representing the same

real-world object into a single, consistent, and clean representation.Conflict resolution challenge

Page 69: An Introduction to Linked Data, Its Applications and Challanges

Challenges

69

Licensing Applications that consume data from the Web must be able

to access explicit specifications of the terms under which data can be reused and republished.

Trust, Quality and Relevanceto ensure the data most relevant or appropriate to the user's

needs is identified and made availablecontent-, context-, and rating-based techniques can be used

to heuristically assess the relevance, quality and trustworthiness of data is given in

Page 70: An Introduction to Linked Data, Its Applications and Challanges

Challenges

70

Equivalents to the PageRank algorithm will likely be important in determining coarse-grained measures of the popularity or significance of a particular data source, as a proxy for relevance or quality of the data, however such algorithms will need to be adapted to the linkage patterns that emerge on the Web of Data.

(Berners-Lee, 1997) proposed that browser interfaces should be enhanced with an “Oh, yeah?” button to support the user in assessing the reliability of information encountered on the Web. Whenever a user encounters a piece of information that they would like to verify, pressing such a button would produce an explanation of the trustworthiness of the displayed information.

Page 71: An Introduction to Linked Data, Its Applications and Challanges

Challenges

71

Privacy protection One problematic area are the opportunities to violate privacy

that arise from integrating data from distinct sources.

Page 72: An Introduction to Linked Data, Its Applications and Challanges

LDOW Workshops

Linking Data On the Web The goal of the LDOW workshop is to provide a forum for

the Linked Data community2008, 2009

72


Recommended