+ All Categories
Home > Documents > 1 Ontology Based Extraction of RDF Data from the World Wide Web Tim Chartrand A Thesis Proposal...

1 Ontology Based Extraction of RDF Data from the World Wide Web Tim Chartrand A Thesis Proposal...

Date post: 18-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
14
1 Ontology Based Extraction of RDF Data from the World Wide Web Tim Chartrand A Thesis Proposal Research Supported By NSF
Transcript

1

Ontology Based Extraction of RDF Data from the World Wide Web

Tim ChartrandA Thesis Proposal

Research Supported By NSF

2

Introduction

World Wide Web Has a huge amount of existing information Designed primarily for human consumption

Semantic Web Is an extension of WWW Gives information a well-defined meaning Allows automation of tasks

DEG contribution – Extract data from the WWWProposed solution Extract Semantic Web data from the WWW Superimpose extracted data

3

Extraction Ontology

ExtractionEngine

HTML Page

RelationalData

Overview of Proposed Research

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML Page

RelationalData

RDF Data

RDF Browser

4

RDF – What is it?

Resource Description Framework

Language of the Semantic Web Set of <subject><predicate><object> triples<mailto:[email protected]><genealogy#age>“25”

<mailto:[email protected]><genealogy#fatherOf><mailto:[email protected]>

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

mailto:[email protected]

25

genealogy:age

mailto:[email protected]

genealogy:fatherOf

genealogy:fatherOf

5

RDFS & DAML

Core ConceptsClasses daml:class – defines a class rdfs:subClassOf – specifies the generalization of a class

Properties daml:property – defines a binary relation, has a value rdfs:domain – specifies class to which a property applies rdfs:range – specifies possible values of a property

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

6

Example Ontology<daml:Class rdf:ID="Program">

<rdfs:label>Program</rdfs:label></daml:Class><daml:Class rdf:ID="Size">

<rdfs:label>Size</rdfs:label></daml:Class>

. . .<daml:Property rdf:ID="Name">

<rdfs:domain rdf:resource="#Program"/><rdfs:range rdf:resource="&rdfs;Literal"/><rdf:type

rdf:resource="&daml;UniqueProperty"/><rdf:type

rdf:resource="&daml;UnambiguousProperty"/></daml:Property><daml:ObjectProperty rdf:ID="ProgSize">

<rdfs:domain rdf:resource="#Program"/><rdfs:range rdf:resource="#Size"/>

</daml:ObjectProperty>. . .

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

7

DAML OSM

Classes Non-lexical object sets

Properties Binary relationship sets between object sets

Literal properties Binary relationship sets between non-lexical and lexical object sets

Cardinality restrictions Participation constraints

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

8

DAML OSM

<daml:Class rdf:ID="Program"><rdfs:label>Program</rdfs:label>

</daml:Class><daml:Class rdf:ID="Size">

<rdfs:label>Size</rdfs:label></daml:Class>

. . .<daml:Property rdf:ID="Name">

<rdfs:domain rdf:resource="#Program"/><rdfs:range rdf:resource="&rdfs;Literal"/><rdf:type

rdf:resource="&daml;UniqueProperty"/><rdf:type

rdf:resource="&daml;UnambiguousProperty"/></daml:Property><daml:ObjectProperty rdf:ID="ProgSize">

<rdfs:domain rdf:resource="#Program"/><rdfs:range rdf:resource="#Size"/>

</daml:ObjectProperty>. . .

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

9

Data Frames

Lexical object sets need data frame.Use data-frame libraryMatch lexical object sets with data frames Compare names

Stemming Levenshtein edit distance Soundex Longest Common Subsequence

Choose most similar data frame

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

10

User Modification

Cardinality Constraints Provide graphical ontology editor Allow the user to edit participation constraints Disallow the user to modify ontology structure

Data Frames Allow user to edit mapping Provide data frame editor Allow user to edit or add data frames

Extraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

11

Extracting the DataExtraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

12

http://www.downloads.com/Program1001

software:Program

Stick Death 1.0 Windows 3.x/95/98/Me/NT/2000/X

2.66 MB

rdf:type

software:name

software:versionsoftware:OperatingSystem

software:ProgSize

software:SizeValsoftware:SizeUnit

software:Size

rdf:type

Convert to RDFExtraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

13

Superimposed DataExtraction Ontology

DAML Ontology

User

ExtractionEngine

HTML

RelationalDataRDF Data

14

Contributions

Advancement of Semantic Web

Application of Information Extraction to building Semantic Web

Semantic Web data as superimposed information

Algorithm for ontology conversion


Recommended