+ All Categories
Home > Documents > Growing the Semantic Web By Charla Woodbury June 11, 2004.

Growing the Semantic Web By Charla Woodbury June 11, 2004.

Date post: 16-Dec-2015
Category:
Upload: justin-eade
View: 221 times
Download: 0 times
Share this document with a friend
Popular Tags:
22
Growing the Growing the Semantic Web Semantic Web By Charla Woodbury By Charla Woodbury June 11, 2004 June 11, 2004
Transcript
Page 1: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Growing the Growing the Semantic WebSemantic WebBy Charla WoodburyBy Charla Woodbury

June 11, 2004June 11, 2004

Page 2: Growing the Semantic Web By Charla Woodbury June 11, 2004.

INTERNET to INTERNET to SEMANTIC WEBSEMANTIC WEB

The present internet is too large to conduct specific The present internet is too large to conduct specific searches in its present formatsearches in its present format

The Semantic Web holds the promise of a much richer The Semantic Web holds the promise of a much richer and easily searchable information resourceand easily searchable information resource

Most current research targets small areas of Most current research targets small areas of development of the Semantic Web rather than looking development of the Semantic Web rather than looking at the whole process and showing its advantagesat the whole process and showing its advantages

What is needed is a working example of the Semantic What is needed is a working example of the Semantic Web that demonstrates the advantages and minimizes Web that demonstrates the advantages and minimizes the problems to be able to start growing webpages for the problems to be able to start growing webpages for the Semantic Webthe Semantic Web

Page 3: Growing the Semantic Web By Charla Woodbury June 11, 2004.

High-volume Information High-volume Information Publishers should be the Publishers should be the first TARGETfirst TARGET

The old adage is to deal with the new The old adage is to deal with the new water coming in rather than changing the water coming in rather than changing the water already in the lake if you want to water already in the lake if you want to change the lake’s water in any waychange the lake’s water in any way

By starting with high-volume information By starting with high-volume information publishers, the nature of the internet lake publishers, the nature of the internet lake would change very quicklywould change very quickly

Page 4: Growing the Semantic Web By Charla Woodbury June 11, 2004.

EmbeddedObituaryOntology

Obituary PrototypeObituary Prototype

Newspaper Publisher

Obituary vocabulary

Word Net

Daily News

obituaries

Daily News

HOME PAGEObituary vocabulary

Page 5: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Once the faucet is turned on Once the faucet is turned on the population pool of the population pool of Semantic Webpages would Semantic Webpages would grow very quicklygrow very quickly

Page 6: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Thesis StatementThesis Statement

The cost/benefit analysis of populating the The cost/benefit analysis of populating the Semantic Web by building an embedded Semantic Web by building an embedded OWL ontology and the corresponding OWL ontology and the corresponding specialized vocabulary on top of specialized vocabulary on top of WordNet for EACH information publisher WordNet for EACH information publisher using an obituary prototype is practical using an obituary prototype is practical and cost effective.and cost effective.

Page 7: Growing the Semantic Web By Charla Woodbury June 11, 2004.

ADVANTAGESADVANTAGES

Each information publisher Each information publisher

The ontology is only built once and used many timesThe ontology is only built once and used many times

The specialized vocabulary is only built once and accessed The specialized vocabulary is only built once and accessed many timesmany times

The ontology and vocabulary belong to the publisher who can The ontology and vocabulary belong to the publisher who can change them as the format and vocabulary of the obituaries change them as the format and vocabulary of the obituaries they produce change (deletion discouraged)they produce change (deletion discouraged)

Most of the cost would be incurred in setting up the ontology Most of the cost would be incurred in setting up the ontology and the specialized vocabularyand the specialized vocabulary

Page 8: Growing the Semantic Web By Charla Woodbury June 11, 2004.

ADVANTAGESADVANTAGES

Information extraction would be done without Information extraction would be done without contacting the publisher other than an agentcontacting the publisher other than an agent

There would be no need to index the information once There would be no need to index the information once the information retrieval portion was in placethe information retrieval portion was in place

HTML information is easy to store and maintainHTML information is easy to store and maintain

HTML files are much smaller than digitized microfilm HTML files are much smaller than digitized microfilm presently usedpresently used

Page 9: Growing the Semantic Web By Charla Woodbury June 11, 2004.

METHODS METHODS Each NewspaperEach Newspaper

Contact selected newspapers to produce semantic obituary webpagesContact selected newspapers to produce semantic obituary webpages Learn how they archive the HTML version of the newspaperLearn how they archive the HTML version of the newspaper

Get estimates on the cost to the newspaper to index, microfilm, and store Get estimates on the cost to the newspaper to index, microfilm, and store their archivestheir archives

Request a reporter in obituaries to list specialized vocabulary and build the Request a reporter in obituaries to list specialized vocabulary and build the vocabulary and OWL ontology to be embeddedvocabulary and OWL ontology to be embedded

Train a newspaper employee to test and edit the ontology and vocabularyTrain a newspaper employee to test and edit the ontology and vocabulary

Test that vocabulary and ontology to make sure that it is sufficiently Test that vocabulary and ontology to make sure that it is sufficiently inclusiveinclusive

Compare the time needed to build the first newspaper with the subsequent Compare the time needed to build the first newspaper with the subsequent onesones

Page 10: Growing the Semantic Web By Charla Woodbury June 11, 2004.

METHODSMETHODSOrganizations using Organizations using Obituary informationObituary information Contact Family History businesses, Genealogical Contact Family History businesses, Genealogical

societies, and Government agencies that would use societies, and Government agencies that would use obituary informationobituary information Find out how they get their obituary information now and how Find out how they get their obituary information now and how

much that costs in time and moneymuch that costs in time and money

Measure their future interest in using agents to retrieve obituary Measure their future interest in using agents to retrieve obituary information insteadinformation instead

Discover what parts of the obituary information they consider Discover what parts of the obituary information they consider minimal to their work and what information would be desired and minimal to their work and what information would be desired and optimaloptimal

Present the results of obituary prototype and re-measure their Present the results of obituary prototype and re-measure their future interest in using agents to retrieve obituary informaitonfuture interest in using agents to retrieve obituary informaiton

Page 11: Growing the Semantic Web By Charla Woodbury June 11, 2004.

PROBLEMSPROBLEMS

The first problem is how to entice publishers to The first problem is how to entice publishers to start the processstart the process

The basic problem is a semantic one? How The basic problem is a semantic one? How will regional burial practices and language will regional burial practices and language differences impact the process?differences impact the process?

But the biggest problem is how to maintain the But the biggest problem is how to maintain the ontology and vocabulary with the least amount ontology and vocabulary with the least amount of human interventionof human intervention

Page 12: Growing the Semantic Web By Charla Woodbury June 11, 2004.

First ProblemFirst ProblemHow to entice publishers to start How to entice publishers to start the process of making semantic the process of making semantic webpages?webpages? Find Grants, Research Money, and/or money from Find Grants, Research Money, and/or money from

Corporate sponsorship by those companies that would Corporate sponsorship by those companies that would profit from the informationprofit from the information

Petition for Government SupportPetition for Government Support Office of Internet Semantic Information (i.e. Library of Office of Internet Semantic Information (i.e. Library of

Congress)Congress) Demonstrate by prototype - ObituariesDemonstrate by prototype - Obituaries

Process works well (Electric lights in large cities)Process works well (Electric lights in large cities) Specific information is far more easily foundSpecific information is far more easily found Their information is more availableTheir information is more available The maintenance process is minimalThe maintenance process is minimal The rewards are maximalThe rewards are maximal Everyone else is doing itEveryone else is doing it

Page 13: Growing the Semantic Web By Charla Woodbury June 11, 2004.

SECOND PROBLEMSECOND PROBLEMThe basic problem is a semantic The basic problem is a semantic one? How will regional burial one? How will regional burial practices and language differences practices and language differences impact the process?impact the process?

The basic format of the specialized vocabulary would be the same The basic format of the specialized vocabulary would be the same as WordNet with rich word relationships (i.e. interred – interment – as WordNet with rich word relationships (i.e. interred – interment – buried – burial as homonyms)buried – burial as homonyms)

Regional and language differences would be expressed in adding Regional and language differences would be expressed in adding rich vocabulary as deemed necessary by the individual publisherrich vocabulary as deemed necessary by the individual publisher

Fine-tune and test the vocabulary and the ontologyFine-tune and test the vocabulary and the ontology

Teach the computer to speak obituary languageTeach the computer to speak obituary language

Page 14: Growing the Semantic Web By Charla Woodbury June 11, 2004.

THIRD PROBLEMTHIRD PROBLEMHow to simplify and automate the How to simplify and automate the testing and maintenance of the testing and maintenance of the ontology and vocabulary?ontology and vocabulary?

TESTING and SIMPLE MAINTENANCETESTING and SIMPLE MAINTENANCE Install a tool for creating and editing an OWL ontology as Install a tool for creating and editing an OWL ontology as

automated as possibleautomated as possible

Set up procedures for how often to test the ontology (i.e. new Set up procedures for how often to test the ontology (i.e. new reporter, new obituary template, a set length of time)reporter, new obituary template, a set length of time)

Write program that tests how effective the ontology is and lists Write program that tests how effective the ontology is and lists words in the obituaries that are not in the vocabulary for review words in the obituaries that are not in the vocabulary for review and addition to the vocabularyand addition to the vocabulary

Teach the machine to add those words automatically to the Teach the machine to add those words automatically to the vocabulary if possiblevocabulary if possible

Page 15: Growing the Semantic Web By Charla Woodbury June 11, 2004.

EvaluationEvaluation

Cost/benefit analysis in time and money Cost/benefit analysis in time and money between the original process and the new between the original process and the new Semantic Web processSemantic Web process

Survey those testing and maintaining the Survey those testing and maintaining the Semantic Webpages about the process and Semantic Webpages about the process and the tools providedthe tools provided

Compare Survey given to possible information Compare Survey given to possible information retrievers before and after demonstration of the retrievers before and after demonstration of the obituary prototypeobituary prototype

Page 16: Growing the Semantic Web By Charla Woodbury June 11, 2004.

CONTRIBUTIONSCONTRIBUTIONS

A working model of the Semantic WebA working model of the Semantic Web

A growing pool of semantic webpages for future A growing pool of semantic webpages for future information extraction & retrievalinformation extraction & retrieval

As new standards emerge, adjustments in the process As new standards emerge, adjustments in the process could be made immediately and only once for everyonecould be made immediately and only once for everyone

A replacement for the cost of human indexing the A replacement for the cost of human indexing the informationinformation

Page 17: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Future WorkFuture WorkHow will agents interpret many How will agents interpret many different obituary ontologies and different obituary ontologies and vocabularies?vocabularies?

NewspaperPublisherNewspaper

PublisherNewspaperPublisherNewspaper

PublisherNewspaperPublishers

EmbeddedObituaryOntology

Daily News

obituaries

EmbeddedObituaryOntology

Daily News

obituaries

EmbeddedObituaryOntology

Daily News

obituaries

EmbeddedObituaryOntology

Daily News

obituaries

EmbeddedObituary

Ontologies

Daily News

obituaries

Obituary vocabularyObituary

vocabularyObituary vocabularyObituary

vocabularyObituary vocabularies

Page 18: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Future WorkFuture WorkShould there be one global Should there be one global obituary ontology and/or one obituary ontology and/or one global burial vocabulary? (All global burial vocabulary? (All languages and burial practices)languages and burial practices)

GLOBALObituaryOntology

Page 19: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Future WorkFuture WorkOr will the agent be smart enough Or will the agent be smart enough to traverse the associated to traverse the associated vocabulary for the correct vocabulary for the correct information?information?

Obituary vocabularyObituary

vocabularyObituary vocabularyObituary

vocabularyObituary vocabularies

AGENT

Page 20: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Future WorkFuture WorkHow will the agents deliver the How will the agents deliver the obituary extracted information?obituary extracted information?

Obituary ExtractedDatabase

Daily News || 26 Jan 2004 || Charles Lambert || b. 12 June 1911 || d. 24 Jan 2004

HTML REPORTAll Obituaries with surname LAMBERT

URL’s to the actual Newspaper ObituariesCharles Lambert d. 24 Jan 2004Richard Greaves Lambert d. 17 Oct 2003

EmbeddedObituaryOntology

Daily News

obituaries

Page 21: Growing the Semantic Web By Charla Woodbury June 11, 2004.

Future WorkFuture Work

Will it be necessary to hire and pay obituary Will it be necessary to hire and pay obituary indexers?indexers?

Will the newspapers continue to be microfilmed Will the newspapers continue to be microfilmed or just stored in HTML? Will storage space be or just stored in HTML? Will storage space be an issue?an issue?

Will the whole process including information Will the whole process including information retrieval be cost effective?retrieval be cost effective?

Page 22: Growing the Semantic Web By Charla Woodbury June 11, 2004.

QUESTIONS?QUESTIONS?

COMMENTS?COMMENTS?


Recommended