Ontological Infrastructure for a Semantic Newspaper Roberto García 1, Ferran Perdrix 1,2, Rosa Gil...

Post on 27-Mar-2015

214 views 1 download

Tags:

transcript

Ontological Infrastructure for a Semantic Newspaper

Roberto García1, Ferran Perdrix1,2, Rosa Gil1

1GRIHO – Human Computer Interaction Research Group Universitat de Lleida, Spain2SEGRE Media Group, Spain

Semantic Integration and Retrieval of Multimedia Metadata

Contents

Introduction Proposal Ontological framework Integration framework Conclusions Future Work

Semantic Integration and Retrieval of Multimedia Metadata

Contents

Introduction Proposal Ontological framework Integration framework Conclusions Future Work

Semantic Integration and Retrieval of Multimedia Metadata

Introduction

Press and Media companies getting digital and Web Segre: newspaper, radio, television and web portal.

Multiple kinds of media text, photo, video,…

Heterogeneous sources agencies, journalists, partners, institutions,…

Heterogeneity: difficult to integrate and manage.

Semantic Integration and Retrieval of Multimedia Metadata

Introduction

Related standards: International Press

NewsCodes, subjects reference system, taxonomy NITF, news documents structure NewsML, model news as multimedia packages

Multimedia MPEG-7, descriptive multimedia metadata TV-Anytime, multimedia taxonomies

Common aspect: non formal semantics, XML-based

Semantic Integration and Retrieval of Multimedia Metadata

Introduction

Journalists

News Agencies

LegacyNews+Media

ReceiverNews+Photos

Custom XML

NITF, NewsCodes, NewsML,…

Archivist

User

Semantic Integration and Retrieval of Multimedia Metadata

Contents

Introduction Proposal Ontological framework Integration framework Conclusions Future Work

Semantic Integration and Retrieval of Multimedia Metadata

Proposal

Semantic Metadata and Ontology facilitate management and integration.

Related previous work: ELIN (Electronic Newspaper Initiative) NEPTUNO (Semantic Web Technologies for Digital Newspaper) NewMARS (Multimedia Advanced Redistribution Surveillance)

Semantic Integration and Retrieval of Multimedia Metadata

Proposal

Journalists

News Agencies

Legacy

Receiver

SemanticRepository

Ontologies Framework

User

Semantic Integration and Retrieval of Multimedia Metadata

Contents

Introduction Proposal Ontological framework Integration framework Conclusions Future Work

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework

NewsML, NITF, NewsCodes, MPEG-7, TVAnytime XML Semantic Web

“XML Semantics Reuse Methodology”. ReDeFer implementation XSD2OWL: schema to ontology. XML2RDF: XML instance data to RDF instances. CS2OWL: classification scheme to ontology

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework ReDeFer

XSD2OWLMappings:

owl:intersectionOfowl:unionOf

sequencechoice

owl:maxCardinalityowl:minCardinality

@maxOccurs@minOccurs

rdfs:subClassOfextension@base|restriction@base

owl:RestrictioncomplexType//element

owl:ClasscomplexType|group|attributeGroup

rdfs:rangeelement@type

rdfs:subPropertyOfelement@substitutionGroup

rdf:Propertyowl:DatatypePropertyowl:ObjectProperty

element|attribute

OWLXML Schema

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework NewsCodes Subjects Ontology

Subjects taxonomy NITF 3.3 Ontology

Structure concepts (paragraph, subheadline,…) Metadata properties (copyright, authorship, issue date,…)

NewsML 1.2 Ontology News multimedia structure (envelope, component, item,…)

MPEG-7 Ontology Complete ontology (2372 classes and 975 properties)

TVAnytime Ontologies Content and Format CSs

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework: MPEG-7

Validation, compare to other MPEG-7 Ontologies: Hunter02: not complete, RDF+DAML. Tsinaraki04: not complete, semantic part of MDS. Troncy03: not complete, from an ontology to MPEG-7.

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework: MPEG-7

Hunter02 MPEG-7 Ontology

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework: MPEG-7

MPEG-7 Ontology

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework: MPEG-7

Tsinaraki04

MPEG-7 Ontology

<complexType name="AudioType"><complexContent>

<extension base="mpeg7:MultimediaContentType">

<sequence><element name="Audio"

type="mpeg7:AudioSegmentType"/></sequence>

</extension></complexContent>

</complexType>

Class (AudioType partial

restriction(Audio cardinality(1))

MultimediaContentType)

Class (AudioType partial

restriction(Audio cardinality(1))restriction(Audio

allValuesFrom(AudioSegmentType)))MultimediaContentType)

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework: Instances

ReDeFer XML2RDF: XML tree RDF graph.

Deduce blank node types from XSD2OWL ontologies restrictions.

Root

elem elemelem

elem elem

Empty Text

elemattr

Empty Text Text Text

Blank nodes

rdf:Properties

XML tree model RDF graph model

Semantic Integration and Retrieval of Multimedia Metadata

Ontological Framework: Instances

XML2RDF example

Semantic Integration and Retrieval of Multimedia Metadata

Contents

Introduction Proposal Ontological framework Integration framework Conclusions Future Work

Semantic Integration and Retrieval of Multimedia Metadata

Signal ProcessingAudio

Video

MPEG-7 XML

Content-based metadata

XML2RDF

NewsML Ontology

RDF

RDFContext-based

metadata

MPEG-7 Ontology

XML

Integration

Retrieval

Higher-level metadata

DL Classifier

SWRL Engine

XSD2OWL

XMLSchemas: NewsML, NITF, MPEG-7...

RDFS / OWL: IPTC SRS...

Integration Framework

Load Ontological Framework

Semantic Integration and Retrieval of Multimedia Metadata

Integration Framework

NITF packaged in NewsML container IPTC’s NITF-to-NewsML Metadata Mapping Stylesheet

<NewsML><NewsItem>

<NewsComponent><DescriptiveMetadata>

<SubjectCode><Subject FormalName="04000000"/>

</SubjectCode></DescriptiveMetadata><ContentItem>

<DataContent><nitf><body>…</body></nitf>

</DataContent></ContentItem>

</NewsComponent></NewsItem>

</NewsML>

Semantic Integration and Retrieval of Multimedia Metadata

Integration Framework

NewsML multimedia itemscontext and content-based MPEG-7 metadata

XML2RDF: RDF for NewsML-NITF instances Bridge subjects to NewsCodes ontology RDF for MPEG-7 metadata

Semantic Integration and Retrieval of Multimedia Metadata

Integration Framework

Signal ProcessingAudio

Video

MPEG-7 XML

Content-based metadata

XML2RDF

NewsML Ontology

RDF

RDFContext-based

metadata

MPEG-7 Ontology

XML

Integration

Retrieval

Higher-level metadata

DL Classifier

SWRL Engine

XSD2OWL

XMLSchemas: NewsML, NITF, MPEG-7...

RDFS / OWL: IPTC SRS...

Semantic Integration and Retrieval of Multimedia Metadata

Signal ProcessingAudio

Video

MPEG-7 XML

Content-based metadata

XML2RDF

NewsML Ontology

RDF

RDFContext-based

metadata

MPEG-7 Ontology

XML

Integration

Retrieval

Higher-level metadata

DL Classifier

SWRL Engine

XSD2OWL

XMLSchemas: NewsML, NITF, MPEG-7...

RDFS / OWL: IPTC SRS...

Integration Framework

Semantic Integration and Retrieval of Multimedia Metadata

Contents

Introduction Proposal Ontological framework Integration framework Conclusions Future Work

Semantic Integration and Retrieval of Multimedia Metadata

Conclusions

Signal ProcessingAudio

Video

MPEG-7 XML

Content-based metadata

XML2RDF

NewsML Ontology

RDF

RDFContext-based

metadata

MPEG-7 Ontology

XML

Integration

Retrieval

Higher-level metadata

DL Classifier

SWRL Engine

XSD2OWL

XMLSchemas: NewsML, NITF, MPEG-7...

RDFS / OWL: IPTC SRS...

Semantic Integration and Retrieval of Multimedia Metadata

Conclusions

Press and Media domain: heterogeneous and metadata intensive

Semantic Web and Ontology facilitate management and integration

Existing workNewsML, NITF, NewsCodes, MPEG-7, TVAnytime,…

Semantic Integration and Retrieval of Multimedia Metadata

Conclusions

XSD2OWL: take profit from XML Schema hidden semantics We formalise them when building ontologies, but also

implicitly when we make XML Schemas. XML2RDF:

reuse existing XML metadata to add momentum to the Semantic Web

Semantic Integration and Retrieval of Multimedia Metadata

Contents

Introduction Proposal Ontological framework Integration framework Conclusions Future Work

Semantic Integration and Retrieval of Multimedia Metadata

Future Work

Generate ontology for legacy system XML Map legacy ontology to NewsML-NITF ontologies Integrate automatic and assisted MPEG-7 metadata

multimedia annotation Complete the integration framework

Semantic Integration and Retrieval of Multimedia Metadata

Future Work

User Interface: Rhizomik Media MPEG-7, TVAnytime, DC, Copyright Ontology… Rhizomer-based semantic portal

Rhizomer

Thank you for your attention

More at:

http://rhizomik.net …/redefer …/semanticnewspaper …/ontologies/mpeg7ontos

Contact:

roberto@rhizomik.net

{fperdrix,rgil}@diei.udl.es