+ All Categories
Home > Documents > SAPIR Search in Audio-visual content using P2p IR

SAPIR Search in Audio-visual content using P2p IR

Date post: 15-Jan-2016
Category:
Upload: naif
View: 25 times
Download: 0 times
Share this document with a friend
Description:
Yosi Mass, Raul Santos. SAPIR Search in Audio-visual content using P2p IR. Why SAPIR?. Searchable space created by the growing amounts of existing video and multimedia files may greatly exceed the area searched by major engines. - PowerPoint PPT Presentation
Popular Tags:
16
SAPIR Search in Audio-visual content using P2p IR Yosi Mass, Raul Santos
Transcript
Page 1: SAPIR Search in Audio-visual content using P2p IR

SAPIR Search in Audio-visual

content using P2p IR

Yosi Mass, Raul Santos

Page 2: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 2

Why SAPIR? Searchable space created by the growing amounts

of existing video and multimedia files may greatly exceed the area searched by major engines.

Traditional search engines are limited to searching in the associated text and meta-data of the multimedia content. If content providers don't clearly or accurately describe their multimedia files, or use inaccurate tags, the current method falls short.

Current internet search is geared mainly to relatively powerful desktop machines and accessed via regular web browsers, not lightweight mobile devices with their connectivity and interactivity limitations.

Page 3: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 3

SAPIR Objectives

Develop cutting-edge technology to index and search large scale audio-visual information by content.

Make information available on many devices, enhanced by social networking while keeping privacy and preventing fraud

Support new trends in MM content production: personal producer VS professional producers

Page 4: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 4

SAPIR challenges

Dimensions of the search problem: Efficiency (scalability is the key issue) Effectiveness (quality measures of results)

Efficiency challenges Scale in collection size Scale in number of users

Effectiveness challenges New search paradigm combining text + audio-

visual content Usability challenges

Page 5: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 5

SAPIR ConsortiumOrganization Activity type Country Nr.

EmployeesRTD Person Months

IBM IND Israel 621 88

CNR Research Institute

Italy 5962 83

MPI Research

Institute

Germany 150 64

UPD University Italy 2234 49

Eurix SME Italy 30 66

Xerox IND France 2080 17

MU-Brno University Czech Republic

908 46

TID IND Spain 1265 66

Telenor IND Norway 674 29

Page 6: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 6

SAPIR approach-P2P Architecture

Page 7: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 7

Search using the Query by Example Paradigm

• Search for information about a physical object by taking an image of it with a mobile phone or find a song by humming the melody.

• Support similarity search for metric spaces

Image Database

Page 8: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 8

<SapirMMObject>

<title>when waves collide</title>

<Mpeg7>

<VisualDescriptor type=“ScalableColorType”>

<VisualDescriptor type=“ColorStructureType”>

<VisualDescriptor type=“ColorLayoutType”>

<VisualDescriptor type=“EdgeHistogramType”>

<VisualDescriptor type=“HomogeneousTextureType”>

</Mpeg7>

<comments>

<comment id=“…" author=“…">beautiful…</comment>

<comment ...>very powerful…</comment>

</comments>

<tags>

<tag id="254" author=“12@N00">waves</tag>

<tag …>Victoria beach</tag>

</tags>

</SapirMMObject>

Feature extraction

Page 9: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 9

Indexing

<SapirMMObject>

<title>when waves collide</title>

<Mpeg7>

<VisualDescriptor type=“ScalableColorType”>

<VisualDescriptor type=“ColorStructureType”>

<VisualDescriptor type=“ColorLayoutType”>

<VisualDescriptor type=“EdgeHistogramType”>

<VisualDescriptor type=“HomogeneousTextureType”>

</Mpeg7>

<comments>

<comment id=“…" author=“…">beautiful…</comment>

<comment ...>very powerful…</comment>

</comments>

<tags>

<tag id="254" author=“12@N00">waves</tag>

<tag …>Victoria beach</tag>

</tags>

</SapirMMObject>

Visual Descriptors Overlay

Metric index

Text Overlay

Text index

Page 10: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 10

Querying

Tag: names

<Mpeg7Query weight=“1”>

<VisualDescriptor type=“ScalableColorType”>

<VisualDescriptor type=“ColorStructureType”>

<VisualDescriptor type=“ColorStructureType”>

</Mpeg7Query>

</Mpeg7Query weight=“0.5”>

<tag>waves</tag>

</Mpeg7Query>

Visual Descriptors Overlay

Text Overlay

MergeResults

Approximation

Page 11: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 11

Project status for Apr 2008 A scalable, extensible and versatile architecture for P2P was

defined. APIs for P2P content management, indexing and search were defined and implemented

Several Scenarios were defined and tested in Focus groups Definition of a common schema for feature representation using

MPEG-7 was defined. A demo for Indexing and search in 10M Flickr files using a

combination of content based image search combined with text and metadata was implemented using the SAPIR APIs.

Testbed of 50M Flickr files crawled by the EGEE grid aiming at 100M towards the Year End. This testbed collection will be available for scientific experiments (CoPhir – http://cophir.isti.cnr.it site)

Next demo (due Nov ’08) will include search in music, video and speech as well as some scenario integration.

Page 12: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 12

Tests

P2P architecture for search in Audio-Visual content

Efficiency – Some initial results: 1M FlickrXML files – ~500msec per query – 50

peers (8CPU, 16Gb) 10M FlickrXML files - ~500msec per query – 500

peers (16CPU, 64Gb) Effectiveness

Text + image improves over text or image only

Page 13: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 13

WP9 – Dissemination and exploitation Public website

http://www.sapir.eu Dissemination

First DUP was published Participate in Chorus meetings and road map Workshops – SIGIR’07, ECIR’08, SAC’08 Demos

Publications More than 20 SAPIR related publications so far

Contacts with Standards Bodies MPEG-21, MPEG-A, MPEG-7

Exploitation

Page 14: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 14

WP9 – Dissemination and exploitation

Proposed contribution to standards Extension to MPEG-7 for music and

speech. Proposals for MPQF (MPEG-7 Query

Format) A DRM implementation for P2P based on

Chillout Propose a call for MPEG-21 Query

Format

Page 15: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 15

Thank You!

For more info visit http://www.sapir.eu

Page 16: SAPIR Search in Audio-visual content using P2p IR

Chorus cluster meeting, Vilamoura 16-17 April 2008 16

Results (Jan 2007 – Mar 2008) WP1 – Scenarios and a complete guideline for

usability and user interface design WP2 – Architecture for P2P and APIs WP3 - Definition of a common schema for feature

representation using MPEG-7. WP4, WP5 – Demo of indexing and search in 10M

Flickr files combining text and low level visual descriptors

WP6 – Work on interoperable DRM solution (Chillout) for P2P networks

WP7 – initial design of Social networking and support for mobile devices


Recommended