+ All Categories
Home > Documents > Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by...

Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by...

Date post: 20-May-2020
Category:
Upload: others
View: 9 times
Download: 0 times
Share this document with a friend
68
European Research Consortium for Informatics and Mathematics www.ercim.org Number 62, July 2005 Special: Multimedia Informatics
Transcript
Page 1: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

European Research Consortium for Informatics and Mathematicswww.ercim.org

Number 62, July 2005

Special :Multimedia Informatics

Page 2: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

2

KEYNOTE

3 by Viviane Reding, Member of the European Commission,responsible for Information Societyand Media

JOINT ERCIM ACTIONS

4 Jérôme Chailloux nominated ERCIM Manager

4 Image and Video Understanding wins the ERCIM Working GroupAward

5 Grid@Asia — Advanced GridResearch Workshops throughEuropean and Asian Co-operation

6 Strategic Workshops organized by ERCIM

NEWS FROM W3C

6 W3C Celebrated Ten Years Leadingthe Web in Europe

6 Launch of the W3C Mobile WebInitiative

7 New Fee Structure for W3CMembership

7 W3C Internationalization Activitylooks towards Africa

7 Latest W3C Recommendations

SPECIAL THEME: MULTIMEDIA INFORMATICS

Introduction

8 Multimedia Informaticsby Joachim Köhler

Multimedia Indexing and RetrievalInvited article:

10 Managing Digital Photo Collections by Lynn Wilcox

11 Multimedia Indexing: The MultimediaChallenge by Patrick Gros, Manolis Delakis, and Guillaume Gravier

12 MediAssist: Managing Personal DigitalPhoto Archives by Noel Murphy, Cathal Gurrin and Gareth J. F. Jones

14 The MUSCLE Benchmarking Initiative by Allan Hanbury and Michael Nölle

15 New Testbed of One Million Images by Gregory Grefenstette, Pierre-AlainMoëllic, Patrick Hède, Christophe Milletand Christian Fluhr

16 Managing the Growth of MultimediaDigital Content by David Bainbridge, Paul Browne, PaulCairns, Stefan Rüger and Li-Qun Xu

18 Maps of Music by Andreas Rauber, Thomas Lidy andRobert Neumayer

19 Structuring Multimedia Archives with Static Documents by Denis Lalanne and Rolf Ingold

20 MultimediaN: Personalized InformationDelivery by Marcel Worring and Nellie Schipper

21 SPIEGLE: A Multimedia Search EngineGenerator by Arjen de Vries

23 Towards a ‘Smart Content Factory’ by Georg Güntner

24 Fischlár-News: Multimedia Access to Broadcast TV News by Alan F. Smeaton, Noel E. O’Connorand Hyowon Lee

25 Interactive Multimedia-EnabledLearning and Training by Claire Kenny, Declan McMullen, Mark Melia and Claus Pahl

Multimedia Networking

27 Representation and Communication of Multimedia Data and Metadata by Sara Colantonio, Maria Grazia Di Bono,Massimo Martinelli, Gabriele Pieri and Ovidio Salvetti

28 A Cognitive Architecture for Semantically Based Medical Image Retrieval by John Moustakas, Socrates Dimitriadisand Kostas Marias

29 Personalized and Adaptive MultimediaRetrieval by Joemon M. Jose and Jana Urban

31 ADMITS: Adaptation in Distributed Multimedia IT Systems by Laszlo Böszörmenyi

32 Solutions for an Interpreter-EnabledMultimedia Conferencing System by Ferenc Sárközy and Géza Haidegger

34 Host Recommendation in the AdaptiveDistributed Multimedia Server by Ottó Hutter, Tibor Szkaliczki andBalázs Goldschmidt

35 Scalable Audio Streaming to Mobile Devices by Jonathan Sherwin and Cormac J. Sreenan

36 A Networked Approach to TV Content Distribution by Adrian Cahill, John Roche and Cormac J. Sreenan

38 Presentation, Control and Collaboration in the Networked Classroom by Leandro Navarro-Moldes and Manuel Oneto

39 Video Transcoding Architectures for Multimedia Real Time Services by Maurizio A. Bonuccelli, FrancescaLonetti and Francesca Martelli

Interactive Multimedia Applications

40 Designing Multi-Modal Multi-Device Interfaces by Silvia Berti and Fabio Paternò

41 LimSee2: A Cross-Platform SMILAuthoring Tool by Romain Deltour, Nabil Layaïda and Daniel Weck

43 Merging Virtual Reality and Television through Interactive Virtual ActorsMarilyn — Multimodal AvatarResponsive Live Newscaster by Sepideh Chakaveh

44 Interactive Multimedia for Supporting the Quality of the Production by George L. Kovács, Géza Haideggerand János Nacsa

45 The CINEMA Project: A Video-BasedHuman-Computer Interaction Systemfor Audio-Visual Immersion by Renaud Dardenne, Jean-JacquesEmbrechts, Marc Van Droogenbroeck,and Nicolas Werner

46 Personalised Enriched BroadcastExperience by Mounia Lalmas, Nick Bryan-Kinns and Alan Pearmain

R&D AND TECHNOLOGY TRANSFER

48 LIGHT: XML-Innovative Generation for Home Networking Technologies by Luca Tarrini and Vittorio Miori

49 Modelling of Authentic Reflectance Behaviour in Virtual Environments by Michal Haindl and Jiří Filip

51 Haptic Training Systems in VirtualSurgery by José San Martín, David Miraut,Carolina Gómez and Sofía Bayona

52 Computer Recognizes Whale Tails by Annette Kik, Eric Pauwels and Elena Ranguelova

53 Text Document Classificationby Jana Novovičová

54 Advancing Black-Box Reuse in aMultimedia Application Frameworkby Bernhard Wagner

55 CASSEM: Vibration Control in theSmart Way by Salim Belouettar

57 Point6: The IPv6 Skill Centre —Moving to the Next-GenerationInternet Protocol by César Viho and Annie Floch

58 Working Slower with More PowerfulComputers by Lorenz M. Hilty, Andreas Köhler,Fabian van Schéele and Rainer Zah

59 Coordinating IST Research across Europe by Simon Lambert

60 Software Automation meetsInteractive Media Developmentby Dirk Deridder, Thomas Cleenewerck,Johan Brichau and Theo D'Hondt

EVENTS

62 Workshop on Challenges in SoftwareEvolutionby Tom Mens

62 WWV 2005 — First Workshop on Automated Specification and Verification of Web Sitesby María Alpuente, Santiago Escobar and Moreno Falaschi

63 Announcements

66 EURO-LEGAL

67 IN BRIEF

CONTENTS

Next issue: October 2005 — Special theme: Security and Trust ManagementERCIM News No. 62, July 2005

Page 3: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 3

Digital convergence between audiovisual media, high-speed networks andsmart devices is a reality. Multimedia informatics is a key enabling tech-nology in this process. It helps make media content more directly manageable

by computers, in an age where new information stored on paper, film, magnetic andoptical media is reckoned to double every three years.

What will be the benefits? Digital convergence presents new opportunities for busi-nesses to unlock new markets, and for public bodies to improve the way they work.For example, national newspapers can reach customers through print, online ormobile services. Hospitals will be able to make X-Rays available instantly at thepatient’s bedside.

Convergence also presents new challenges, especially in the interoperability betweenthe network, device and content levels. Media content needs to be reformatted,restructured and re-indexed continually for multi-channel distribution – and this canbest be done with automated solutions.

What is the EU’s role? On the research front, the EU’s sixth Framework Programmefor Research and Development contributes to digital content, cognition and interfacedevelopment as well as to eLearning and culture.

Digital content research combines semantic methods with audiovisual and Web tech-nologies, allowing providers to create new forms of attractive and meaningful contentfor the consumer. In one instance, commercial producers and distributors will aim todeliver media content to end-users through different channels including interactiveTV, personal computer, kiosks, mobile and handheld devices. In another case, anetwork of excellence - led by ERCIM - will bring together research groups in datamining and machine learning, to automate semantic-based multimedia retrieval frommedia content.

Cognition research breaks new ground in modeling machine perception and under-standing more closely the human brain. One goal of this approach is to arrive at a newkind of computer that can describe what it sees, in real life situations. Imagine thebenefits of a system which can report a traffic jam that it sees over a video camera, orwhich can translate sign language into words.

We also need easy-to-use interfaces, as far as possible using our own language andpreferences. Research here aims, for example, at a new generation of machine transla-tion, based on automated speech recognition and spoken language translation ofbroadcast news and speeches, starting with English and Spanish. Research will alsoexplore multimodality, for example smart meeting rooms and electronic assistantswhich can collect, annotate and distribute different kinds of meeting materials on thefly – spoken, written or visual.

What does the future hold? My new i2010 initiative, adopted by the Commission on1 June, sets an objective of increasing our investment in ICT research by 80% over thecoming years. But i2010 does not stop there. It addresses key policy and regulatorymatters, such as digital rights management. It also aims at a more inclusive informa-tion society. A recent eLearning conference explored the need to reform educationand training systems, and to promote digital literacy, e-skills and lifelong learning inan ageing population. In the area of digital culture, we intend to strengthen EU poli-cies concerning the preservation and exploitation of Europe’s written and audiovisualheritage, with the help of all the public players concerned, making easy access by citi-zens to Europe’s valuable resources an everyday reality.

Viviane Reding

Viviane Reding,

Member of the European

Commission responsible for

Information Society and Media

KEYNOTE

Page 4: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

4

JOINT ERCIM ACTIONS

Jérôme Chailloux nominated ERCIM ManagerERCIM’s board of directors has nominated Jérôme Chailloux as managerof ERCIM during their meeting in Helsinki on 29th May. Jérôme Chaillouxwas proposed by INRIA, the host of the ERCIM office.

Image and VideoUnderstanding wins the ERCIM Working Group AwardThe ERCIM Working Group “Image and VideoUnderstanding is the winner of the 2005 WorkingGroup Award. The award consists of the right tospend up to 20,000 Euro for its activities. It waspresented to the Working Group chair EricPauwels, CWI, by the ERCIM president KeithJeffery during the ERCIM meetings in Helsinkion 30 May 2005.

Shortly after it became an official ERCIMWorking Group, the Image and VideoUnderstanding Group successfully submitted itsproposal for the MUSCLE Network ofExcellence (NOE) which covered the same scien-tific topics. The consortium was enlarged byapproximately 20 additional scientific membersin the formation of the new Network. The highlevel of collaborative research that arose from thisnetwork resulted in the group winning the 2005ERCIM Working Group Award.

The 2005 award was the last award in the currentformat. ERCIM is currently developing a newscheme to support the Working Groups in theirattempts for create new project proposals.

Links: MUSCLE Network of Excellence: http://www.muscle-noe.org

ERCIM Working Groups: http://www.ercim.org/activity/workgroup.html

Please contact:Jérôme Chailloux, ERCIM officeTel: +33 4 92385010E-mail: [email protected]

The ERCIM manager is the validrepresentative of ERCIM vis-à-visthird parties. He is responsible forensuring that the implementationof ERCIM's general policy iswithin the framework specified bythe membership. Jérôme will be incharge of managing both W3Cand ERCIM offices. An importanttask will be the joint coordinationof the ERCIM office and W3CEurope. There is a strong demandon behalf of the ERCIM consor-tium for an efficiently managedW3C Europe since the ERCIMEEIG members are liable for itsactivities.

From 1980 to 1987, Jérôme Chailloux was a researcher at INRIA, where hebecame a research director whilst occupying a number of teaching posi-tions (École Polytechnique, CERICS). He worked in the areas of automaticVLSI circuits, software engineering and knowledge-based systems, and isthe main inventor and developer of the programming language Le-Lisp,specializing in artificial intelligence.

From 1987 to 1995, Jérôme Chailloux co-founded the company ILOG, thesecond subsidiary of INRIA, taking on the positions of Chief ScientificOfficer and director. ILOG is a world leader in the production of softwarecomponents in the fields of optimization, decision aid and visualization.

From 1995 to 2001, he was Chief Information Officer of the genomicscompany GENSET, which is listed on the Nouveau Marché and NASDAQstock exchanges. In this role, he decided upon strategy, implemented ITand bioinformatics resources and led one of Europe’s largest teams ofbioinformaticians.

Until 2000, he was a member of the Coordination Committee for Scienceand Information Technology and Communication of the French NationalMinistry for Education, Research and Technology.

He is president of the Sophia Complexity Association, partner in the invest-ment fund of The Hyper Company, scientific adviser for the companiesGenclis, Chiasma and the Thrombosis Research Institute, and manager ofthe consultancy company CERTICS.

Keith Jeffery (left) presents the Working Group

Award to Eric Pauwels.

ERCIM News No. 62, July 2005

Page 5: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 5

JOINT ERCIM ACTIONS

Grid@Asia will define a jointresearch agenda to address inter-national Grid priorities relying ona core of leading European Gridresearch institutes. This initiativewill be supported by Asian part-ners to ensure on-site organisa-tion, enhanced visibility and theparticipation of high-profileindustrial and scientific delega-tions. Grid@Asia is expected toprovide Europe with a clearpicture of the Grid community in thosetwo Asian countries and to prepare a reli-able ground for sustainable and long-term collaboration. The project is imple-mented through three principle steps:• identification of Chinese and South

Korea key players in Grid research andtechnologies,

• organisation of focused workshopsaround EU/Asia research and indus-trial agendas

• establishment of sustainable cooper-ation and dissemination activities.

Grid@Asia will support long term inter-national cooperation by weaving addi-tional links with leading Asian Gridresearch communities, in particularthrough the integration of Asian expertisewith leading European Grid initiatives inthe 6th Framework Programme of theEuropean Union (such as Networks ofExcellence, Integrated Projects,STREPS, etc.) and later on within theforthcoming 7th Framework Programme(FP). This will position the EuropeanGrid community as a leading centre ofexcellence, enrich European expertise inthe field and support the adoption ofcommon Grid standards worldwide.

To reach these goals it was decided toorganise a series of three workshops inChina and South Korea (Beijing,Shanghai and Seoul). The first event

took place from 21 to 23 June 2005 inBeijing, hosted by the BeihangUniversity.

The workshop focussed on three mainscientific themes — GRID Middleware,GRID Applications, Tools andProgramming Environments — andallowed the European and Asian Gridcommunities to discuss and identify theircommon areas of interest. Two series ofpresentations were given. The first oncurrent European projects supported bythe Commission, such as AKOGRIMO,NEXTGRID, SIMDAT, DILIGENT,DEISA, CoreGRID and GRIDLAB andsimilar Chinese projects. The secondfeatured project ideas which could leadto the submission of common proposals.National and multilateral programmesincluding their funding mechanismswere also presented with the goal toinitiate collaborations betweenpromising European and Asian researchteams within the 6th and 7th FP.

The next workshop, to be organisedbefore the end of 2005, will focus onGrid applications.

Link: http://www.gridatasia.net

Please contact: Bruno Le Dantec, ERCIM officeTel: +33 4 92 38 50 10E-mail: [email protected]

The Grid@Asia project will foster collaboration in Grid research and technologiesbetween the European Union and Asian countries with a particular focus on Chinaand South Korea. The project is coordinated by ERCIM and supported by theEuropean Commission.

Grid@Asia — Advanced Grid ResearchWorkshops through European and Asian Co-operation

Strategic Workshopsorganized by ERCIM by Jessica Michel

Beyond the Horizon, a coordinationaction funded by the Future andEmerging Technologies (FET) activityof the EU’s IST-FP6 Programme, haslaunched its opening workshops inJune.

Six workshops will take place betweenJune and September uniting leadingEuropean experts in IST-related researchareas requiring support. The goal is tocreate critical scientific mass in theseareas and ensure that European researchstays one step ahead of general trends.The scheduled workshops include thefollowing:• Intelligent and Cognitive Systems,

Zurich, Switzerland, 11-13 June• Security, Trust and Dependability,

Paris, France, 20-23 June • Bio-ICT Synergies,

Sophia Antipolis, France, 28-29 June • Pervasive Computing &

Communications, Vienna, Austria, 28 July

• Software Intensive Systems, Koblenz, Germany, 9-10 September

• Nanotechnologies and Nanoelectronics,Brussels, Belgium, 11-12 October

• Plenary Workshop, Brussels, Belgium, 12-13 December.

ERCIM must mobilize its extensivenetwork of researchers to contibutemaximally to the success of this project.If you would like to learn more aboutparticipating in any one of these thematicareas, or attend the Plenary Workshopduring which inter-disciplinary researchdirections will be explored, pleasecontact Jessica Michel.

Link: http://www.beyond-the-horizon.net

Please contact: Jessica Michel, ERCIM officeTel: +33 4 9238 5089E-mail: [email protected]

Workshop participants.

Page 6: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

News from W3C

Today, mobile Web access suffers from interoperability andusability problems. Browsing the Web from a mobile device isnot as convenient as expected. Users often find that theirfavorite Web sites are not accessible or not as easy to use ontheir mobile phone as on their desktop computer. Contentproviders have difficulties building Web sites that work wellon all types and configurations of mobile phones offering Webaccess.

On 11 May 2005, W3C launched the Mobile Web Initiative(W3C-MWI) to make browsing the Web from mobile devicesa reality. “Mobile access to the Web has been a second classexperience for far too long,” explained Tim Berners-Lee, W3CDirector. “MWI recognizes the mobile device as a first classparticipant, and will produce materials to help developersmake the mobile Web experience worthwhile.”

Mobile Web Initiative participants will initially focus on thetwo areas ‘best practices’ and ‘mobile device descriptions’.The Mobile Web Best Practices Working Group (MWBP WG)is chartered to develop authoring guidelines, checklists andbest practices to help content providers to develop Web contentthat works well on mobile devices. The Device DescriptionWorking Group (DD WG) is chartered to address the develop-ment of improved device description solutions, that is, a

database of descriptions that can be used by content authors toadapt their content to a particular device.

W3C is already active in the mobile Web space, developingWeb standards for multimodal interaction and device-indepen-dent design, as well as profiles for mobile devices; related stan-dards include XHTML, SVG Mobile Profiles, and the SMILBasic Profile. MWI work will complement these currentefforts. MWI is also chartered to establish cooperative ties withrelated groups, including the Open Mobile Alliance (OMA)and 3GPP. These ties will help ensure that the needs of usersare well-defined and that the efforts of the MWI and theserelated groups are complementary.

W3C is pleased to welcome the Founding Sponsors of theMobile Web Initiative: Afilias, Bango.net, Drutt Corporation,Ericsson, France Telecom, HP, Jataayu Software,MobileAware, Nokia, NTT DoCoMo, Opera Software, TIMItalia, Segala M Test, Sevenval, RuleSpace, V-Enable,Vodafone and Volantis. Participation in MWI-sponsored W3CWorking Groups is open to all organizations.

Links:Mobile Web Initiative: http://www.w3.org/2005/MWI/

Mobile Web Initiative Sponsorship Program:http://www.w3.org/2005/MWI/Sponsoring.html

W3C Celebrated Ten YearsLeading the Web in Europe

W3C held a celebration of its tenyears in Europe on Friday, 3 June2005, in Sophia Antipolis, France.This half-day celebration affordedW3C Members and invited gueststhe opportunity to reflect on theprogress and the role of the Webin Europe.

W3C10 Europe was part two of theW3C Tenth Anniversary celebra-tion and follows on the W3C10celebration last December, whichmarked the anniversary of W3C'sfounding at the MassachusettsInstitute of Technology.

W3C10 Europe speakers and panelists discussed the impor-tance of the Web in Europe, W3C's central role in the develop-ment of the Web, and visions of the future of the Web. Theprogram emphasized two themes:

• The Web as Unifying Force in Europe: The European Unionexperience provides a compelling backdrop for consideringhow to expand the frontiers of the Web to enable participa-tion by new communities. Speakers addressed the integration

of the Web into the lives of Europeans, some challenges ofinternationalization, and advances in sharing data acrosscommunities.

• Policies Shaping the Web in Europe: Speakers discussedhow Europeans are tackling important Web policy issuessuch as privacy, ensuring access by people with disabilities,and use of the Web in the public sector.

Tim Berners-Lee, W3C Director and inventor of the Web,reunited with his former CERN colleague Robert Cailliau toshare personal reflections and stories about how the Web gotstarted at CERN. Berners-Lee also delivered a keynotestressing the importance of Web standards and addressingcurrent challenges in the European industry and researchcommunities.

Links:W3C10 Europe: http://www.w3.org/2005/06/W3C10.html

Tim Berners-Lee keynote on "The Economic Importance of Standards": http://www.w3.org/2005/Talks/w3c10-TimBernersLee/

All presentations available from:http://www.w3.org/2005/Talks/w3c10-Overview/

6 ERCIM News No. 62, July 2005

Tim Berners-Lee.

Keith Jeffery. Tim Berners-Lee

and Robert Cailliau.

Launch of the W3C Mobile Web Initiative

Pho

tos:

Ber

t Bos

, W3C

.

Page 7: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

Latest W3C Recommendations• XML Key Management Specification (XKMS 2.0)

28 June 05, Shivaram H. Mysore, Phillip Hallam-Baker• XML Key Management Specification (XKMS 2.0) Bindings

28 June 2005, Shivaram H. Mysore, Phillip Hallam-Baker• Character Model for the World Wide Web 1.0:

Fundamentals15 February 2005, Tex Texin, François Yergeau, RichardIshida, Martin J. Dürst, Misha Wolf

A complete list of all W3C Technical Reports: http://www.w3.org/TR/

New Fee Structure for W3CMembership In keeping with its international mission to lead the Web to itsfull potential, W3C announced a new fee structure designed toreduce the barrier of entry for organizations in developingcountries. The goal is to make it easier for small companies andnot-for-profit organizations to become W3C Members andbecome engaged in the development of foundation technolo-gies for the World Wide Web.

Standardized technologies built in a flexible manner, withattention to internationalization needs (languages and/or infra-structure) can have dramatic impact on life, education andcommerce in a given region.

However, one of the greatest obstacles for participation in largeconsortia is the cost of entry. What appears to be a reasonablemembership fee in Western Europe, Japan or North America isprohibitive in other parts of the world. While W3C doesembrace participation from individuals as invited experts, theconsortium realized that more was necessary to engage organi-zations around the globe.

W3C is all about building Web technologies that can be ofservice to the world. This new fee structure for organizationsfrom the developing world affirms the value W3C places ontheir participation in, contribution to and use of the standardsand guidelines developped to drive the future of the WorldWide Web.

This initiative, focused on regions of the world beginning todiscover Web technologies, is only the most recent in W3C'sinternational commitments. Through its technicalInternationalization Activity, volunteer-based translationprogram, its fourteen Offices around the globe, as well as itsPatent Policy, W3C can better meet the needs and requirementsof diverse populations, and can help those regions developsound, standards-based Web infrastructure.

Links: New fee structure: http://www.w3.org/Consortium/feesJoin W3C documentation: http://www.w3.org/Consortium/join

W3C Internationalization Activitylooks towards Africa Richard Ishida represented the W3C InternationalizationActivity in Casablanca beginning of June as a keynote speakerat a 3-day Pan-African Localization Workshop organized bythe International Development Research Center. The work-shop, the first of its kind, brought together participants fromtwelve African countries as well as experts from other conti-nents to discuss how to better localize ICT into indigenouslanguages and scripts so as to promote rapid and fair develop-ment in Africa. The workshop was also visited by M. RachidTalbi el Alami, Moroccan Minister-Delegate to the PrimeMinister in Charge of General and Economic Affairs, andCarmen Sylvain, Canadian Ambassador to Morocco. Bothexpressed support for its aims. The workshop sets a foundationfor future networking and information sharing via the develop-ment of a collaborative, Web-based site which will provideuseful information and support the initiatives of a pan-Africancommunity of localizers.

Contributing to the workshop supports the aims of the W3C toincrease participation by developing countries in the process ofdeveloping Web technologies. The W3C has recently revisedits member fees to encourage participation by such countries.

The W3C Internationalization Activity has the goal ofproposing and coordinating any techniques, conventions,guidelines and activities within the W3C and together withother organizations that allow and make it easy to use W3Ctechnology worldwide, with different languages, scripts, andcultures.

Articles and Tutorials on International Usage of W3C TechnologiesW3C's Internationalization Activity reviews W3C technologiesin production for internationalization concerns. It also regularlypublishes articles and tutorials relating to international usage ofW3C technologies. For example, the latest article describes theuse of the language tags to indicate the language of text inHTML and XML documents, as well as in HTTP headers,SMIL and SVG switch statements, CSS pseudo-elements, etc.The tutorials list covers the multilingual web addresses usage,the ruby markup and styling, the character sets and encodingsin XHTML, HTML and CSS, and many more.

The Internationalization Activity welcomes the participation ofindividuals and organizations around the world to help improvethe appropriateness of the Web for multiple cultures, scriptsand languages.

Links: W3C Internationalization Activity: http://www.w3.org/International/

Articles: http://www.w3.org/International/articles/

W3C Web Internationalization Tutorials:http://www.w3.org/International/tutorials/

Language tags in HTML and XML:http://www.w3.org/International/articles/language-tags/

7ERCIM News No. 62, July 2005

Page 8: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

Media informatics provides the basis formany practical applications, such as videoconferencing, advanced digital TV andInternet services. It has gained from enor-mous advances in data-capture and presen-tation hardware, and in broadband networktechnologies for real-time transmission ofmultimedia data services.

Different standardization organizationsand funding agencies have recognized theimportant role of multimedia technologies.The MPEG community (http://www.chiariglione.org/mpeg/) was verysuccessful in standardizing advancedaudio and video coding technologies.MPEG-2 coding and transmission tech-nology forms the basis for all digital videobroadcasting (DVB) services and hasgenerated huge revenue in the mediaindustry. Advanced MPEG-4 audio andvideo codecs allow the transmission ofaudio-visual data over mobile networks.The increased availability of multimediacontent has also generated the need tomanage and organize audio-visual data.This is covered by the MPEG-7 standard,which contains a metadata descriptionscheme to describe multimedia content for

indexing and retrieval applications. Thegoal of many research groups is to inventautomatic methods to extract relevantmetadata information, using methods fromthe area of signal processing and patternrecognition. This is a typical example ofhow algorithms and methods are appliedand exploited for multimedia contentprocessing.

Although several applications of multi-media technology already exist in ourdaily lives, there are still many advancesand improvements to be made. Thisspecial issue contains 29 articles on avariety of research projects in this fieldbeing undertaken by ERCIM members.The invited article by Lynn Wilcox fromFXPAL research describes an advancedsystem for managing digital photo collec-tions using a face recognition engine todetect and classify persons on digitalphotos. Other work is the result of nationaland European projects, includingSAVANT, MUSCLE and SIMILAR. Theimportance of international projects isclear, especially when benchmarking testsmust be carried out to evaluate the perfor-mance of a multimedia system.

Multimedia Informaticsby Joachim Köhler

8 ERCIM News No. 62, July 2005

SPECIAL THEME: Multimedia Informatics

Multimedia informatics is a multidisciplinary research area looking at the creation,processing, transmission and consumption of audio-visual data. It combines expertisein signal processing, pattern recognition, coding technology, networking andprotocols, data modelling and user interaction. Although these are all challengingresearch areas in their own right, multimedia informatics stresses their applied usage,exploitation and adaptation. It covers the whole chain of processing, from contentauthoring to media indexing of the archived content.

Page 9: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 9

Introduction

8 Multimedia Informaticsby Joachim Köhler

Multimedia Indexing and RetrievalInvited article:

10 Managing Digital PhotoCollections by Lynn Wilcox

11 Multimedia Indexing: The Multimedia Challenge by Patrick Gros, Manolis Delakis, and Guillaume Gravier

12 MediAssist: Managing PersonalDigital Photo Archives by Noel Murphy, Cathal Gurrin and Gareth J. F. Jones

14 The MUSCLE Benchmarking Initiative by Allan Hanbury and Michael Nölle

15 New Testbed of One Million Images by Gregory Grefenstette, Pierre-Alain Moëllic, Patrick Hède,Christophe Millet and ChristianFluhr

16 Managing the Growth of Multimedia Digital Content by David Bainbridge, Paul Browne,Paul Cairns, Stefan Rüger and Li-Qun Xu

18 Maps of Music by Andreas Rauber, Thomas Lidyand Robert Neumayer

19 Structuring Multimedia Archives with Static Documents by Denis Lalanne and Rolf Ingold

20 MultimediaN: PersonalizedInformation Delivery by Marcel Worring and Nellie Schipper

21 SPIEGLE: A Multimedia SearchEngine Generator by Arjen de Vries

23 Towards a ‘Smart Content Factory’ by Georg Güntner

24 Fischlár-News: MultimediaAccess to Broadcast TV News by Alan F. Smeaton, Noel E.O’Connor and Hyowon Lee

25 Interactive Multimedia-EnabledLearning and Training by Claire Kenny, Declan McMullen, Mark Melia and Claus Pahl

Multimedia Networking

27 Representation andCommunication of MultimediaData and Metadata by Sara Colantonio, Maria GraziaDi Bono, Massimo Martinelli,Gabriele Pieri and Ovidio Salvetti

28 A Cognitive Architecture for Semantically Based Medical Image Retrieval by John Moustakas, SocratesDimitriadis and Kostas Marias

29 Personalized and AdaptiveMultimedia Retrieval by Joemon M. Jose and JanaUrban

31 ADMITS: Adaptation in Distributed Multimedia IT Systems by Laszlo Böszörmenyi

32 Solutions for an Interpreter-Enabled MultimediaConferencing System by Ferenc Sárközy and Géza Haidegger

34 Host Recommendation in the Adaptive DistributedMultimedia Server by Ottó Hutter, Tibor Szkaliczki and Balázs Goldschmidt

35 Scalable Audio Streaming to Mobile Devices by Jonathan Sherwin and Cormac J. Sreenan

36 A Networked Approach to TV Content Distribution by Adrian Cahill, John Roche and Cormac J. Sreenan

38 Presentation, Control and Collaboration in the Networked Classroom by Leandro Navarro-Moldes and Manuel Oneto

39 Video Transcoding Architectures for Multimedia Real TimeServices by Maurizio A. Bonuccelli,Francesca Lonetti and Francesca Martelli

Interactive MultimediaApplications

40 Designing Multi-Modal Multi-Device Interfaces by Silvia Berti and Fabio Paternò

41 LimSee2: A Cross-Platform SMILAuthoring Tool by Romain Deltour, Nabil Layaïda and Daniel Weck

43 Merging Virtual Reality andTelevision through InteractiveVirtual Actors Marilyn —Multimodal Avatar ResponsiveLive Newscaster by Sepideh Chakaveh

44 Interactive Multimedia for Supporting the Quality of the Production by George L. Kovács, GézaHaidegger and János Nacsa

45 The CINEMA Project: A Video-Based Human-ComputerInteraction System for Audio-Visual Immersion by Renaud Dardenne, Jean-Jacques Embrechts, Marc VanDroogenbroeck, and NicolasWerner

46 Personalised Enriched BroadcastExperience by Mounia Lalmas, Nick Bryan-Kinns and Alan Pearmain

ARTICLES IN THIS SECTION

The articles are clustered into three areas,which are organized in the followingmanner:

Topic 1: Multimedia Indexing and RetrievalThis topic contains most of the submittedarticles and includes research work oncontent-based image retrieval, multimediasearch engines and the automatic generationand management of metadata. The goal ofseveral research groups is to increase thesemantic knowledge of multimediaresources and to close the semantic gapbetween high-level features coming fromtext processing, and low-level features fromimage and audio processing.

Topic 2: Multimedia NetworkingSeven articles cover investigations on multi-media networking issues, and presentdistributed multimedia systems that areconnected and managed with intelligentnetwork components (eg proxies). Work onscalable streaming techniques andtranscoding mechanisms for adapting thebandwidth for heterogeneous networks issubject of the articles from Sherwin (page35) and Bonuccelli (page 39).

Topic 3: Interactive Multimedia ApplicationsThese articles describe research work oncontent authoring and show several interac-tive multimedia applications. With thetoolkit LimSee2, developed by RomainDeltour (page 41), it is possible to createmultimedia applications and content usingthe W3C standard SMIL. Another toolkitcalled TERESA from the research organiza-tion ISTI-CNR allows the development ofmultimodal user interfaces in multi-deviceenvironments (page 40). The virtual actorMarilyn (see article by Sepideh Chakavehfrom Fraunhofer IMK on page 43) is appliedas a virtual newscaster for an interactive TVapplication.

Please contact:Joachim Köhler, Institute for Media Communication -IMK, Fraunhofer ICT Group, GermanyE-mail: [email protected]

Page 10: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

10 ERCIM News No. 62, July 2005

The increased use of digital cameras iscreating a need to manage large collec-tions of photos. People want to organizetheir photos, browse their collections,search for a particular photo, and createslide shows to share with friends.However, people are typically unwillingto manually classify their photos fororganization, and most often simplyupload the images from the camera intofolders on their computer.

The FXPAL photo application is a toolthat allows users to automatically orga-nize their photo collections on the fly.The user first tells the application wherethe photos are stored on the computer.He then selects the desired view and thesystem displays the photos accordingly.For example, if the user wants to see thephotos organized by date, he can selectthe dates view to see photos groupedhierarchically by year, month, and day.Similarly, he can select the people viewto see photos grouped according to whois in the picture and the place view to seephotos grouped according to location.There is also an event view that automat-ically groups the photos into meaningfulevents, such as birthdays or weddings.

We achieve these groupings by automat-ically analyzing the photos’ content andthe metadata associated with the photos.By metadata, we mean information suchas the date, time, and location the picturewas taken. Grouping by time and date isstraightforward — we simply create ahierarchy that allows users to select allphotos by year, month, or day. TheFXPAL photo application provides acalendar interface that visualizes thephotos within the calendar for easierselection. The use of space is optimizedin sparse calendar views by expandingthe days of the calendar where there arephotos and shrinking the days wherethere are none.

Grouping photos by events such asholiday gatherings or vacations is a

common practice among photogra-phers. The FXPAL system auto-matically detects events by clus-tering photos according to the timethey were taken and their content.The technique is based on similarityanalysis, in which the self-simi-larity matrix of the photos,computed from temporal andcontent similarity, is partitionedinto disjoint events. After photoshave been automatically groupedinto events, users can attachsemantic labels, so that when thephotos are displayed in the eventview the contents of the collectionare easily understood.

Another common method forgrouping photos is according topeople. Manually assigning names topeople in photos is tedious and timeconsuming. We make use of theFSCA face detection and recognitionsystem to make this task easier. Facedetection is highly accurate and is runautomatically. Faces are cropped from thephotos by orienting an ellipse based on theline segment between the eye positions.Users can view only the cropped facesfrom the photos by selecting the face view.

To determine the identity of people in thephotos, we use the FSCA face recogni-tion system. Unlike face detection, facerecognition is not reliable. Variations inlighting, pose, and eyewear cause thesystem to make errors. Thus we provide auser interface for semi-automatic classifi-cation of faces according to person.Using the face view, the user selects oneor more faces of a particular person tocreate a model. The user then asks thesystem to find similar faces. This resultsin a display of faces ordered by similarityto the model. The user selects correctfaces, usually found near the top of thelist, and adds them to the set of facescorresponding to the person. These facesare used to update the model, thusincreasing the accuracy when more faces

of the person. This process is repeateduntil all faces for people in the collectionhave been labeled.

The FXPAL photo application alsoprovides interesting ways to viewphotos. The Pan ‘n Scan animated slideshow displays full screen photo imageswith background music. The slide showis animated by panning and zoomingthrough each image using rules based onlocation of faces. Another way to viewphotos is as a Stained Glass image.Selected photos are cropped usingdetected faces and assembled in acollage with irregular boundaries similarto stained glass windows.

The photo application created byFXPAL and FSCA is available for usertesting. Please contact the author for usepermission.

Link: http://www.fxpal.com/?p=PhotoApplication

Please contact: Lynn Wilcox, FX Palo Alto Laboratory, USATel: +1 650 813 7574E-mail: [email protected]

Managing Digital Photo Collectionsby Lynn Wilcox

The FXPAL photo application automatically organizes digital photo collectionsbased on date, event, person, or place.

Figure 1: Event View in the photo organizer.

Events have been automatically determined.

Semantic event labels are assigned by the

user.

Figure 2: Face view in the photo organizer,

showing cropped faces from photos and their

similarity to the photo labeled ‘Andreas’.

SPECIAL THEME: Multimedia Informatics

Page 11: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 11

Multimedia indexing has become ageneral label to designate a large domainof activities ranging from image descrip-tion to description languages, fromspeech recognition to ontology defini-tion. Of course, these fields existedbefore the expression ‘multimediaindexing’ became popular, and mostcontinue to have an independent exis-tence. However, the rise of multimediahas forced people to try to mix themtogether in order to manage properly bigcollections of multimedia documents.The global goal of multimedia indexingis to describe documents automatically,especially those containing images,sounds or videos, allowing users toretrieve them from large collections, orto navigate these collections easily.Such documents, which used to be raredue to the price of acquisition devicesand because of the memory required, arenow flooding our digital environmentthanks to the camera-phones, webcams,digital cameras, as well as to thenetworks that allow the data to be widelyshared. The question is no longer “Howcan I acquire a digital image?”, but rather“How can I retrieve the image I want?”

What Does Multimedia Change?While it is possible to study images oraudio tracks alone for some documents,such approaches appear to be verylimited when applied to multimediadocuments like TV streams. This limita-tion is twofold. First, users (who are notspecialists or documentalists) would liketo access such documents semantically;second, users face huge sets of docu-ments. As a consequence, many tech-niques that reduce semantics to syntacticcues in the context of small sets of docu-ments are no longer useful, and no singlemedium can provide acceptable accessto document semantics.

If one considers a TV stream, it isapparent that images are not able toprovide a lot of semantic information.The information that can be extractedfrom this medium includes segmentationinformation (shot detection, clustering ofneighbouring shots), face detection andrecognition capabilities, and text andlogo detection. It is possible to do a lotmore but only in very limited contexts,like news reports or sports broadcasts. Insuch contexts, syntactic cues likeoutdoor/indoor classifications have apertinent semantic translation (anchorperson/outdoor reports), but these trickscannot be used in open contexts. Thesituation is similar in audio analysis.Cries and applause are good indicationsof interesting events in sport reports, butnot in drama and films. On the otherhand, audio can provide useful segmen-tation information (music or speechdetection), speaker detection and recog-nition, key sound detection, or speechtranscription capabilities. There may beseveral sources of interesting text, eginternal sources like closed captions, textincluded in the images, speech transcrip-tion or external sources such as programguides.

The Big Challenge: Mixing MediaThe best way to describe a document isto make use of all the information itcarries, and thus all the media it includes.If this statement seems obvious, it never-theless implies many practical difficul-ties. The various media within a docu-ment are not synchronized temporallyand spatially: the speaker is not alwaysvisible on the TV screen, the text relatedto an image may not be the closest thingto this image, audio and video temporalsegmentations have different borders. Tomake things worse, audio and video donot work at the same rate (100Hz foraudio, and 24, 25 or 30Hz for video).

From a more general point of view,audio, video and text are studied usingdifferent backgrounds, which are notalways easy to mix. Text requires naturallanguage-processing tools that use dataanalysis or symbolic techniques, whileimage and audio are branches of signalprocessing and use a lot of statisticaltools but in the continuous domain.Other domains like geometry are alsoused. Mixing all these tools in one inte-grated model is one facet of the problem.

Two common solutions to this problemexist in the literature. The first is to usethe media in a sequential manner. Onemedium is used to detect some event,and another medium is then used to clas-sify it. For example, audio can be used tofind the most important events in asoccer game, while video is necessary tounderstand what kind of event it is. Suchan approach does not require a theoret-ical framework, remains ad-hoc and isnot so difficult to implement, and is agood starting point for many problems.The second uses Hidden Markov Models(HMMs) to describe and recognizesequences of events. Markov models areof common use in sound and imageprocessing and are very suited to identi-fying sequences of events. This is thanksto the Viterbi algorithm, which is basedon a dynamic programming approachand provides a global optimal solution ata reasonable cost.

Segment Models: A PromisingApproachIn the context of multimedia documentslike video streams, HMMs have stronglimitations due to the fact that each statemay correspond to one and only oneobservation. On the other hand, thisobservation can contain a visual and anaudio part. In the context of video docu-ments, this means that a single temporal

Multimedia Indexing: The Multimedia Challengeby Patrick Gros, Manolis Delakis and Guillaume Gravier

Multimedia indexing is a very active field of research, despite most works usingonly a single medium. This is mainly due to the fact that while they may becorrelated, media are not strongly synchronized. Segment models appear to bea good candidate to manage such a desynchronization.

Page 12: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

12 ERCIM News No. 62, July 2005

granularity must be chosen for the obser-vations, and to align sound informationon video units (images or shots) or viceversa. We used such models to retrievethe structure of videos of tennis, anddespite the limitation, these modelsperformed well in terms of precision ofshot classification.

We propose using an enhanced versionof these models called segment models(SMs). In these models, each state canaccept a variable number of observa-tions, this number (or its distribution)being a new parameter of the state. Onthe one hand, such a model allows adifferent number of visual and audioobservations for a given audio-visualevent. On the other hand, it adds somecomplexity to learning the conditionalprobabilities of the observations, and toidentifying the duration of each state inthe data streams. Our first results showthat segment models can outperformMarkov models. However, the mainwork is now to determine how muchflexibility we can gain, and what can bedone that was impossible before.

We present the performance of HMMsand SMs on a test set of three tennisgames. The task is to segment thecomplete video into predefined scenes,

namely ‘first missed serve andexchange’, ‘exchange’, ‘replay’, and‘break’. In this context, a state in SMsrepresents a complete scene rather than ashot, as in HMMs. Performances aremeasured in terms of percentage of shotsassigned with the correct scene label (C),and recall (R) and precision (P) rates on

scene boundaries detection. We use shot-based audio descriptors in HMMs. Thevideo part of the observations of a scenefor SMs is modelled via HMMs, oper-ating as observation scorers. The audiopart is modelled by unigram models ofaudio events (SM1gram) that fuse audiodescriptors as in HMMs but at the scenelevel, or by bigram models (SM2gram)that can capture a succession of audioevents inside a scene.

Please contact: Patrick Gros, IRISA – CNRS, FranceTel: +33 2 99 84 74 28E-mail: [email protected]

Figure 1: Structure

of a HMM for tennis

videos analysis.

Table 1: The performance of Hidden Markov

Models (HMMs) and Segment Models (SMs)

on a test set of three tennis games.

C P RHMMs 74.57 73.69 82.51SM1gram 76.95 72.28 72.47SM2gram 79.17 75.11 80.13

Recent years have seen a revolution inphotography with a move away fromanalog film towards digital technologies.Many users of digital cameras are nowaccumulating very large numbers ofpersonal digital photos. While digitalstorage offers ample capacity to storethese collections, technology formanaging digital photos has not keptpace with advances in capture andstorage technologies. The MediAssistproject at the Centre for Digital Video

Processing (CDVP) at Dublin CityUniversity is developing applications toenable users to efficiently search theirimage archives.Users often rememberwhere and when a photo was taken, andindeed may recall this more clearly thanthe actual contents of a photo, so clearlycontext information of this type shouldbe effective when searching digitalphoto archives. MediAssist applicationsuse the capture and exploitation ofcontext data as the basis for organising

and searching personal photo collec-tions. The time and date of photo captureare easily accessible from the camera,and this can be augmented with locationinformation using coincident GPS data.

A typical operational scenario for a userusing MediAssist tools would require aGPS enabled digital camera andalthough available, these are currentlyexpensive. While awaiting the arrival tomarket of consumer-grade digital

MediAssist: Managing Personal Digital Photo Archivesby Noel Murphy, Cathal Gurrin and Gareth J. F. Jones

MediAssist is creating a range of applications to help people to manage theirdigital photo archives by employing context to automatically annotate photocollections.

Page 13: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

13

cameras with integrated GPS, we use aseparate GPS device and we automati-cally match photos with their locationfrom a GPS tracklog. Photos areuploaded from the camera to a PC whichthen automatically annotates each photo.This annotation stage extends the timelabels to include weekday, weekend,month, season and year of capture, andthe GPS location labels the town,city/state and country. Data from thecamera is used to determine whether thephoto was taken indoors or outdoors,The MediAssist tools use external infor-mation sources to further annotateimages based on the time and location ofcapture to include whether the environ-ment was light or dark, and even theprevailing weather conditions. In addi-tion, various automatic content analysistools are employed to add further contentto a digital photo archive. Based on otherresearch activities within the CDVP, wehave integrated technologies whichannotate each photo to identify (re-occurring) faces and buildings.

Once a user has their digital photos orga-nized in the archive, MediAssist applica-tions provide support for searching andbrowsing. We have two streams ofresearch, producing a desktop and amobile device interface.

Figure 1 shows the MediAssist desktopsearch application. Using this the usercan search for photos based on locationand/or time of capture, and to filterresults based on light status, weatherconditions, indoor or outdoor, andwhether the photos contain people orbuildings. Location based searching

allows the user to select combinations ofdesired country, city, and town, whiletime based searching enables selectionof a specific time period, eg summer, ortime ranges, eg between February andApril.

When examining the results of a search,a user is presented either with an exhaus-tive list of images or a list of events,where each event is a combination ofphotos taken at the same time and place,eg a birthday party, family outing, or tripto the zoo. Each event is represented inthe desktop interface with up to five keyphotos automatically extracted from theevent, to best represent the varied themeswithin the event. Selecting an eventautomatically displays all photos fromthat event.

In addition to the desktop interface, weare developing a mobile photo manage-ment application. This is based on thesame underlying architecture as thedesktop application, with an interfacetailored to suit mobile devices.

It is known that the reduced screen spaceand limited options for interaction withmobiles devices means that user interac-tion with mobile devices should differgreatly from those of the standarddesktop applications. Figure 2 shows theinterface to the MediAssist mobile appli-cation, this addresses interactivity issuesby primarily presenting a user with apersonalised list of recommended‘favourite’ photos. As a secondaryaccess methodology it also supportslocation and feature based searching ofthe user’s archive. The search window inthe mobile device is hidden from viewand only appears when the user wishes tosearch the photo archive. The searchoptions are more limited than in thedesktop environment, and are primarilybased on location and easily selecteduseful features, such as season andweather. The result of a search is a list ofevents, with a single most representativephoto chosen for each event. If a userwishes, they can further browse theevent by selecting the orange arrow.

The current MediAssist applications willbe enhanced to incorporate advances inthe annotation technologies as thesebecome available within the CDVP.These improvements will includeadvances in recognition of contentfeatures within the photos, and alsoextensions to the annotation process toinclude further external knowledge.

The support of the InformaticsDirectorate of Enterprise Ireland is grate-fully acknowledged.

Link:Centre for Digital Video Processing: http://www.cdvp.dcu.ie/

Please contact: Noel Murphy, Centre for Digital Video Processing, Dublin City University / Irish Universities ConsortiumE-mail: [email protected]

Figure 2: Mobile MediAssist photo

management application.

ERCIM News No. 62, July 2005

Figure 1:

MediAssist desktop

search application.

Page 14: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

14 ERCIM News No. 62, July 2005

The MUSCLE Network of Excellenceaims at fostering close collaborationbetween research groups in multimediadata-mining and machine learning.Around 40 research groups are involvedin the MUSCLE network, which hasbeen in existence since March 2004.

The network aims at achieving twogrand challenges in the area of multi-media learning: natural high-level inter-action with multimedia databases andinterpreting human behaviour in videos.In order to measure the progresstowards achieving these challenges, abenchmarking initiative has beenstarted within MUSCLE. The activitiestaking place within this benchmarkinginitiative are also open to researchgroups not directly involved inMUSCLE.

A very important component of a bench-marking initiative is test data. A reposi-tory for such data has been set up withinMUSCLE (see link below). Most of thedata in this repository is publicly avail-able, while some is restricted to use bymembers of MUSCLE. The publiclyavailable data includes: video sequencesof artificially generated humans innatural scenes for evaluating motiondetection and tracking algorithms;videos of different human gestures andvideos of various types of basketball shotfor evaluating human behaviour inter-pretation algorithms. A collection of 10000 images of coins for evaluating clas-sification algorithms and content basedimage retrieval approaches is also avail-able. This Coin Image Seibersdorfdatabase (CIS) is a result of thechangeover from 12 European curren-cies to the Euro. After the changeover,large volumes of mixed coin collectionshad to be returned properly sorted to thenational banks of the originating coun-tries. The database consists of roughly2000 patterns (classes) of coins frommany different countries. Additionally,there are 100 000 coin images collected

during an automatic sorting processcarried out at the ARC Seibersdorfresearch GmbH, which will later serve astest and benchmarking data. A carefullygenerated and manually verified groundtruth accompanies the data. This makesthe CIS database and benchmark defini-tion ideally suited to a large scale evalua-tion of classification or object recogni-tion algorithms.

All researchers in the multimedia infor-mation retrieval field are encouraged tocontribute useful benchmarking data orsoftware to this repository. This isparticularly encouraged if resultsobtained by using this data have beenpublished, as this allows otherresearchers to evaluate their algorithmson the same data. Furthermore, any

ground truth or annotation for the datacollections can also be contributed.

Apart from making benchmarking dataavailable, a number of benchmarkingcampaigns are being organised. Thesecampaigns are open to all groups doingresearch in multimedia retrieval. Thefirst campaign is the MUSCLE coinclassification competition. This is aneducational initiative aimed at encour-aging senior students interested inpattern recognition and machinelearning. Participants in the competitionwill submit code implementing algo-rithms for classifying the coin databasecurrently available on the benchmarkingwebpage. These algorithms will betested on a part of the database whichhas not yet been made public, and theauthor of the best performing algorithmwill receive a prize sponsored byMUSCLE. The call for participation inthis competition will appear on theMUSCLE benchmarking webpage inautumn 2005.

An evaluation campaign aimed atbenchmarking image retrieval isplanned for 2006 in collaboration withthe CLEF image retrieval track(ImageCLEF). The first step in thiscollaboration is a workshop that will beheld on the 20th of September 2005 inVienna, in conjunction with the CLEFworkshop and the ECDL 2005(European Conference on DigitalLibraries). The workshop is aimed atstimulating discussion on the currentstate of image retrieval evaluation (withwell-known researchers in this field) aswell as planning the 2006 evaluationcampaign. More information is avail-able on the MUSCLE benchmarkingwebpage.

An evaluation campaign linked toMUSCLE is ImagEVAL, funded by theFrench Ministry of Research. Thiscampaign consists of the following fivetasks related to image retrieval: recogni-

The MUSCLE Benchmarking Initiative by Allan Hanbury and Michael Nölle

Evaluating the performance of multimedia retrieval algorithms is part of theMUSCLE, a EU Network of Excellence administrated by ERCIM.

Figure 1: Image from the CIS database

(provided by ARC Seibersdorf Research

GmbH).

Figure 2: Image from the basketball shots

database (provided by INRIA-Vista).

Page 15: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 15

tion of transformed images (rotation,translation, scaling, etc.); search forphotographs illustrating a text using textand image analysis; extraction andrecognition of text areas in an image;object detection; and automatic imageclassification (night/day, indoor/outdoor, city/nature, etc.).

Participation in these campaigns isencouraged as it will lead to an objectiveevaluation of the current state-of-the-artin the multimedia information retrievalresearch area. As the best results of thesecampaigns will be publicised, the groupshaving submitted the best-performingalgorithms can benefit from the publicity.

Links:MUSCLE NoE website: http://www.muscle-noe.org

MUSCLE benchmarking page: http://muscle.prip.tuwien.ac.at

Cross-Language Evaluation Forum (CLEF): http://www.clef-campaign.org/

ImageCLEF: http://ir.shef.ac.uk/imageclef

ImagEVAL competition: http://www.imageval.org

European Conference on Digital Libraries 2005 (ECDL): http://www.ecdl2005.org

Please contact:Allan Hanbury, Technical University Vienna, PRIP, AustriaTel: +43 1 58801 18359E-mail: [email protected]

Michael Nölle,Quantumtechnology, ARC Seibersdorf Research, AustriaE-mail: [email protected]

Figure 3: Image from the motion detection database (provided by Advanced Computer

Vision GmbH – ACV). The image containing a computer-generated human is shown on

the left, and the ground truth image (actual position of the human) is shown on the right.

Content-based image retrieval involvessearching a collection of images forthose relevant to a given ‘query image’.The user submits an image (or some-times only the description of an image)and wants to find the images in thecollection that most closely match.Currently, researchers have been limitedto small collections in developing andtesting their content-based imageretrieval techniques. The largest testcollections currently used (such as theUniversity of Columbia databases, or theCorel databases) contain from hundredsup to 60,000 images. To serve as a testcollection, the image database mustcontain labeled images, where the labelsshow which image is relevant to whichother image. Any image can then serveas a test query, and researchers canmeasure the performance of their systemusing values such as precision (theproportion of retrieved images that actu-ally are relevant to the query) and recall(the percentage of relevant images in thedatabase that the system retrieves).

Within the EU-sponsored Network ofExcellence MUSCLE (MultimediaUnderstanding through Semantics,Computation and Learning), a newtestbed image collection of one milliontest images has been created. Thedatabase is called CLIC (for CEA ListImage Collection) and has beenproduced by the LIC2M team at theCEA, which is one of the MUSCLE part-ners. LIC2M stands for Laboratoired’Ingenérie de la ConnaissanceMutlimédia Mutlilingue, and is a labora-tory outside of Paris that specializes inimage and text processing in manylanguages.

The CLIC image collection containslabelled images, each of which can beused as a query image. The image collec-tion was created by hand-labellingphotographs that were donated to theproject by colleagues. Any photographscontaining identifiable persons wereremoved. The remaining 15,200 photoswere classified into a shallow hierarchy:

New Testbed of One Million Images by Gregory Grefenstette, Pierre-Alain Moëllic, Patrick Hède,Christophe Millet and Christian Fluhr

The Commissariat à l’Energie Atomique (CEA) in France has produced an imagedatabase of one million images that will allow researchers in content-based imageretrieval to test their system on a life-size collection.

Some images from classes of the kernel

of the CLIC content-based image retrieval

testbed.

Page 16: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

16 ERCIM News No. 62, July 2005

• Food: images of food, and meals• Architecture: images of architecture,

architectural details, castles, churches,Asian temples

• Arts: paintings, sculptures, stainedglass, engravings

• Botanic: various plants, trees, flowers • Linguistic: images containing text areas• Mathematics: fractals• Music: images of musical instruments • Objects: images representing everyday

objects such as coins, scissors etc• Nature&Landscapes: landscapes,

valley, hills, deserts etc• Society: images with people• Sports&Games: stadiums, items from

games and sports• Symbols: iconic symbols, roadsigns,

national flags (real and syntheticimages)

• Technical: images involving trans-portation, robotics, computer science

• Textures: rock, sky, grass, wall, sand etc• City: buildings, roads, streets etc.

These labelled images form the kernel ofthe collection (see the figure for exam-

ples). Each image in the kernel was thenaltered in 69 different ways. The trans-formations applied to each originalkernel image included: geometric trans-formations (such as rotation, translation,projection, and splitting), chromatictransformations (such as negative, satu-ration, black-and-white, and quantifica-tion) and various other transformations(such as low-pass filtering, noise addi-tion, border addition, text incrustation,mosaic, resizing, and edge outlining).The altered images were added to thecollection with the same labels, therebygenerating the one million labelledimages in CLIC. Any image can be usedas a query, in the knowledge that at least69 other images are relevant to the query.The transformations were designed tocover a wide variety that occur in naturalimage manipulation. For example, it iscommon to find slightly altered andcropped pictures to avoid detection forcopyright infringement.

The database will be distributed to theresearch community through the

MUSCLE Network of Excellence, andwill prove useful for testing text-basedcontent-based image retrieval (using thelabels), evaluating algorithm behaviourover large databases, testing the invari-ance of algorithms towards transforma-tions, for automatic classification (usingthe hierarchy), for object and personrecognition, and for the detection of textin images. The database will bedistributed for research free of charge:please check the MUSCLE Web site forthe latest information on its availability.

Link: http://www.muscle-noe.org

For further information about the CLIC testbed, please consult the following article:Pierre-Alain Moëllic, Patrick Hède, Gregory Grefenstette, Christophe Millet, “Evaluating Content Based Image Retrieval Techniqueswith the One Million Images CLIC TestBed”, Proceedings of the Second World Enformatika Congress, WEC’05, February 25-27, 2005, Istanbul, Turkey, pp 171-174.

Please contact: Gregory Grefenstette,Commissariat à l’Energie Atomique, France Tel: +33 1 46 54 96 56E-mail: [email protected]

A multimedia digital libraries project hasbeen set forward to address the complexissues related to media management witha view to developing more effectiveapproaches for searching and browsingmultimedia content: The MultimediaInformation Retrieval group at ImperialCollege London collaborates withHuman-Computer-Interaction expertsfrom University College London, DigitalLibrary experts from the Greenstonegroup at the University of Waikato, New

Zealand, and Media Managementexperts at BT Research. The group ispulling together digital content fromarchives of the BBC, the British Library,the New Zealand Digital Library and theVictoria & Albert Museum.

This project has the following four mainobjectives:• develop a query-by-example retrieval

approach using automated content-based analysis, in conjunction with

meta-data and text-based search whenand if necessary

• devise new tailored search andbrowsing approaches appropriate fordifferent media collections and users'needs

• reduce information overload by pre-senting and summarising search resultsin a semantically meaningful manner

• define new interfaces which can inte-grate and adapt to different media anduser knowledge.

Managing the Growth of Multimedia Digital Contentby David Bainbridge, Paul Browne, Paul Cairns, Stefan Rüger and Li-Qun Xu

The growth in multimedia documents including collections of photos, music andvideos etc has been phenomenal in recent years. Effective management of thesemultimedia collections has become a necessity for both businesses and thegeneral public. Indeed with the advent of large volume and cheap storage devicesand increasingly adopted broadband connections, home storage systems arebecoming a reality for ordinary users to organise multimedia files. Currentapproaches - used mainly by fully trained professionals - are expensive andcomplicated. Can the media growth be effectively managed?

Page 17: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 17

For example, we will automaticallycreate new relations between multimediaobjects (ie photos and videos segments)by automatically inserting links fromeach object to all those objects which aremost similar under different criteria, thuscreating a list of nearest neighbours withwhat we call ‘lateral associations’. Thisgenerates a network which is eminentlysuitable for browsing and exploration.The right panel in the figure summarisesa large image database by displaying allhighly connected images in the network;they can serve as entry points forbrowsing. The lateral neighbours of aparticular image are shown in the leftpanel. The bottom panel shows objectsin their context, which may be temporalin the case of videos or semantic (egsame category or genre) in the case ofimage collections.

Who really needs MultimediaDigital Library Management?The beneficiaries of such a systeminclude students and scholars, themedical profession with their specificdatabases and the general public.Organisations that create and managedigital libraries could also benefit; theyinclude museums, art galleries, andphoto & movie libraries. Indeed, evencasual users are starting to see a need foran effective way to manage theirgrowing collections of digital photos,music and personal movies.

The project is meant to have a definingimpact on the way documents / informa-tion are presented and accessed in digitallibraries. It aims to provide easy accessto digitised material that is not yet fullyannotated. Such methods are likely toultimately change the way products,goods and services are presented indigital catalogues and how they aremarketed on the Internet. It will revolu-tionise the design of large scale videodatabases as well as spawn a range ofnew revenue opportunities for museums,art galleries and image libraries.

Managing Future Multimedia CollectionsTraditional libraries, digital or not, usemeta-data, catalogues and classificationsystems to facilitate access to docu-ments. Search engines such as Googlehave enhanced this process by indexing

and searching the entire documents togive access to the meta-data, ie, the linkto the document. This project intends togo a step further by providing visualaccess modes, eg, through visual searchboxes into which images can be dropped.Those visual access modes will be inaddition to traditional meta-data searchin libraries and full-text search boxes. Assuch, we amalgamate digital libraryfunctionalities given, eg by thesuccessful Greenstone Digital Library(http://www.greenstone.org) withcontent-based multimedia access.

As applications evolve over time theirfunctionality and user interfacecomplexity tend to increase substan-tially. User interfaces do not generallyoffer different approaches for peoplehaving different individual characteris-tics (Novice, Intermediate, orProfessional). Adaptable and responsiveinterfaces are important for the searchand browse of media libraries.

Our Multimedia CollectionsThe multimedia digital libraries projectdevelopment requires a diverse collec-tion of media in order to design and eval-uate appropriate search and browsingapproaches according to users’ require-ments. The varying levels of meta-dataavailable within the collections will

provide a realistic testbed for validatingthe underlying technologies. The testcollections include the following: • 50 hours of television content from the

BBC• 1 million page images from a selection

of 200 newspaper titles supplied by theBritish Library

• the University of Waikato's Maorinewspaper collection

• the V&A collection of 29,000 imageswith annotation

• Imperial College London's digitalShoebox collection of 6,000 personalphotos.

The multimedia digital libraries projectaims to solve the complex mediamanagement issues by offering a tailoredbrowsing and search access that is adapt-able to users’ needs as well as differentdigital library collections.

Software that arises from this projectwill become open source and will bedisseminated under the GNU GeneralPublic License.

Link:http://mmir.doc.ic.ac.uk/pr-mmdl-2005/

Please contact:Stefan Rüger, Imperial College London, UKE-mail: [email protected]

Interface design for lateral browsing.

Page 18: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

18 ERCIM News No. 62, July 2005

Music is becoming one of the dominantgoods in internet traffic and privatestorage, with according business modelsslowly starting to catch up. Yet, to selectfrom this wealth of music, to choosewhich songs to listen to in a specific situ-ation or particular mood, we are stillforced to turn to cumbersome clicking,to manually sort titles into playlists,again listen only to pre-compiledalbums, or to revert the dullest automaticselection method, namely random play.

In order to enjoy the plethora of musicavailable, new techniques and interfacesare required that will free us fromburdensome selection of tracks. Whileprivate music collections and audioplayers are desperately in need of moreintuitive methods of organizing andeasily selecting music, commercialvendors most probably are as well. Thisneed is evident from the popularity ofcurrently available recommendationtechniques, such as the dominant“customers who bought this album alsobought this” style recommendations.Allowing customers to casually selectand browse sections of vast stocks ofaudio tracks and discover titles theydidn't know before requires a differentapproach.

To this end we are developing methodsthat allow us to organize audio reposito-ries by the way we perceive music,grouping audio tracks by their perceivedacoustic similarity. The SOM-enhancedJukeBox system (SOMeJB) providesautomatic indexing and organization ofmusic repositories based on perceivedsound similarity of single tracks. A mapmetaphor is used for visualization, withsimilar songs being placed into similarregions on the map.

Using a variety of feature extractiontechniques, the audio signal in the formof WAV or MP3 files is analysed toextract representations that allow us tocompute the perceived similarity of twopieces of music. Specifically, we use -amongst other statistical features -Rhythm Patterns, modelling the ampli-tude modulation frequency in differentfrequency bands while incorporating arange of psycho-acoustic transforma-tions. In a two-stage feature extractionprocess, the specific loudness sensationin different frequency bands is firstcomputed, and is then transformed into atime-invariant representation based onthe modulation frequency. Thesefeatures describe the complex rhythmic

interactions that are characteristic fordifferent musical styles.

On top of this, we can apply standardinformation retrieval techniques tosearch for specific pieces of music, or fortracks from a certain musical genre.Classifiers can be trained to sort audiointo pre-defined genre categories. Withthe SOMeJB system, the SOM-enhancedJukeBox, we go one step further by clus-tering individual audio tracks using aself-organizing map, allowing us toovercome traditional genre boundaries.This system provides a mapping of thehigh-dimensional feature spaces into atwo-dimensional map space that can beconveniently explored. Different visual-ization techniques, such as SmoothedData Histograms, reveal the clusterboundaries on the map and result in‘Islands of Music’ being depicted, witheach island containing music of aspecific type or style. Vector field visual-izations on top of these produce weathercharts that help users to interpret themap, telling them where more aggressiveor quieter music is located on.

With the PlaySOM and thePocketSOMPlayer, we added two novelinterface modules allowing us to browsea music collection by navigating a mapof clustered music tracks and to selectregions of interest containing similartracks for playing. The PlaySOM systemis primarily designed to allow interactionvia a large, preferably touch-screendevice, whereas the PocketSOMPlayeris implemented for mobile devices,supporting both local and streamedaudio replay, or acting as a remotecontrol for the audio server.

Music is selected simply by marking anarea on the map, or – in a slightly moresophisticated fashion – by drawing atrajectory on the map, along whichmusic is played. This allows us to start

Maps of Musicby Andreas Rauber, Thomas Lidy and Robert Neumayer

Manually sorting and searching through directory structures is no way to enjoymusic, no matter how much we can pack onto a single, tiny device. Players shouldbe ‘intelligent’ enough to organize the music for us, allowing us to pick only whatwe want to listen to.

Figure 1:

PlaySOM playlist selection by trajectory on a tablet PC.

Figure 2: PocketSOMPlayer

running on an iPAQ.

Page 19: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 19

from, say, some soft classical pianomusic, move on to somewhat moredynamic orchestral pieces, beforereturning again to softer violin pieces.Or, in a different area of the map, startingwith some electronic music, moving onto more dynamic pop and rock pieces,and gradually increasing in dynamics toreach the metal area on the map.

This approach allows users to quicklyexplore the range of music available andto effortlessly select styles of music

according to their preferences, simply byselecting the appropriate areas on themap. Current work is focusing on furtherimprovements in the feature extractionprocess to even better capture theperceived characteristics of audio, aswell as on methods for integratingdifferent views of and interaction possi-bilities with large music collections.

Part of this work was supported by theEuropean Union in the 6th FrameworkProgramme, IST, through the Networks

of Excellence ‘DELOS’ on DigitalLibraries, and ‘MUSCLE’ onMultimedia Understanding throughSemantics, Computation and Learning.

Link: http://www.ifs.tuwien.ac.at/~andi/somejb

Please contact:Andreas RauberVienna University of Technology, AustriaE-mail: [email protected]

Interfaces to textual-document librariesare improving, but search and browsinginterfaces in multimedia-documentlibraries are still in the early stages ofdevelopment. Most existing systems aremono-modal and allow searching eitherfor images, videos or sound. For thisreason, much current research in imageand video analysis is focusing on auto-matically creating indexes and pictorialvideo summaries to help users browsethrough multimedia corpuses. However,such methods are often based on low-level visual features and lack semanticinformation. Other research projects uselanguage-understanding techniques ortext captions derived from OCR, in orderto create more powerful indexes andsearch mechanisms. Our assumption isthat in a large proportion of multimediaapplications (eg lectures, meetings, newsetc), classical printed documents or theirelectronic counterparts (referred to bythe term ‘printable’) play a central role inthe thematic structure of discussions.

Unlike other multimedia data, staticdocuments are highly thematic andstructured, and thus relatively easy to

index and retrieve. Documents carry avariety of structures that can be usefulfor indexing and structuring multimediaarchives, but such structures are oftenhard to extract from audio or video. It istherefore essential to find links betweendocuments and multimodal annotationsof meeting data, such as audio andvideo.

Recently there has emerged a significantresearch trend toward recording andanalysing meetings. This is done mostlyin order to advance research on multi-modal content analysis and multimediainformation retrieval, which are keyfeatures for designing future communi-cation systems. Many research projectsaim at archiving recordings of meeting informs suitable for later browsing andretrieval. However, most of theseprojects do not take into account theprinted documents that often form part ofthe information available during ameeting. We believe printable docu-ments could provide a natural andthematic means for browsing andsearching through large multimediarepositories.

For this reason, we have designed andimplemented a tool that automaticallyextracts the hidden structures containedin PDF documents. The semantics of theinformation behind layout and logicalstructures is largely underestimated andwe believe their extraction can drasti-cally improve both document indexingand retrieval, and linking with othermedia.

In order to browse multimedia corpusesusing documents as interfaces, it isnecessary to build links between print-able documents, which are inherentlynon-temporal, and other temporal media.We use the term ‘temporal documentalignment’ to refer to the operation ofextracting the relationships between adocument excerpt at variable granularitylevels, and the meeting presentationtime. Temporal document alignmentcreates links between document extractsand the time intervals in which they werein either the speech focus or the visualfocus. It is thus possible to align docu-ment parts with audio and video extracts,and by extension with any annotation ofaudio, video and/or gesture.

Structuring Multimedia Archives with Static Documentsby Denis Lalanne and Rolf Ingold

If we consider static documents as structured and thematic vectors towardsmultimedia archives, they can be used as a tool for structuring events such asmeetings. Here we present a method for bridging the gap between staticdocuments and temporal multimedia data, such as audio and video. This isachieved by first extracting electronic document structures, then aligning themwith multimedia meeting data, and finally using them as interfaces to accessmultimedia archives.

Page 20: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

20 ERCIM News No. 62, July 2005

In the FRIDOC multimedia browser thatwe have developed, users can first searchat a cross-meeting level by typing in a setof keywords: this will retrieve all rele-vant documents. Clicking on a documentor an article then allows users to view therelated multimedia data attached to thiselement and to directly jump to theportions of meetings in which it was infocus. At the intra-meeting level, all thecomponents (documents, audio/video,transcription and annotations) aresynchronized through the meeting time,thanks to the document alignments;clicking on one of them causes all thecomponents to visualize their content atthe same time. For instance, clicking ona journal article cues audio/video clipsfrom the time at which it was discussed,cues the speech transcription from thesame time period, and displays the docu-ment that was projected.

This work demonstrates the role of staticdocuments as structured and thematicvectors towards multimedia archives andproposes a method for bridging the gapbetween static documents and multi-media meeting archives. The resultsobtained so far through user evaluationstend to prove that documents are an effi-cient means of accessing multimediacorpuses, such as multimedia meeting

repositories or multimedia conferencearchives.

Link: http://diuf.unifr.ch/im2/

Please contact:Rolf Ingold, University of Fribourg, SwitzerlandE-mail: [email protected]

Multimedia meeting.

The amount of multimedia informationbeing captured and produced isconstantly increasing. While providingthe right information to a user is alreadydifficult for structured information, it ismuch harder in the case of multimediainformation. When multimedia collec-tions become large, complete manualannotation is no longer an option. As aconsequence, automatic indexing ofmultimedia is becoming an essentialingredient in any modern informationsystem.

The state-of-the-art in automatic videoindexing is evaluated in the yearlyTRECVID, an international videoretrieval benchmark that focuses onnews data. Our Mediamill team(UvA/TNO) has participated in alleditions. Much progress has alreadybeen made in this field, and the perfor-

mance of automatic indexing techniqueshas proven to be useful for interactiveretrieval. In the TRECVID we haveshown that for successful indexing, allinformation about the data should beemployed. For news video this meanscombining information from both thespeech and the visual channel.Furthermore, analysis should not berestricted to the content of the two chan-nels, but should also consider how thedata is captured and what recurring useof style can be observed. ForTRECVID2004 we indexed 32 concepts,and we are now scaling up to 50-100.From there we use ontologies to scale upby orders of magnitude the number ofconcepts for which an index is available.

Ultimately this will lead to automaticannotation both for produced video, likenews and film, and non-produced video

captured with a security camera or bysomeone walking around with a camera.In the latter case, the user can employ thespeech channel for spoken annotation.Further, it is clear that the use of videorestricts us to a two-dimensional repre-sentation of the world. Ideally, the three-dimensional world and the objectswithin it could also be stored in thedatabase. We are working on 3D recon-struction methods from video for thispurpose.

In the end, our information systems willbe filled which large collections ofimages, videos, and 3D worlds, togetherwith annotations of these data items.Deciding what to present to the userdepends on a number of different factors.What is the device the person is using? Isthe user sitting behind her PC in heroffice, or is she walking around in the

MultimediaN: Personalized Information Deliveryby Marcel Worring and Nellie Schipper

How can we make the best use of the abundance of multimedia information? Themultidisciplinary PID project in the Netherlands is developing methods forcapturing information, and then automatically indexing and presenting it to usersin an optimal way.

Page 21: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 21

field with her PDA?Furthermore, it depends on thetask being performed and thecontext in which it is performed.A cognitive engineeringapproach is vital, in order toprovide the user with the rightinformation by taking intoaccount the task, context anduser capabilities.

Clearly only a multidisciplinaryapproach can bring together allof the above. Experts are neededin computer vision, machinelearning, information systems,information visualization andhuman computer interaction.These different disciplines have beenbrought together in the PID project.

ContextThe PID project is part of the large-scaleMultimediaN project funded by theDutch government. It started on 1st

April, 2004, and will have a total dura-tion of four years. The PID consortiumconsists of research institutes(University of Amsterdam, TNO),system integrators (LogicaCMG,Compano/Ziuz) and application holders(Dutch Olympic Committee, Dutch

Forensic Institute, the police). Thus, theproject covers the whole chain fromresearch to applications.

ApplicationsTo study the above methodologies anumber of concrete applications arebeing pursued. For each, the whole chainfrom data capturing to presentation isconsidered, but each application has itsemphasis on one of the elements. Theapplication being developed with theDutch Forensic Institute is the 3D recon-struction of crime scenes using video

cameras. Indexing of the crimescene will be performed using acombination of speech andvisual analysis. A project incollaboration with the DutchOlympic Committee is devel-oping a personal coach, whichcaptures the 3D movement ofathletes, combining this infor-mation with data from othersensors, such as heart rate. Thisinformation is then used fortracking and improving theathletes’ performance. For thepolice, the emphasis is on thecognitive side, aiming at equip-ping police on the job with atten-tive mobile devices that provide

them with relevant information. Finally,home videos are considered where theemphasis lies on creating summaries ofthe data and finding relations within thedata.

Link:http://www.multimedian.nl

Please contact:Marcel Worring University of Amsterdam, The NetherlandsTel: +31 20 5257521E-mail: [email protected]

Nellie Schipper, TNO, The NetherlandsE-mail: [email protected]

Various applications of personalized information delivery.

Multimedia searching is never a goal onits own, but in reality is embedded inuser tasks. These vary in complexity, inthe collections accessed, and in theircontextual parameters. For example, afilm maker may seek a suitable shot toinclude in a documentary, a teachermight look for an animation to illustratea lecture, a group of friends may searchfor photos to accompany their stories

told at a wedding party, while a DJ spin-ning his records in a popular night clubmight try to find that perfect blues lick tosample over his beats.

It is to be expected that a search strategythat works well in one scenario will notnecessarily be the best choice foranother. Ideally, each user task would bematched by a specialized retrieval

strategy. In other words, a retrievalengine should be context-aware, or atleast be adaptable to the context in whichit will operate. For example, while thejournalist could find relevant shots in anational archive, those looking forwedding photos would be better servedon the Web, or in their friends’ collectivefolders of digital photos. In addition,while the teacher may be satisfied with a

SPIEGLE: A Multimedia Search Engine Generatorby Arjen de Vries

When the efficiency of multimedia search system engineering is improved,multimedia search effectiveness will follow. The Semantic Multimedia Accessproject in the Netherlands has developed ‘Spiegle’, a parameterized search-enginegenerator. It generates a search system specialized for a particular set ofcircumstances, including user, form of data collection and background. TheNetherlands Institute for Sound and Vision - the national audio-visual archive -and Van Dale publishers will test the system.

Page 22: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

22 ERCIM News No. 62, July 2005

familiar example, the best answers to theDJ’s query would be unusual musicalsamples.

Researchers from CWI, the University ofAmsterdam and the University ofTwente are collaborating onMultimediaN’s Semantic MultimediaAccess project (also known as MN-N5)to create new technology to ease thisadaptation of search system to user task.The project's main goal is to developSpiegle, a ‘parameterized search engine

generator’. Spiegle takes two inputs:firstly, a collection schema andsecondly, a declarative specification of aretrieval strategy suitable for the usersearch task. It then generates a searchsystem specialized for the particularcontext at hand, including the combina-tion of this user, this collection, and thespecific background knowledge avail-able.

Spiegle combines the results of twoexisting research lines, both based uponprobabilistic methods for informationretrieval. The first building block is theTIJAH structured document retrievalsystem, developed in the Cirquid project.TIJAH retrieves document componentsfrom XML documents, matching onboth content and structure. We arecurrently extending its text retrieval

models with the probabilistic modelsdeveloped for image and video retrieval,and will use the resulting system formultimedia search in both TRECVIDand INEX 2005.

The second building block is the RAMdatabase front-end for processingqueries over arrays. RAM (RelationalArray Mapping) was originally devel-oped to express the retrieval modelsinvolving multimedia content. A recentarticle by Roelleke and others has

demonstrated how to express many well-known retrieval models in a generalmatrix framework. The correspondingmatrix expressions are easily expressedin RAM, so it seems the ideal startingpoint for specifying search strategiesdeclaratively.

Consequently, the Spiegle parameterizedsearch engine can be realized by inte-grating TIJAH and RAM. TIJAHprovides the techniques to adapt thesearch system to different collectionsand background knowledge, and RAMprovides the declarative language forspecifying the retrieval model. Sinceboth are implemented as front-ends onthe same database back-end (the opensource database system MonetDB), thisshould be feasible without too manycomplications.

Search systems generated with Spieglewill be put to the test with a variety ofsearch tasks in scientific evaluationssuch as TREC, TRECVID, INEX, andCLEF. Perhaps more interestinglythough, the project also involves variousend-user organisations, including theNetherlands Institute for Sound andVision, and Van Dale publishers, whooffer great case studies for further vali-dation of our research results.

Sound and Vision is not only the busi-ness archive of the national broadcastingcorporations, but also a cultural historyinstitute and a unique media experiencefor its visitors. The institute intends toopen its archive to program makers andresearchers, as well as for educationalpurposes. Using our technology, itshould be easier to support each of theseuser groups with search systems special-ized to their search tasks. We also hopeto reduce the burden of annotation byintegrating search functionality into theannotation process.

Our work with Van Dale, a prominentpublisher of dictionaries, demonstratesthat project results are not limited to the'multimedia search engine'. Both detec-tion and tracking are important forms ofsearching, and we have applied our tech-nology to track the development oflanguage in terms of word usage and theshifting of meaning over time. We thinkthe insights resulting from this projectare also applicable in searching collec-tions that span decades of text data.

In summary, the main innovation of thisproject is its goal of working toward asystem architecture that accommodatesdifferent types of searching, usingdifferent sets of a priori knowledge andexhibiting a varying degree of hetero-geneity (or homogeneity). This willsimplify considerably the comparison ofdifferent types of retrieval model instan-tiations on a series of search problems.

Links:http://monetdb.cwi.nl/projects/trecvid/MN5/http://www.spiegle.nl/

Please contact:Arjen de Vries, CWI, The NetherlandsTel: +31 20 592 4306E-mail: [email protected]

The Willem Frederik Hermans (WFH) portal contains the complete

primary and secondary bibliography, the author's huge private

archive, as well as interviews broadcast on radio and television.

Courtesy Willem Frederik Hermans instituut (WFHi) and the Stichting

DBNL (Digitale Bibliotheek voor de Nederlandse Letteren).

Page 23: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 23

While the consumer market provides uswith increasing possibilities to createrich media content (cameras, smartphones etc), the ‘professional’ contentindustries (broadcasters, news agenciesetc) continue to digitize their completecontent life-cycle management processes.This results in a semantic gap betweenthe ease of content creation and the needfor the utilization of content in a context-aware, individually tailored way.

The Smart Content FactoryIn October 2003 a group of Austriantechnology and science partners (X-ArtProDivision, ORF, Joanneum Research,coordinator: Salzburg Research) starteda project entitled ‘Smart ContentFactory’ (the Factory). The project aimsto develop a knowledge-based infra-structure for search and retrieval in anaudiovisual archive: The approach ofthe project is not to reinvent existingmedia management technology, but tocreate a framework superimposing a‘semantic layer’ on top of state-of-the-art technology. The general objective ofthe project is to define a system archi-tecture supporting a wide range of‘knowledge-intensive’ user scenariosfor the utilization of rich media contentin the business to business (B2B) andbusiness to consumer (B2C) areas.

The project is one of the lead projects atSalzburg NewMediaLab (SNML), theAustrian research centre in the area ofdigital content engineering.

Objectives and ResultsThis section outlines the project objec-tives and shows the approach the projectteam took during the prototype phase tomeet the requirements.

Re-use of existing content: By using content ‘as is’ there is no needto reformat or transform existing con-tent into new data structures. Hencethere is no need for redundancy of high-volume data storage. All operations

related to the media clips use uniquereferences to the media objects.

Aspects of integration: The Factory’s indexing componentsoffer interfaces for integration into thecontent production workflow. Forexample, when a new digital videoediting system is introduced, theFactory will be informed of newly cre-ated content and the indexing processwill be triggered as soon as new contentis published. Currently the Factory sup-ports an indexing pipeline consisting oftwo steps: a primary content-basedindexing and a secondary semantic

indexing using the concepts of theknow-ledge base.

Interoperability: The Factory makes use of state-of-the-art digital asset management technology.Wherever possible, standard-based inter-face layers are used to keep the core ofthe Factory independent of the under-lying data and knowledge layer. TheFactory uses Virage’s VS Archive™, amedia management system offeringpowerful content-based indexingfeatures and a well documented

programming interface. In the course ofthe project an MPEG-7 interface wascreated to avoid dependencies from theproprietary system.

Adaptability and extensibility: The Factory will be adaptable and exten-sible to various domain knowledgemodels. Currently three ‘pluggable’knowledge models form the knowledgebase of the Smart Content Factory: (i)the ‘Geoname Thesaurus’ developed bySalzburg Research; (ii) the IPTCthesaurus provided by the InternationalPress and Telecommunications Council;and (iii) a service of (German) synonyms

provided by the University of Leipzig.The first two of these are represented inRDF according to the SKOS Core MetaModel (W3C).

Usability: The Factory makes use of advanced userinterface paradigms for easy explorationand navigation in the repository,including reasoning and recommenda-tions. In the prototype, hyperbolic treenavigation (based on TouchGraph) isoffered for browsing the knowledgemodels. The detailed result view (‘video

Towards a ‘Smart Content Factory’by Georg Güntner

A project at Salzburg NewMediaLab (SNML) introduces a pragmatic approach tothe implementation of a knowledge-based infrastructure for search and retrievalin audiovisual repositories.

The prototype of the Smart Content Factory at work (April 2005).

Page 24: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

24 ERCIM News No. 62, July 2005

summary’) offers a synchronized presen-tation of the video-, audio-, speech-,keyframe- and stripe image-track.

OutlookThe project has now delivered a firstvertical prototype for a knowledge-basedsearch and retrieval scenario in an audio-visual archive. Future work will be dedi-cated to the extension of the core infra-structure to the needs of other ‘know-ledge-intensive’ scenarios, such as theautomatic creation of a summary ofhistorical video clips related to a certainevent or person. We therefore plan tointegrate additional knowledge models

(eg temporal categories and categoriesrelated to Austrian history and policy),and to adapt the content retrievalmethods to the user’s intuitive under-standing of ‘generalization’ and ‘special-ization’ (using inferencing and reasoningbased on the knowledge models).

The prototype has shown that ourapproach to integrating and extendingexisting technology with semanticfeatures has proven feasible. Howevercomplex it seems to implement a fullyfeatured, knowledge-based infra-structure for audiovisual repositories(and some earlier projects got lost in this

complexity), one must start withmanageable and pragmatic steps toapproach the vision of a ‘Smart ContentFactory’.

Links:http://www.newmedialab.at/http://www.iptc.org/http://www.touchgraph.com/http://www.virage.com/http://www.w3.org/2004/02/skos/

Please contact:Georg Güntner Salzburg NewMediaLab, AustriaTel: +43 662 2288 400E-mail: [email protected]

The Centre for Digital Video Processingat Dublin City University has beencarrying out leading research into videoanalysis and structuring since the mid-1990s and this has been done with a viewtowards supporting content-based opera-tions such as summarisation (movies),highlight detection (sports) andsearching/browsing (TV News). Part ofthis work has culminated in the Físchlár-News system as part of a project whichcommenced in 2000. The resultingsystem captures broadcast TV news,nightly, and automatically analyses andstructures the entire broadcast into anMPEG-7 annotation. These analysisprocesses include shot boundary detec-tion, detection of keyframes, identifica-tion of the exact start/end of theprogram, TV advertisement detection,speech/music discrimination and auto-matic detection of anchorperson shots.The outputs of these analyses are fed intoa trained Support Vector Machine(SVM) which segments the broadcastinto discrete news stories which we useas the units for retrieval. The analysis ofa broadcast news program takes aboutreal time and the program is available on

the system about 30 minutes after itstransmission.

A user uses Físchlár-News to access TVnews video through a conventional webbrowser and when a user logs on he/sheis presented with several ways in whichto locate TV news stories. The mostpopular modality is access by date and acalendar option provides fast access toall the news stories for a given date inthis way. Users can also enter keywordsinto a text search box which are thenmatched against a text representation foreach news story using standard informa-tion retrieval approaches, captured viathe closed captions associated with thebroadcast. Once a story is located it ispresented as a set of keyframes,including an anchorperson shot pluskeyframes for any outside footage forthat story, inter-twined with the textdialogue of that story as taken from theclosed captions. For each story presentedwe also present a list of “related” storieswhich are those from the archive(currently about 10,000 stories, andgrowing daily) which are most similar tothe story in question. This allows a userto follow a thread of stories of, for

example, a criminal event, the trackingdown and capture of the criminals, theirtrial, sentencing, etc. It also allows a userto follow links to related stories. Forexample, when viewing a story of amurder in Dublin city a user would beshown links to stories of other murdersin Dublin from other dates, etc. Finally,we also incorporate a personalisationand recommender system into Físchlar-News which tracks users’ viewing ofstories, as well as their explicit storyratings, and we use this to provide a“recommended stories” feature which isused when users have been away fromthe system for some time. This is alsouseful to bring older stories which usersmay have missed, or forgotten, to theirattention. The diagram in the figureshows the process or news video analysisin the system and apart from some peri-odic sanity-checking of the performanceof the SVM, the process is entirely auto-matic.

The Físchlár-News system has beenshown to be very useful for users but it isprimarily a showcase for our underlyingresearch in multimedia content analysisand automatic annotation using

Fischlár-News: Multimedia Access to Broadcast TV Newsby Alan F. Smeaton, Noel E. O’Connor and Hyowon Lee

Fischlár-News is an operational system which provides content-based access toa growing archive of broadcast TV news.

Page 25: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 25

MPEG–7. It demonstrates a variety ofways in which multimedia content(digital video in our case) should beindexed and subsequently accessed. Thesystem has been operational on theUniversity campus for nearly 3 years andover 1,500 unique users have used thesystem to keep themselves informed ofTV news. It is especially useful for

people when traveling as it allows themto access their own TV local news, fromabroad using only a web browser andtheir password.

Although the research project whichfunded the development of Físchlár-News was finished in late 2003 we havekept the system operational, mostly

because of the demands of our users whofind it too useful to be without ! Thesystem also acts as an operational show-case of how video analysis and multi-media analysis and annotation can leadto useful systems to allow searching andbrowsing of that same multimediacontent. Several extensions to the systemhave also been suggested includingrecording more than 1 TV news programper day, recording TV news from morethan 1 broadcaster and allowing moreexplicit temporal browsing through topicthreads. These suggestions came from anextensive user study we performedrecently based on analysis of usage data,user diaries and pre- and post-studyquestionnaires. It is our intention toincorporate as many of these suggestionsas we can.

The support of the InformaticsDirectorate of Enterprise Ireland is grate-fully acknowledged.

Link: Centre for Digital Video Processing: http://www.cdvp.dcu.ie/

Please contact: Alan F. Smeaton, Centre for Digital Video Processing, Dublin City University / Irish Universities Consortium, IrelandE-mail: [email protected]

Process of News Video Analyses in Físchlár-News.

Database learning, like many other topicsin various disciplines, requires an under-standing of foundational conceptscombined with skills that can only beobtained in a realistic environment. Aspart of the INVITE project —INfrastructures for VIrtual Teaching andlearning Environments – at the School ofComputing at Dublin City University, we

have developed a range of interactivemultimedia features in an integrated envi-ronment – called IDLE – to support activedatabase learning and training. IDLE, theInteractive Database LearningEnvironment, is a Web-based, multi-modal educational media environmentused in undergraduate teaching for morethan five years.

An Educational MultimediaArchitecture Interactive multimedia features have tostructure and guide the learner’s accessto educational content. Learning tech-nology systems need to communicatecontent in the most appropriate form to alearner. Multimedia ideally suits theserequirements. Learning content is a

Interactive Multimedia-Enabled Learning and Trainingby Claire Kenny, Declan McMullen, Mark Melia and Claus Pahl

Multimedia technology is an ideal platform to support advanced forms of learningand training. Active learning, for instance, requires a high degree of interactivityin different forms — for which multimedia technology provides an infrastructuresolution. The IDLE system – an interactive educational multimedia systemsupporting database learning and training — shall illustrate the benefits andsupporting architectures for multimodal, interactive learning and training.

Page 26: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

26 ERCIM News No. 62, July 2005

collection of stored media resources thatpresents the learner with different viewsand activities relating to the centralconcepts of the subject domain. At thecore of the learning environment is amultimedia delivery systems that allowsthe learner to access content resourcesand to interact with content in the mostappropriate, educationally sound way.The figure shows the multimedia archi-tecture of IDLE. The main architecturalelements are:

• content resources, eg spoken andwritten words enhanced by images,moving pictures, or active objects

• delivery infrastructure and mediaplayers, eg basic Web browser func-tionality (hypertext), audio player(audio stream), animation player(animations), advanced Web browserand server functionality (active,dynamic pages using applets, HTMLforms, or servlets).

Educational Multimedia Design Concepts are at the centre of organisingeducational content. Usually, variousperspectives on the presentation ofconcepts in content exist in terms of thelearning and training context. Aspectssuch as declarative or factual knowledge,procedural knowledge, and skills formthese perspectives. All three perspectivescan be related to the same concept. Forexample, learning about databasequeries requires an understanding of theconceptual relational data model back-ground and the operational aspects ofquery execution as well as trained skillsin query formulation and execution.These different perspectives arise fromthe different learner objectives in rela-tion to a given concept: • for declarative knowledge such as data

model definitions presented throughsynchronised audio and hypertext,learning objectives include abstrac-tion, comprehension, and reasoning

• for procedural knowledge such asquery execution presented throughsequenced individual animations,controlled observation is a means tounderstand the operational aspects

• or skills such a query definition andformulation presented through applet-and servlet-supported active Webpages, execution and manipulationwith feedback are paramount.

Consequently, the learning and trainingrequires different forms of interaction oflearners with content.

Multimodality is one of the central char-acteristics of comprehensive learningtechnology systems that enable asuccessful learning experience andsupport learning objectives. A multi-modal media architecture is needed tosupport these learner objectives, ie, tofacilitate the corresponding learningactivities. We can associate learningactivities and suitable multimediamodalities and channels. Abstraction,comprehension, and reasoning is usuallyenabled through spoken and writtenlanguage, ie using audio and text media.Controlled observation is based on avisual learning experience using movingpictures or animations in a computer-supported learning environment.Execution and manipulation can besupported through an (almost) tactileform, where virtual objects are manipu-lated. Different media types in learningtechnology systems such as IDLE enableinteractivity in order to supportsuccessful active learning.

Authoring of InteractiveEducational MultimediaThe importance of multimedia inlearning and training requires a system-atic Interactive Educational Multimedia(IEMM) engineering approach. Learningand training activities need to be mappedto the human-computer interface and

implemented through multimediafeatures. Due to the complexity of thedomain, the design of channels and inter-action languages is ideally supported in adomain-specific framework. We haveused education-specific channels – suchas declarative knowledge, proceduralknowledge, skills, learning sequencing,feedback, and coaching – as domain-specific channels in the design of IDLEand other learning technology systems.Each of the channels is different in theway learners use and work with thecontent that is communicated over thechannels.

ConclusionsThe benefits of a multimodal learningexperience that enables a variety ofcontent interactions are undisputed. Thecomplexity of designing these experi-ences and authoring IEMM contentrequires a domain-specific approachusing educational channel notions asabstractions and particular media typesto enable certain learner-content interac-tions. Only such a development andauthoring approach will help to reducethe high development costs associatedwith educational content developmentand to make multimedia learning objectsmore reusable.

Please contact:Claus Pahl, Dublin City University, School of Computing / Irish Universities ConsortiumTel: +353 1 700 5620E-mail: [email protected]

The IDLE Multimedia Architecture.

Page 27: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 27

MUSCLE (Multimedia Understandingthrough Semantics, Computation andLearning) is a European Network ofExcellence (NoE) that aims fosteringclose collaboration between researchgroups in multimedia data mining andmachine learning. Within MUSCLE, ourresearch is focused on investigating stan-dards and tools that allow interoper-ability of heterogeneous and distributed(meta)data also by enabling data descrip-tions of high semantic content (egontologies, MPEG-7 and XMLschemata) and inference schemes thatcan reason about these at the appropriatelevels.

Metadata are used to represent the value-added information that describes thetechnical and semantic characteristicsassociated with MM data. Metadatamake data more processable, allowingmore efficient retrieval or classification,quality estimation and prediction basedon Machine Learning techniques in bothsingle and multiple-modality.

Many initiatives for metadata standard-isation have been proposed in order todescribe multimedia content in variousdomains. Scientific and industrialcommunities tend to create their ownstandards tailored on their particularneeds. This could cause an unrestrictedgrowth in the number of available stan-dards making the integration and sharingof MM data between different communi-ties (vision, speech, text, …) very diffi-cult. A recent approach is to combine aspecific MM metadata standard withother standards that can be used todescribe similar application domains, inorder to provide a more comprehensive

characterisation of heterogeneous MMdata without creating a new standard.

From a recent survey of the state-of-the-art, we have identified two mainapproaches to MM data processing. Onthe one hand, people who employ MMdata for scientific purposes use consoli-dated MM data processing algorithms,on the other hand, applications followinga Content Based Query (CBQ) paradigmrequire content representation.

This scenario, which involves MPEG-7or metadata models tailored on thespecific requirements of a given commu-nity, highlights a possible limit for inter-operability among different communi-ties. We feel that two important issuesmust be considered in order to achievean efficient and integrated use of MMmetadata: (i) a common MM standardformat able to describe and represent theintrinsic heterogeneous nature of MMdata and their semantics must be defined;(ii) more abstract models (eg ontologies)and the related mapping tools are needed

to “represent” and “translate” differentmetadata sets whose elements are corre-lated on the basis of the same or similarmeanings so that MM applications canuse ontology knowledge in addition tothe metadata (see the figure).

In the first case, one strategy could be touse MPEG-7, which is currently themost mature MM metadata standard, dueto its generality and extendibility.MPEG-7 permits an extensive descrip-

tion of multimedia content not only atthe low-level feature level (visual, audio,multimedia) but also at the highersemantic level. However, its freedom interms of structures and parameters issuch that, in general, it is not easy tointerpret MPEG-7 produced by others.To overcome this problem, a possiblesolution could be to build an MPEG-7ontology.

In the second case, the introduction of ahigh-level ontology covering multipledomains could be convenient; this solu-

Representation and Communication of Multimedia Data and Metadataby Sara Colantonio, Maria Grazia Di Bono, Massimo Martinelli, Gabriele Pieri and Ovidio Salvetti

In recent years the increasing role of Multimedia (MM) data, in the form of stillpictures, graphics, 3D models, audio, speech, video or their combination (eg MMpresentations), in the real world, has lead to a demand for better procedures forthe automatic generation and extraction of both low level and semantic featuresfrom multi-source data in order to enhance their potential for computationalinterpretation and processing.

Two possible solutions for MM data integration and interoperability: dynamic single

metadata standard (left) or definition of an upper ontology (right).

Page 28: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

28 ERCIM News No. 62, July 2005

tion would have the advantage of beingmore independent of the lower metadatastandards. However, the definition of acompletely new ontology is a verycomplex task.

When constructing ontologies, standardtools like XML, RDF and OWL shouldbe considered. XML supports the defini-tion of constraints for structure, cardi-nality and data-types but does notsupport the definition of semantic know-ledge. RDF provides mechanisms todescribe MM resources, group them and

represent their semantic relationships,but does not offer mechanisms to definegeneral axioms, which can provide astronger semantic representation. OWLcan be used to derive logical semanticconsequences but has expressivity limi-tations, which could be overcome byusing specific rule languages (egRuleML, SWRL, ORL, ...) extendingOWL axioms with additive and moreexpressive rules.

The use of a higher level ontology wouldseem the best way to approach the

problem of offering a simpler high levelaccess to MM data for processing andinterpretation purposes. We are nowinvestigating this solution.

Link: http://www.muscle-noe.org (Workpackage 9)

Please contact: Ovidio Salvetti, ISTI-CNR, ItalyTel: +39 050 3153124E-mail: [email protected]

or Eric Pauwels, MUSCLE Scientific CoordinatorCWI,The Netherlands, Tel: +31 20 592 4225E-mail: [email protected]

Humans undoubtedly possess the abilityto process visual information efficientlyand to identify images as being similarbased on their visual content. However,computational approaches currently fallshort of matching this ability. At FORTH-ICS we aim to develop and implementCBIR mechanisms that are perceptuallymotivated and based on biologicallyinspired architectures. Our recent work isconcerned with medical image retrievaland aims to provide a reliable frameworkthat can be customized for severalimaging applications, and potentially becombined with DICOM functionality aswell. Crucial to this goal is the automaticextraction of semantic information frommedical images (eg asymmetry orpathology detection). Nevertheless, thesemantic content of images is subjective,and depends on the specific image class.For this reason, generic similarity CBIRapproaches often fall into the ‘semanticgap’ problem, meaning that the computed

features can’t always properly describethe real characteristics of the image. AtFORTH-ICS we developed a novel two-tier CBIR platform inspired by the humancognitive architecture. The key ideaunderlying our work emanates frompsychological and neuroscientific studieswhich indicate that the human visualsystem processes information in severalstages.

The visual system retains independentretinotopic maps for different primitivevisual features (colour, form etc). In apre-attentive or early stage of vision, theprocessing on these feature maps isundertaken independently and in parallel,whereas in the subsequent attentive stagethe visual modalities engage in coopera-tive work. In other words, the pre-atten-tive level decomposes the optical scene inits primitive characteristics, which to alarge extent are processed independently,in parallel and autonomously.

After the first stage of fixed-time pre-attentive processing, the human visualsystem performs a serial and selectiveexamination of semantic objects thatdraw the subject's attention – that is, theattentive level of perception.

Based on this biological paradigm wedeveloped a two-tier CBIR platformfeaturing both a pre-attentive and anattentive level of retrieval by extendingour previous work on agent-based single-tier CBIR (ERCIM News No. 53, April2003). In order to be able to assess thevalue of the proposed architecture, ourplatform was customized for a specificdomain, ie brain MRI image retrieval.The pre-attentive layer of the proposedarchitecture produces independent,parallel feature maps (A, B, C, in Figure),each coding an independent visualfeature. During the retrieval stage, eachautonomous agent compares thecomputed values between the query and

A Cognitive Architecture for Semantically Based Medical Image Retrievalby John Moustakas, Socrates Dimitriadis and Kostas Marias

The automatic extraction of meaningful image semantics is an important steptowards the development of intelligent systems for Content-Based Image Retrieval(CBIR). Such systems have the potential to become useful clinical decision supporttools, by retrieving medical images with established diagnoses that are ‘similar’to the images the clinician must read. In the Institute of Computer Science -FORTH, we are developing and implementing experimental platforms for theinvestigation of CBIR, based on biologically inspired multi-agent architectures.In this news article, we present a novel platform based on a two-level architectureinspired by human cognitive mechanisms. These two levels share the computationof generic similarity and medical image semantics.

Page 29: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 29

each database image. The compar-ison scores from all relevant agentsare driven to the voting system,resulting in the final score for thecandidate retrieval image. The votingscheme is selected by the user.

An additional, ‘attentive’ layer isdesigned for the agents to receive,one by one, semantic regions ofinterest (ROIs). A specialized groupof ‘attentive’ agents then carefullyexamines and compares ROIs in aserial fashion (first ‘1’, then ‘2’, andso on). In our implementation, theattentive similarity of a given pair ofMRI images is defined as the simi-larity of their ‘closest’ pair of ROIs.It is obvious that in order to imple-ment the attentive retrieval level, thesemantics must be defined for thespecific application (brain MRIretrieval), since it is still difficult toautomatically define importantsemantics in any image class withoutincorporating any prior knowledge.

The definition of semantic regions forbrain MRI retrieval was based on novelalgorithms for brain symmetry detection.It is well known that a normal humanbrain exhibits a remarkable degree ofsymmetry with respect to the mid-

sagittal plane. In addition, the identifica-tion of regions of asymmetry is oftenindicative of diseases such asschizophrenia, epilepsy, andAlzheimer’s. We developed algorithmsfor segmenting and analysing asymmet-

rical regions for the ‘attentive’ levelof CBIR. The authors will reportinitial retrieval results on publiclyavailable data (‘The Whole BrainAtlas’ from http://www.harvard.edu)at the forthcoming IEEE InternationalConference on Multimedia & Expo(ICME2005).

The CBIR system was developed inJavaTM, and makes use of the JavaAdvanced Imaging package. It isfully scalable and can be easilyextended with additional agents orvoting schemes in order to take intoaccount the specific requirements ofdifferent experiments and classes ofimages. While the attentive level ofretrieval can be customized for anyapplication, provided that semanticfeatures can be automaticallydefined, this is an extremely hardtask. Researchers across disciplineshave tried for many years to shedlight on the mechanisms of decom-posing any image to its primitivevisual features, and extracting the

true underlying semantics.

Please contact: Kostas Marias, ICS-FORTH, GreeceTel: +30 2810 391696 E-mail: [email protected]

Two-tier architecture for medical image retrieval

consisting of a pre-attentive (a), and an attentive

level (b).

There are a number of reasons for thisscenario. The major one is that the state-of-the-art in image processing does notallow us to identify and extract anymeaningful segments from images.Images are still represented using low-level features. However, the images’low-level feature representation does notreflect the high-level concepts the user

has in mind (semantic gap) and – partlydue to this –users can experience seriousdifficulties in effectively formulatingand communicating their informationneed (query formulation problem).

Any reasonable solution to CBIR shouldaddress the issue of semantic gap, andproviding help with query formulation is

one way to do this. In our approach toimage retrieval, we employ an adaptivescheme for image retrieval.

Adaptive Models for ImageRetrievalDifficulties with query formulation areaddressed using an adaptive querylearning scheme and an innovative

Personalized and Adaptive Multimedia Retrievalby Joemon M. Jose and Jana Urban

Data accumulation has become an integral part of our life, be it text (eg email) ormultimedia data (eg photographs). This is mainly due to the proliferation ofcomputers, networking devices and consumer devices like digital cameras andcamcorders. However, our creation prowess is not matched by any comparablesearch facilities. As a result, finding relevant information from such archivesbecomes a cumbersome process. Even after a decade of research in content-based image retrieval (CBIR), users must plough manually through their mediaarchives in order to find what they need.

Page 30: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

30 ERCIM News No. 62, July 2005

search interface. We have developed aninterface (see Figure 1) in which a userstarts browsing with one example image.Subsequently, a new set of similarimages is presented to the user.

As a next step, the user – throughselecting one of the returned images –updates the query, which now consists ofthe original image and the selectedimage from the returned set of images.After a couple of iterations the query isbased on a set of images. In this scheme,the retrieval process is iterative: updatingthe system’s knowledge of the user’sinformation need based on the user’simplicit feedback. In the underlyinginteraction model, the user builds up abrowsing tree of interesting images bychoosing one image from a recom-mended set to be appended to thebrowsing path in each iteration. Thesystem’s recommendations are thenbased on a query constructed from thecurrent path of images.

In this approach, the emphasis is placedon the user’s activity and the context,rather than any predefined internal repre-sentation of the data. A path represents auser’s motion through information, andtaken as a whole is used to build up arepresentation of the instantaneous infor-mation need.

In a nutshell, our method supports bothbrowse-based and query-basedapproaches. It supports a query-less

interface, in which the user’s selection ofan image is interpreted as evidence of itsrelevance to her/his current informationneed. It therefore allows direct searchingwithout the need for formally describingthe information need. For the query, eachimage in the path is considered relevant,but the degree of relevance is dependenton the age: it decreases over time whennew images are appended. In this way,the retrieval model is a special kind ofrelevance feedback model, in which aquery is implicitly refined by the user’sselection of images for feedback. Itrecognizes and addresses the dynamicnature of information needs, and has theadvantage of allowing for an intuitiveand user-centred search process.

A user study involving design studentsdemonstrated the effectiveness of thisapproach. The evaluation showed thatpeople preferred the search process inthe ostensive browsing scheme, feltmore comfortable during the interaction,and found the system more satisfactoryto use compared to a traditional CBIRinterface.

Image Retrieval by GroupOrganizationAnother approach to providing effectiveimage retrieval is to develop mecha-nisms to support the handling of theretrieved set of images. This will allowusers to handle retrieved sets of imagesmore easily by organizing them into rele-vant groups. The system can then

capture the context of the user, and willthus be able to recommend images moretuned to their needs.

We introduced a system called EGO(Effective Group Organization), whichfacilitates retrieval in context (seeFigure 2). EGO is a system for themanagement of image collections,supporting the user through the processof personalization and adaptation. Inthis system, we stress the need for asearch system that provides flexible andextensible interfaces. The idea is to helpthe user in query formulation: by facili-tating qualitatively high interaction, thesystem learns the needs of its users andcan provide searches accordingly.

Our approach encourages the user togroup and organize their search resultsand thus provide more finely grainedfeedback for the system. It combines thesearch and management process, whichhelps the user to conceptualize theirsearch tasks and to overcome the queryformulation problem. The system assiststhe user by recommending relevantimages for selected groups. The user cantherefore concentrate on solving specifictasks rather than having to think abouthow to create a good query in accor-dance with the retrieval mechanism.

Please contact:Joemon M Jose, University of Glasgow, UKE-mail: [email protected]

Figure 1: An interface of the system. Figure 2: The EGO interface.

Page 31: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 31

In recent years, a large number of papershave been published on the subject ofadaptation in multimedia systems. Thesepapers reflect a diversity of interpreta-tions, however, generally have a ratherlimited scope. Best-effort schedulingand worst-case reservation of resourcesare two extreme cases, neither being wellsuited to cope with large-scale, dynamicmultimedia systems. A middle coursecan be found in a system that dynami-cally adapts its data, resource require-ments, and processing components toachieve user satisfaction. Nevertheless,there is no agreement about the questionsof where, when, what and who shouldadapt.

A distributed multimedia systemcomprises several types of components,such as media servers, meta-databases,proxies, routers and clients. In addition, alarge number of adaptation possibilitiesexist, from simple frame dropping up tovirtual server systems that dynamicallyallocate new resources on demand. Themain problem is determining which kindof component can best be used for eachkind of adaptation.

In the frame of the ADMITS project, weare seeking answers to exactly this basicquestion, and to a number of relatedquestions. In building the experimentalsystem, we explore a number of possibleadaptation entities (server, proxy,clients, routers), and implement andevaluate different algorithms for media,component and application-level adapta-tions. Experimental data is also collectedin order to gain insight into when, whereand how to adapt, as well as how indi-vidual, distributed adaptation steps inter-operate and interact with each other. Theoverall architecture of the experimentalsystem is depicted in the Figure.

The individual components playdifferent roles in the adaptation process.They are connected physically by thenetwork, and ‘semantically’ by MPEG-7Multimedia Descriptions and MPEG-21DIA Descriptions that flow over thenetwork. The reliance on these interna-tional standards makes the componentsinteroperable with any standard-compliant components developed else-where. To our knowledge there exists noother system that handles adaptation in

such a comprehensive and interoperableway.

Since ADMITS is a research project, itsmain emphasis is on the publication ofpapers. A list of these can be found onthe home page of ITEC(http://www.ifi.uni-klu.ac.at/ITEC/),under ‘Publications’. However, in orderto validate our ideas, but also with theaim of future industrial applicability, alarge repository of software tools (calledViTooKi, for Video Toolkit) has beendeveloped. The software tools are highlyinteroperable and available as open-source software (http://vitooki.source-forge.net/). The repository is unique inproviding a very rich set of interoperablevideo tools. The following non-exhaus-tive list gives an overview of the toolsand their major features:

• Media server:- standard compliant media streaming

by using RTSP and RTP/UDP- streams all media formats supported

by the ffmpeg library (eg MPEG-1,MPEG-2, MPEG-4)

- communicates terminal capabilitiesof the client device and user prefer-ences using standardized MPEG-21descriptors

- supports real-time adaptation ofmedia content according to theclients’ terminal capabilities, the userpreferences, and the availablenetwork resources; for example,mobile devices get a lower streamquality than high-performance work-stations with good network access

- implements standardized RTP exten-sions to allow intelligent retransmis-sion of lost video frames wherenecessary

- can be run in a distributed environ-ment that supports proactive serviceand content replication and migration

ADMITS: Adaptation in Distributed Multimedia IT Systemsby Laszlo Böszörmenyi

The ADMITS project, at the Institute of Information Technology of the Universityof Klagenfurt (ITEC), has developed an experimental distributed multimedia systemfor investigations in adaptation, an increasingly important tool for resource andmedia management in distributed multimedia systems.

The Experimental Distributed Multimedia System.

Page 32: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

32 ERCIM News No. 62, July 2005

operations; this is especially helpfulwhen content adaptation steps are notallowed due to legal constraints orthe user insists on the original streamin its full quality

- supports proactive adaptations byactively measuring and forecastingavailable server and networkresources on and between servernodes.

• Proxy server:- incorporates both a server and a

client implementation (since a proxymust act as a server to the client and aclient to the server)

- caches elementary streams indifferent quality versions

- implements quality-aware replace-ment strategies

- can be dynamically relocated in thevicinity of requesting clients.

• Meta-database:- multimedia database schema based

on the MPEG-7 standard- multimedia indexing framework - cost-based query optimization for

range and k-nearest neighboursearches

- application-level libraries forcontent-based image retrievalsystems, audio recognition tools,video browsing tools, and qualityaware MPEG-4 proxies.

• Media player:- standard compliant control of RTP-

based media streams by using RTSP- supports parallel presentation of

many videos in different viewers, indifferent qualities

- implements a general framework forSMIL- and BIFS-based multi-scenepresentations.

One of the most important and uniquecharacteristics of the ADMITS project isthe consequent combination of funda-mental research with international stan-dards. To reach this goal, ITEC hasdecided to participate in the standardiza-tion processes of the Moving PictureExperts Group (MPEG), especially inthe area of MPEG-7 and MPEG-21. Anumber of contributions in the context ofDigital Item Adaptation (DIA) havebeen submitted by ITEC (in partialcollaboration with Siemens AG) andaccepted by MPEG in the recent years.

Link: https://www.ifi.uni-klu.ac.at/ITEC/

Please contact: Laszlo Böszörmenyi Institute of Information Technology (ITEC),Klagenfurt University, Austria Tel: +43 463 2700 3611E-mail: [email protected]

Video conferencing has become a widelyused tool, since it is available to anyonewith a mainstream computer and a broad-band (xDSL, cable) Internet connection.Despite telecommunication serviceparameters having improved remarkably,they still represent a bottleneck in someprofessional applications of video confer-encing systems. Progress in this area hasslowed recently, mostly for business-related reasons. Consequently, it is hopedthat the market push will make furtherdevelopment probable, as has been thecase in Japan. The enhancement of thelocal loop bandwidth and other QoS(quality of service) parameters can openup the way for such demanding applica-tions as interpreted video conferences(IVC).

The Global Conference Network (GCN)system attempts to establish audio andvideo connections independently fromlanguage and actual place of residence. Inorder to achieve this general goal, GCNhas to meet the following requirements:• to integrate interpreters in the confer-

ence: this is the fundamental differencebetween GCN and currently availablevideo conferencing solutions

• to take over the role of existing localconference systems as the software ofstandard hardware elements

• to integrate all kinds of participantsthrough an appropriate Internetconnection

• to integrate local network-basedconferences and individual partici-pants through the Internet.

The most important advantages of anIVC system are the following:• to widen the applicability of video

conferencing itself• to reduce the cost of conferences by

integrating remote participants, partic-ularly interpreters

• to increase potential audiences for e-learning through interpretation.

The GCN system has three main func-tional parts. Firstly, the GCN database(GCNDB) contains all the informationabout any conferences organized withthe help of that GCN system. This is aMySQL database and it is maintained bythe GCN Web Server (GCNWS).Secondly, GCNWS is a PHP-based Webapplication. It provides Web pages that

Solutions for an Interpreter-EnabledMultimedia Conferencing Systemby Ferenc Sárközy and Géza Haidegger

As video conferencing is likely to become a commonplace tool in the near future,a demand may also arise for an interpreter-enabled video conferencing system.Despite the fact that current technology is able to accomplish this, no such productis actually available. The Global Conference Network project aimed at establishinga distributed, multi-lingual, multimedia conferencing system. Research anddevelopment work at the CIM Research Laboratory of SZTAKI contributed to thisproject.

Page 33: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 33

handle the entire life cycle of the confer-ence data, but the emphasis is on pre-and post-conference functionalities. Theconference structure is created during theset-up process and after the conference,the GCNWS can be used as an archiveviewer.

The third part of the GCN system - theGCN Applications (GA) - is an extend-able set of applications which interactwith each other and with the floorcontrol messages of GCN. The base setof these and their roles are thefollowing: • a Conference Organizer (CO), which

stores the conference’s actual statusand can be queried by client applica-tions to initialize themselves afterlogin or reconnect, authenticates andauthorizes the users, manages thevoting sessions and archives GCNmessages

• client applications for the differentkinds of participant roles, such asorganizer, chairman, interpreter,speaker and observer

• interfaces to the media handling appli-cations.

Our aim was to create a portable solu-tion, so all these components are realizedon a Java platform.

Media-handling tools are loosely inte-grated into the GCN system. At thebeginning of the development none ofthe available tools had a distinct advan-tage over the others. Since we did notwant to make such an important commit-ment at the beginning, so we designedGCN clients to be able to includedifferent kind of media-handling plug-ins.

As most standards and standard proto-cols in tele- and video-conferencinghave some deficiencies, we must alsodeal with protocol development.

The Open H.323 project aims to createan open-source implementation of theH.323 teleconferencing protocol. H.323and therefore the Open H.323 projectoffer a solution based on a MultipointControl Unit (MCU). One of our media-handling solutions is based on this open-source software.

Using an MCU leads necessarily to acentralized system. JMF (Java MediaFramework) and the VideoLAN project(see below) allow a decentralized, mediaserver-less solution. The requirement isthat the underlying network should bemulticast-enabled. JMF allowed us toimplement a distributed media-handlingsolution. However, during the imple-mentation, an unforeseen bug in JMFcaused some problems.

The Open Source VideoLAN projectaims to provide a video and audiostreaming tool. One of the result applica-tions of VideoLAN is an open-sourceprogram, named ‘VLC’, which is amedia-processing unit. It can handlecodecs that scale very well, and so worksalso on low bit-rates. VLC can streamwith RTP, either unicast or multicast.VLC itself is a stand-alone application,but offers several possibilities forprogrammed access to its services.

The solutions based on VLC and H.323worked equally well on a local areanetwork. VLC provides much betterquality, but also requires moreprocessing power and is based on alower level standard. On an entry levelADSL connection (384/128kbps), the

upload bandwidth was a serious bottle-neck for VLC. In the download directionthe bandwidth was sufficient, and bothimage and sound quality were satisfac-tory. The H.323-based solution providedthe quality that two NetMeeting clientscan achieve on this connection. Duringthe modification of the MCU we devel-oped the know-how for the integration ofbetter codecs into the system, and as aconsequence the final solution will bebased on the H.323 standard.

The project leader, DIGITON Ltd, is aninnovative Hungarian SME, working onsolutions to match future, novel needs.This project inspired new ideas andplans, and the consortium is open tofurther scientific challenges.

Links:http://www.videolan.orghttp://www.digiton.hu http://www.sztaki.hu/sztaki/divisions.jhtml

Please contact:Ferenc Sárközy, Géza Haidegger,SZTAKI, HungaryTel: +36-1-2796207E-mail: [email protected], [email protected]

General configuration of the GCN system.

Page 34: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

34 ERCIM News No. 62, July 2005

The Adaptive Distributed MultimediaServer (ADMS) was developed at theKlagenfurt University. It is an adaptivearchitecture for distributing videostreams through networks covering largeterritories, and manages both networkand node overloads by migrating andreplicating server components in thenetwork. The Host Recommendationmodule developed at the BudapestUniversity of Technology andEconomics (BUTE) is responsible forfinding proper placement of the compo-nents according to the dynamicallychanging network parameters and clientdemands. SZTAKI is also participatingin the development of the HostRecommendation module, in cooper-ation with the Klagenfurt University andBUTE. The module was tested as astand-alone in a simulated network envi-ronment using randomly generated clientrequests, and is integrated into the

ADMS at the Klagenfurt University forexperimental tests.

First, we concentrated on where to putproxies in the network. The task was tofind suitable locations for proxies of thedistributed multimedia server, whilemaximizing the clients' satisfaction andminimizing the network load. The clientsreceive the same video in parallel.

Proxy and cache placement is a heavilystudied area, but we found significantdifferences between the placement ofWeb proxies and ADMS proxies. In ourcase, the clients do not need accuratedelivery but rather strict Quality ofService. Moreover, ADMS delivers hugemedia streams instead of small files.Nodes hosting server applications in anetwork are usually assigned in a staticway. However, this approach is notapplicable to the ADMS. The delivery of

the data-streams with proper quality canstart only after the placement of theserver nodes. As a consequence, thedynamic reconfiguration of the serverrequires an automatic host recommenda-tion whose running time should complywith strict time constraints. TheTechnical University of Budapest beganwork on this problem by analysing twomethods, namely the greedy algorithmand the particle swarm algorithm.SZTAKI joined the server developmentby proposing and implementing twoother methods (linear programmingrounding and incremental algorithm) in2003 and 2004.

The greedy approach was the first algo-rithm to be implemented, and has provento be successful in many similar prob-lems. The particle swarm algorithm is akind of evolutionary algorithm. Theincremental algorithm finds an initialsolution as fast as possible and then thealgorithm incrementally improves it,complying with the time constraints inorder to approximate the optimal place-ment. Another approach is the linearprogramming rounding technique, wherethe integer programming formulation ofthe problem is derived, its linearprogramming relaxation is solved andthe results are rounded.

We compared the results gained byrunning the implementations of the fourproposed algorithms on different testnetworks. The greedy algorithm wasclearly the worst of the four, while theparticle swarm algorithm produced thebest result. In terms of running time, theincremental algorithm was the best.There was no single winner from the

Host Recommendation in the AdaptiveDistributed Multimedia Serverby Ottó Hutter, Tibor Szkaliczki and Balázs Goldschmidt

The Adaptive Distributed Multimedia Server (ADMS) developed at the KlagenfurtUniversity is able to add and remove its components to and from different nodesof the network. This novel feature of the multimedia server enables the dynamicplacement of the server components according to the current requests and Qualityof Service (QoS) parameters of the network. Researchers at SZTAKI and theBudapest University of Technology and Economics have developed andimplemented algorithms for the optimal placement of nodes for hosting the ADMScomponents in the network.

Our goal is to find a suitable location for multimedia proxies and data managers in a

network.

Page 35: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 35

particle swarm, linear programming andincremental algorithms; they can beapplied in a smart combination.

After the stand-alone tests, our next taskis to integrate this module into theADMS and to test it in an experimentalenvironment before the multimediaservice becomes available for a wideraudience. This year, the three partnersparticipating in ADMS developmentstarted a new project to complete theserver and to introduce pilot applica-tions. This is running in cooperation withtwo SMEs within the framework of theHungarian Economical CompetitivenessOperative Programme.

The Klagenfurt University implementedand evaluated a mechanism called ClientBehaviour Prediction, which determinesthe most likely locations of client

requests in the near future. This mecha-nism enables the server to execute thetime-consuming process of the place-ment of the data managers and videos inadvance, without significant delay inserving the client requests. For thisreason, the placement of the videosrecently came into the focus along withthat of the proxies.

Numerous online training materialscontain videos that can be streamed tothe user. SZTAKI intends to apply theresearch results to e-learning applica-tions containing video-based course-ware. Our objective is to create anAdaptive Learning Management Systemthrough the integration of standardLearning Management Systems (LMS),our own standardized courseware reposi-tory, and the ADMS platform. This inte-grated system delivers the videos

appearing in the training materialsthrough the ADMS. Student behaviourin requesting the training videos can bemodeled relatively well. For this reason,the student and course data gained fromthe LMS can significantly facilitate theprediction of the client requests.

Link:http://143.205.180.128/ITEC/Publications/pubfiles/pdffiles/2004-0005-BGAT.pdf

Please contact:Ottó Hutter, SZTAKI, HungaryTel: +36 1 279 6191E-mail: [email protected]

Tibor Szkaliczki, SZTAKI, HungaryTel: +36 1 279 6172E-mail: [email protected]

Balázs Goldschmidt, Budapest University of Technology and Economics, HungaryTel: +36 1 463 2649E-mail: [email protected]

A project at University College Cork inIreland has resulted in an experimentalplatform for scalable audio streaming tomobile endpoints. This research wassupervised by Professor Sreenan andcarried out for a research Master’sdegree by Jonathan Sherwin, a Lecturerat Cork Institute of Technology. In thenear future, a mobile device will becapable of connecting to many differentwireless networks, eg 3G, GPRS,802.11, Bluetooth. The choice of whichinterface to activate depends on factorssuch as availability, cost, and bandwidth,as well as coverage areas and mobility.The research addresses the issue of howto manage audio quality in such an envi-ronment.

Two key technical challenges arise due tovariations in delay and variations in band-

width. Streaming applications requireuninterrupted playback that is isolatedfrom the vagaries of network jitter. In theInternet this is commonly achieved usinga playback buffer, but moving betweennetworks can result in much larger delaysdue to handover protocols and the need tore-establish the flow of streamed data. Inregard to bandwidth variability, in theInternet various techniques are used todetermine the bandwidth available to auser and to adjust to mild and infrequentchanges. But in a mobile environmentextreme changes of bandwidth are likelyas a mobile device roaming betweenheterogeneous networks.

It was decided to design and implement aplatform based on open standard proto-cols, and to implement an application-layer solution to mobility issues,

meaning that the software should be ableto run on a device connected to anynetwork through which a TCP/IPconnection could be established with thestreaming server. The design made useof Apple Computer’s Darwin StreamingServer (DSS) and their QuickTimePlayer. These use the standard protocolsfor streaming - RTP/RTCP and RTSP.The approach taken was to add function-ality directly to DSS since source code ismade available by Apple, and indirectlyat the client end of the network by imple-menting a client-side proxy since theQuickTime Player is closed-source (seethe figure).

The purpose of the proxy is to shield theclient from the effects of changes ofbandwidth or moves between networks.Achieving this initially required addition

Scalable Audio Streaming to Mobile Devicesby Jonathan Sherwin and Cormac J. Sreenan

Telecoms operators are introducing new audiences to video and audio streamingas they roll out their 3G mobile phone technology throughout Europe andelsewhere. However, current implementations are really only the beginning, limitedas much by commercial as by technical considerations. In comparison to theInternet, mobile environments will pose significant challenges for the qualitydelivery of streaming media. The use of scalable audio streaming offers promisefor adapting quality in the face of mobility, but to date there has been a dearth ofpractical proposals and implementation experiences.

Page 36: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

36 ERCIM News No. 62, July 2005

Research taking place at UniversityCollege Cork in Ireland under the direc-tion of Prof. Sreenan is creating theelements that are needed to make thisvision a reality. For his PhD degree,Adrian Cahill is investigating novelapproaches for cost-effective manage-ment of TV content on a content distri-

bution network (CDN) – work that ispart-funded by AT&T Labs USA. JohnRoche has designed and implemented anexperimental network-based digitalvideo recorder as part of his MScresearch. The research commenced in2001 and is nearing completion.

This research is motivated by severalfactors, including the penetration ofbroadband access and digital set-topboxes, availability of sophisticated tech-niques for bandwidth management andquality of service, and the popularity ofpersonal digital video recorders. Thelatter are a class of home appliance that

A Networked Approach to TV Content Distributionby Adrian Cahill, John Roche and Cormac J. Sreenan

The combination of broadband access and packet internetworking removes theexisting spatial constraints for TV viewing by allowing residential users to obtaincontent that originates in any country. Allied with high-capacity networked storage,the temporal constraints can also be removed by ensuring ubiquitous access toa comprehensive library of stored TV content. This compelling vision presentsseveral important research challenges in the area of efficient and scalable contentdistribution networks for high-quality streaming media.

of support for mid-stream scalability toDSS and implemented in the streamingproxy. Then, components were added tothe proxy to enable bandwidth moni-toring and probing, and to re-establishthe connection to the server in case of amove between networks. A buffermanager was designed for the proxy tomaintain a continuous supply of data tothe client even during a move betweennetworks, and to replenish the buffer to aminimum level by varying the rate oftransmission from the server asnecessary.

The implementation was tested using anetwork emulator and some interestingresults and insights were obtained.

Firstly, maintaining continuous playbackto the user during a move betweennetworks was demonstrated to befeasible. However the duration of data inthe buffer depended on a number offactors, most critically the length of timefor which the mobile device is notconnected to any network, and the lengthof time it takes to obtain an IP address onthe new network. For example, obtainingan IP address via DHCP can take severalseconds. These effects were quantifiedas part of the experiments.

Secondly, managing the buffer in the faceof changing levels of bandwidth is acomplex problem. For example, if anextreme reduction in available bandwidth

is detected – eg by falling levels ofbuffered data – the proxy requests alowering of the quality of data sent fromthe server. However, before the serverhas received the request and acted on it, acertain amount of data is already bufferedwithin the network, awaiting delivery.While this data trickles through, theproxy’s buffer level continues to fall,possibly to a critical level. The imple-mentation has allowed clear observationand analysis of this problem.

The key contributions of the project arefirstly, an open-source standards-basedscalable streaming platform, andsecondly a set of valuable results andpractical insights. Areas of futureactivity include the incorporation of apassive method of bandwidth measure-ment and an adaptive predictive controlmethod to better cope with varyinglevels of network delay.

Link:http://www.cs.ucc.ie/misl

Please contact:Jonathan Sherwin, Cork Institute of Technology, Ireland. E-mail: [email protected]

Cormac J. Sreenan, University College Cork / Irish Universities Consortium, Ireland. E-mail: [email protected]

Software Architecture.

Page 37: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 37

allow a user to record TV content withfeatures for shifted time viewing, butstrictly limited in their range of availablecontent and capacity. The proposedsystem model is shown in the figure andenvisages set-top box clients accessingnetwork-based proxy storage serverswhich are organised into a large-scaleCDN.

The various elements of the architectureare:• proxies that store and deliver video

objects on-demand• a database is used as an indexing

server to track what content is avail-able and where it is located within thenetwork

• a management server is used to keeptrack of accounting and managerialaspects of the network, such as whatproxies are active

• a search server provides a user inter-face to a cached replica of the index-database.

Clients connect to the Search Server andselect the object to view, the searchserver redirects the client to the mostappropriate Proxy Server that can deliverthe object, and a streaming session isinitiated. The system is technically ahybrid CDN/Peer-to-Peer architecture,consisting of proxies that are leased on-demand from Internet Service Providersand Network Operators. The proposedarchitecture enlists the use of idle ISPproxies during times of high load, andlater, when the requests abate, theproxies are released from the network.The use of leased servers in this manneris an especially interesting and chal-lenging feature of the work, and onewhich appears attractive from a commer-cial perspective in relation to deploy-ment costs and operational flexibility.

Currently, the main focus of the work ison the computational placementproblem, which involves deciding uponthe number and location of the replicaswithin the network. The effectiveness ofour architecture can be greatly affectedby the placement strategy in use. If toomany replicas were created, then storageresources would be wasted, whereas toofew replicas would increase the distance(network hops) between the clients andthe servers, possibly resulting in

degraded performance. These factorsand others need to be considered whendeciding upon an optimal placementstrategy. Finally, this placement strategyneeds to be constantly evaluated, asobject freshness and popularity dimin-ishes over time.

One solution to finding the optimalplacement for video objects is to take allaspects of the problem and formulate a

cost function. This cost function evalu-ates all placement instances and identi-fies the placement layout that yields thelowest resource usage. This is computa-tionally expensive so a heuristicapproach is used based on a hierarchicalarchitecture that first decides the generalregion of the network for the replicaplacement and then performs an in-depthevaluation of all possibilities within thisregion.

An experimental networked digital videorecorder has been implemented to gainexperience with the approach.Programme listings for all availablechannels are provided through anElectronic Programme Guide (EPG)based on ETSI naming standards. AGUI enables searching on a variety ofinformation including:• keywords

• broadcast channel• broadcast time and date• language• episode number.

Yet-to-be-broadcast programmes thatare identified in search results may berecorded by simply selecting an optionon the GUI. Recording requests are sentto the network-based ManagementServer where a suitable Proxy is identi-

fied to complete the recording task.Whilst a programme is airing it isencoded by an Origin Server and trans-mitted by multicast to the designatedProxy and any clients wishing to viewthe programme in real-time. Proxyservers provide VCR-like functionalityand content delivery is enabled by theutilisation of standard streaming proto-cols - Real-Time Streaming Protocol(RTSP) and Real-Time TransportProtocol (RTP).

Link:http://www.cs.ucc.ie/misl

Please contact:Adrian Cahill, University College Cork / Irish Universities Consortium, IrelandE-mail: [email protected]

High-Level System Architecture.

Page 38: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

38 ERCIM News No. 62, July 2005

This project was undertaken by theDistributed Systems group at UniversitatPolitècnica de Catalunya (UPC) inBarcelona (Spain), with the support of aSpanish publishing company (edebé). Itwas conceived in 2000 as an evaluationof the learning and collaboration possi-bilities in the growing number of class-rooms in schools and universities that areequipped with networked PCs. The ideawas to define, pilot and developnetworked applications using simplePCs connected via Ethernet with noadditional hardware. We started consid-ering reliable multicast transports forscreen-sharing applications based on theRemote Frame Buffer (RFB) protocol, asa software replacement for expensiveand inconvenient screen projectors. Aspart of the research, pilot tests withhundreds of schools from many educa-tional levels all over Spain were carriedout, which provided feedback andsuggestions from the field. As a result ofthis process, an initial application wasdeveloped and a spin-off company(Rededia.com) was initiated in thesummer of 2002. This company special-izes in multicast-based synchronous

applications and is also providing tech-nical support for the product.

From that experience, the softwareevolved towards a complete reimple-mentation of a Windows COM compo-nent-based synchronous and multicast-based middleware for 1-N, N-1 and N-Ncommunication over a LAN supportingseveral synchronous applications. Thecurrent application goes well beyondsimply replacing a screen projector: itsupports interaction between theinstructor and all or several PCs in theclass in both directions, and also withingroups of students defined by theinstructor. Screen content, files,streaming multimedia, Web URLs andobjects can be efficiently shared amongPCs in the class.

The research group and the company areworking together in complementarytopics. While Rededia.com is supporting,maintaining and commercializing theRedianet class product, on the researchside the group at UPC is working onextensions of the transport for additionalmedia, for larger-range distribution

beyond LAN towards supporting multi-site collaboration, and on integratingsensors for tracking the location of peopleand PCs and automatically detecting theformation of groups. These new mecha-nisms would provide the applicationswith the context awareness to automati-cally and immediately offer collaborationsupport for emerging groups of peoplewho are simply sitting at the same table,or move close to one other.

This long-term collaboration between auniversity research group and a tech-nology company has proven mutuallyproductive. Rededia.com and UPCrecently applied for research and develop-ment project grants in national andEuropean funding bodies with otherERCIM research institutions. Given thestrong research component of our work,there is strong interest in collaboratingwith other ERCIM members, as well asoffering the product of our work toERCIM members (particularly universi-ties: an evaluation version of the productcan now be downloaded in Catalan,Spanish and English (http://www.redi-anet.com), and feedback is both appreci-ated and useful for the evolution of theproduct).

Links:Universitat Politècnica de Catalunya: http://www.ac.upc.edu

http://www.redianet.com

http://www.rededia.com

Please contact:Leandro Navarro-Moldes,Politècnica de Catalunya, SpainTel: +34 9 3401 6807E-mail: [email protected]

Manuel Oneto,REDEDIA, SpainTel: +34 9 3413 7952E-mail: [email protected]

Presentation, Control and Collaboration in the Networked Classroom by Leandro Navarro-Moldes and Manuel Oneto

Classrooms with networked PCs can be augmented by software to facilitatelearning. Presentation: an IP multicast-based application allows screen content,streaming video, files and Web pages to be viewed in all PCs efficiently. Control:the instructor can observe a mosaic of thumbnail views of all PCs, and can selectone for remote support. Collaboration: the instructor can define groups of studentswho can share screens, files or messages to facilitate collaborative learning.

Screenshot of the application from the viewpoint of the instructor.

Page 39: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 39

A key technology for many applications,such as Digital TV Broadcasting,Distance Learning, Video on Demand,Video Telephony and VideoConferencing, is Digital Video Coding.In third generation telecommunicationsystems, communication technologiesare extremely heterogeneous. Adaptingthe media content to the characteristicsof different networks (communicationlinks and access terminals) in order toobtain video delivery with acceptableservice quality is thus an important issue.Video transcoding converts onecompressed video bitstreaminto another with a differentformat, size (spatialtranscoding), bit rate (qualitytranscoding), or frame rate(temporal transcoding). Thegoal of transcoding is toenable the interoperability ofheterogeneous multimedianetworks reducing complexityand run time by avoiding thetotal decoding and re-encoding of a video stream.

We are interested in temporalvideo transcoding. Thisprocess skips some frames inorder to change the frame rate of a videosequence without decreasing the videoquality of non-skipped frames. In thirdgeneration mobile telecommunicationsystems (UMTS ), the bandwidth of acoded video stream must be drasticallyreduced in order to cope with theconstrained transmission channel. Frameskipping is a promising approach fortranscoding one video sequence intoanother with a lower bit rate, while main-taining good video quality. Many multi-media services (such as videoconfer-encing, video telephony) have real-timefeatures, so transcoding must guarantee afixed communication delay. We have

concentrated on this aspect, and havedeveloped and evaluated two temporaltranscoding architectures.

Temporal TranscodingArchitecturesIn a video sequence, many frames arecoded with reference to previous frames,using motion vectors and predictionerrors. In temporal transcoding, when aframe is skipped, the references of thenext frame are no longer valid. MotionVector Composition (MVC) is a proce-dure that computes the new motion

vectors of the non-skipped frames. Oncenew motion vectors have beencomputed, new prediction errors are alsoneeded for the transcoded frames.Another important issue in temporaltranscoding is the choice of frames to beskipped. A first frame rate control archi-tecture, Dynamic Frame Skipping(DFS), dynamically adjusts the numberof skipped frames according to motionactivity. This gives a measure of themotion in a frame and frames with muchmotion are not skipped. Anothertemporal transcoding architecture,Frame Skipping Control (FSC),computes the prediction errors; this

produces re-encoding errors, and framesare skipped on the basis of the effect ofre-encoding errors and motion activity.The goal of this strategy is to minimizethe re-encoding errors and to preservethe motion smoothness of the transcodedframes.

The real time features of many advancedmultimedia applications are not takeninto account by the above architectures.In order to meet the needs of such appli-cations, we have modified both architec-tures so that the output bit rate is

constant, and the maximum communica-tion delay is fixed. We achieved this byintroducing a transcoder output buffer,and by skipping frames according to thebuffer occupancy. The maximumcommunication delay depends on thebuffer size.

Simulation ResultsWe implemented an MPEG4-basedtemporal transcoder and evaluated theperformance of both our architecturesover several benchmark videos. Theresults, in terms of PSNR (a measureindicating the quality of the transcodedsequence) are compared with those of a

Video Transcoding Architectures for Multimedia Real Time Servicesby Maurizio A. Bonuccelli, Francesca Lonetti and Francesca Martelli

The video transcoding project at PisaTel, a laboratory located at ISTI-CNR in ajoint collaboration between ISTI, Ericsson Lab Italy, the ‘Scuola S. Anna’ and PisaUniversity, aims at developing efficient solutions for real-time video coding andtranscoding.

Average PSNR (dB) of temporal (DFS and FSC) and quality transcoder (QT) for ‘akiyo’ and ‘mobile’

video sequences (input bit rate = 64 Kbps, output bit rate = 32 Kbps).

Page 40: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

40 ERCIM News No. 62, July 2005

quality transcoder (QT). The comparisonshows that better performance isachieved by quality transcoding forvideos with a lot of motion and bytemporal transcoding (DFS and FSC) forvideos with little motion (see the figure).Moreover, we observed that the DFSarchitecture has the better performance

since in FSC many frames are skippedbecause of re-encoding errors. Weobtained similar results using differentMVC algorithms (Bilinear Interpolation(BI), Telescopic Vector Composition(TVC), Forward Dominant VectorSelection (FDVS), Activity DominantVector Selection (ADVS)).

Link:Pisatel:http://pacinotti.isti.cnr.it/ERI/project4.htm

Please contact:Francesca Martelli, ISTI-CNR, ItalyTel: +39 050 315 3468E-mail: [email protected]

Life today is becoming a multiplatformexperience in which people aresurrounded by different types of interac-tive devices, including mobile phones,personal digital assistants (PDAs),pagers, car navigation systems, mobilegame machines, digital book readers,and smart watches through which theycan connect to networks in differentways.

This situation poses a number of chal-lenges for the designers and developersof multi-device interfaces. A furthercomplication is that these devices canuse different modalities (graphics, voice,gestures, and so on, in different combi-nations). Thus, although there are now anumber of multimodal systems in circu-lation, their development still remains adifficult task.

A promising solution to handle thecomplexity involved is to use logicaldevice-independent, XML-basedlanguages to represent concepts, such asuser tasks and communication goals,along with intelligent transformers thatcan generate user interfaces in differentimplementation languages for differentplatforms depending on their interactionresources and modalities.

We have developed an authoring envi-ronment, TERESA, within the EU ISTproject CAMELEON to address issuesrelated to multi-device interfaces. An

extension was implemented in theSIMILAR Network of Excellence onMulti-Modal User Interfaces in order tosupport multi-modal user interfaces inmulti-device environments. TERESAincorporates intelligent rendering todecrease the cost of developing multipleinterface versions for the different targetplatforms, allowing designers to concen-trate on logical decisions without havingto deal with a variety of low-level detailsat the level of implementationlanguages.

In the design and development process,it is important to consider that there aretasks that may be meaningful only whenusing some specific platform ormodality. For example, watching a longmovie makes sense in a multimedia

desktop system, whereas accessinginformation from a car in order to avoid atraffic jam can be done only through amobile device, and if this task isperformed while driving, it can besupported only through a vocal interface.The modality involved may also impacton how to accomplish a task, forexample vocal or graphical mobilephone interfaces require the user toperform tasks sequentially which couldhave been done concurrently in adesktop graphical interface.

In our approach, a user interface is struc-tured into a number of presentations. Apresentation identifies a set of interactiontechniques that are enabled at a giventime. The presentation is structured intointeractors (logical descriptions of inter-

Designing Multi-Modal Multi-Device Interfacesby Silvia Berti and Fabio Paternò

The increasing availability of new types of interaction platforms raises a numberof issues for designers and developers of interactive applications. There is a needfor new methods and tools to support the development of Multi-Modal Multi-Device applications. The TERESA tool supports multi-modal user interfaces inmulti-device environments.

A User Interface

generated by the

authoring

environment.

Page 41: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 41

action techniques) and compositionoperators, which indicate how to put theinteractors together. While at the abstractlevel, the interactors and their composi-tions are identified in terms of theirsemantics in a modality independentmanner, at the concrete level theirdescription and the attribute valuesdepend on the modality involved.

In the case of both vocal and graphicalsupport, multimodality can be exploitedin different manners. Modalities can beused alternatively to perform the sameinteraction. They can be used synergisti-cally within one basic interaction (forexample, providing input vocally andshowing the result of the interactiongraphically) or within a complex interac-tion (for example, filling in a formpartially vocally and partially graphi-cally). A certain level of redundancy canalso be supported, for example whenfeedback for a vocal interaction isprovided both graphically and vocally.We have decomposed each interactorinto three parts and the above propertiescan be applied to each part: prompt,represents the interface output indicating

that the system is ready to receive aninput; input, represents how the user canactually provide the input; feedback,represents the output of the system afterthe user input.

The composition operators indicate howto put such interactors together and areassociated with communication goals. Acommunication goal is an effect thatdesigners aim to achieve when theystructure presentations. Grouping is anexample of a composition operator thataims to show that a group of interfaceelements are logically related to eachother. It can be implemented in thegraphical channel through one ormultiple attributes (fieldset, colour, loca-tion...), whereas in the vocal channel thegrouping of elements is achieved byinserting a sound or a pause at the begin-ning and the end of the groupedelements. In the case of multimodalinterfaces we have to consider the actualresources available. Thus, grouping on amultimodal desktop interface should bemainly graphical and the use of the vocalchannel can be limited to providing addi-tional information on the elements

involved. Instead, the grouping for amultimodal PDA interface uses the vocalchannel more extensively, while thegraphical one is dedicated to importantor explicative information (see Figure).

The current environment supports designand development for various types ofplatforms (form-based desktop, interac-tive graphical desktop, form-basedmobile, interactive graphical mobile,vocal device, multimodal device) andgenerates the corresponding user inter-faces in various implementationlanguages (XHTML, XHTML MobileProfile, VoiceXML, SVG, X+V). Futurework will be dedicated to supportingadditional modalities, such as gesturalinteraction.

Links:TERESA: http://giove.isti.cnr.it/teresa.html

SIMILAR Network of Excellence: http://www.similar.cc

HIIS Laboratory at ISTI-CNR:http://www.isti.cnr.it/ResearchUnits/Labs/hiis-lab/

Please contact:Fabio Paternò, ISTI-CNR, ItalyTel: +39 050 315 3066E-mail: [email protected]

With the rapid diversification of the Web(access devices, communicationnetworks etc), the utilization of multi-media documents is becoming commonpractice. For this reason, in 1998 theW3C designed a standard mark-uplanguage dedicated to the description andsynchronization of multimedia content:the Synchronized Multimedia IntegrationLanguage (SMIL, pronounced ‘smile’).However, authoring multimedia informa-tion remains a real challenge, as the tradi-tional WYSIWYG paradigm is difficultto apply in non-deterministic multimediapresentations involving rich user interac-tion. Taking such parameters intoaccount, the Web Adaptation and

Multimedia (WAM) team of INRIA hasdeveloped LimSee2, an authoring toolthat provides a powerful graphical userinterface designed to assist in the manip-ulation of time-based multimedia andthereby increase productivity.

LimSee2 has a multi-view solution thatrenders the structure of the SMIL docu-ment at different levels during theauthoring process: timing and synchro-nization, spatial layout, XML tree etc.The different views are synchronized (amodification in one view is immediatelyrendered in all the other views) andprovide functionality that allows a userto manipulate and fine-tune a SMIL

document without requiring a full know-ledge of the language.

Spatial ViewThe spatial layout of a SMIL documentcan be edited in LimSee2 in a 2D canvas,which constitutes a WYSIWYG envi-ronment for a fixed time in the temporalscenario. SMIL regions can be easilymoved, resized or created in a few clicks,media content can be directly previewed(for images, texts and videos), andregion z-indexes can be adjusted with anintuitive drag-and-drop mechanism. The2D canvas also provides traditionalfeatures such as a zooming tool or acustomizable snap grid.

LimSee2: A Cross-Platform SMIL Authoring Toolby Romain Deltour, Nabil Layaïda and Daniel Weck

LimSee2 is an open-source and cross-platform authoring tool dedicated to themanipulation of time-based multimedia documents for the Web. It relies on theSMIL standard of the World Wide Web Consortium (W3C).

Page 42: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

42 ERCIM News No. 62, July 2005

Timing ViewVisualizing the temporal scenario of adocument during the authoring process isone of the key challenges of multimediaauthoring. To fulfil this requirement,LimSee2 features a timeline view: atemporal element is represented by a box,with its length standing for the durationof the element, and its position standingfor the start-time of the element. Anarrow linking two boxes then represents asynchronization relation. Hence, the usercan easily and intuitively adjust mediasynchronization by moving and resizingthe boxes in the timeline. Moreover, acursor may be moved through the time-line in order to see in the spatial view thestate of the multimedia presentation at aspecific time. This is very useful forpreviewing the results of the synchro-nization. In addition to SMIL-specificfeatures (for instance SMIL temporalcontainers represented as collapsible boxcontainers) the timing view also providesa zooming tool and media previewability.

XML FeaturesAs SMIL is an XML language, manywell-established XML technologies aredirectly involved in LimSee2. First ofall, the DTD-awareness of the applica-tion ensures the validity of the under-lying SMIL document model (providedthat it is modified within the Limsee2GUI). Additionally, a built-in validatorcan check on demand the conformance

of the document with SMIL 1.0 or 2.0syntaxes, and allows the user to correctpossible errors interactively. LimSee2also features two XML-dedicated views:the Structure View, which renders thehierarchical structure of the SMIL docu-ment as a collapsible tree, and theAttributes View, which is a high-levelDTD-aware attributes editor allowingfine-tuning of the presentation directlyfrom the attributes values. As for theusers who want to directly access thesource of their SMIL documents,LimSee2 integrates an XML sourceeditor. This features traditional syntaxhighlighting, pretty formatting, andincremental search and replace, andseamlessly synchronizes with the otherviews.

Other FeaturesIn addition to the major featuresdescribed above, LimSee2 also providesthe following:• the character encoding of documents is

detected at opening and can bechanged on-the-fly

• the LimSee2 interface has just beeninternationalized, and is available inEnglish, French, and Japanese

• a native player (eg RealPlayer,Ambulant) can be launched from theapplication to externally visualize theedited document

• a ‘storing’ mechanism means entireSMIL presentations can be saved intoa single directory (source, media

content, linked documents), even fordistant data accessed by HTTP

• slides can be automatically importedas JPG images from MicrosoftPowerPoint or OpenOffice.org presen-tations

• a slideshow builder tool allows thecreation in a few clicks of multimediaslideshow presentations, whichsynchronize a video/audio track, aninteractive table of content, a naviga-tion panel and imported slide images.

Collaboration and PerspectivesA new version of the application is underdevelopment. It will be more user-oriented, and will feature powerfulauthoring functionality such as template-based editions or reusable models ofdocuments. In parallel, the WAM Teamis cooperating closely with NRCD(National Rehabilitation Center forPersons with Disabilities, Japan) onaccessibility features in LimSee2, in thecontext of a project aiming at establishinga natural disasters preparedness system.

Project DetailsThe development of LimSee2 was initi-ated by the Web Adaptation andMultimedia research team at the FrenchNational Institute for Research inComputer Science and Control (INRIA),Grenoble, France.

The project started in October 2002. Thefirst public release was made on the 13thJune, 2003. The latest available versionis 1.5.2 and was released on the 28thMarch, 2005.

LimSee2 is developed with Java J2SE1.4.2. It is currently available onWindows, Linux, and Mac OS X plat-forms.

Links:LimSee2 Home Page: http://wam.inrialpes.fr/software/limsee2/

WAM Team Home Page: http://wam.inrialpes.fr/

W3C Synchronized Multimedia Page: http://www.w3.org/AudioVideo/

Please contact:Nabil Layaïda, Romain Deltour,WAM Team, INRIA, France.Tel: +33 4 76 61 52 84, +33 4 76 61 52 18E-mail: [email protected], [email protected]

SPECIAL THEME: Multimedia Informatics

The LimSee2 SMIL authoring tool.

Page 43: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 43

In recent years we have seen the Internetand television come closer in manyways. This began when streaming tech-nology gradually opened the way towatching television programs on theInternet. IMK-ITV is now consideringreversing this process, that is, imple-menting Internet-type applications ontelevision (TV). Moreover, interactiveTV (iTV) aims to combine traditionalTV with additional services previouslyonly available on the Internet. This isleading to the disappearance of theboundaries between two forms ofcontent, namely infotainment andedutainment.

The development of iTVdepends on several factors. Thefirst of these is the developmentof the iTV production chain. Thisinvolves producing mediacontent (ie a TV program),broadcasting it on a particularchannel and receiving it using aTV view system. The second isthe development of both themedium and paradigm of interac-tivity with iTV. So far, interac-tion with TV content is stilllimited, with ‘clickable’ interac-tivity as the sole paradigm ofinteraction, and the use of remotecontrols as the sole medium ofinteraction. The third factor isthe development of artificalintelligence (AI) techniques thatserve an intelligent iTV (IiTV)platform. To this end, speechrecognition, natural languageunderstanding and decisionsupport systems should bedeployed in the development of(IiTV). This has enormouspotential, as the number ofInternet users, while rapidly

increasing, is still small compared totelevision users.

Marilyn (Multimodal Avatar ResponsiveLive Newscaster) is a new system forinteractive television, where a virtualreality three-dimensional facial avatarresponds to the remote control in realtime, speaking to the viewer andproviding the requested information.

Marilyn informs the viewer with a clickof a button on daily financial news. Herethe focus is on the provision of choice as

well as personalization of information inan entertaining manner. As well asoffering live financial data from leadingstock exchanges such as New York,London, Frankfurt and Tokyo, multilin-gual aspects of the information are alsocatered for.

Traditionally, financial news has beenregarded as content-based and ratherrigid in format. In contrast, the edutain-ment aspects of Marilyn can make sucha program entertaining as well as infor-mative.

Figure 1 shows a schematic ofthe structure of Marilyn. Theapplication consists of threestand-alone but interrelatedsections.

Financial ContentThis section, which is primarilydeveloped in Java, M3D (aJava3D language developed atFraunhofer) and XML, definesthe core information centre ofMarilyn. Financial content fromvarious locations (eg Frankfurt,London and New York) is fed inlive to the system, and updatedcontinuously.

3D Facial AvatarThis, which forms the heart ofMarilyn, is a 3D humanoid faceof the late actress MarilynMonroe, which reads the finan-cial news tickers live in respec-tive languages. The facial avataris implemented using XML,MPEG4 and M3D1. Theprogram implements naturalmouth and eye gestures duringspeech, as in the case of a realhuman being.

Merging Virtual Reality and Television through Interactive Virtual Actors Marilyn —Multimodal Avatar Responsive Live Newscasterby Sepideh Chakaveh

A recent and innovative research activity of the ERCIM E-Learning Working Groupand the Fraunhofer Institute for Media Communication (IMK) merges virtual reality,artificial intelligence and television.

Figure 1: A schematic of the structure of Marilyn.

Figure 2: Marilyn screenshot presentation of the Tokyo

stock exchange.

Page 44: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

44 ERCIM News No. 62, July 2005

Within a factory environment, there areseveral areas in which it is essential thathuman experts observe, test, control, andunderstand the details of the productionprocesses. This is the case in both thedesign and the planning stages, as well asduring manufacturing, testing, and veri-fication. The new features of interactivemultimedia services raise the level ofquality, increase the efficiency, observ-ability and controllability of production,and also allow the customer and the end-user to actively take part in the manufac-turing processes.

Since tele-presence and interactivemultimedia involve the integration ofhumans, our research work looks at man-machine interaction, and the definitionof VMDs (Virtual ManufacturingDevices) devoted to human entitieswithin the production area. A survey onwearable computers and head-mountedmonitors was used to analyse currenttechnology with a view to applyingcomputer platforms on the human body.

Our research focused on a generalizedsoftware environment that enables tele-

presence operations with interactivemultimedia features. The tangible resultof this work is called IMUTA:Interactive Multimedia for Tele-pres-ence Applications. In this framework weextended Enterprise Resources withmultimedia information and function-ality. The Extended Resource modelprovides a richer and more human-oriented view of resources andprocesses. For example:• machining can be observed remotely

and in real time; relevant process data,product drawings, technology plansand quality reports can be chosen andshown according to actual requests

• documentation of value products canbe extended with multimedia attach-ments to ensure the customers’ satis-faction

• customers can observe the productionflow of a product.

Significant effort has been devoted to theplanning of some pilot demonstrations.We looked for demonstration sites andservice-intensive applications that wouldinspire the hosts and potential customers

Interactive Multimedia for Supporting the Quality of the Productionby George L. Kovács, Géza Haidegger and János Nacsa

In the frame of DIGITAL FACTORY, a Hungarian national R&D project on factoryautomation, researchers are investigating, developing and applying interactivemultimedia technologies to assist design, production and test processes. Theproject is being undertaken at the CIM (Computer Integrated Manufacturing)Research Laboratory of SZTAKI and in the factory partner’s sites. Over the pastthree years, a generalized software environment (IMUTA) and some use caseshave been developed.

Figure: The real and the simulated manufacturing cell.

The avatar may of course have any otherface mapped to it, but, owing to thedifferent bone structures of male andfemale faces, the sex of the facial avataris pre-defined. This means the meshstructure for the avatar must be createdin advance for a male or femalepresenter.

Equally, the voice of a male or a femalespeaker should be defined, as well as thedesired language or accent. For example,if Marilyn reads the news in English,then information provided from the

London stock exchange could be readwith a British accent, or that from theNew York stock exchange with anAmerican accent. In addition, the finan-cial data from the Frankfurt stockexchange can be read in German.

User InterfaceThrough the user interface the viewermay communicate with the facial avatar,which in turn selects the desired content.By clicking on the remote control it ispossible to choose the stock exchangeand in which language the information

should be read. The user interface is firstrealized using FLASH and it is thenimplemented using MPEG4.

This project has been supported by Dr.Soha Maad, an ERCIM fellow, hostedby the Institute for Media Communi-cations for 18 months.

Please contact: Sepideh ChakavehInstitute for Media Communication, Fraunhofer ICT GroupERCIM E-Learning Working GroupE-mail: [email protected]

Page 45: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 45

The CINEMA Project: A Video-Based Human-Computer InteractionSystem for Audio-Visual Immersionby Renaud Dardenne, Jean-Jacques Embrechts, Marc Van Droogenbroeck and Nicolas Werner

Numerous studies are currently focusing on the modelling of humans and theirbehavior in 3D environments. The CINEMA project has similar goals butdifferentiates itself by aiming at enhanced interactions, the creation of mixedreality, and the creation of interactive and reactive acoustical environments. Partof the project consists in gesture recognition of a user, who is given the real-timecontrol of auralization and audio spatialization processes.

The CINEMA project is a collaborativeeffort between teams at four Belgianuniversities. The project foresees thedevelopment of a real-time system thatincludes 3D user and environmentmodelling, extraction of motion parame-ters, gesture recognition, computation ofdepth maps, creation of virtual spaces,and auralization and spatialization. Themerging of these techniques is expectedto produce a tractable 3D modelallowing users to interact with a virtualworld.

Although much progress has been madein the areas of video and audioprocessing, there is still a need to find away of properly combining theseelements. The production of a tool for

augmented or virtual reality that mixesvideo and audio could be used for anumber of applications: as an educa-tional tool for example, or at culturalplaces like museums.

This project started on January 2003 andwill end in September 2006. It is fundedby the Walloon Region, Belgium.

Following are details regarding thevideo-handling techniques and the inter-action with the sound system:

The Video Analysis EngineThe major goal of the video part of thesystem is to identify simple humangestures. As real time operation is acompulsory requirement for any interac-

tive system, we committed to the use ofsimple tools to achieve gesture recogni-tion. Therefore our algorithm detectsmotion based on a known algorithm forbackground extraction, combined with askin detection algorithm to isolate thehead and the hands. To increase therobustness of the system, we testedseveral motion detection algorithms andcombined them. For indoor scenes, abackground extraction using a staticbackground and a thresholding algo-rithm in the HSV colour space for skindetection appeared to suffice, except forthe presence of shadows. Fortunatelyshadows have a specific range of coloursand particular shapes that enable us todiscard them.

to implement similar functions. Modelswere developed for the specific applica-tion scenarios in order to derive a gener-alized framework.

Some of these scenarios are thefollowing:• reuse of high-value waste workpieces

are supported with complex databases,including production and design infor-mation, and quality multimedia data;users can reach these entities via anintranet environment

• a wearable computer-based test envi-ronment can support customer witnesstests

• predefined deviations in the produc-tion parameters activate multimediadata acquisition at the necessarymachines and operations; the effects ofthe time delay of the statistical data are

solved with a circular multimediabuffer mechanism.

By integrating real-life vision andcomputer-animated entities, virtual andaugmented reality services are offeredfor specific user applications (seeFigure). For that purpose, we developeda distributed simulation and visualiza-tion environment for flexible manufac-turing systems. It enables users to visu-alize the results of a simulation of anFMS (flexible manufacturing system)with three-dimensional graphics, or tovisualize the actual data of a real system.On the client side, our system usesVRML and Java, so it needs only a Webbrowser with the appropriate plug-ins touse it. The users can view the sceneprovided by the simulation system viathe Internet, as well as choosing the

perspective from which they view it andinteracting with it in the simulationphase. The IMUTA concept supports theintegration of augmented reality.

These developments will make interac-tive multimedia a more widely acceptedtool not only in simulation and experi-mentation, but in real factory applica-tions. This will, however, require furtherorganization and management.

Links:http://www.sztaki.hu/sztaki/ake/cim/http://www.digital-factory.sztaki.huhttp://www.ercim.org/publication/Ercim_News/enw51/monostori.html

Please contact: George L. Kovács, Géza Haidegger, János Nacsa, SZTAKI, HungaryTel: +36 1 279 6140E-mail: gkovacs|haidegger|[email protected]

Page 46: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

46 ERCIM News No. 62, July 2005

Interactive digital television (iTV) isbecoming widely available and is beingpromoted by broadcasters as a means ofattracting viewers to digital TV and as anadditional revenue stream. iTV providesadditional services to the viewer, whichenrich the broadcast experience, byincluding, for example, time-shiftedviewing; bidirectional feedback; andadditional content, from subtitles tointeractive games; which all take theuser's experience to a higher level than

the linear, passive format of traditionalTV. However, this comes at a cost.

Increasing the amount of content avail-able whilst retaining current interactiondevices increases the complexity of theviewer interaction resulting in morebuttons being pressed to locate addi-tional content, and more informationbeing crammed onto screens to showavailable options. Viewers could beprovided with new forms of interactiondevices such as personal computers, or

increasingly complex remote controls.However, this would change the natureof the viewing experience from a shared,family, or social situation, to a personaltele-visual experience where one personhas control, and possibly prime view-point.

A more promising approach is to person-alise the experience so that to reduce thecontent presented to what is believed ofinterest to viewers. This is mostcommonly realised as personalised elec-

Personalised Enriched Broadcast Experienceby Mounia Lalmas, Nick Bryan-Kinns and Alan Pearmain

The SAVANT European research project developed integrated broadcast andInternet technologies, allowing users to access enriched broadcast content in anintelligent and transparent manner on a range of devices (eg TV, PC and PDA)and under varying network conditions. One aim of SAVANT was to personalisethe enriched broadcast experience.

Once regions with skin have beendetected, we must distinguish betweenthe head and the hands. It is hard topropose a fast and general method thatwould consider all possible pathologicalcases for the relative positions of skinnedregions. Therefore we rely on theassumption that the head lies in themiddle of the region with motion. Thefinal system is capable of recognizingsimple hand positions (both hands raisedabove the head, both hands on the sameside of the body etc), and sending appro-priate instructions to the audiosubsystem (see Figure 1).

The Audio Subsystem To build a realistic and immersive world,we have implemented a sound spatializa-tion system. This audio system is able tosynthesize localized sound sources,which means that we can decide on theorientation and distance of all thesources. In addition, the system usesreal-time software to produce a realisticsound. This last technique, called aural-ization, is fully configurable and takesinto account the room configuration orthe virtual audio environment via itsimpulse response, and the positions ofthe sound sources and the listener. Forthe sound rendering, a multichannelsetup is used, and the audio signals are

distributed to the loudspeakers accordingto an amplitude panning technique.

The Communication Channelbetween the Video and AudioSubsystems The video and the audio subsystemscommunicate through a socket interface.This approach offers the choice ofrunning all the algorithms on a singlemachine (using the loopback) or on sepa-rate machines. As all the machines arepresumably located on a LAN, acommunication channel based on theUDP protocol was chosen. The subsys-tems interoperate, and in order to allowfor further extensions, both understand alanguage that uses tags and is based onthe XML language.

Link:http://www.ulg.ac.be/telecom/research/cinema.html

Please contact: Nicolas Werner, Université de Liège, Belgium Tel: +32 4 366 2652 E-mail: [email protected]

Renaud Dardenne, Université de Liège, Belgium Tel: +32 4 366 2686 E-mail: [email protected]

Jean-Jacques Embrechts, Université de Liège, BelgiumTel: +32 4 366 2650 E-mail: [email protected]

Marc Van Droogenbroeck, Université de Liège, Belgium Tel: +32 4 366 2693E-mail: [email protected]

Head and hands

detection.

Page 47: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

SPECIAL THEME: Multimedia Informatics

ERCIM News No. 62, July 2005 47

tronic programme guides (EPG).However, as the amount of contentincreases personalisation needs to beintroduced within programmes andsub-programmes, especially in infor-mation rich domains such as newsprogrammes. In this article wedescribe our personalisation systemdeveloped as part of the SAVANTproject (Synchronised and ScalableAudio Visual content AcrossNeTworks).

PersonalisationA service consists of a number ofservice components with a compo-nent consisting of either a segment ofthe main broadcast or an item ofadditional related content (egMPEG-4 clips, HTML pages, 3Dgraphics). A service is personalisedby modified it according to user pref-erences so that only those servicecomponents that are of interest to theuser are shown or recommended.

A SAVANT screen, for the newsdomain, provides an alphabeticallisting of content items from thedaily archive, grouped under theirrespective topic headings Figure 1shows how content is personalisedaccording to user interests, which areexpressed in user profiles.Personalisation is applied to the topiclistings as well as the individual mainand additional content items of a newsstory within each topic.

The personalisation system also builtrecommendations according to userprofiles (see Figure 2). A recommenda-tion screen only include the ranked list ofcontent items that the user is likely to beinterested in, whereas a personalisationscreen include all content items, butplace the user’s ‘favourites’ at the top ofthe list.

ImplementationPersonalisation was applied to the mainbroadcast content and any additionalcontent included in the digital TVservice itself. Any type of additionalcontent, together with segments of themain content, was considered as servicecomponents within the complete serviceof a TV programme. A novel metadatamodel based on the existing standards

TV-Anytime, MPEG-7 and MPEG-21was developed to transparently segmentthe traditional linear programme intosub-components.

Users profiles were represented as a listof likes and dislikes with associatedprobability values reflecting the degreeof interest. By examining over time thetype of content that users choose to view,user profiles were created and continu-ally maintained to reflect the changinginterests of the user, by employing a rele-vance feedback algorithm.

Information retrieval technologies wereused to prioritise news items and to buildrecommendations. The metadata associ-ated with news items was indexed, andthis index was matched against userprofiles. Matching resulted in the proba-bility that the news item is relevant to theuser profile, where the higher the proba-bility value, the higher the news item isestimated to be of interest to the user.

EvaluationA study was carried out to evaluatethe appropriateness of the personali-sation of iTV services. We consid-ered the appropriateness of a recom-mended news clip to be measurablein terms of its perceived relevance tosome defined other semi-randomlygenerated.

Our results indicate that the recom-mendations were on the whole betterthan the semi-random generatedones, which means that appropriaterecommendations were made tousers. Post-feedback showed someskepticism from the participants interms of the utility of personalisedservices. Participants expressed theirconcerns about using recommenda-tions to structure their viewing, butwere still interested in receiving suchrecommendations. They also indi-cated that personalisation may be ofuse when there is a lot of contentavailable and some pre-filtering isrequired, but would not be appreci-ated if it actually selected additionalpossible items to view and socontributed to information overload.So, it is the filtering aspect that wasseen by the participants as its keyutility in their interactive TV experi-ence.

SAVANTThis project was funded by the EuropeanCommission IST programme. Our part-ners were Brunel University, UK;Expway, France; Fraunhofer IPSI,Germany; Institut für Rundfunktechnik,Germany; Nederlands OmroepproduktieBedrijf, The Netherlands; RundfunkBerlin-Brandenburg, Germany;Siemens, Germany; STT, Spain;Telenor, Norway; TNO Telecom, TheNetherlands. The interface was designedby IPSI and the content was provided byRBB (Rundfunk Berlin-Brandenburg).

Link:http://www.savant.tv/

Please contact:Mounia LalmasQueen Mary University of London, UKE-mail: [email protected]://qmir.dcs.qmul.ac.uk

Figure 1: Personalisation.

Figure 2: Recommendation.

Page 48: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

48 ERCIM News No. 62, July 2005

R&D AND TECHNOLOGY TRANSFER

Articles in this Section

48 LIGHT: XML-InnovativeGeneration for HomeNetworking Technologies by Luca Tarrini and Vittorio Miori

49 Modelling of AuthenticReflectance Behaviour inVirtual Environments by Michal Haindl and Jiří Filip

51 Haptic Training Systems inVirtual Surgery by José San Martín, DavidMiraut, Carolina Gómez andSofía Bayona

52 Computer Recognizes WhaleTails by Annette Kik, Eric Pauwels and Elena Ranguelova

53 Text Document Classificationby Jana Novovičová

54 Advancing Black-Box Reusein a Multimedia ApplicationFrameworkby Bernhard Wagner

55 CASSEM: Vibration Controlin the Smart Way by Salim Belouettar

57 Point6: The IPv6 Skill Centre— Moving to the Next-Generation Internet Protocol by César Viho and Annie Floch

58 Working Slower with MorePowerful Computers by Lorenz M. Hilty, AndreasKöhler, Fabian van Schéeleand Rainer Zah

59 Coordinating IST Research across Europe by Simon Lambert

60 Software Automation meetsInteractive MediaDevelopmentby Dirk Deridder, ThomasCleenewerck, Johan Brichauand Theo D'Hondt

LIGHT (xmL-Innovative Generation forHome networking Technologies) is aresearch project of the Domotics Lab atISTI-CNR, Pisa, in collaboration withtwo local SMEs: SISTER (Sistemi-Territoriali), expert provider of profes-sional GIS services and consultancy,and MicroComm (Microwave Commu-nications), specialized in microwavecommunications circuits and systems.The project began in November 2004and is currently developing alightweight middleware, WSED (WebServices for Embedded Devices), for theseamless networking of embeddedsystems through the Web services proto-cols.

Increasingly, many appliances, such asvideo recorders, MP3 players, mobilephones and personal digital assistantshave small embedded computing unitsthat allow their users to interoperate in aubiquitous environment. The actualtrend of including a growing number oftransistors into ever smaller devices,while decreasing their power consump-tion and costs, is pointing the Webservices domain in a new and interestingdirection, the home environment.

We believe that the advent of service-oriented protocols opens the way touniversal, platform and language neutralconnectivity among appliances. Thistechnology is offering much of realinterest to the home networking commu-nity, in particular with respect to systemarchitecture. In recent years, a number ofarchitectures for home networking havebeen proposed, all aiming at facilitatingdynamic cooperation among devices inorder to be able to announce the presenceof new services in a network, or to

discover services and use them.However, so far, no single architectureprovides all the capabilities of the others,and none is yet sufficiently mature todominate the field.

Our research is thus focused on the inte-gration of a rich suite of Web servicespecifications with resource-constraineddevices to enable the typical networkingfunctions. Figure 1 shows our proposedarchitecture.

The WSED Core Broker integratescurrent and emerging Web service specifi-cations (see Figure 2), so that these can beused efficiently to offer new services withbetter discovery, description, and eventingmodels, and advanced security. In fact, thenext generation of services in the homeenvironment will be autonomous and plat-form-independent computationalelements, described, published, discov-ered, orchestrated, and programmed usingstandard protocols. This will significantlyreduce the cost and time that users have tospend in configuring and managing theirpervasive device.

To satisfy the specific needs for interop-erability between the WSED core and

LIGHT: XML-Innovative Generation for Home Networking Technologiesby Luca Tarrini and Vittorio Miori

The LIGHT project aims at designing and developing a standard, service-orientedarchitecture for the next generation of embedded devices in order to render homenetworking easier and more effective for the end-user. LIGHT will exploit theService Oriented Architecture (SOA) paradigm in a new domain, the homeenvironment, proposing advanced and interoperable profiles for householdappliances and devising solutions for the problems of overhead associated withhosting Web services on embedded systems.

Figure 1: LIGHT Architecture.

Page 49: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

49

R&D AND TECHNOLOGY TRANSFER

ERCIM News No. 62, July 2005

other home computing platforms,LIGHT employs DomoML (DomoticsMarkup Language) an XML languagefor communication between householdappliances, standardizing the exchangeof messages independently of transportmedia. DomoML provides a unifiedvocabulary for the description ofdevices, services and applications, thusguaranteeing interoperability amongheterogeneous domotics architecturessuch as UPnP, Konnex, X10, and so on.The heart of DomoML is the commontype system which can be used todescribe services and applications.

The Personal Universal Controller (PUC)improves the interface of household

appliances. It downloads a specificationof the appliance’s features and then auto-matically generates an interface to controlthat appliance. The XML-based specifi-cation language abstractly describes thevarious functions of the appliances. Thisspecification has been provided by thePebbles project and we thank JeffreyNichols and Brad A. Mayers of CarnegieMellon University for the software.

Another objective of the project is tointegrate these XML-based languagesinto a single architecture to satisfy theneed for interoperability among intelli-gent appliances and the generation ofremote control interfaces for complexappliances. However, in order to inte-

grate all these components within anembedded device, LIGHT must providesolutions to minimize the resourcerequirements. In fact, the currentobstacle to hosting Web services onembedded devices is the high require-ment of resources in terms of processingpower, available memory, and networkbandwidth. This is due to the overheadassociated with the execution of applica-tions, data processing and marshallingtechniques. The processing and memorylimitations of embedded devices mayseverely compromise the usability ofXML, unless suitable optimizations areemployed. LIGHT will thus investigateand offer efficient solutions in themiddleware layer to overcome theconstraints of limited resources typicallyimposed by embedded devices.

This research has been supported partlyby the ‘Curiosity Driven’ programme ofISTI-CNR.

Link: http://light.isti.cnr.it/

Please contact:Vittorio Miori, ISTI-CNR, ItalyTel: + 39 050 315 3007E-mail: [email protected]

Luca Tarrini, ISTI-CNR, ItalyTel: + 39 050 315 2607E-mail: [email protected]

Figure 2:

Web Services

Specification

The goal of this project is to develop anovel, image-based, physically correctvisualization technology for VR-systems. The limited capabilities ofcurrent graphics systems restrict theappearance of materials covering virtualobjects to low-quality unnatural approxi-mations. The project aims to overcome

this restriction by developing a physi-cally correct simulation of light distribu-tion and reflection, as well as an image-based real-time visualization technologyfor synthetic objects with complexreflectance behaviour. RealReflect–(Real-time visualization of complexreflectance behaviour in virtual proto-

typing) is a joint research projectbetween the University of Bonn; ViennaTechnical University; Max PlanckInstitute for Computer Science,Saarbrucken; Institute of InformationTheory and Automation, Academy ofSciences of the Czech Republic (UTIA),Prague; French National Institute for

Modelling of Authentic Reflectance Behaviour in Virtual Environmentsby Michal Haindl and Jiří Filip

Recent advances in computer hardware and virtual modelling allow the view andillumination dependencies of natural surface materials to be respected.Thecorresponding texture representation in the form of Bidirectional Texture Function(BTF) enables significant improvements in the realism of virtual models, at theexpense of an immense increase in material sample data space. Consequently,the introduction of a method for the fast compression, modelling and renderingof BTF data is inevitable. The development of such mathematical models is amongthe major objectives of the European RealReflect project.

Page 50: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

50 ERCIM News No. 62, July 2005

R&D AND TECHNOLOGY TRANSFER

Research in Computer Science andControl (INRIA), Grenoble;DaimlerChrysler AG, Germany; ICIDOGmbH, Germany; Faurecia, France; andVirtual Reality Architects, Austria. Theproject is funded by the EU IST-Programme. Several major applicationshave already come from the project,including realistic virtual models inarchitecture and the automotive industry;for example, the simulation of safetyissues such as blinding of the driver byinterior lights in a night-driving situa-tion.

UTIA is responsible for development ofmathematical models capable of simu-lating and compressing BTF textures.Modelling of a natural texture is a verychallenging task, due to the unlimitedvariety of possible surfaces, illuminationand viewing conditions along with thestrong discriminative functionality of thehuman visual system. The BTF measure-

ments span the whole hemisphere ofpossible light and camera positions inobserved material sample coordinates,according to selected quantization steps.We used the best recent BTF measure-ments from our University of Bonnproject partners, which contain 6561images per material sample. In their loss-less compressed form, these data requireabout 5 GB per material sample. Suchmaterials present in a car interior VRscene can easily reach the range ofterabytes.

UTIA has developed several BTFmodelling methods, which can beassorted into three distinct groups: prob-abilistic models, reflectance models andsampling-based models. As with othertexture applications, we have confirmedthe experience that there is no ideal BTFmodel. Models in each group havecomplementary properties and conse-quently also optimal application areas.

Our probabilistic models are based eitheron a set of underlying Markov randomfields or probabilistic mixtures, andallow unlimited texture enlargement,BTF texture restoration, huge BTF spacecompression (up to 1:1000 000) andeven modelling of previously unseenBTF data. These methods require neitherthe storing of original measurements norany pixel-wise parametric representa-tion. However, such models are non-trivial, and several unsolved theoreticalproblems exist that must be circum-vented. Nonlinear reflectance modelsoffer BTF modelling with excellentvisual quality and mild compressionratio (1:200) as well as a fast graphicshardware implementation. Samplingapproaches rely on sophisticatedsampling from original BTF measure-ments. They offer high visual quality formost textures, negligible computationcomplexity but only a moderatecompression ratio (1:4). Another draw-back of these methods is that unlike ourprobabilistic models, they do not allow aBTF data space restoration or modellingof unseen (unmeasured) BTF space data.Finally a hybrid method based onGaussian distribution mixtures wasdeveloped with the aim of combining theadvantages of both basic texture-modelling approaches. This hybridmodel can be used either to directlysynthesize textures or to statisticallycontrol sampling from the original data.

Regardless of single models traits, theyall meet comprehensive requirementssuch as unlimited seamless BTF textureenlargement, high visual quality andcompression. In addition, they includesome less obvious features like the strictseparation of analytical and syntheticparts, possible parallelization or imple-mentation inside a graphics-processingunit.

Links:http://www.realreflect.orghttp://www.utia.cas.cz/RO

Please contact:Michal Haindl, CRCIM (UTIA), Czech RepublicTel: +420 2 6605 2350E-mail: [email protected]

Figure 1: The Mercedes C100 gearbox (3D model courtesy of DaimlerChrysler) covered

with synthetic BTFs generated by probabilistic models.

Figure 2: BTFs generated by the synthetic reflectance model applied to part of the C100

car interior.

Page 51: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 51

R&D AND TECHNOLOGY TRANSFER

Virtual Reality is acquiring increasingimportance due to its potential in manyareas of science and technology. Its aimis the manipulation of human sensesthrough computer-generated virtualscenes such that users can interact withthe virtual environment intuitively and inreal time, without realising it is not real.In order to achieve these goals, virtualreality hardware (eg head-mounteddisplays, data gloves, tracking systemsand haptic devices) is being used.

Thanks to this potential, virtual realityhas been applied to education, militarytraining, architectural design, psychology(for phobia therapy) and medicine, inparticular for anatomical teaching andsurgical training systems such asMinimal Invasive Surgery (MIS).

Typically, most efforts have beenfocused on composing and recreatingsynthetic images and sounds. However,this kind of virtual environment, whichinvolves only visual and auditory sensa-tions, is very limited in its ability tointeract with users. A growing researchcommunity has realized the necessity ofthe sense of touch in performing precisetasks like surgical simulations and theremote operation of robotics inhazardous environments. During theseprocedures, the handling of assortedobjects is critical, and haptic assistanceclearly improves performance.

Recent advances in computing andhaptic devices now allow the pressureand temperature sensory receptors inhuman skin to be ‘cheated’. In order tocreate the sensation of touching andmanipulating virtual objects in real time,we need to generate the reaction forceover them. These receptors are spreadover a large area and are extremely fast,

so the reflecting force must be recalcu-lated over 1000 times per second. This isa complex challenge when we deal withdeformable bodies as in MIS.

MIS techniques offer important advan-tages, such as decreasing patient traumaand reducing costs, but they also have animportant drawback: they are complexprocedures and difficult to master.Traditional learning methods, assumingthe importance of force feedback, usereal surgical instruments with phantoms

(plastic models) for training. Howeverthis set of instruments is expensive anddelicate; moreover, the plastic modelsdegrade due to the incisions until theybecome unusable.

Here virtual reality and haptics havecome up with the solution. Our team, incooperation with the company GMV, isdeveloping a successful advancedarthroscopy training simulator, based onvirtual reality. Using this simulator, thepractitioner learns to handle the surgicalinstruments and to recognize patholo-gies. The incorporation of a haptic

device facilitates triangulation and navi-gation learning, making it possible tofeel and distinguish between differenttissues.

Current haptic MIS training systems canbe classified as either general purposedevices (eg from SensAble PHANToM)or brand new devices dedicated tosurgical simulation (eg Immersion’sLaparoscopy Impulse Engine, VirtualLaparoscopic Interface-VLI withoutforce-feedback, and Laparoscopic

Surgical Workstation withforce-feedback). We havedesigned a dedicatedsystem, but there remainsome limitations that wewant to solve.

Shoulder arthroscopyrequires a very flexibleworking area, especiallyregarding inter-trocardistance (this is fixed incurrent devices at 135mm).Human body dimensionsdiffer from one patient toanother, and this techniquenecessitates the simulta-neous positioning of each

trocar on every side of the patient’sshoulder.

We need both surgical instruments toface on, one trocar against the other: thisis not normally possible because of themain stand structure. It is thereforenecessary to include this additionaldegree of freedom.

It is crucial to ensure that even if theinstruments face each other, and theminimum lateral distance between themhas been established, they do not touchone another.

Haptic Training Systems in Virtual Surgeryby José San Martín, David Miraut, Carolina Gómez and Sofía Bayona

In the near future, the learning process of different surgery techniques is goingto be based on surgical simulators. On the one hand, emerging Minimal InvasiveSurgery (MIS) means minimum patient trauma. On the other hand, the only surgeonassistance is an indirect visualization of the operation, via a video monitor. Inthis situation, the sense of touch has become a key feature in the training ofsurgeons. Consequently, there is a need for the development of virtual realityhaptic training systems.

A training session with Laparoscopy Impulse Engine.

Page 52: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

52 ERCIM News No. 62, July 2005

R&D AND TECHNOLOGY TRANSFER

The suggested design, the LaparoscopyTraining System, provides the usual fivedegrees of freedom (including a scissorshandle option in order to simulategrasping forces), plus two additionaldegrees. These are a lateral displacementof every actuator set (adaptive inter-trocar distance) and a rotation of thesesets around the device symmetry axis.Moreover, the design allows anergonomic positioning of the surgical

instruments depending on the preferenceof the trainee.

Our work is focused on designing anMIS training system to simulate specificlaparoscopy techniques that are notgenerally considered. Having finishedthe design stage, a prototype is underconstruction, and a working deviceshould be achieved by summer 2005.

Nowadays, the features of physicaldevices restrict programmers due tolimitations in the number of degrees offreedom. We think that this design willhelp to solve these constraints.

Link:http://dac.escet.urjc.es/investigacion/GMRV/

Please Contact:José San Martín, Universidad Rey Juan Carlos/SpaRCIM, SpainE-mail: [email protected]

Governments and organizations such asthe International Whaling Commissionwant to know more about whale popula-tions in order to protect biodiversity andto make informed choices about possiblehunting within certain limits. Usingparameters such as the number and ageof female animals, scientists canestimate mathematically how thepopulation will evolve.Identification is an important toolin collecting these data for stockmanagement. One of the mostconvenient ways is photo-identifi-cation. It is less intrusive thanharpooning whales for a sample ofDNA, and is more extensivebecause of the potentially largecollections of photographs frombiologists, sailors and tourists.

The EUROPHLUKES projectcommenced in 2001. Its brief wasto develop a photo-ID system anddatabase for cetaceans - whales,dolphins and porpoises. Theobjective was to be able to identifyif a particular cetacean had alreadybeen photographed, and if so,where and when. The networkcomprises more than forty part-ners and participants, mostlymarine biologists. It is coordinated

by the Universiteit Leiden and is fundedby the Fifth Framework of the EuropeanUnion. To deal with specific computervision problems, researchers fromCWI’s Signals and Images group, amember of ERCIM’s WG on Image and

Video Understanding, were invited tothe team.

Watershed Method Human beings can easily identify indi-vidual whales, due to the unique spotsand scars on the animals’ skin and the

shape and indentations of theirtails or dorsal fins. However, theycan only compare a few pictures ata time, whereas photographiccollections are growing rapidly. Acomputer on the other hand, canquickly compare thousands ofpictures in databases but it hasgreat difficulties in spotting simi-larities. For instance, a tail canlook different when it is turned,waves can occlude specific marksand the picture quality can varyenormously. In a black and whitepicture it is not always easy todistinguish between tail and water.In addition, a computer is notintelligent so it cannot immedi-ately recognize the most importantmarks: it has to compare allfeatures, big and small.

To recognize individual character-istics semi-automatically, CWIresearchers combined and appliedseveral mathematical techniques:

Computer Recognizes Whale Tailsby Annette Kik, Eric Pauwels and Elena Ranguelova

How many whales are there in the ocean and how do they migrate? To answerthese questions it is important to identify individual animals. Until now, biologistshave tried to search by hand through vast numbers of photographs. In theEuropean EUROPHLUKES project that ended in 2004, CWI researchers developeda method for semi-automatic pattern recognition of whale tails and dorsal fins.This is a first step towards the automatic photo-identification of individual animals.

Individual whales can be identified by their markings.

CWI's software system Phluke_Phinder detects the

shape of the tail and the location of its various spots.

Page 53: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 53

R&D AND TECHNOLOGY TRANSFER

image segmentation, contour and featureextraction and finally, comparison ofthese data with an image database. First,the grey-level of the image is representedas a three-dimensional picture: white ishigh, darker is lower. At the edgebetween sea and tail the difference ingrey-scale will be large, or in otherwords, the gradient will be high. Thepicture of this gradient can be viewed asa ‘topological surface’: It has mountains,plains and valleys. In the so-calledwatershed method, virtual water floodsthis surface. When the water is so highthat two lakes in valleys are about tomerge, a virtual watershed is placed.This procedure has been programmed inMATLAB, a technical computinglanguage, allowing the computer torobustly identify regions of similar greyscales and thus extract the contours ofthe tail or dorsal fin.

Spots and ScarsTo compare spots and scars on a tailphotographed from nearby with one thatis photographed from a larger distance ora different viewpoint, the researchersattach a virtual grid to the tail that isalways the same, or in other words‘invariant under affine transformations’.This grid is used as a coordinate system.To define it, the user specifies threeanatomical points in the tail: the middlenotch of the fluke and both its tips.Assuming that the tail is not too flexible,parallelism and relative distances can beused to divide the tail into a large numberof small regions (eg thirty). The propor-tion of spots to background in each ofthese regions can then be represented ina 30-dimensional vector, which can becompared with other vectors in thedatabase of identified animals.

The performance can be improved bycombining the above feature vector witha more detailed mathematical descriptionof specific, salient spots and scars on theanimal. With morphological processingthese marks can be found - both theircentre of gravity with respect to the gridand a computed approximating ellipse.Using these data, the computer gives atop list of potential matches with picturesfrom the database. The user can then pickthe actual match from this shortlist orconfirm that there is no matching animalin the database. This method makes itpossible to compare pictures with largercetacean databases.

Links:http://www.europhlukes.net http://homepages.cwi.nl/~ely/projects.htmhttp://homepages.cwi.nl/~pauwels/PNA4.1.html

Please contact:Eric Pauwels, CWI, The NetherlandsTel: +31 20 592 4225E-mail: [email protected]

Document classification appears inmany applications, including e-mailfiltering, mail routing, spam filtering,news monitoring, selective dissemina-tion of information to informationconsumers, automated indexing of scien-tific articles, automated population ofhierarchical catalogues of Webresources, identification of documentgenre, authorship attribution, surveycoding and so on. Automated text cate-gorization is attractive because manuallyorganizing text document bases can betoo expensive, or unfeasible given thetime constraints of the application or thenumber of documents involved.

The task of text categorization (TC) isthe focus of a research project at the

Institute of Information Theory andAutomation, at the Academy of Sciencesof the Czech Republic (UTIA). Thedominant approach to TC is based onmachine learning techniques. We canroughly distinguish three differentphases in the design of TC systems:document representation, classifierconstruction and classifier evaluation.Document representation denotes themapping of a document into a compactform of its content. A text document istypically represented as a vector of termweights (word features) from a set ofterms (dictionary), where each termoccurs at least once in a certainminimum number (k) of documents. Amajor characteristic of the TC problemis the extremely high dimensionality of

text data. The number of potentialfeatures often exceeds the number oftraining documents. Dimensionalityreduction (DR) is a very important stepin TC, because irrelevant and redundantfeatures often degrade the performanceof classification algorithms both in speedand classification accuracy.

DR in TC often takes the form of featureselection. Methods for feature subsetselection for TC tasks use some evalua-tion function that is applied to a singlefeature. The best individual features(BIF) method evaluates all words indi-vidually according to a given criterion,sorts them and selects the best subset ofwords. Since the vocabulary usuallycontains several thousand or tens of

Text Document Classificationby Jana Novovičová

During the last twenty years the number of text documents in digital form hasgrown enormously in size. As a consequence, it is of great practical importanceto be able to automatically organize and classify documents. Research into textclassification aims to partition unstructured sets of documents into groups thatdescribe the contents of the documents. There are two main variants of textclassification: text clustering and text categorization. The former is concernedwith finding a latent group structure in the set of documents, while the latter (alsoknown as text classification) can be seen as the task of structuring the repositoryof documents according to a group structure that is known in advance.

Page 54: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

54 ERCIM News No. 62, July 2005

R&D AND TECHNOLOGY TRANSFER

Multimedia production environmentslike Macromedia Director face the chal-lenge of presenting inherently complextechnologies in a simplified, accessibleway to the multimedia producer. When itcomes to interactivity that exceeds themost basic operations (eg followinglinks, ‘next/previous slide’ or ‘start/stopanimation’), these production environ-ments opt for scripting. While a propri-etary scripting language allows for avocabulary exactly fitted to the domainat hand, such an approach can becomecomplex and cumbersome, due to limita-tions of the metaphors used or becauseithe difficulty of scripting is often under-estimated.

Like most frameworks, the object-oriented multimedia application frame-work MET++ has evolved into a statewhere most tasks can be achieved byblack-box rather than white-box reuse,or by composition rather than inheri-tance. To avoid having to set up a whole

development environment to authormultimedia applications, a visualprogramming environment has beenconceived and implemented as anorthogonal control environment to theabstractions already available withinMET++.

MET++MET++ is a portable object-orientedC++ multimedia application frameworkdeveloped at the University of Zürich. Itis based on the object-oriented applica-tion framework ET++. ET++ consists ofseveral frameworks, which support thedevelopment of desktop applicationswith graphical user interfaces. ET++ hasa layered architecture addressing thefollowing goals: portability among oper-ating systems and windowing systems,generic data structures, support forgraphic user interfaces, and desktopapplications. The abstractions in ET++are highly integrated and anticipate allgeneric interaction between application

components. Thus a developer using theframework need only fill predefined slotswith the application-specific content.

MET++ is built on top of ET++,adhering entirely to the architecture andstyle defined by ET++. The multimediaextensions provided by MET++ are:• 3D graphics• audio and music• video• time synchronization• visual programming.

MET++ has proven its worth innumerous multimedia projects and isused in commercial applications as well.

Experience in student assignments hasshown that for a developer versed in theMET++ framework, application develop-ment is very efficient. The newcomer,however, must expend significant timeand effort to get used to the frameworkand its abstractions. The difficulty comes

Advancing Black-Box Reuse in a Multimedia Application Frameworkby Bernhard Wagner

MET++ is a tool for producers of multimedia packages. A visual programmingenvironment avoids the need to set up a development environment to authormultimedia applications.

thousands of words, BIF methods arepopular in TC. However, such methodsevaluate each word separately, andcompletely ignore the existence of otherwords and the manner in which thewords work together.

UTIA proposed the use of the sequentialforward selection (SFS) method basedon novel improved mutual informationmeasures as criteria for reducing thedimensionality of text data. Thesecriteria take into consideration howfeatures work together. The performanceof the proposed criteria using SFScompared to mutual information, infor-mation gain, chi-square statistics andodds ratio using the BIF method hasbeen investigated. Experiments using anaive Bayes classifier based on multino-mial model, linear support vector

machine (SVM) and k-nearest neighbourclassifiers on the Reuters data sets wereperformed and the results were analysedfrom various perspectives, includingprecision, recall and F1-measure.Preliminary experimental results on theReuters data indicate that SFS methodssignificantly outperform BIF based onthe above-mentioned evaluation func-tions. Furthermore, SVM on averageoutperforms both Naive Bayes and k-nearest neighbour classifiers on the testdata.

Currently, text classification research atUTIA is heading in two directions. First,investigation of sequential floatingsearch methods and oscillating algo-rithms (developed in UTIA) for reducingdimensionality of text data; and second,design of a new probabilistic model for

document modelling based on mixturesfor simultaneously solving the problemsof feature selection and classification.These phases of the project rely on theinvolvement of PhD students from theFaculty of Mathematics and Physics atCharles University in Prague.

This research is partially supported byGAAV No A2075302, the EC MUSCLEproject FP6-507752 and the project1M6798555601 DAR.

Link: http://www.utia.cas.cz/RO/FD

Please contact: Jana Novovičová, CRCIM (UTIA), Czech RepublicTel: +420 2 6605 2224E-mail: [email protected]

Page 55: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 55

R&D AND TECHNOLOGY TRANSFER

In many industrial and defence applica-tions, noise and vibration are importantproblems. In recent years, the control ofsound and vibration has been the subjectof much research, and there are nownumerous examples of such applica-tions. The most common general classi-fication of vibration control differenti-ates between passive, active and hybridpassive-active control. Passive controlinvolves the use of reactive or resistivedevices that either load the transmissionpath of the disturbing vibration or absorbvibratory energy. Active control alsoloads the transmission path but achievesthis loading through the use of actuatorsthat generally require external energy. In

passive control, the material propertiesof structure such as damping and stiff-ness are modified so as to change theresponse of structure. In active control,the structural response is controlled byadding external effort to the structure.

Combining these two approaches, hybridcontrol integrates the passive approachwith an active control structure, and isintended to reduce the amount of externalpower necessary to achieve control.

Normally an adaptive or smart structurecontains one or more active or smartmaterials. It is the use of these materialsthat causes the whole structure to be

classified as ‘smart’. These materialshave the ability to change their shape,rheological properties (eg stiffness anddamping), or internal electrical proper-ties (eg dielectric constant or resistivity).

Depending on the relative positions of theviscoelastic layer and the piezoelectricactuator, the viscoelastic passive andpiezoelectric active actions can operateeither separate or simultaneous actions.Typically, these materials have a sand-wich structure, in which a soft, thinviscoelastic layer is confined betweenidentical stiff, elastic layers. These struc-tures yield a superior energy absorption.In particular, they offer the advantage of

CASSEM: Vibration Control in the Smart Wayby Salim Belouettar

The ambition of the CASSEM project is to define the ‘best’ models and tech-niques that will permit us to model, simulate and validate the development of amore efficient vibration control.

from the fact that, like typical frame-works, MET++ and ET++ have evolvedinto a state where most tasks can beaccomplished by black-box reuse andcomposition of already existing classes.A software system functioning in ablack-box manner is hard to understand ifone has only the source code to examine.

This observation led to the developmentof an authoring environment that notonly visually displays the current appli-cation as a composition of black-boxcomponents, but also allows applicationsto be built visually: the visual program-ming environment of MET++.

Visual ProgrammingThe metaphor of the visual programmingenvironment in MET++ is that of a data-flow engine. Its building blocks are so-called DataUnits and DataPorts. Thereare several categories of DataUnits:mathematical functions, GUI compo-nents, wrappers, data containers and datamappers. The DataPorts provide theinput/output to the DataUnits.

The visual programming environment inMET++ uses the Adapter design Pattern

to wrap existing media abstractions, thusenabling their control from within thevisual programming environment. Usingthis environment, a user can explore thebehaviour and protocol of a mediaabstraction available in MET++ beforeprogramming against its API using C++.

A special DataUnit type has been devel-oped explicitly for the visualization ofdata, namely, the Mapper. This unitprovides the necessary infrastructure forascertaining the number and cardinalityof rectangular data and for subsequentlyiterating over all dimensions. The dataresponds to the index sets generated bythe mapper with the corresponding datacontent, which is then visualized by themapper. Several kinds of mappers havebeen developed that allow visualizationas two- or three-dimensional graphicsand sonification of the data.

Entire applications can be built in thevisual programming environment. Aspecial HTML-Browser has been devel-oped that allows visual programs to beembedded in normal HTML pages in thesame way as applets or flash compo-nents. Likewise, plug-ins have been

developed for the Mozilla Firebird andMicrosoft Internet Explorer browsers.

ApplicationsThe visual programming environmenthas been successfully employed in theareas:• interactive data visualization• animation• sonification of animation• visualization of sound.

OutlookThe author is currently preparingdidactic material for introductorylectures in 3D graphics, using the visualprogramming environment of MET++.This material will interactively demon-strate concepts of computer graphics andwill also contain student exercises.

Links:MET++ material: ftp://ftp.ifi.unizh.ch/pub/projects/met++/

Commercial use of MET++:http://www.perspectix.com

Publications about Visual Programming in MET++: http://xmlizer.biz/priorArt.html

Please contact: Bernhard WagnerUniversity of Zurich, SwitzerlandE-mail: [email protected]

Page 56: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

56 ERCIM News No. 62, July 2005

R&D AND TECHNOLOGY TRANSFER

high damping with low weight addition.The interlayer-damping concept is highlycompatible with the laminated configura-tion of composite structures and withtheir fabrication techniques, and providesan effective way to reduce vibrations andnoise in structures. The damping is intro-duced by an important transverse shear inthe viscoelastic layer. This is due to thedifference between in-plane displace-ments of the elastic layers and also to thelow stiffness of the central layer.

The performance of passive and hybridcontrol systems depends strongly on theviscoelastic material layer and piezoelec-tric material properties. In this project,numerical identification based ondirect/inverse approaches will be devel-oped. An advanced non-contact lasertechnique (ISI-SYS vibrograph systemand Polytec scanning vibrometer) will be

applied for the vibration measurements.These experimental data will be used todetermine the natural frequencies andcorresponding loss factors by the devel-oped modal analysis program.

Three approaches are put forward forretrieving the material parameters. Thefirst approach is the use of neuralnetworks as a regression analysis tool.With the development of neuralnetworks, it has become possible toperform a meaningful parameter extrac-tion without the knowledge of an analyt-ical relationship between the materialproperties and test values. This allowsmechanical parameters to be identifiedby combining finite element modelling

and experimental testing through neuralnetworking.

The second approach considers opti-mization techniques. An error function,eg the least squares sum of the differencebetween modelling and experimentalresults, is minimized by changing thematerial parameters using differentmethods. These are either deterministicmethods involving semi-analytical ornumerical gradient procedures orstochastic methods like genetic algo-rithms or shooting methods.

The basic idea of the third proposedapproach is that simple mathematicalmodels (response surfaces) are deter-mined only by the finite element solu-tions in the reference points of the exper-iment design. The function to be mini-mized describes the difference between

the measured andnumerically calcu-lated parameters ofthe response ofstructure. By mini-mizing the func-tion, the identifica-tion parameters areobtained.

Another issue ofthis FP6 project isto develop ageneral analyticaland numerical(Finite Element)framework tomodel: (i)

composite structures with piezoelectricsensors and actuators, (ii) thermal andpyroelectric effects in piezoelectriccomposites, and (iii) piezoelectricshunted damping.

In the context of the FP6 STREP Project‘CASSEM’ (Composites and AdaptiveStructure: Simulation, Experimentationand Modelling), we will also design arobust controller, which is stable in thepresence of uncertainties of modellingand parameters, and ensures optimaldisturbance rejection capability. In theimplementation of the controller, actua-tors and sensors are needed. The loca-tions of actuators and sensors over astructure determine the effectiveness ofthe controller in damping vibrations.

A variety of problems must be clarifiedbefore active systems can be imple-mented within structures. One of these -an important and not fully recognizedproblem - is the proper positioning ofsensors and actuators on structures in thecase of active systems, and the locationof dampers in the case of passivesystems. In active vibration control,actuator and sensor placement is a verysignificant issue, since it has a directeffect on the control efficiency and cost.For example, large flexible structuresrequire many actuators for active vibra-tion control, and the problem of opti-mizing their location becomes extremelysignificant in maximizing systemcontrollability. An arbitrary choice ofactuator positions can seriously degradethe system performance. The controlla-bility index, the genetic algorithms, thegradient-based optimization procedureand the heuristic procedures were used todetermine the proper sensor or actuatorlocations. The problem of positioningand size of passive dampers is alsoimportant, for similar reasons.

CASSEM is a highly interdisciplinaryproject combining engineering and phys-ical sciences, for example, experimentalmaterial science, numerical modellingmethods, mathematics, automationsystems and mechatronics. The consor-tium consists of nine scientific andindustrial partners from seven EU coun-tries. The application of the projectresults will lead to long-term innovationsin composites and adaptive structures.The development of vibration controlsystems will allow the areas of applica-tion of multi-functional composite mate-rials to be extended based on theadvanced knowledge and understandingof vibration response.

Links:CASSEM: http://www.cassem.luCentre de Recherche Henri Tudor:http://www.tudor.lu

Please contact:Salim Belouettar, Centre de Recherche Henri Tudor, LuxembourgTel: +352 54 55 80 500E-mail: [email protected]

Multi-layered structures.

Page 57: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 57

R&D AND TECHNOLOGY TRANSFER

Since January 1995, IPv6 has been spec-ified and standardized with the aim ofmeeting global needs generated by thehuge growth of the Internet. In additionto the increased (virtually unlimited)address space, IPv6 offers numerousmajor design improvements over IPv4.These can remedy the ‘old’ protocollimits, as well as opening the door toinnovative Internet-based products andservices. Among the most significant ofthese enhancements are auto-configura-tion (‘plug and play’) and reconfigura-tion mechanisms, mobility services, end-to-end security - with IPSec encryptionand authentication features - andenhanced support for multicast and QoS.Today, with these technical advantages,IPv6 is widely available from industryand supported by recent network equip-ments (routers, switches, desktops,servers and operating systems). It hasalso been successfully applied world-wide by educational establishments,government institutions, telecom compa-nies and research organizations.However, European companies are stillbehind schedule in the acceptance anddeployment of this technology. Thecommon attitude appears to be: “IPv4 iswell known and functional. Why shouldI change? What are the benefits and theassociated costs and risks?”

Researchers from the IRISA/INRIA andENST Bretagne/GET laboratories havebeen involved in the IPv6 project fromthe very beginning. They co-developedthe very first IPv6 stack ten years ago,and have been involved in the IPv6 inter-operability testing and the internationalcertification program (IPv6 Ready Logo)since 1999. This expertise develops dailythrough research activities, IETF stan-dardization efforts, educational duties,

technical publications and conferences,prototype implementation and interna-tional interoperability test events. Theteam has been further reinforced by thecreation of the Point6 skill centre inRennes (France). This unique know-howis now available for IT and businessmanagers and engineers who need tounderstand and control all the IPv6related issues, prior to organizing anyassociated work.

IPv6 protocol is stable and mature butnot well known, possibly because of aperception that it is difficult to learn.Indeed, IPv6 is specified by a collectionof more than 60 RFCs (Requests forComments - Internet standards).Companies that need to implement IPv6in a product naturally ask the followingquestions: Which IPv6 skills should I

develop regarding my product specifica-tion? Where can I find an IPv6 stack formy real-time OS? Which part of my codedo I have to modify to remain IPv6-compliant? How long will it take tocomplete the work?

The ambition of the Point6 experts is tounderstand and analyse the industryneeds, provide efficient solutions, andassist the R&D teams in their adoption ofthe IPv6 protocol, from the initialrequirements to the implementationdetails. From our experience, the maindifficulty is the initial step of IPv6 know-ledge acquisition. Once this is achieved,the technical issues are generally wellknown and decision are quite easy totake. Usually, products have to be bothIPv4 and IPv6 compliant, and this can bemanaged with a minimum of effort.

Point6: The IPv6 Skill Centre — Moving to the Next-Generation Internet Protocolby César Viho and Annie Floch

IPv6 was initially conceived to anticipate the shortage of IP addresses, and torespond to the deficiencies in IPv4 that appeared in the 1990s. These requirementsconcern both emerging communication technologies (cell phones, PDAs, Internet-connected vehicles, xDSL etc) and billions of new users (in Japan, China, Indiaetc), and they become more significant daily. It is therefore vital that IPv6 beintroduced into infrastructures and new projects.

Conformance and interoperability testing.

Page 58: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

58 ERCIM News No. 62, July 2005

R&D AND TECHNOLOGY TRANSFER

The average service life of a personalcomputer used for business is 2-3 years.After this short service life, it is usuallyreplaced by a more powerful PC runningnew versions of system and applicationsoftware. Therefore, it is worth consid-ering whether the high replacement costsof PCs bring a comparable benefit.Whereas the benefits from having addi-tional functionality (eg, in multimediaapplications) are self-evident, it is not apriori clear whether pre-existing func-tionality can be accessed and used asefficiently as before when the user goesto a new machine. The study reportedhere addresses only the latter issue.

Our laboratory experiment was based onPC configurations typical of the years1997, 2000 and 2003. As Table 1 shows,the hardware used differed greatly inclock speed, working memory and harddisk capacity. Since the test onlyinvolved file handling and text editing, itwas sufficient to install the operatingsystem and a word processor. Theconfigurations A, B and C are recon-structions of PC configurations used

widely in business in the respective year.All software components were installedtaking the default parameter settings.The appearance of the desktops was keptsimilar on all computers to optimallysatisfy blind test conditions. A switchbox made it possible to connect the samemouse, keyboard, and monitor to each ofthe three computers.

We defined two tasks for testing userperformance on the three systems: a filehandling and a text editing task. The filehandling task included locating,copying, moving and deleting files and

groups of files varying in size from 24kByte to 42 MByte. The text editing taskfocused on locating, inserting anddeleting text, changing fonts and formatsin a large document of 150 pages as wellas copying, pasting and positioningphotographs between two documents.

The laboratory experiment wasperformed on 42 subjects recruited froma service organization with 180employees. Subjects had to execute eachof the tasks twice on each computersystem in randomized order. The wholetest took 45-90 minutes per subject. We

measured the time thesubjects needed tocomplete each task,logged user interac-tions and processorworkload on thecomputer and video-taped the screensignal.

The test revealed astatistically significantdecrease in user

Working Slower with More Powerful Computersby Lorenz M. Hilty, Andreas Köhler, Fabian van Schéele and Rainer Zah

An empirical study conducted by the Swiss Federal Laboratories for MaterialsTesting and Research showed that using a more powerful PC can significantlyslow down office workers in performing every-day office tasks.

Table 1:

Description of the computer systems used for the experiment.

Another representative case is thenetwork infrastructure migration fromIPv4 to IPv6. Most organizations havedifficulty making accurate decisions atthe right moment, even though theyknow that IPv6 is now reliable andcommonly used in products and fromInternet Service Providers. Point6experts can analyse existing infra-structure and are capable of developingcustomized migration plans. The mainsteps typically include IPv6 connec-tivity, address plan definition, equipmentupgrade, routing configuration, networkservices reconfiguration (firewalls etc),service updates (mail, Web etc), end userequipment and metrics definition. Thiscan be done gradually, retaining thesame level of service as with IPv4.

Companies that have developed theirIPv6 products or want to validate anyIPv6 network configuration can use thePoint6 platform for both conformanceand interoperability testing facilities.Conformance tests verify that an imple-mentation has been developed in strictaccordance with its specifications(RFCs). Multi-vendor environmentsmake it necessary to validate equipmentcompatibility by applying interoper-ability scenarios.

Testing is time-consuming, and requiresskill, specialized resources (hosts,routers, probes etc) and rational impar-tiality. The Point6 test service is basedon a platform supplied with up-to-datetests and managed by IPv6 experts. The

team has the relevant skills for helpingcompanies to obtain the ‘IPv6 ReadyLogo’, which requires 100% correctresults for both conformance and inter-operability tests. More tests for IPv6 areavailable in the laboratory for additionalIPv6 routing protocols (RIPng, OSPFv3,BGP4+), mobility (Nemo) and transitionmechanisms (NAT/PT, 6to4). Other testsuites are also under development.

Links:Point 6: http://www.point6.net

IPv6 ready Logo Program: http://www.ipv6ready.org/

Please contact:César Viho, IRISA, FranceE-mail: [email protected]

Page 59: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 59

R&D AND TECHNOLOGY TRANSFER

performance on Windows XP ascompared to Windows 2000 for the filehandling task, despite the fact thatWindows XP was run on considerablyfaster hardware. This result was totallyunexpected because Windows 2000 hadnever been used in the organization inwhich the subjects were employed.Windows NT systems had been in usefor five years. Approximately fivemonths before the test, a migration fromWindows NT to Windows XP had beeninitiated. By the time of the test, virtuallyevery employee was working withWindows XP. Only four subjectsreported that they usually worked withWindows 2000, and another four werestill working with Windows NT.

In order to find explanations for ourresult, we examined the effort requiredfrom the user and the machine tocomplete the tasks in detail. Since mostuser actions during task execution weremouse positionings (usually followed bya single mouse click, less often by noclick or a double click), we used thenumber of mouse positionings as an indi-cator of the user’s effort. The computer’swork was measured in CPU time, whichwe calculated by integrating processorutilization over time.

As Figure 1 (top) shows, the filehandling task required much less CPUtime on system B (year 2000) than onsystem A (1997), but again much moreon system C (2003), and the same is truefor the number of mouse positionings theusers performed (significant difference,

P=0.016). Whereas they managed withroughly 50 mouse positionings onWindows 2000 on the average, theyneeded more than 80 positionings for thesame task on Windows XP.

The text editing task (Figure 1, bottom)shows a decrease in mouse positionings

from A to B to C, but an increase in CPUtime from B to C. Given the fact that theC hardware executes more than twice asmuch many instructions per second thanthe B hardware, one would insteadexpect a significant decrease.

The main conclusion from this experi-ment is that changing over to faster acomputer running newer software doesnot necessarily lead to better perfor-mance, at least for the types of tasks weused in our experiment. On the contrary,it is even possible that both – themachine and the user – need to worksignificantly longer in order to replicatea given task with the new system.

AcknowledgementsThe work reported here was conducted incooperation with the Swedish RoyalTechnical University (KTH), Stockholm,and co-funded funded by the Board ofthe Swiss Federal Institutes ofTechnology (ETH board) as a part of the‘Sustainability in the InformationSociety’ research program at Empa.

Links:Empa: http://www.empa.ch

Technology and Society Lab at Empa:http://www.empa.ch/TSL

Sustainability in the Information Society Program: http://www.empa.ch/SIS

Please contact:Lorenz M. Hilty, Technology and Society Lab, Empa, Swiss Federal Laboratories for Materials Testing and Research, SwitzerlandTel: +41 71 2747 500E-mail: [email protected]

Figure 1: Human effort, measured by

number of mouse positionings, versus

machine work, measured by CPU time, for

executing the tasks for the second time

on a given computer. Top: mean values

for the file handling task. Bottom: mean

values for the text editing task. Error bars

denote s.e.m. Arrows indicate the

temporal sequence of systems.

The future prosperity of Europe dependson its competing successfully in anumber of key knowledge-based indus-tries, of which the so-called informationsociety technologies (IST) form a largeeconomic block. To achieve thiseconomic success in the longer term, it isnecessary to circumvent the barriers

posed by national boundaries anddiffering research policies and priorities,and to bring together a critical mass ofresources from across Europe to researchkey technologies.

The CISTRANA project is aCoordination Action under the European

Commission’s IST programme, with theaim of achieving coordination ofnational IST programmes with eachother and with European programmes inorder to improve the impact of allresearch and development efforts inEurope and to reinforce Europeancompetitiveness in this area.

Coordinating IST Research Across Europeby Simon Lambert

The CISTRANA project will help to enhance European competitiveness throughinternational coordination of IST research activities

Page 60: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

60 ERCIM News No. 62, July 2005

R&D AND TECHNOLOGY TRANSFER

CISTRANA is of three years duration,and involves five partners from acrossEurope. DLR in Germany is the coordi-nator, and the others are TEKES(Finland), NKTH (Hungary), ANRT(France) and CCLRC (UK, an ERCIMinstitute). The objectives of the projectare to develop a map of the nationalresearch landscape in the area of IST; topinpoint research topic areas andstrategic themes where cooperation isessential; and to establish sustainablemechanisms including common method-ologies and procedures to set up trans-national coordination initiatives betweendifferent countries.

It is evident that active participation andcommitment of stakeholders is essentialfor the success of CISTRANA. ASteering Committee has been set up,with members from around thirty coun-tries, that gives advice on informationcollection and guidance on priorities andimplementation. The SteeringCommittee acts as the interface betweenthe CISTRANA project and nationalministries. Each of the CISTRANA part-ners, with the exception of CCLRCwhose role is explained below, is aCountry Group Representative, with theresponsibility of gathering and assessinginformation collected at the nationallevel. The information itself is providedby National Support Organisations, one

in each country, selected by the SteeringCommittee member.

The project started in September 2004and so far a comprehensive survey hasbeen conducted about national researchpolicies, programmes and other activi-ties. The response has been very good,and the results are currently beingassessed to feed into other project activi-ties. A series of workshops are beingplanned during the life of the project,aimed at various target audiences, withthemes such as impact assessment, bestpractice in multi-national programmecollaboration, and portals for informa-tion dissemination.

An additional important output of theCISTRANA project is a portal to ISTresearch information. A study conductedfor the European Commission addressedthe perceived difficulty in accessinginformation on IST research at nationallevels. The main findings were that thereare important barriers to accessing rele-vant information, including a lack ofconsistent thematic search possibilities;a lack of relevant information; and a lackof comparability between informationsources. It concluded that the overallpresence of information was insufficientin comparison with users‚ requirements‚and that the barriers can be largely over-come by establishing an IST research

portal as a gateway providing access tonational IST RTD information.

CCLRC is the partner responsible fordeveloping the portal. The aim is to giveaccess to diverse kinds of informationabout national programmes, policies andprojects, either gathered specially forCISTRANA or in existing nationaldatabases. A system architecture hasbeen developed that is sufficiently flex-ible to accommodate these possibilities,and in particular allows different optionsto national database owners for how theycan make their data available via theportal. The CERIF standard is of keyimportance here: it is the CommonResearch Information Format that hasbeen developed over many years as arecommendation to member states of theEU for Current Research InformationSystems. It is expected that the portalwill be of great interest to various usercommunities including policy makers,researchers and those in the industrialworld, offering a way of convenientlyfinding out what is going on in othercountries.

Links:Project website: http://www.cistrana.org

CERIF standard: http://www.eurocris.org/en/taskgroups/cerif/

Please contact:Simon Lambert, CCLRCTel: +44 1235 445716E-mail: [email protected]

Media production companies, in partic-ular media broadcasters, are being chal-lenged by their consumers to producemedia in which the consumers can

actively participate and interact with TVshows and between peers. This new formof media, called interactive media, is acombination of traditional media and a

behavioural component (software).Examples of such components are onlinegames, quiz software, virtual communityworlds etc.

Software Automation meets Interactive Media Developmentby Dirk Deridder, Thomas Cleenewerck, Johan Brichau and Theo D'Hondt

Developing the software component of interactive media requires an advancedset of development tools and enforces a different view on software developmentin general. This is mainly due to the specific characteristics of the environmentin which this development takes place. We propose a combination of softwareautomation techniques to counter this effectively. Since the iMedia domain is incontinuous flux, and these technologies are mostly designed for stable domains,the evolvability of the approach is guaranteed by rooting it into the heart of thesystem.

Page 61: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 61

R&D AND TECHNOLOGY TRANSFER

The challenge lies in the production ofthe behavioural component of the newform of media within the broadcastingenvironment. The production cycle isconstrained by the imposition ofextremely strict broadcasting times, theextremely short time-to-market situationcaused by last-minute changes, and theextreme deployment prerequisites.

IMedia Software Generation SystemIMedia development requires moreadvanced development tools. Theapproach we propose combines existingresearch from the areas of generativeprogramming, transformation systemsand domain engineering. This results in asystem that is best described as an

iMedia Software Generation System(IMSGS; see the figure), specialized foreach product range. In an IMSGS, moreautonomy and flexibility is given to themedia producer to adapt the iMedia soft-ware product. This is achieved by gener-ating different tailor-made ‘instances’ ofthe product range, given a high-levelspecification. The tailoring of particularinstances is managed by the mediaproducer (the domain expert).

Evolution of the IMSGS Our research focuses on the evolvabilityproblems of a system that is based onDSL and generative programming tech-

nology. The system and the techniquesused to build the system were designedwith evolution in mind: the impact ofchanges to the system is limited to indi-vidual easily identifiable modules.

An IMSGS for a specific product rangeis divided into a set of program genera-tors, each targeted at a specific concernin the product (eg the application logic,the graphical user interface etc). For eachconcern a domain model with CoBro isconstructed that defines the concepts inthe domain and the relationshipsbetween them. These models are thenused to construct concise domain-specific languages (DSLs) compliantwith the definitions in the domain model.The DSL compilers are program genera-

tors implemented in LingletTransformation System (LTS), whichtranslate the DSL specifications intoexecutable code in some genericlanguage using generic libraries andframeworks. Finally these programgenerators are composed usingGenerative Logic Meta-Programming(GLMP) in order to integrate each oftheir generated program parts into oneapplication.

The domain knowledge is described inCoBro. CoBro follows a concept-centricapproach in which we couple the domainconcepts to their corresponding imple-

mentations in the quiz language. In thisway, one can start at the level of thedomain concepts to estimate which partsof the implementation (δ3 in the figure)will be affected by the evolution (δ2 inthe figure). Moreover, connecting thedomain knowledge to the implementa-tion provides a valuable source of docu-mentation of the assumptions made bythe original developers.

The DSLs are constructed using theLinglet Transformation System via acomposition of language components,which is expressed in a language specifi-cation. LTS modularizes the languagecomponents by specifying the necessarycommunication patterns among them ina separated language specification,through the customization of thelanguage components. Hence the depen-dencies among the language componentsbecome explicit and are removed fromtheir implementations. Consequently theimpact of changes in the language (d3and δ4 in the the figure) is isolated to thelanguage specification and to individualidentifiable components.

The composition of the program genera-tors (DSL compilers) is realized withGenerative Logic Meta-Programming.GLMP features a grey-box compositionmodel of program generators that allowsthe specification of integration relation-ships among the subparts of differentprogram generators. This mechanism isvital for adapting the generators so thatthey produce program parts; these canthen be combined into a single applica-tion with no undesired interferences thatcould break their functionality (δ5 in thefigure).

This research is performed in the contextof the Advanced Media Project, a collab-oration between Vlaamse Radio- enTelevisieomroep, Vrije UniversiteitBrussel, Universiteit Gent and IMEC.

Links: http://prog.vub.ac.be/

http://prog.vub.ac.be/Publications/2004/vub-prog-tr-04-19.pdf

http://www.xmt.be/sake.html

Please contact: Dirk DeridderVrije Universiteit Brussel, BelgiumE-mail: [email protected]

Overview of the IMedia Software Generation System with the major evolution d’s in an

IMSGS.

Page 62: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

62 ERCIM News No. 62, July 2005

EVENTS

Workshop on Challenges in Software Evolutionby Tom Mens

The annual meeting of the ERCIM Working Group onSoftware Evolution was held in Bern in Switzerland, 12-13April 2005. Together with the scientific research networkRELEASE, financed by the European Science Foundation(ESF), a two-day workshop was organized focusing on themost important challenges and emerging trends in softwareevolution research and practice.

The workshop was co-organized by Stéphane Ducasse, RaduMarinescu, Tom Mens and Oscar Nierstrasz. The workshopbrought together people from academia and industry, identi-fying substantial obstacles to software evolution research andpractice, and proposing ways to overcome these obstacles. Intotal, we received seventeen contributions, and there were 37workshop participants coming from ten different countries.

The proposed challenges were diverse in nature, ranging fromthe fundamental (eg what are the laws governing softwareevolution?), to the pragmatic (eg how can we provide softwareevolution techniques and tools that scale up to industrial-sizesoftware applications?). The expected time horizon of the chal-lenges also varied, ranging from short term (typically a coupleof months) to long term (several years or even decades). Aclassification of all proposed challenges, together with a briefsummary of each, has been compiled as a result of the work-shop, and is available via the workshop Web site.

In addition to its scientific purpose, the workshop also hostedthe annual steering committee meeting of the RELEASEnetwork, financed by the European Science Foundation untilautumn 2005, and the annual steering committee meeting ofthe ERCIM Working Group on Software Evolution. Duringthe latter meeting, we discussed the current status of thenetwork (including over thirty members originating fromresearch institutes all over Europe, seventeen of which belongto ten different ERCIM partner institutes). We also reported onthe successful meeting organised in Rome last year (October2004), and planned a number of new activities for 2005 and2006. Last, but certainly not least, we discussed concreteopportunities and plans for proposing new initiatives withinthe IST domain of the 5th call of the EU 6th FrameworkProgramme (particularly with the strategic objectives‘embedded systems’ and ‘software and services’). In addition,we plan to set up a Marie Curie Research Training Network.

Links:WG Web site: http://w3.umh.ac.be/evol/ Workshop website: http://w3.umh.ac.be/evol/meetings/evol2005.htmlRELEASE network website: http://www.esf.org/release

Please contact:Tom Mens, Institut d'Informatique, Université de Mons-HainautTel: +32 65 37 3453; E-mail: [email protected]

WWV 2005 — First Workshop on Automated Specification and Verification of Web Sitesby María Alpuente, Santiago Escobar and Moreno Falaschi

The First International Workshop on Automated Specificationand Verification of Web Sites (WWV 05) was held in Valencia,Spain, during March 14-15, 2005.

This was the first workshop in a series aimed at promoting acommon forum for researchers from the communities of rule-based programming, automated software engineering, andweb-oriented research.

The workshop was attended by 42 participants from universi-ties, research institutes, and companies from eleven countries.The two invited speakers gave highly interesting and stimu-lating presentations. Anthony Finkelstein, from the UniversityCollege of London, UK, spoke about checking complex,distributed data against business rules, reference data andindustry standards as a way to avoid operational errors andlarge, daily losses in data intensive businesses due to inconsis-tent data. Shriram Krishnamurthi from Brown University,USA, spoke on the pros and cons of two different approachesto web site verification: the static view of a web site as programsource and the dynamic view of a seb site as an entity withdifferent contextual or temporal behaviors.

The Programme Committee had selected ten regular papers,two position papers, and six system descriptions or works inprogress. All of them spanning formal methodologies and tech-niques as diverse and complementary as (i) formal models fordescribing and reasoning about web wites, (ii) testing, valida-tion and categorization of web sites, (iii) accessibility evalua-tion, (iv) XML transformation and optimization, (v) Rule-based approaches to seb site analysis and verification, and (vi)model-checking and static analysis applied to the web. Theproceedings of the workshop are published as a technicalreport of the Departamento de Sistemas Informáticos yComputación of the Universidad Politécnica de Valencia. Aselection of the papers will appear in the Elsevier seriesElectronic Notes in Theoretical Computer Science (ENTCS).

The programme and workshop chairs wish to thank the organi-sations that have supported the event, especially the EU-Indiaproject ALA/95/23/2003/077-054, Università degli Studi diSiena, CologNET, Technical University of Valencia, and theSpanish Ministry of Education and Science.

Link:WWV'05: http://www.dsic.upv.es/workshops/wwv05/

Please contact:Maria Alpuente, DSIC-UPV, Valencia, SpainTel: +34 963 879 354; E-mail: [email protected]

Page 63: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 63

EVENTS

CALL FOR PARTICIPATION

3rd International Workshop on AdaptiveMultimedia Retrieval

Glasgow, UK, 28-29 July 2005

This workshop is part of the IR Festival, which is aweek full of IR activities in Glasgow, and it is co-located with the 19th International Joint Conferenceon Artificial Intelligence (IJCAI 2005) in Edinburgh.The goals of the workshop are to intensify theexchange of ideas between different research commu-nities such as multimedia, information retrieval andAI, to provide an overview of current activities in thisarea and to point out connections between the diverseinvolved research communities and research in AI.

The workshop focuses especially on researchers thatare working on feature extraction techniques formultimedia, computer linguistic approaches,(dynamic) data analysis methods, and visualisationmethods as well as user interface design. Topics forthe workshop are:• multimedia retrieval systems (for text, image, audio,

video and mixed-media)• theoretical foundations of multimedia retrieval and

mining• intelligent multimedia data modelling, indexing and

structure extraction• adaptive Hypermedia and web based systems• metadata for multimedia retrieval• multimedia and multi-modal mining• semantic content analysis for multimedia• semantic web and ontologies• adaptive query languages• similarity measures (especially user adaptive

measures)• user and preference modelling (including feedback

models)• methods for adaptive data visualisation and user

interfaces.

Registration is free and participation is open toeveryone. The workshop is sponsored by theMultimedia Knowledge Management Network

(http://www.mmkm.org) which consists of researchteams from seven UK universities who work in thisnew interdisciplinary field. The aim of this network isto enhance communication between the experts inboth academia and industry, and to maintain sharedresources for the direct benefit of the researchcommunity. The network is hosted at and maintainedby the Multimedia Information Retrieval group atImperial College London (http://mmir.doc.ic.ac.uk).

More information: http://www.dcs.gla.ac.uk/amr2005/

If you need help in finding apartner for the EU IST programs:

IDEAL-IST

IDEAL-IST is there to help you… again !

Expecting the deadline for the 5th Call in September 2005for the submision of proposals in the IST-programme bySeptember 2005 with a budget of: 638 Mio €, IDEAL-ISTis ready for action !

Helping IST-proposers since 1996 in finding suitablepartners for their IST-project ideas, IDEAL-IST hasexpanded by now to a consortium of 34 countries, linkingover 55,000 organizations, with incomparable experiencein the IST-programme as a whole and in the issue ofPartner Search in particular.

Three main services are offered free of charge:

1. Finding partners for your IST project ideaYour project idea will be distributed to over 55,000international contacts, interested in the IST Programme &subscribed to their national IDEAL-IST Mailing List. Thisservice boasts a 92% success rate: proposers findappropriate partners for partnership building! The IDEAL-IST network provides suitable partners within weeks,even days, with an average of 44 responses per partnersearch! Partner searches are call-specific and rely on aproactive mechanism. That means that you will becontacted by the national IDEAL-IST Representative whobecomes your personal advisor and supports you in theimplementation of the final Partner Search applicationbefore publishing it on the public Website and distributingit to the 34 IDEAL-IST Representatives and Mailing Listsubscribers.

2. Joining other IST proposals 3. Joining running IST projectsYour national IDEAL-IST Representative keeps youinformed about partner searches. Just register yourinterests in the national Mailing List and your nationalIDEAL-IST Representative will automatically send youthe new searches.

Register your details today and take advantage of thisservice, free of charge.

http://www.ideal-ist.net

Page 64: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

64 ERCIM News No. 62, July 2005

EVENTS

CALL FOR PAPERS

SOFSEM 2006 – 32nd Conference on CurrentTrends in Theory and Practice of Computer Science

Merin, Czech Republic, 21-27 January 2006

SOFSEM (SOFtware SEMinar) is anannual international conference devotedto the theory and practice of ComputerScience aiming at fostering co-operation among professionals fromacademia and industry working invarious areas of Computer Science.SOFSEM offers a unique opportunity toquickly obtain a representativeoverview of the areas that are selectedas the topics of the year: • Foundations• Wireless, Mobile, Ad Hoc and Sensor

Networks• Database Technologies• Semantic Web Technologies.

SOFSEM is especially suited for youngcomputer scientists. The programconsists of series of invited talks,contributed talks, working sessions andthe student research forum.

SOFSEM 2006 Invited Speakers • Foundations:

S. Barry Cooper, UK: ‘How Cannature help us Compute?’

• Wireless, Mobile, Ad Hoc and SensorNetworks:Sotiris Nikoletseas, Greece: ‘Modelsand Algorithms for Wireless SensorNetworks (Smart Dust)’;Christian Schindelhauer, Germany:n.c.

• Database Technologies:Georg Gottlob, Germany: ‘MonadicQueries over Tree Structured Data’

• Semantic Web Technologies:Marie-Christine Rousset, France:‘Somewhere in Semantic Web’

Submission Deadlines • Abstracts: 15 August 2005 • Full papers: 22 August 2005

More information:http://www.cs.cas.cz/sofsem/06/

CALL FOR PARTICIPATION

GRIDs @ Work : Middlewares, Components,Users, Contest, and Plugtest ETSI, Sophia Antipolis 10-14 October 2005

The purpose of this event is to test andexperiment tools and protocols for Gridcomputing. This implies to be able togain access to a Grid computing infra-structure and then to be able to have anapplication running on it. For thatreason, it is necessary to gather worksta-tions, clusters, supercomputers,computing Grids, etc., that may alreadybe deployed all over the world, and tomake them accessible during thePlugtest event. One technical challengeis to succeed to virtually merge all thegathered computing elements in order toform a single world-scale computingGrid. This yields to a huge Grid at work,through which it becomes easy to test theinteroperability of the various cluster andGrid computing technologies. The effec-tive test is done through the deploymentof a single application on all CPUs avail-able at once.

In order to set up such a world-scale grid,ProActive, a high-level Grid middlewarewill be used. ProActive is a library(Source code under LGPL license,developed by the OASIS team at INRIASophia-Antipolis, and a key part of theObjectWeb consortiumhttp://www.objectweb.org) for parallel,distributed, and concurrent computing,also featuring mobility and security in auniform framework. FurthermoreProActive interoperates with the majorprotocols and tools for cluster and Gridcomputing (HTTP, ssh, ssh tunnelling,RMI, Globus GT2, GT3 and GT4, LSF,PBS, SunGrid Engine, etc). ProActive isopen, ie, it is technically possible toenlarge the panel of Grid access toolswith whom it is interfaced, in order tolaunch jobs for instance. As such,ProActive allows to orchestrate differentheterogeneous Grid resources at once.ProActive has been used for the firstGrid Plugtest organised by ETSI andINRIA in October 2004. It gathered 800CPUs all over the world. This time, Gridinfrastructures deployed by some of themajor Grid actors such as EGEE andNorduGrid will join the testbed.

More information: http://www.etsi.org/plugtests/GRID.htm

By advertising in ERCIM News, your company or institution

will be able to speak to a highly qualified audience:

You can reach over 10,000 researchers, scientists and decision

makers in the field of information and communication technologies.

For rates and formats, see:

http://www.ercim.org/publication/Ercim_News/ads.html

This could be your advertisement!

Page 65: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM News No. 62, July 2005 65

EVENTS

$11281&(0(17�2)�$�&203(7,7,9(�&$//�)25�$'',7,21$/�352-(&7�3$571(56�

The following project currently active in the 6L[WK�)UDPHZRUN�SURJUDPPH�RI�WKH�(XURSHDQ�&RPPXQLW\�IRU�UHVHDUFK��WHFKQRORJLFDO�GHYHORSPHQW�DQG�GHPRQVWUDWLRQ�DFWLYLWLHV�FRQWULEXWLQJ�WR�WKH�FUHDWLRQ�RI�WKH�(XURSHDQ�UHVHDUFK�DUHD�DQG�WR�LQQRYDWLRQ�������������requires the participation of new project partners (mainly SMEs operating in collaborative networks) to carry out certain tasks within the project.

Project acronym and contract number ECOLEAD, FP6 IP 506958 (Integrated project) Project full name European Collaborative networked Organizations LEADership initiative Date of opening of call August 2005; exact date to be published in www.ecolead.org Date of close of call October 2005; exact date to be published in www.ecolead.org Language in which proposal should be submitted English Address for further information (call webpage & Project coordinator):

www.ecolead.org , [email protected]

���(&2/($'�SURMHFW�DQG�WKH�UROH�RI�QHZ�SDUWQHUV�

ECOLEAD develops foundations, mechanisms and tools for establishing, operating and managing collaborative networked organizations (CNO). Three focus areas are addressed: Breeding Environments (BE), Dynamic Virtual Organizations (VO), and Professional Virtual Communities (PVC). The holistic approach is reinforced and sustained on two horizontal activities: the Theoretical Foundation for collaborative networks and the ICT Infrastructure, that both support and affect the three vertical focus areas. More info www.ecolead.org. The searched and selected organizations will join the ECOLEAD consortium as new partners. (&2/($'�ZLOO�RIIHU�WKH�QHZ�SDUWQHUV: - solutions and methods supporting operation and management of CNOs (BEs, VOs and PVCs), - possibility to gain information and to have an influence on the latest development in collaborative networks, - possibility to compare their experiences with other networked organizations, - training into the main concepts and solutions & support in the demonstrations, - partial funding from the European Commission (see below).��(&2/($'�H[SHFWV�WKH�QHZ�SDUWQHUV: - to represent a collaborative networked organization (CNO), that is a network, a virtual organization or a professional virtual community, - to operate in the role of end-users of ECOLEAD solutions and to perform the demonstrations collaboratively each with its own CNO, - to participate actively in the ECOLEAD collaboration and perform all the partner duties. ���6XPPDU\�RI�WKH�WDVNV�UHTXHVWHG�IURP�WKH�QHZ�SDUWQHUV��

7KH�WDVNV�RI�WKH�QHZ�SDUWQHUV�LQFOXGH: 1. 'HPRQVWUDWLRQ� WDVNV: This is the main task of the new partners. The objective is to demonstrate, test and evaluate ECOLEAD solutions which have been developed in the three ECOLEAD focus areas (mentioned above). The demonstrated solutions may be tools (e.g. supporting software), but also processes, methods or operating rules, which support the operation of Collaborative Networked Organizations (CNOs). 2. 5HVHDUFK��GHYHORSPHQW��LQQRYDWLRQ�WDVNV: The R&D tasks create the basis for the demonstrations and support the after-demonstration analysis. In the first phase ECOLEAD solutions are analysed and applied to create and document a partner-CNO-specific demonstration scenario. Further the task of the new SME network partners is to participate as end-users in the generation of ideas and future visions and further specification and development of solutions, based on experience derived from the demonstration. Participation in impact creation and dissemination of the results outside ECOLEAD is also encouraged. 3. 0DQDJHPHQW�WDVNV: A small amount of funding is reserved to take care about the administrative tasks of the project. �

���'HPRQVWUDWLRQ�WDUJHWV�DQG�WDVN�GHVFULSWLRQ�

The demonstrations support the ECOLEAD goal by generating information about the applicability and impact of the ECOLEAD developments. The main candidate prototype demonstrations include: ¾ Breeding Environment (BE) partner profiles e-catalog & skill management system, VO Creation environment ¾ Virtual Organization management support & VO performance measurement process and support tool, VO e-services. ¾ Professional Virtual Community collaboration platform, operation model & management, PVC metrics, PVC expertise profiling Demonstrations are the main task of the new partners. They will be organised in two different stages; ¾ First (trial) stages, whose main aim is to test and validate the ECOLEAD approach from an operative point of view, with the objective of

provide feedback and improvement suggestions to the developed ECOLEAD products. As collaborative solutions are tested, the participation of collaborating partners in the demonstrations is envisaged. The new partner (proposer) coordinates its own demonstration in its Collaborative networked organization. Trial metrics will be prepared in advance by ECOLEAD.

¾ Second (take-up) stage, in which the impact of the adopted ECOLEAD methodologies and technologies will be assessed in terms of impact of the implementation of the methodology/ tool in the company practices and resulting business benefits. Take-up metrics will be prepared in advance.

Participation in collaborative workshops of the ECOLEAD consortium is expected. More detailed descriptions of the demonstrations can be found in www.ecolead.org. ���3DUWQHU�VHOHFWLRQ�

The call is focused to end-users of collaborative network solutions, that is, mainly organizations managing or operating in SME enterprise networks or Professional Virtual Communities. The ECOLEAD project makes the potential contract only with one legal entity. The proposer is advised to name the main intended co-operating companies. The expected number of new partners is 6-8, from which some would participate only in the 1st stage (trial), and the rest to the both stages. Information about the selection criteria is available at ZZZ�HFROHDG�RUJ� ���6FKHGXOH�DQG�UHVRXUFHV�

Activity type Estimated total (all the new partners together) costs Commission funding percentage Research ¼���������� up to 50% Demonstration ¼��������� up to 35% Consortium management ¼����������� up to 100%

Total Commission funding available for the new partners is ¼ 669500. Expected duration of participation in project: between Project Month 21 and Month 48 (1.1.06-31.3.08). The project budget is revised yearly.

8SGDWHG�LQIRUPDWLRQ�DERXW�WKH�FDOO��ZZZ�HFROHDG�RUJ��

Advertisment

Page 66: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

66 ERCIM News No. 62, July 2005

Digital Rights Management Digital Rights Management (DRM)presents a real challenge for contentcommunities in this digital age. Securityand encryption issues as a means ofavoiding unauthorized copying were themain issues for DRM in the past, basi-cally locking away content unless it waspaid for. Today DRM deals with muchbroader issues such as the description,identification, trading, protection, moni-toring and tracking of all forms of rightsusages over both tangible and intangibleassets, including management of rightsholders’ relationships.

Typically a DRM framework ismodelled around three areas of intellec-tual property:• Asset Creation and Capture: To

manage the creation of content so itcan be easily traded. This includesasserting rights when content is firstcreated (or reused and extended withappropriate rights to do so) by variouscontent creators/providers.

• Asset Management: To manage thecollection and trade of content. Thisincludes accepting content fromcreators into an asset managementsystem. Systems need to manage thedescriptive metadata and rights meta-data (eg, parties, uses, payments, etc.).

• Asset Use: To manage the use ofcontent once it has been traded. Thisincludes supporting constraints overtraded content in specific desktopsystems/software.

A DRM functional architectureaddresses the following issues in each ofthese three main areas:For asset creation and capture: • Rights Validation – to ensure the

content being created from existingcontent includes the rights to do so

• Rights Creation – to allow rights to beassigned to the new content, such asspecifying the rights owners andpermissible use

• Rights Workflow – to allow for contentto be processed through a series of

workflow steps for review and/orapproval of rights (and content).

For asset management:• Repository functions – to enable the

access/retrieval of content in poten-tially distributed databases and theaccess/retrieval of metadata. Themetadata includes Parties, Rights anddescriptions of the Works

• Trading functions – to enable theassignment of licenses to parties whohave traded agreements for rights overcontent, including payments fromlicensees to rights holders (eg royaltypayments). In some cases the contentmay be encrypted/protected or pack-aged for a particular type of desktopenvironment.

For asset use:• Permissions Management – to enable

the use environment to adhere to therights associated with the content,such as taking account of any restric-tions imposed under the conditions ofthe licence

• Tracking Management – to monitoruse of which may impose restrictionsunder the license conditions, forexample where the number of accessesis restricted or perhaps to trackpayment.

For more information on Digital RightsManagement Architectures seehttp://www.dlib.org/dlib/june01/iannella/06iannella.html

DRM is heavily criticised by somesupporters of the Open Access move-ment who believe it serves to ‘lock up’content, rather being seen as a tool toensure correct author attribution, tocertify integrity and provenance, preventplagiarism and to encourage authors toassert their creative rights rather thanrestrict them. Perhaps the most impor-tant area though is the use of rights meta-data. The JISC-funded project RoMEOhttp://www.lboro.ac.uk/departments/ls/disresearch/romeo/ found that 55% ofresearchers wanted to limit usage of their

works to certain purposes, eg educa-tional or non-commercial use. However,an ‘all rights reserved’ model was morethan most researchers wanted. Theproject developed an XML-based systemdesigned to express rights and permis-sions in an OA environment. Aroundthis time a number of intellectual prop-erty lawyers got together and foundedCreative Commons http://creativecom-mons.org/ a not for profit corporationfounded on the principle that somecreators may not want to exercise all theintellectual property rights the lawaffords them. Creative Commons offersa number of different licence solutions tocreators such as Attribution - free use toothers on the basis that credit is given tothe creator; Non-commercial – restrictedto non-commercial use only; No deriva-tive works – work must be used as it isand not modified in any way; Share alike– allows others to distribute derivativeworks but only under an identicallicence.

The creation of more and more compounddigital objects, inheriting all of theexisting rights of the individual digitalcomponents, along with the new rights ofthe newly-created object, makes DigitalRights Management in an e-Researchenvironment more challenging than ever.

PlagiarismAnti-plagiarism software, used byuniversities to check student essays, isbeing adapted to review academic papersprior to publication. This is one of thelatest initiatives to be embraced by thebig publishers, like Elsevier (the world’slargest scientific publisher whichpublishes around a quarter of a millionpapers each year) and Blackwell. Thesoftware is being tested out on the arXivphysics preprint server at CornellUniversity. It runs an algorithm thatlooks for any two documents that sharesix of the same words in a row. Otheranti-plagiarism tools are being devel-oped currently but little is known aboutthem at present, but some of them couldbe free to use and targeted at editors andpeer reviewers. Student anti-plagiarismsoftware services detect not only dupli-cate publications but can check docu-ments against large databases.

by Heather Weaver, CCLRC, UK

EURO-LEGAL

News about legal information relating to Information

Technology from European directives, and pan-European

legal requirements and regulations.

Page 67: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

67ERCIM News No. 62, July 2005

IN BRIEF

ERCIM News is the magazine of ERCIM.Published quarterly, the newsletter reports on jointactions of the ERCIM partners, and aims toreflect the contribution made by ERCIM to theEuropean Community in Information Technology.Through short articles and news items, it providesa forum for the exchange of information betweenthe institutes and also with the wider scientificcommunity. This issue has a circulation of 10,500copies.

AdvertisingFor current advertising rates and conditions,see http://www.ercim.org/publication/ERCIM_News/or contact [email protected]

Copyright NoticeAll authors, as identified in each article, retaincopyright of their work.

ERCIM News online editionhttp://www.ercim.org/publication/ERCIM_News/

ERCIM News is published by ERCIM EEIG, BP 93, F-06902 Sophia-Antipolis CedexTel: +33 4 9238 5010, E-mail: [email protected] 0926-4981

Director: Jérôme Chailloux, ERCIM Manager

Central Editor:Peter Kunz, ERCIM office [email protected]

Local Editors:AARIT: n.a.CCLRC: Martin Prime

[email protected]: Michal Haindl

[email protected]: Annette Kik

[email protected]: Carol Peters

[email protected]: Margherita Antona

[email protected] ICT Group:

Michael [email protected]

FNR: Patrik [email protected]

FWO/FNRS: Benoît [email protected]

INRIA: Bernard [email protected]

Irish Universities Consortium:Ray [email protected]

NTNU: Truls [email protected]

SARIT: Harry [email protected]

SICS: Kersti [email protected]

SpaRCIM: Salvador [email protected]

SRCIM: Gabriela Andrejkova [email protected]

SZTAKI: Erzsébet Csuhaj-Varjú[email protected]

VTT: Pia-Maria Linden-Linna [email protected]

W3C: Marie-Claire [email protected]

SubscriptionSubscribe to ERCIM News free of charge by: • sending e-mail to your local editor• contacting the ERCIM office (see address

above) • filling out the form at the ERCIM website at

http://www.ercim.org/

CWI — Scientists of CWI and theUniversities of Twente and Konstanzlaunched MonetDB/XQuery during theHolland Open Software Conference inAmsterdam, on May 31. Storing andquerying high volumes of XML datarequire new software systems.MonetDB/XQuery is an open sourcesystem that provides a complete imple-mentation of XQuery. This makes itpossible to search both the content andstructure of large XML documents.MonetDB/XQuery is built on theMonetDB Relational DatabaseManagement System. More informationcan be found on: http://pathfinder-xquery.org/http://www.cwi.nl/ins1http://www.hollandopen.nl/index.jsp?nr=2610

CWI —A l e x a n d e rSchrijver isawarded theSpinoza Prize2005. This wasannounced bythe NetherlandsO r g a n i z a t i o nfor ScientificResearch in The

Hague on 6 June. Lex Schrijver isresearcher at CWI in Amsterdam,leader of the Probability, Networks andAlgorithms research cluster (until 1st ofJuly) and member of CWI’smanagement team. He is also part-timeprofessor at the University ofAmsterdam. Schrijver received thisprestigious prize, called the ‘DutchNobel Prize’, for his outstanding,pioneering and inspiring research in thefield of combinatorics and algorithms.The Spinoza Prize is the mostdistinguished award in science in theNetherlands and consists of 1.5 millionEuro – to be spent on research of choice– and a statue of Spinoza. The officialceremony will take place onWednesday 23 November 2005.

INRIA — It is with a profound sadness that INRIA officiallyannounces the demise of Isabelle Attali and her two childrenTom and Ugo, following the December 26, 2004 tsunami.

Isabelle Attali was the scientific head of the Oasis team sinceJanuary 2000. She was very much involved in numerousnational, European and international initiatives. She wasVice-President of the Institute's Evaluation Committee since2003, head of a work group of the GridCoord European pro-ject launched in July 2004, and an active member of theERCIM administrated CoreGRID network of excellence. Up

to 2001, Isabelle Attali took care of the Young Researcher in Programming School shehad contributed to creating in 1995. She also devoted much of her time and energy tothe Telecom Valley association at Sophia Antipolis.

In December 2004, Isabelle and her companion, Denis Caromel, also a member ofproject Oasis, were on assignment in Sri Lanka with their children. They were teach-ing at a winter school organized by Cimpa and Unesco and were setting up a collabo-ration with the University of Ruhuna. As a tribute to her major commitment to allthese endeavors, the Sophia Antipolis Foundation and INRIA have decided to create afund to facilitate exchanges between Sri Lanka and Sophia Antipolis in the field ofinformation and communication science and technology. This fund may be provi-sioned by individual and corporate donations. For further information about the fund,contact [email protected]

VTT — After many years in theERCIM Executive Committee SeppoLinnainmaa , head of VTT'sInformation Systems department nowrepresents VTT on ERCIM's Board ofDirectors succeeding PekkaSilvennoinen, Executive Director ofVTT Information Technology. SeppoLinnainmaa was replaced in theExecutive Committee by Seppo Valli,head of the Multimedia research groupof the ‘Networks’ department.

Pho

to: A

. Wap

enaa

r

Page 68: Multimedia Informatics - ERCIM · MULTIMEDIA INFORMATICS Introduction 8 Multimedia Informatics by Joachim Köhler Multimedia Indexing and Retrieval Invited article: 10Managing Digital

ERCIM – The European Research Consortium for Informatics and Mathematics is an organisation

dedicated to the advancement of European research and development, in information technology

and applied mathematics. Its national member institutions aim to foster collaborative work within

the European research community and to increase co-operation with European industry.

ERCIM is the European Host of the World Wide Web Consortium.

Fraunhofer ICT GroupFriedrichstr. 6010117 Berlin, GermanyTel: +49 30 726 15 66 0, Fax: ++49 30 726 15 66 19http://www.iuk.fraunhofer.de

Institut National de Recherche en Informatique et en AutomatiqueB.P. 105, F-78153 Le Chesnay, FranceTel: +33 1 3963 5511, Fax: +33 1 3963 5330http://www.inria.fr/

Swedish Institute of Computer ScienceBox 1263, SE-164 29 Kista, SwedenTel: +46 8 633 1500, Fax: +46 8 751 72 30http://www.sics.se/

Technical Research Centre of FinlandP.O. Box 1200FIN-02044 VTT, FinlandTel:+358 9 456 6041, Fax :+358 9 456 6027http://www.vtt.fi/tte

Swiss Association for Research in Information Technologyc/o Prof. Dr Alfred Strohmeier, EPFL-IC-LGL, CH-1015 Lausanne, SwitzerlandTel +41 21 693 4231, Fax +41 21 693 5079http://www.sarit.ch/

Slovak Research Consortium for Informatics andMathematics, Comenius University, Dept.of ComputerScience, Mlynska Dolina M, SK-84248 Bratislava, SlovakiaTel: +421 2 654 266 35, Fax: 421 2 654 270 41http://www.srcim.sk

Magyar Tudományos AkadémiaSzámítástechnikai és Automatizálási Kutató IntézetP.O. Box 63, H-1518 Budapest, HungaryTel: +36 1 279 6000, Fax: + 36 1 466 7503http://www.sztaki.hu/

Irish Universities Consortiumc/o School of Computing, Dublin City UniversityGlasnevin, Dublin 9, IrelandTel: +3531 7005636, Fax: +3531 7005442http://ercim.computing.dcu.ie/

Austrian Association for Research in ITc/o Österreichische Computer GesellschaftWollzeile 1-3, A-1010 Wien, AustriaTel: +43 1 512 02 35 0, Fax: +43 1 512 02 35 9http://www.aarit.at/

Norwegian University of Science and Technology Faculty of Information Technology, Mathematics andElectrical Engineering, N 7491 Trondheim, NorwayTel: +47 73 59 80 35, Fax: +47 73 59 36 28http://www.ntnu.no/

Spanish Research Consortium for Informaticsand Mathematics c/o Esperanza Marcos, Rey Juan CarlosUniversity, C/ Tulipan s/n, 28933-Móstoles, Madrid, Spain, Tel: +34 91 664 74 91, Fax: 34 91 664 74 90http://www.sparcim.org

Consiglio Nazionale delle Ricerche, ISTI-CNRArea della Ricerca CNR di Pisa, Via G. Moruzzi 1, 56124 Pisa, ItalyTel: +39 050 315 2878, Fax: +39 050 315 2810http://www.isti.cnr.it/

Centrum voor Wiskunde en InformaticaKruislaan 413, NL-1098 SJ Amsterdam, The NetherlandsTel: +31 20 592 9333, Fax: +31 20 592 4199http://www.cwi.nl/

Council for the Central Laboratory of the ResearchCouncils, Rutherford Appleton LaboratoryChilton, Didcot, Oxfordshire OX11 0QX, United KingdomTel: +44 1235 82 1900, Fax: +44 1235 44 5385http://www.cclrc.ac.uk/

Foundation for Research and Technology – HellasInstitute of Computer ScienceP.O. Box 1385, GR-71110 Heraklion, Crete, GreeceTel: +30 2810 39 16 00, Fax: +30 2810 39 16 01http://www.ics.forth.gr/FORTH

Czech Research Consortium for Informatics and MathematicsFI MU, Botanicka 68a, CZ-602 00 Brno, Czech RepublicTel: +420 2 688 4669, Fax: +420 2 688 4903http://www.utia.cas.cz/CRCIM/home.html

Fonds National de la Recherche6, rue Antoine de Saint-Exupéry, B.P. 1777L-1017 Luxembourg-KirchbergTel. +352 26 19 25-1, Fax +352 26 1925 35http:// www.fnr.lu

FWOEgmontstraat 5B-1000 Brussels, BelgiumTel: +32 2 512.9110http://www.fwo.be/

FNRSrue d'Egmont 5B-1000 Brussels, BelgiumTel: +32 2 504 92 11http://www.fnrs.be/

Order FormIf you wish to subscribe to ERCIM News

free of chargeor if you know of a colleague who would like to

receive regular copies of ERCIM News, please fill in this form and we

will add you/them to the mailing list.

Send, fax or email this form to:

ERCIM NEWS2004 route des Lucioles

BP 93F-06902 Sophia Antipolis Cedex

Fax: +33 4 9238 5011E-mail: [email protected]

Data from this form will be held on a computer database.

By giving your email address, you allow ERCIM to send you email

I wish to subscribe to the

❏ printed edtion ❏ online edition (email required)

Name:

Organisation/Company:

Address:

Post Code:

City:

Country

E-mail:

You can also subscribe to ERCIM News and order back copies by filling out the form at the ERCIM website at http://www.ercim.org/publication/Ercim_News/


Recommended