Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 214 times |
Download: | 1 times |
1
Final Year Project 2003/2004LYU0302PVCAIS – Personal Video Conference Archives Indexing System
Supervisor:Prof Michael Lyu
Presented by: Lewis Ng, Philip Chan
15 March 2004
2
Outline
• Introduction• Motivation• Architecture of PVCAIS
• Media Acquisition Module• Archive Indexing Module• Videoconference Accessing Module
• Implementation• Conclusions • Future Work
3
Introduction
• PVCAIS stands for Personal Video Conference Archives Indexing System
• A system that provides convenient searching and browsing support for videoconferencing users on past videoconference archives
4
Introduction
• What is video conference?• A real-time communication technology w
hich combines different media that may include:• audio, video, text chat, file transfer, whiteboa
rd and shared applications• More precisely is “multimedia conferenc
e”
5
Motivation
• Videoconference is becoming popular in education, business and personal communication
• Participants wish to keep videoconference archives for later references
• Normal video and audio files are neither searchable nor helpful to recall their contents
• Indexing of videoconference archives has not been investigated till now
6
Architecture of PVCAIS
• Consists of 3 modules:• Media Acquisition Module• Archive Indexing Module• Videoconference Accessing Module
Mediaacquisition
Rawvideoconference
archives
Indexedvideoconference
archives
Archive indexing
Videoconferenceaccessing
8
ArchitectureMedia Acquisition Module
Extracts channel data and forms media filesVideoconferencing physically contains 4 types of cha
nnels: Audio, Video, Data and ControlAudio and Video channels: transmit incoming/ outgoing au
dio and video informationData channel: carries information for user application such
as Text Chat, Whiteboard and File TransferControl channel: transmits system control information such
as Member Information
11
Architecture Archive Indexing Module
• Raw files are extracted in Media Acquisition Module
• Need to implement some multimedia indexing functions to retrieve more information
• These includes: • Face Detection, Face Recognition, Speech
Recognition, Time-based Text Merging, Keyword Selection, Title Generation
12
Architecture Archive Indexing Module
• Face Detection and Recognition• Associate human
faces in Video-in with name
• Need to keep a face base
• If no match in the face base, ask remote user to enter the name
13
Architecture Archive Indexing Module
• Speech Recognition• Generate speech script from audio archiv
e• Speech of a videoconferencing contains t
he most information• Can use commercial library: Microsoft SA
PI, IBM Via Voice
14
Architecture Archive Indexing Module
• Time-based Text Merging• Merge the Speech transcript, Chat
script, Whiteboard script and slide text archive into the Text Source according to their timestamp
• Keyword Selection• Take the Text Source as input• Generate keyword for the
videoconference
15
Architecture Archive Indexing Module• Title Generation
• Take the Text source as input• Automatically generate a title for the
videoconference
• Generate XML index file• Integrate all the archives• Store all the related files of a videoconference
into a single directory
16
ArchitectureVideoconference Accessing Module
• Provides an interface for user to manage, search and review all indexed conference archives
• Allows user to modify the content of a conference, such as editing title or keywords, or delete a conference
• Allows user to search for a conference by different criteria, such as period of meeting, member name or keyword
• Allows user to review a conference by playing back different media in a synchronized way
17
Implementation
• Face Verification Feature• Each registered user is assigned with
a user ID and his/her face is saved in face base
• Before joining a videoconference, PVCAIS needs to verify the face of the user against his/her user ID
18
Implementation
• NetMeeting 3.0• A Windows feature that provide
Internet conferencing function• Support video, audio and data
conferencing including application sharing, chat, whiteboard and file transfer
• Other features include remote desktop sharing
19
Implementation
• NetMeeting 3.0 SDK • An extension of NetMeeting, provides
an interface for programmers and Web developers to integrate conferencing capabilities into their applications
• API is in the form of COM interfaces and functions
20
Implementation
• A simple NetMeeting compatible videoconference program built on top of the NetMeeting 3.0 SDK
• Support:• Video• Audio• Text Chat• File Transfer• Whiteboard
21
Implementation Media Acquisition Module
• By directly using the functions of the API, the following raw data can be obtained: • the members information • file transfer record • text messages record
• Video, audio and whiteboard data cannot be directly obtained
22
ImplementationMedia Acquisition Module
• Video• create a thread to check the display
of the video windows • if scene change is detected, the video
will be captured and stored as a still image
• the stored images are key frames of the conference
23
Implementation Media Acquisition Module
• Audio• create a thread to record the local
audio from the microphone• members of the conference will
continuously exchange the audio data• all the received audio files and locally
recorded audio files will be combined to generate a single audio file
24
• Whiteboard• cannot capture the NetMeeting
whiteboard information because the format of the data is not stated in the API
• We have designed and created our own whiteboard and data format
Implementation Media Acquisition Module
25
Implementation Archive Indexing Module
• The stored key-frames will be used for face detection and recognition after the conference
• The final audio file will be used for voice recognition, the voice engine used is Microsoft SAPI
26
Implementation Videoconference Accessing Module
• An interface for conferences management• search conference by member name
or chatting content• review conference by playing back
the content of the conference, including audio, key-frames, member information, file exchange record and chatting content
27
Implementation Videoconference Accessing Module
• SMIL• stands for Synchronized Multimedia
Integration Language• HTML-like language• can integrate streaming audio and
video with images, text, or any other media type into one presentation
28
Conclusions
• We developed a videoconferencing client
• All the media can be extracted in Media Acquisition Module
• Multimedia indexing functions are implemented
• A stand alone Videoconference Accessing Module is being developed