A fuzzy video content representation for video summarization and content-based
retrieval
Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias
2000
Presented by Mohammed S. Al-Logmani
Agenda• Introduction
• Motivation/ Problem Statement
• Video Sequence Analysis
• Fuzzy Visual Content Representation
• Video Summarization
• Content-Based Retrieval
• Experimental Results
• Future Work
• Conclusion
Introduction• The increase amount of digital image &
video data requires new technologies for efficient searching, indexing, content-based retrieving & managing multimedia databases.
• Drawbacks with keyword annotations:• Large amount of effort for developing them.• Cannot efficiently characterize the rich visual
content using only text
Introduction Cont.
• Content-based algorithms• QBIC• VisualSeek• Virage
• Cannot easily applied to video DBs.• Perform queries on every frame is inefficient & time
consuming• Videos DBs. are distributed which impose large
storage & transmission requirements
Introduction Cont.
• Content-based sampling algorithms• Extract small but meaningful info. (summarization)
• Require a more meaningful representation of visual content than the traditional pixel-based one
• Related Work:• A hidden Markov model for color image retrieval
• An approach of image retrieval based on user sketches
• A hierarchical color clustering method
• Construction of a compact image map or image mosaics for video summarization
• A pictorial summary of video sequences based on story units
Motivation/ Problem Statement• Increase the flexibility of content-based
retrieval systems• Provide an interpretation closer to the human
perception
• Result a more robust description of visual content• possible instabilities of the segmentation are
reduced
fuzzy representation of visual content
• Video summarization• Performed by minimizing a cross correlation criterion
among the video frames using a GA.• The correlation is computed using several features
extracted using a color/ motion segmentation on a fuzzy feature vector formulation basis.
• Content-based indexing & retrieval• The user provides queries (images or sketches) which are
analyzed in the same way as video frames in video summarization scheme.
• A metric distance or similarity measure is then used to find a set of frames that best match the user's query.
Video Sequence Analysis• A color/motion segmentation algorithm is
applied for visual content description• Multiresolution Recursive Shortest Spanning
Tree (M-RSST)• recursively applies the RSST to images of increasing
resolution. (a truncated image pyramid is created)
• Produces same results as RSST with less time.
• Eliminates regions of small segments
• Factors affect the segmentation efficiency• The initial image resolution level
• selected to be 3 (downsampling by 8x8 pixels)
• The selection of threshold used for terminating the algorithm
• Euclidean distance of the color or motion intensities between two neighboring segments
• Terminate the segmentation if no segments are merged from one step to another.
Video Sequence Analysis cont.
Video Sequence Analysis cont.
Fuzzy visual content representation
• The size & location cannot be used directly• segments # is not constant for each video frame
• To overcome this problem, pre-determined classes of color/motion properties
• To avoid the possibility of classifying two similar segments to different classes, a degree of membership is allocated to each class• Resulting in a fuzzy classification formulation
• Create a fuzzy multidimensional histogram
Fuzzy visual content representation Cont.
•Example: property (s) is used for each segment.•s takes values in [0,1]•It is classified into Q classes using Q membership functions• • degree of membership of s in the nth class
•Assume a video frame consists of K segments•First, evaluate the degree of membership of feature
si = 1,2, … K, of the ith segment•Then, find the degree of membership of K in the nth class through the fuzzy histogram
Fuzzy visual content representation Cont.
Video summarization
Video summarization Cont.
• Extraction of key-frames• Key-frames are extracted by minimizing a cross-
correlation criterion, so that the selected frames are not similar to each other.
• The generic approach (GA)• Similarities to the traveling salesman problem (TSP).• Initially, a population of m chromosomes is created.• Evaluate the performance of all chromosomes in
population P(n) using a correlation measure.• Evaluate the chromosomes quality using fitness functions.• Select appropriate parent so that a fitter chromosome gives
a higher number of offspring• The GA terminates when the best chromosome fitness
remains constant for a large number of generations
• Examined about170 shot, # Kf=6 , Q=3
Video summarization Cont.
Content-based retrieval• Apply the previous scheme to discard all the
redundant temporal video information• The user can submit:
• Images (query by example)• Sketches (query by sketch)
• Analyze the query using M-RSST• Extract and classify the segments
• Apply a distance similarity measure
Experimental results
Experimental results Cont.
Experimental results Cont.
Future Work• Increase the system accuracy by
developing a fuzzy adaptive mechanism for estimating the distance weights.
Conclusion• Presented a fuzzy video content representation
• Efficient for:• Video summarization• Content-based image indexing & retrieval
• Experimental results indicate that this approach outperforms the other methods for both accuracy and computational efficiency