+ All Categories
Home > Documents > Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Date post: 07-Jan-2016
Category:
Upload: samson
View: 32 times
Download: 0 times
Share this document with a friend
Description:
Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes. Mike Christel, Wei-Hao Lin, and Bryan Maher {christel, whlin, bsm}@cs.cmu.edu School of Computer Science Carnegie Mellon University. CIVR July 8, 2008. Talk Outline. - PowerPoint PPT Presentation
Popular Tags:
19
Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes CIVR July 8, 2008 Mike Christel, Wei-Hao Lin, and Bryan Maher {christel, whlin, bsm}@cs.cmu.edu School of Computer Science Carnegie Mellon University
Transcript
Page 1: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

CIVRJuly 8, 2008

Mike Christel, Wei-Hao Lin, and Bryan Maher{christel, whlin, bsm}@cs.cmu.edu

School of Computer Science Carnegie Mellon University

Page 2: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Talk OutlineTalk Outline

• TRECVID 2007 BBC Rushes Summarization Task TRECVID 2007 BBC Rushes Summarization Task

• Look at a few Video SummarizationsLook at a few Video Summarizations

• Assessment Procedure: Are they any good?Assessment Procedure: Are they any good?

• First Study: 25x, cluster, pzFirst Study: 25x, cluster, pz

• Second Study (focus on acceleration): 25x, 50x, 100x, pzSecond Study (focus on acceleration): 25x, 50x, 100x, pz

• DiscussionDiscussion

Page 3: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

TRECVID 2007 BBC Rushes SummarizationTRECVID 2007 BBC Rushes Summarization

• Video summary is “a condensed version of some Video summary is “a condensed version of some information, such that various judgments about the full information, such that various judgments about the full information can be made using only the summary and information can be made using only the summary and taking less time and effort than would be required using taking less time and effort than would be required using the full information source”the full information source”

• Maximum 4% durationMaximum 4% duration

• Benefits of this TRECVID task: provides a reasonably Benefits of this TRECVID task: provides a reasonably large video collection to be summarized, a uniform large video collection to be summarized, a uniform method of creating ground truth, and a uniform scoring method of creating ground truth, and a uniform scoring mechanismmechanism

Page 4: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

BBC Rushes

• 42 test videos (+ development ones) from BBC Archive

• Test videos:• minimum duration 3.3 minutes• maximum duration 36.4 minutes• mean duration 25 minutes

• Raw (unedited) rush video with a great deal of Raw (unedited) rush video with a great deal of redundancy (repeated takes), mixed quality audio, redundancy (repeated takes), mixed quality audio, “junk” frames“junk” frames

Page 5: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Summary Demonstration

Page 6: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Assessment Procedure Assessment Procedure

Page 7: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Assessment (Text Inclusions of Prior Slide)

• pan left to right around table with five people eating dinner• pan right to left around table with five people sitting talking• curly haired man stands up from the table• closeup of grey haired lady, dinner table not visible• grey haired lady across dinner table, green wine bottle visible in

foreground• grey haired lady across dinner table, camera pans right• grey haired lady across dinner table, green wine bottle not visible in

foreground• partial view of person to the right talking to grey haired lady across

dinner table• closeup of short haired man sitting, without his hands clasped

together• closeup of blonde lady as she stands up, there is a fire in the

background• closeup of curly haired man without a hand on his face• closeup of curly haired man as he stands up

Page 8: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Assessment Procedure, Grading

Page 9: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Assessment Metrics

• Duration (DU, <= 4% of the target video)

• Assessor time-on-task (TT) judging which ground truth segments were included in the summary

• The fraction of listed text segments from the full video included in the summary as judged by assessor (IN)

• Ease of use to find desired content (EA)

• How redundant was the summary (RE)

• …ideal summary would have the smallest DU and TT necessary to achieve sufficient IN performance with high user satisfaction based on subjective EA and RE

Page 10: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

First Study: cluster, 25x, pz

• cluster: based on iterative color clustering with junk frame removal, backfilling of unused space and audio coherence

• pz: cluster-based, but use domain knowledge that “pans/zooms” are important to keep pan or zoom sequences in 1-3 second runs as representing clusters

• 25x: select every 25th frame of target video to produce 4% (1/25) video summary with apparent 25x playback (use same coherent audio as with cluster) – note that no junk frame filtering is used

Page 11: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Results from TRECVID 2007 Evaluation

Baseline 1 Baseline 2 cluster

TT (secs.) 105.66 100.48 101.83

IN 0.59 0.58 0.60

EA (5 best) 3.44 3.41 3.37

RE (5 best) 3.52 3.50 3.62

Page 12: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Participants and Results, Study 1Participants and Results, Study 1

• 4 CMU students and staff following the NIST procedure4 CMU students and staff following the NIST procedure

Page 13: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Study 1 DiscussionStudy 1 Discussion

• 25x excellent method to produce summary for high IN25x excellent method to produce summary for high IN

• TT metric for 25x also highTT metric for 25x also high

• RE metric poor for 25x (but inversely related to IN…)RE metric poor for 25x (but inversely related to IN…)

• EA for 25x better than cluster (perhaps helped by audio)EA for 25x better than cluster (perhaps helped by audio)

• Subjective metrics TT, RE, and EA best for pz Subjective metrics TT, RE, and EA best for pz

Page 14: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Question Leading to Study 2Question Leading to Study 2

How fast is too fast?How fast is too fast?(see [Wildemuth 2003] cited in paper)(see [Wildemuth 2003] cited in paper)

25x? 50x? 100x? Will “pz” differentiate from these?25x? 50x? 100x? Will “pz” differentiate from these?

Page 15: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Second Study: 25x, 50x, 100x, pzA

• 25x: as before (every 25th frame, coherent audio)

• 50x: select every 50th frame of target video to produce 2% (1/50) video summary with apparent 50x playback

• 100x: select every 100th frame, 1% summary

• pzA: as before but with audio same as 25x audio, filled to be a 4% summary

Page 16: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Participants and Results, Study 2Participants and Results, Study 2

• 15 subjects (8 female, 7 male; age range [21, 35] with 15 subjects (8 female, 7 male; age range [21, 35] with average age 25.7) following the NIST procedureaverage age 25.7) following the NIST procedure

Page 17: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Study 2 DiscussionStudy 2 Discussion

• 25x excellent method to produce summary for high IN 25x excellent method to produce summary for high IN (0.73)(0.73)

• 50x also excellent for high IN (0.68), >> pzA and 100x50x also excellent for high IN (0.68), >> pzA and 100x

• TT metric for 25x also high: 25x and pzA (the two 4% TT metric for 25x also high: 25x and pzA (the two 4% summaries) both significantly slower than 50x and 100xsummaries) both significantly slower than 50x and 100x

• RE metric shows 25x worse than pzARE metric shows 25x worse than pzA

• EA for 100x worse than others (100x has fastest TT)EA for 100x worse than others (100x has fastest TT)

• 50x produces excellent IN performance at 2/3 the time 50x produces excellent IN performance at 2/3 the time cost (TT) of 25xcost (TT) of 25x

• 100x too fast: IN significantly worse than 50x, EA poor100x too fast: IN significantly worse than 50x, EA poor

Page 18: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

Discussion

• We believe inclusion of audio narrative along with sped-up video made 25x and 50x more playable; at 100x the audio becomes too short/choppy to contribute well

• 15 subjects for Study 2 not as careful as NIST or Study 1 assessors, e.g., TT of 77.5 vs. 110 or 102 seconds

• If these 15 better reflect true users, time savings important (and hence TT is important metric)

• How will 50x hold up as a baseline? (To be discussed in the context of TRECVID 2008 BBC rushes summarization task – it does well on IN, poor on TT, RE)

Page 19: Evaluating Audio Skimming and Frame Rate Acceleration for Summarizing BBC Rushes

ConclusionsConclusions

• For BBC rushes, 50x works quite wellFor BBC rushes, 50x works quite well

• Domain knowledge (here, attempting to preserve pans/zooms) did not Domain knowledge (here, attempting to preserve pans/zooms) did not distinguish itselfdistinguish itself• Improve detector for “significant” pans/zoomsImprove detector for “significant” pans/zooms• Sacrifice coverage for pan/zoom inclusionSacrifice coverage for pan/zoom inclusion

• Interactive summary control an area of promise, e.g., 50x until Interactive summary control an area of promise, e.g., 50x until neighborhood of interest found, then pz to see pans/zooms and more detailneighborhood of interest found, then pz to see pans/zooms and more detail

Thanks to NIST, BBC, and TRECVID organizers for Thanks to NIST, BBC, and TRECVID organizers for making this investigation possible. This work supported making this investigation possible. This work supported

by the National Science Foundation under Grant Nos. IIS-by the National Science Foundation under Grant Nos. IIS-0205219 and IIS-07054910205219 and IIS-0705491


Recommended