Post on 02-Jan-2016
transcript
Semantic Content-based Semantic Content-based Access To Hypervideo Access To Hypervideo
DatabasesDatabases
Haitao Jiang
Major Professor: Ahmed K. Elmagarmid
Computer Science Department
Purdue University
1998
Organization Of The TalkOrganization Of The Talk
• Introduction And Review Of Related WorkIntroduction And Review Of Related Work
• Logical Hypervideo Data Model (LHVDM)Logical Hypervideo Data Model (LHVDM)
• Semantic Content-based Video Queries Semantic Content-based Video Queries
• A Web-based Logical Hypervideo A Web-based Logical Hypervideo Database (WLHVDB)Database (WLHVDB)
• ConclusionConclusion
IntroductionIntroduction
• Digital Video And Video Digital Video And Video DatabasesDatabases
• Basic Research Problems Basic Research Problems
• Research Motivation Research Motivation
• Research Goal Research Goal
Unique Characteristics Of Unique Characteristics Of Video DataVideo Data
• Semantics: rich and ambiguousSemantics: rich and ambiguous
• Relationship: ill-definedRelationship: ill-defined
• Structure: unclearStructure: unclear
• Dimension: spatial and temporalDimension: spatial and temporal
• Volume: hugeVolume: huge
Video Data ContentVideo Data Content
• Visual ContentVisual Content
• Audio ContentAudio Content
• Text ContentText Content
• Semantics ContentSemantics Content
Research ProblemsResearch Problems
• Video Data ModelingVideo Data Modeling
• Video Data IndexingVideo Data Indexing
• Video Data QueryVideo Data Query
• Video BrowsingVideo Browsing
Video Data Model Video Data Model RequirementsRequirements
• Content-based Data AccessContent-based Data Access
• Video Data AbstractionVideo Data Abstraction
• Variable Data Access GranularityVariable Data Access Granularity
• Dynamic And Incremental Video Dynamic And Incremental Video AnnotationAnnotation
Video Data Model Video Data Model Requirements (Con.)Requirements (Con.)
• Video Data Independence Video Data Independence
• Spatial And Temporal Spatial And Temporal CharacteristicsCharacteristics
• Video And Meta-data Sharing Video And Meta-data Sharing And ReuseAnd Reuse
Related WorkRelated Work
• Video Data Modeling, Video Data Modeling, Indexing, And QueryingIndexing, And Querying
• Video ObjectsVideo Objects
• Video BrowsingVideo Browsing
Video Data Modeling, Video Data Modeling, Indexing, and QueryingIndexing, and Querying
• Traditional Database ApproachTraditional Database Approach
• Visual Content Or Segmentation-Visual Content Or Segmentation-based Approachbased Approach
• Stratification Or Annotation Stratification Or Annotation Layering ApproachLayering Approach
Traditional Database Traditional Database ApproachApproach
• Categorize And Predefine Video Categorize And Predefine Video Data Attributes/ValuesData Attributes/Values
• Use Traditional Databases And SQLUse Traditional Databases And SQL
• Inflexible And LimitedInflexible And Limited
• Examples: Examples: VISIONVISION, , Video Database BrowserVideo Database Browser
Segmentation-based Segmentation-based ModelsModels
• Parse And Segment Video StreamsParse And Segment Video Streams
• Index On Visual Features Of Index On Visual Features Of RFramesRFrames
• Extract High Level Logical Structure Extract High Level Logical Structure And Semantics By Classifying And Semantics By Classifying Against Domain ModelsAgainst Domain Models
Segmentation-based Segmentation-based Models (con.)Models (con.)
• Can Be Fully Automated Can Be Fully Automated
• Lack Of Flexibility Lack Of Flexibility
• Limited SemanticsLimited Semantics
• Video Streams Need To Be Well-Video Streams Need To Be Well-structuredstructured
• Examples: Examples: JACOBJACOB, QBIC, , QBIC, InformediaInformedia
StratificationStratification
• Segment Video Semantics Segment Video Semantics
• Concept Of Logical Video DataConcept Of Logical Video Data
• Allows For Semantic Content-based Allows For Semantic Content-based Video AccessVideo Access
• Annotation Can Be Tedious And BiasedAnnotation Can Be Tedious And Biased
• Examples: Examples: VideoStarVideoStar, Algebraic , Algebraic VideoVideo
StratificationStratification(con.)(con.)
Existing ModelsExisting Models
– Has Limited Temporal QueriesHas Limited Temporal Queries
– Has Limited Video Browsing Mechanism Has Limited Video Browsing Mechanism
– Lack Multi-user Views And Data SharingLack Multi-user Views And Data Sharing
– Lack Modeling Of Video ObjectsLack Modeling Of Video Objects
– Lack Spatial And Spatial-Temporal Lack Spatial And Spatial-Temporal Query CapabilitiesQuery Capabilities
Different Forms Of Video Different Forms Of Video AnnotationAnnotation
• Multi-layer Icons - Multi-layer Icons - MediaStreamMediaStream
• Keywords Keywords
• Free Text DocumentsFree Text Documents
• Other Types Of Annotation?Other Types Of Annotation?
Sources Of Video Sources Of Video AnnotationsAnnotations
• Closed CaptionClosed Caption
• Text In Video Frames: highlight Text In Video Frames: highlight detection and OCRdetection and OCR
• Voice RecognitionVoice Recognition
• Manual AnnotationManual Annotation
Annotation Support In A Annotation Support In A Video Data ModelVideo Data Model
• Annotation of Arbitrary SequenceAnnotation of Arbitrary Sequence
• Incremental Creation, Deletion, Incremental Creation, Deletion, And ModificationAnd Modification
• Multi-user Annotation SharingMulti-user Annotation Sharing
• Arbitrary Overlap Of AnnotationsArbitrary Overlap Of Annotations
Video ObjectsVideo Objects
• Index On Spatial And Temporal InformationIndex On Spatial And Temporal Information
• MBR as the Spatial RepresentationMBR as the Spatial Representation
• Narrow Focus And Lack Of Data AbstractionNarrow Focus And Lack Of Data Abstraction
• Limited Video QueriesLimited Video Queries
• Example: Example: AVISAVIS, , CVOTCVOT
Video BrowsingVideo Browsing
• Visual Content-based BrowsingVisual Content-based Browsing– Film StripsFilm Strips
– Salient ImagesSalient Images
– Scene Clustering GraphScene Clustering Graph
• NeedNeed Semantic Content-based Semantic Content-based BrowsingBrowsing
• NeedNeed Inter-Video Navigation Inter-Video Navigation
Research MotivationsResearch Motivations
• Visual Content-based Video Access IS Visual Content-based Video Access IS Important BUT Lack SemanticsImportant BUT Lack Semantics
• Users Often Prefer Semantic Content-Users Often Prefer Semantic Content-based Video Data Accessbased Video Data Access
• Lots Applications: Digital Video Lots Applications: Digital Video Library And Distance Learning etc.Library And Distance Learning etc.
• Web Is An Emerging Way Of Web Is An Emerging Way Of Information SharingInformation Sharing
Research GoalResearch Goal
• Goal: Goal: To Provide Effective And Flexible To Provide Effective And Flexible Semantic Content-based Video Data Access In A Semantic Content-based Video Data Access In A Distributed and Multi-user Sharing EnvironmentDistributed and Multi-user Sharing Environment
• Both Spatial And Temporal Video Both Spatial And Temporal Video QueriesQueries
• Heterogeneous Applications And User Heterogeneous Applications And User ViewsViews
• Semantic Content-based BrowsingSemantic Content-based Browsing
JACOB ProjectJACOB ProjectArdizzone and Cascia et al. 1997Ardizzone and Cascia et al. 1997
• Visual Content-based Access To Visual Content-based Access To Images And VideosImages And Videos
• RFrames Are Extracted And Served RFrames Are Extracted And Served As Descriptors Of Video SegmentsAs Descriptors Of Video Segments
• Index On Visual Features (Color, Index On Visual Features (Color, Motion, And Texture etc.)Motion, And Texture etc.)
Informedia ProjectInformedia ProjectM. A. Smith, T. Kanade, M. G. Christel, D. B. Winkler et al. CMUM. A. Smith, T. Kanade, M. G. Christel, D. B. Winkler et al. CMU
• Video Abstraction: Title-Poster Video Abstraction: Title-Poster Frame-Film strip-Skim videoFrame-Film strip-Skim video
• Speech Recognition->Transcript-Speech Recognition->Transcript->Natural Language Processing->Natural Language Processing->Keywords->Align to Frames>Keywords->Align to Frames
• Face And Keyword SearchFace And Keyword Search
VISION Digital LibraryVISION Digital LibraryK. M. Pua, S. Gauch et al. University of Kansas, 1993 - 1994K. M. Pua, S. Gauch et al. University of Kansas, 1993 - 1994
• Practical And Cost-effective Practical And Cost-effective Implementation But Very LimitedImplementation But Very Limited
• Video Storage System + IR system Video Storage System + IR system (Illustra - An ORDBMS)(Illustra - An ORDBMS)
• Text Is As One Table Entry Of Video DataText Is As One Table Entry Of Video Data
• Support Boolean OperatorsSupport Boolean Operators
OVID SystemOVID SystemOomota and Tanaka, 1991Oomota and Tanaka, 1991
• Video Object: a set arbitrary frame Video Object: a set arbitrary frame sequences with attributes and valuessequences with attributes and values
• Video Object Model Is SchemalessVideo Object Model Is Schemaless
• Data Description Sharing Via Data Description Sharing Via “Interval-inclusion Based Inheritance”“Interval-inclusion Based Inheritance”
• User Can Decide which Attributes To User Can Decide which Attributes To Be Shared Be Shared
OVID System (con.)OVID System (con.)• Video-Object Composition: merge, Video-Object Composition: merge,
interval projection and overlapinterval projection and overlap
• VideoSQLVideoSQL– SELECT: SELECT:
continuous/Incontinuous/anyObjectcontinuous/Incontinuous/anyObject
– WHERE: attribute is [value] / attribute WHERE: attribute is [value] / attribute contains [value] / defineOver [frames]contains [value] / defineOver [frames]
• Browsing: VideoChart - bar chart Browsing: VideoChart - bar chart representation of video objectsrepresentation of video objects
Virtual Video BrowserVirtual Video BrowserLittle et al., 1993Little et al., 1993
• Predefined Schema With Fixed AttributesPredefined Schema With Fixed Attributes
• Descriptions Can Not be Overlapped or Descriptions Can Not be Overlapped or NestedNested
• Target at MOD: not suitable for dynamic Target at MOD: not suitable for dynamic creation, modification of videocreation, modification of video
• No Personalized ViewNo Personalized View
• No Spatio-temporal QueriesNo Spatio-temporal Queries
Video Database Browser Video Database Browser SystemSystemRowe, Boreczky et al. 1994Rowe, Boreczky et al. 1994
• Classify Metadata Into: Bibliographic, Classify Metadata Into: Bibliographic, Structural, And Content DataStructural, And Content Data
• Use Relational Database Schema Use Relational Database Schema (POSTGRES RDBMS)(POSTGRES RDBMS)
• Support Video Queries On Predefined Support Video Queries On Predefined AttributesAttributes
Video StratificationVideo StratificationSmith and Davenport, MIT, 1991 - 1992Smith and Davenport, MIT, 1991 - 1992
• Associate Description To A Associate Description To A Sequence Of Video FramesSequence Of Video Frames
• Simple Keyword SearchSimple Keyword Search
• Strata May OverlapStrata May Overlap
• Relation Among Strata Is AbsentRelation Among Strata Is Absent
BRAHMABRAHMADan et al., IBM T. J. Watson, 1996Dan et al., IBM T. J. Watson, 1996
• Browsing and Retrieval Browsing and Retrieval Architecture for Hierarchical Architecture for Hierarchical Multimedia AnnotationsMultimedia Annotations
• Each Annotation Node is an Each Annotation Node is an Attribute / Value PairAttribute / Value Pair
• Nodes Can Be Dynamically Created Nodes Can Be Dynamically Created and Shared by Multi-users and Shared by Multi-users
Media StreamsMedia StreamsDavis 1993Davis 1993
• Goal: overcome keyword annotation weaknessesGoal: overcome keyword annotation weaknesses
• Iconic Video Content AnnotationIconic Video Content Annotation
• Hierarchical: general -> specificHierarchical: general -> specific
• Represent And Match Temporal RelationsRepresent And Match Temporal Relations
• Fixed VocabularyFixed Vocabulary
• Doesn’t Address Textual Data, e.g. Closed Doesn’t Address Textual Data, e.g. Closed CaptionCaption
Algebraic Video SystemAlgebraic Video SystemWeiss et al, MIT, 1995Weiss et al, MIT, 1995
• Goal: Temporal Video Goal: Temporal Video CompositionComposition
• Basic Approach: StratificationBasic Approach: Stratification
Algebraic Video Data Algebraic Video Data ModelModel• Video ExpressionVideo Expression::
– multi-window, spatial, temporal and multi-window, spatial, temporal and content combination of raw video content combination of raw video segmentssegments
– recursively constructed using video recursively constructed using video algebraic operatorsalgebraic operators
• Video Algebraic Operators: creation, Video Algebraic Operators: creation, composition, output, and descriptioncomposition, output, and description
Algebraic Video Data Algebraic Video Data ModelModel
• Providing Multiple Coexisting Views Providing Multiple Coexisting Views (Nest Stratification)(Nest Stratification)
• Video Query: Boolean combination of Video Query: Boolean combination of attributesattributes
• Temporal Constraint Is Expressed As Temporal Constraint Is Expressed As Attribute ValuesAttribute Values
• Video Browsing Within The ExpressionVideo Browsing Within The Expression
VideoSTAR (STorage And VideoSTAR (STorage And Retrieval) SystemRetrieval) SystemHjelsvold et al,, 1995Hjelsvold et al,, 1995
• Goal: Multi-user Video Information Goal: Multi-user Video Information SharingSharing
• Basic Approach: StratificationBasic Approach: Stratification
VideoSTAR: Generic VideoSTAR: Generic Video Data ModelVideo Data Model
• Continuous Media Objects (Continuous Media Objects (CMObjectCMObjects)s)
• MediaStreamMediaStream: : – Virtual Video Streams (Virtual Video Streams (VideoStreamsVideoStreams))
– Video/Audio Recordings Video/Audio Recordings ((StoredMediaSegmentsStoredMediaSegments))
• An Arbitrary An Arbitrary StreamInterval StreamInterval can be can be annotatedannotated
VideoSTAR: Video VideoSTAR: Video Querying and BrowsingQuerying and Browsing
• Three Kinds of Video Context:Three Kinds of Video Context: – Basic, Secondary, and PrimaryBasic, Secondary, and Primary
– Unconditionally context sharingUnconditionally context sharing
• VideoSTAR Query AlgebraVideoSTAR Query Algebra– Boolean, Set , and Temporal OperatorsBoolean, Set , and Temporal Operators
– Based on Attribute/ValueBased on Attribute/Value
– Users Need to Choose Query ContextUsers Need to Choose Query Context
VideoSTAR: Video VideoSTAR: Video Querying and BrowsingQuerying and Browsing
Two Browsing OperatorsTwo Browsing Operators
• Retrieve All Annotations Over a Retrieve All Annotations Over a Video Stream or Interval Video Stream or Interval
• Retrieve All Structures Defined Retrieve All Structures Defined Over a IntervalOver a Interval
Advanced Video Advanced Video Information System (AVIS)Information System (AVIS)Adah, Candan, Chen, Erol, and Subrahamanian, University of Adah, Candan, Chen, Erol, and Subrahamanian, University of
Maryland. MSJ 1996Maryland. MSJ 1996
• Basic Approach: spatial Indexes + Basic Approach: spatial Indexes + RDB RDB
• Entities: Entities: things that are interesting which things that are interesting which may or may not actually appear in the movie, may or may not actually appear in the movie, including video objects, activity types, event including video objects, activity types, event (roles and teams)(roles and teams)
• Raw Video Frame SequencesRaw Video Frame Sequences
Advanced Video Advanced Video Information System (AVIS) Information System (AVIS)
• Associate MapAssociate Map: : entities <--> frame sequences.entities <--> frame sequences.
• Index:Index: frame segment tree + OBJECTARRAY + frame segment tree + OBJECTARRAY + EVENTARRAY + ACTIVITYARRAYEVENTARRAY + ACTIVITYARRAY
• All Clips Must Be Equal Length With No All Clips Must Be Equal Length With No OverlapOverlap
• No Spatial and Temporal QueriesNo Spatial and Temporal Queries
• No Logical Video AbstractionsNo Logical Video Abstractions
Common Video Object Common Video Object Model (CVOT)Model (CVOT)J. Li and T. Ozsu et al. University of Alberta, 1998J. Li and T. Ozsu et al. University of Alberta, 1998
• Focus On Salient Objects And Based On Focus On Salient Objects And Based On OODBMSOODBMS
• CVO Tree: each leaf is a video interval CVO Tree: each leaf is a video interval with salient objects (similar to AVIS) with salient objects (similar to AVIS) attachedattached
• Video Clips Can Be Overlapped To Model Video Clips Can Be Overlapped To Model Special Editing Effects (Fade In etc.)Special Editing Effects (Fade In etc.)
Common Video Object Common Video Object Model (CVOT) Model (CVOT)
• Query Language: MOQLQuery Language: MOQL– based on OQL proposed by ODMG for based on OQL proposed by ODMG for
ODBMSsODBMSs
– has both temporal and spatial operatorshas both temporal and spatial operators
• Symbolic Trajectory Representation And Symbolic Trajectory Representation And MatchingMatching
• Logical v.s. Physical Salient ObjectsLogical v.s. Physical Salient Objects
• Only Address Salient ObjectsOnly Address Salient Objects
Video BrowsingVideo Browsing
• Representation Frames (RFrames)Representation Frames (RFrames)– Sport Highlight [Yow95]Sport Highlight [Yow95]
– Caption Detection [Smith95, Yeo96]Caption Detection [Smith95, Yeo96]
– Keyword Spotting [Smith95]Keyword Spotting [Smith95]
• Explicit Models (News Video) Explicit Models (News Video) [Swanberg93, Zhang94][Swanberg93, Zhang94]
Video Browsing (con.)Video Browsing (con.)
• Shot Clustering Based On Visual Shot Clustering Based On Visual Similarity and Temporal Similarity and Temporal Locality[Yeung95, Rui98]Locality[Yeung95, Rui98]– Scene Change Graph (CTG) [Yeung95] Scene Change Graph (CTG) [Yeung95]
Video->Shot Segmentation->Shot Video->Shot Segmentation->Shot Clustering->Scene SegmentationClustering->Scene Segmentation
Logical Hypervideo Data Logical Hypervideo Data Model (LHVDM)Model (LHVDM)
• DefinitionDefinition
• Hierarchical Video AbstractionsHierarchical Video Abstractions
• Hot Video Object ModelingHot Video Object Modeling
• Video IndexingVideo Indexing
• Video Semantic Association And HypervideoVideo Semantic Association And Hypervideo
• A Generic Video Database ArchitectureA Generic Video Database Architecture
Logical Hypervideo Data Logical Hypervideo Data Model (con.)Model (con.)
(PV, PVS, LV, LVS, HO, CD, LINKS, UV, (PV, PVS, LV, LVS, HO, CD, LINKS, UV, MAP)MAP)PV: Set Of Physical Video StreamsPV: Set Of Physical Video Streams
PVS: Set Of Physical Video SegmentsPVS: Set Of Physical Video Segments
LV: Set Of Logical Video StreamsLV: Set Of Logical Video Streams
LVS: Set Of Logical Video SegmentsLVS: Set Of Logical Video Segments
HO: Set Of Hot Objects HO: Set Of Hot Objects
CD: Set Of Content DescriptionsCD: Set Of Content Descriptions
LINK: Set Of Video HyperlinksLINK: Set Of Video Hyperlinks
UV: Set Of User ViewsUV: Set Of User Views
MAP: Set Of Mapping Relations MAP: Set Of Mapping Relations
Logical Hypervideo Data Logical Hypervideo Data Model (con.)Model (con.)
MAP includesMAP includesPV <--> PVS: Easy Data ManipulationPV <--> PVS: Easy Data Manipulation
PVS <--> LV: Data Independence And Data ReusePVS <--> LV: Data Independence And Data Reuse
LV <--> LVS: Multi-user View LV <--> LVS: Multi-user View
LV,LVS<-->HO:Effective QueryLV,LVS<-->HO:Effective Query
LV,LVS,HO,CD<-->UV: Multi-user View SharingLV,LVS,HO,CD<-->UV: Multi-user View Sharing
LV,LVS,HO,LINKS<-->CD: Semantic Content-based LV,LVS,HO,LINKS<-->CD: Semantic Content-based AccessAccess
Video Hyperlinks: Effective Video BrowsingVideo Hyperlinks: Effective Video Browsing
Hierarchical Video Hierarchical Video AbstractionsAbstractions
Physical Video Streams(PVs)
Physical Video Segments(PVSs)
Logical Video Streams(LVs)
Logical Video Segments(LVSs)
Hot Objects(HOs)
User Views(UVs)
Logical HypervideoData Model (LHVDM)
Video Hyperlinks
Hot Video ObjectsHot Video Objects
• What Is A Hot Video ObjectWhat Is A Hot Video Object– A Logical Video AbstractionA Logical Video Abstraction
– A Sub-Frame Region That Is “Hot” In A A Sub-Frame Region That Is “Hot” In A Set Of Logical Frame SequenceSet Of Logical Frame Sequence
• Why Call Them “Hot” Object?Why Call Them “Hot” Object?– Target Of InterestTarget Of Interest
– Hyperlink Property (Hot Video Spot)Hyperlink Property (Hot Video Spot)
Hot Video Objects (con.)Hot Video Objects (con.)
• ImplicitImplicit Hot Video Object Hot Video Object
• Why Hot Object Modeling Is Why Hot Object Modeling Is Important?Important?
– More Precise Video Annotation And More Precise Video Annotation And QueryQuery
– Capture Spatial Characteristics of Video Capture Spatial Characteristics of Video DataData
Hot Video Object ModelHot Video Object Model
Video Hyperlinks
(LINK)
Hot Object(HO)
Semantic Content (T)
Geometric Content (G)
Visual Content
Live Time Intervals
(LTI)
AudioContent
Hot Object TrackingHot Object Tracking
• Template MatchingTemplate Matching
• Active ContourActive Contour
E = E = * E * Eint int + (1- + (1- ) * E) * Eext ext
EEintint((VVii) = ) = ||VVi i ’’||2 2 + + ||VVi i
’’’’||22
EEext ext = = c c (-|(-|NN((V V (t))(t))II((V V (t))|)dt(t))|)dt
EEext ext ((VVii) =) = 1 - (D(1 - (D(VVii)/255 + 1)* (-|)/255 + 1)* (-|NN((VVii))II((VVii )|) )|)
Experiments With Hot Experiments With Hot Video Object TrackingVideo Object Tracking
Video IndexingVideo Indexing
• Main Indexes: Semantic Content Main Indexes: Semantic Content DescriptionsDescriptions
• Content DescriptionsContent Descriptions– Hot Objects’ v.s. LVSs’ v.s. LVs’ v.s. Hot Objects’ v.s. LVSs’ v.s. LVs’ v.s.
LINKSs’LINKSs’
– System’s v.s. Users’System’s v.s. Users’
– Sharing & Access ControlSharing & Access Control
• Auxiliary Indexes: Various MappingsAuxiliary Indexes: Various Mappings
Video Semantic Video Semantic Association and Association and
HypervideoHypervideo• Why Semantic Association Is Important?Why Semantic Association Is Important?
– More Effective Video Data Access More Effective Video Data Access
• Video Hyperlinks Represent Semantic Video Hyperlinks Represent Semantic AssociationsAssociations– Hypervideo And Hypervideo DatabasesHypervideo And Hypervideo Databases
– Flexible And User Adaptive Hypervideo Flexible And User Adaptive Hypervideo DatabasesDatabases
A Generic Video Database A Generic Video Database System ArchitectureSystem Architecture
Knowledge Inference
Engine
GeometricEngine
InformationRetrievalEngine
Video Database Engines
Integrator
GUI
VideoHyperlinks
Video Indexes
Hot Objects
LVS
LV
Video Data
SpatialInformation
Video Annotations
KnowledgeBase
Users
Video Database
Semantic Content-based Semantic Content-based Video Query LanguageVideo Query Language
((ExprExpr, , GranularityGranularity, Scope, , Scope, SpaceSpace))
• Expr Expr : video query expression: video query expression
• GranularityGranularity ::
– Logical Video Streams (v)Logical Video Streams (v)
– Logical Video Segments (s)Logical Video Segments (s)
– Hot Objects (o)Hot Objects (o)
Semantic Content-based Semantic Content-based Video Query Language Video Query Language
(con.)(con.)• ScopeScope : :
– System Annotations Only (s)System Annotations Only (s)
– Including User Annotations (u)Including User Annotations (u)
– Including Sharable Other Users’ Including Sharable Other Users’ Annotations (a)Annotations (a)
Semantic Content-based Semantic Content-based Video Query Language Video Query Language
(con.)(con.)• Space Space ::
– Subset Of VDBSubset Of VDB
– Can Be Result Of Another QueryCan Be Result Of Another Query
– Allows Recursive Query Allows Recursive Query RefinementRefinement
– Example: Q2 = (expr, o, u, Q1)Example: Q2 = (expr, o, u, Q1)
Boolean and IR Query Boolean and IR Query OperatorsOperators
• AND, OR, NOTAND, OR, NOT
• NOT Is Not SafeNOT Is Not Safe
• ADJ (adjacent)ADJ (adjacent)
• Regular Expression And Regular Expression And Approximate MatchingApproximate Matching
Temporal Query Temporal Query OperatorsOperators
Thirteen Interval Relations [Allen83]Thirteen Interval Relations [Allen83]
Before After
Starts Ends
Meets
Overlaps
Equal
During
Temporal Query Temporal Query Operators (con.)Operators (con.)
• Operators For LVSOperators For LVS
– Interval Temporal OperatorsInterval Temporal Operators
• Operators For HOOperators For HO
– Instance (or Point) Temporal Operators: Instance (or Point) Temporal Operators: more precise query specificationmore precise query specification
– Interval Temporal OperatorsInterval Temporal Operators
Spatial Query OperatorsSpatial Query Operators
• DirectionalDirectional
• TopologicalTopological
• DistanceDistance
North
South
West East
Query ProcessingQuery Processing
• Recursive And Top-Down/Bottom-UPRecursive And Top-Down/Bottom-UP
• Support Distribute EvaluationSupport Distribute Evaluation
• ““Close World Assumption”? Close World Assumption”?
• Answer: No (Raw Video Data) And Yes Answer: No (Raw Video Data) And Yes (Within User’s View)(Within User’s View)
• Reason: Video Data Is Semantically RichReason: Video Data Is Semantically Rich
Query Search SpaceQuery Search Space
• User Definable User Definable
• System Owned Subset of VDB Are System Owned Subset of VDB Are Always SearchedAlways Searched
• User’s Queries Are Processed Within User’s Queries Are Processed Within One’s ViewOne’s View
• Determined By A Query’s Granularity, Determined By A Query’s Granularity, Scope, Space, And UserScope, Space, And User
Efficient Query ProcessingEfficient Query Processing
• Query Augmentation or Pre-Query Augmentation or Pre-filteringfiltering
• Query Evaluation OrderQuery Evaluation Order
• Query Caching And Knowledge Query Caching And Knowledge BaseBase
Query ExamplesQuery Examples
• Simple QueriesSimple Queries
““Find video clips that has a red BMW Z3 in Find video clips that has a red BMW Z3 in it”it”
Q1 = ((red Q1 = ((red BMW
Query Examples (con.)Query Examples (con.)
• Temporal QueryTemporal Query
“Find video clips in which a scene with a bird flying appears after the scene with a child eating ice cream”
Q3 = ((bird flying) Tafter (child eating ice cream), o, -,-)
Q3’ = ((bird flying) Iafter (child eating ice cream), s, -,-)
Query Examples (con.)Query Examples (con.)
• Spatial QuerySpatial Query
“Find video clips in which the Vice President Al Gore standing to the right of President Clinton who is giving his Union speech at Washington DC”
Q2a = Union speech Washington DC, s, -,-)
Q2b =(((Vice President Gore) Sright (President Clinton)), o, -, Q2a)
Query Examples (con.)Query Examples (con.)
• Spatial QueriesSpatial Queries
““Find video clips in which a blue bird Find video clips in which a blue bird is flying over a kid’s head”is flying over a kid’s head”
QQ4 4 = ((blue = ((blue bird bird flying) flying) SSaboveabove ((child ((child kid kid boy boy girl) girl) head), o, -,-)head), o, -,-)
Query Examples (con.)Query Examples (con.)
• Spatio-Temporal QuerySpatio-Temporal Query
““Find video clips in which a police car Find video clips in which a police car with siren on is chasing a red Porsche with siren on is chasing a red Porsche and hit on it”and hit on it”
QQaa = ((police car siren), o, -,-) = ((police car siren), o, -,-)
QQbb = (red Prosche), o, -,-) = (red Prosche), o, -,-)
Q = ((QQ = ((Qaa S Sapproachapproach Q Qbb) I) Ibeforebefore (Q (Qaa S Stouchtouch Q Qbb), o, ), o, -,-)-,-)
Web-based Logical Web-based Logical Hypervideo Database Hypervideo Database
System (WLHVDB)System (WLHVDB)
• System ArchitectureSystem Architecture
• Video Wrapper And Lazy DeliveryVideo Wrapper And Lazy Delivery
• Populating The Video DatabasePopulating The Video Database
• Distributed Query ProcessingDistributed Query Processing
• Access Control And User ProfilingAccess Control And User Profiling
System ArchitectureSystem Architecture
Server
InternetInternet
Client
Video Annotation Engine
Video Parser
Query Processing Server
IR Engine
Various Tools and Scripts
Account Manager
User Profile Manager
Access Control Manager
Server Cache Manager
Physical Video Data
Logical Video Data
Video Annotations
User Views
Video Indexes
User Profiles
Server Query Cache
Video Hyperlinks
Query Input
Query Result Presentation
Media Player
Client Query Cache
Client Cache Manager
Data Editor ClientClient
Information Retrieval (IR) Information Retrieval (IR) and Glimpseand Glimpse
• Full Text ScanningFull Text Scanning
• Signature FilesSignature Files
• Inversion - almost all commercial Inversion - almost all commercial systemssystems
• Vector Model and Clustering: Vector Model and Clustering: weighted and relevance feedbackweighted and relevance feedback
Glimpse Glimpse (GLobal IMPlicit SEarch)(GLobal IMPlicit SEarch)
• Small Index: Small Index: 2-4%2-4%
• Full Text Boolean QueriesFull Text Boolean Queries
• Arbitrary Approximate And Regular Arbitrary Approximate And Regular Expression MatchingExpression Matching
• Efficient (<500MB): Efficient (<500MB): 5 seconds for finding 5 seconds for finding 10 occurrences among 4500 files of total size 10 occurrences among 4500 files of total size of 69MBof 69MB
Video Wrapper And Lazy Video Wrapper And Lazy DeliveryDelivery
• Why Need Them? Why Need Them? – Huge Date Volume v.s. Limited BandwidthHuge Date Volume v.s. Limited Bandwidth
• Why Lazy Delivery?Why Lazy Delivery?– ““Avoid” Sending Video Data InformationAvoid” Sending Video Data Information
• What is a Video Wrapper?What is a Video Wrapper?– Multi-resolution Video RepresentationMulti-resolution Video Representation
– Adaptive Local Refinement Based On InterestAdaptive Local Refinement Based On Interest
Current Video Wrapper Current Video Wrapper Implementation Implementation
PVSs
RFrames
Clip Posters
Video Poster
Populating the VDBPopulating the VDB
Video Capture
Video Parsing and
Segmentation
Video Representation
Construction
Video Indexes
Video Wrapper
LVsClosed Caption
CaptureLVSs and
Annotations
Hot Objects
Object Tracking
DBA UsersAnalogVideo
Query ProcessingQuery Processing
Client
Video Query
Sending IR Sub-queries
Query Parsing and
Syntax Checking
Processing Boolean and Spatio-temporal Operators
Result
No Error
Show ErrorMessage Processing IR Sub-queries
Sending Partial Results
Server
Search User’s View
Server and Client-Side Server and Client-Side Query CachingQuery Caching
Client-side Cache
IR Sub-queries
Get Results
Update Cache
Server-side Cache
Get Result
Send Results
Query IR Engine
Update Cache
Results
Hit Hit
Miss Miss
Client Server
Video BrowsingVideo Browsing
• Loop of Query-Browsing-PlayLoop of Query-Browsing-Play
• Inter- And Intra-Video BrowsingInter- And Intra-Video Browsing
• User Adaptive User Adaptive
• Video Wrapper Refinement Video Wrapper Refinement ProcessProcess
Video Browsing (con.)Video Browsing (con.)
Video Query
Video Posters
Video Board
RFrames Annotations Audio Stream Video Hyperlinks
Video Stream
Access Control and User Access Control and User ProfilingProfiling
• Different Categories Of Users And GroupsDifferent Categories Of Users And Groups
• Different Permissions On Video Data And Different Permissions On Video Data And MetadataMetadata
• Users Need To Authenticate ThemselvesUsers Need To Authenticate Themselves
• User Activities And Local Environment User Activities And Local Environment Information Are RecordedInformation Are Recorded
Summary of Major Summary of Major ContributionsContributions
• A Novel Video Data Model A Novel Video Data Model (LHVDM) That Supports(LHVDM) That Supports– Multi-level Video AbstractionMulti-level Video Abstraction
– Video Data IndependenceVideo Data Independence
– Multi-user Data SharingMulti-user Data Sharing
– Dynamic And Incremental View UpdateDynamic And Incremental View Update
– Variable Access GranularitiesVariable Access Granularities
Summary of Major Summary of Major Contributions (con.)Contributions (con.)
• A Novel Video Data Model A Novel Video Data Model (LHVDM) That(LHVDM) That Represents Represents
– Both Spatial And Temporal Video Both Spatial And Temporal Video Characteristics Characteristics
– Hot Video ObjectsHot Video Objects
– Video Semantics And Semantic Video Semantics And Semantic AssociationsAssociations
Summary of Major Summary of Major Contributions (con.)Contributions (con.)
• A Novel Video Data Model (LHVDM) A Novel Video Data Model (LHVDM) ThatThat
– Supports User Adaptive Video Browsing Supports User Adaptive Video Browsing
– Hyperlinks Video Entities For More Hyperlinks Video Entities For More Efficient BrowsingEfficient Browsing
– Can Be Extended To Other Multimedia Can Be Extended To Other Multimedia Data Such As Audio DataData Such As Audio Data
Summary of Major Summary of Major Contributions (con.)Contributions (con.)
• A Video Query Language That AllowsA Video Query Language That Allows
– Easy Query FormulationEasy Query Formulation
– Video Semantic Content-based Video Semantic Content-based QueriesQueries
– Both Spatial And Temporal Both Spatial And Temporal ConstraintsConstraints
– Hot Object-based Video QueriesHot Object-based Video Queries
– User Selectable Granularity, Space, User Selectable Granularity, Space, and Scope and Scope
Summary of Major Summary of Major Contributions (con.)Contributions (con.)
• A Generic Video Database A Generic Video Database System Architecture That IsSystem Architecture That Is
– Modular, Flexible, And ScalableModular, Flexible, And Scalable
– Readily To Be Distributed Readily To Be Distributed
– Easy To Be ImplementedEasy To Be Implemented
Summary of Major Summary of Major Contributions (con.)Contributions (con.)
• The Design And Implementation Of The Design And Implementation Of A Web-based Prototype That UsesA Web-based Prototype That Uses– A Novel Video Wrapper And Lazy A Novel Video Wrapper And Lazy
Evaluation Approach Evaluation Approach
– Distributed Query Processing And Sub-Distributed Query Processing And Sub-query Caching Schemaquery Caching Schema
– Multi-user Data Access Control And Multi-user Data Access Control And View SharingView Sharing
– User ProfilingUser Profiling
Future WorkFuture Work
• Identify New Applications And Identify New Applications And Perform More Extensive TestsPerform More Extensive Tests
• Explore And Integrate Other Forms Explore And Integrate Other Forms Of Video Annotations Such As Of Video Annotations Such As Visual FeaturesVisual Features
• Extended To Other Multimedia Extended To Other Multimedia Data Such As Slides, Images, And Data Such As Slides, Images, And AudioAudio
Future WorkFuture Work(con.)(con.)
• Knowledge-based Video AccessKnowledge-based Video Access
• Automatic Generation of Video Automatic Generation of Video WrappersWrappers
• Video Data Security and Access Video Data Security and Access ControlControl
Related PublicationsRelated Publications
• Survey and BooksSurvey and Books– A. K. Elmagarmid and H. Jiang. Multimedia Video A. K. Elmagarmid and H. Jiang. Multimedia Video
(chapter), Encyclopedia of Electrical and Electronics (chapter), Encyclopedia of Electrical and Electronics Engineering. John Wiley & Sons. 1998, In press.Engineering. John Wiley & Sons. 1998, In press.
– A. K. Elmagarmid, H, Jiang and et al. Video Database A. K. Elmagarmid, H, Jiang and et al. Video Database Systems: Issues, Products and Applications. Kluwer Systems: Issues, Products and Applications. Kluwer Academic Publishers, 1997.Academic Publishers, 1997.
– H. Jiang, A. Helal, A. K. Elmagarmid, and A. Joshi. Scene H. Jiang, A. Helal, A. K. Elmagarmid, and A. Joshi. Scene Change Detection Techniques for Video Database Change Detection Techniques for Video Database
Systems. ACM Multimedia Sys., 6:186-195, May 1998.Systems. ACM Multimedia Sys., 6:186-195, May 1998. – H. Jiang and A. K. Elmagarmid. Video Databases: State of H. Jiang and A. K. Elmagarmid. Video Databases: State of
the Art, State of the Market and State of Practice. Proc. the Art, State of the Market and State of Practice. Proc. 2nd Intl. Workshop on Multimedia Info. Sys., Page 87-91, 2nd Intl. Workshop on Multimedia Info. Sys., Page 87-91, West Point, New York, September 26-28, 1996.West Point, New York, September 26-28, 1996.
Related Publications Related Publications (con.)(con.)
• Video Analysis And Computer VisionVideo Analysis And Computer Vision– H. Jiang and A. K. Elmagarmid. Extract Visual H. Jiang and A. K. Elmagarmid. Extract Visual
Content Representation in Video Databases, Proc. of Content Representation in Video Databases, Proc. of Intel Conf. on Imaging Sci., Sys., and Tech. Intel Conf. on Imaging Sci., Sys., and Tech. (CISST'97), Las Vegas, Nevada, June 30 - July 3, (CISST'97), Las Vegas, Nevada, June 30 - July 3, 1997. 1997.
– H. Jiang and J. Dailey. A Video Database System for H. Jiang and J. Dailey. A Video Database System for Studying Animal Behavior. Proc. SPIE Photonics Studying Animal Behavior. Proc. SPIE Photonics East'96 - Multimedia Storage and Archiving Sys. Intl. East'96 - Multimedia Storage and Archiving Sys. Intl. Conf., Page 162-173, Volume SPIE-2916, Boston, MA, Conf., Page 162-173, Volume SPIE-2916, Boston, MA, November 18-19, 1996.November 18-19, 1996.
Related Publications Related Publications (con.)(con.)
• Video Data Model, Indexing, and Video Data Model, Indexing, and AccessAccess– H. Jiang, D. Montesi, and A. K. Elmagarmid. Integrate H. Jiang, D. Montesi, and A. K. Elmagarmid. Integrate
Video and Text for Content-based Accesses to Video Video and Text for Content-based Accesses to Video Databases. J. of Multimedia Sys. and Tools. 1998, Databases. J. of Multimedia Sys. and Tools. 1998, accepted. accepted.
– H. Jiang and A. K. Elmagarmid. WVTDB - A Web-based H. Jiang and A. K. Elmagarmid. WVTDB - A Web-based VideoText Database System. Special Issue on Data VideoText Database System. Special Issue on Data and Knowl. Management in Multimedia Sys., IEEE and Knowl. Management in Multimedia Sys., IEEE Trans. on Data and Knowl. Eng.. 1998, accepted. Trans. on Data and Knowl. Eng.. 1998, accepted.
Related Publications Related Publications (con.)(con.)
• Video Data Model, Indexing, and AccessVideo Data Model, Indexing, and Access• H. Jiang and A. K. Elmagarmid. Spatial and Tempora H. Jiang and A. K. Elmagarmid. Spatial and Tempora
Content-based Queries in Hypervideo Databases. Special Content-based Queries in Hypervideo Databases. Special Issue on Multimedia Data Management, The Very Large Issue on Multimedia Data Management, The Very Large Database J.. 1998, submitted. Database J.. 1998, submitted.
• F. Kokkoras, H. Jiang, I. Valhavas, A. K. Elmagarmid, and E. F. Kokkoras, H. Jiang, I. Valhavas, A. K. Elmagarmid, and E. N. Houstis. Smart VideoText: An Intelligent Video Database N. Houstis. Smart VideoText: An Intelligent Video Database System}, CSD-TR 97-049, Department of Computer System}, CSD-TR 97-049, Department of Computer Sciences, Purdue University, West Lafayette, IN 47907, Sciences, Purdue University, West Lafayette, IN 47907, USA, 1997. USA, 1997.
END OF THE END OF THE PRESENTATIONPRESENTATION
THANK YOU VERY THANK YOU VERY MUCHMUCH
MPEG-IMPEG-I
• A Bit Stream For Compressed Video A Bit Stream For Compressed Video And Audio Stream And Audio Stream
• Optimized For Data Rate Of 1.5MbpsOptimized For Data Rate Of 1.5Mbps
• Non-interlacedNon-interlaced
• Typical Compress Ratio 27:1Typical Compress Ratio 27:1
• Quality: VHS VideoQuality: VHS Video
MPEG-IIMPEG-II
• Digital Boardcasting Video (CCIR601) At A Digital Boardcasting Video (CCIR601) At A Data Rate Of 4 - 9 MbpsData Rate Of 4 - 9 Mbps
• Compatible With MPEG-ICompatible With MPEG-I
• Coding Of Interlaced VideoCoding Of Interlaced Video
• User-selectable DCT PrecisionUser-selectable DCT Precision
• Scalable Extension For Multi-resolution Scalable Extension For Multi-resolution Coding Coding
• Wide Range Of Frame SizeWide Range Of Frame Size
MPEG-IVMPEG-IV
• Targets At Multimedia ApplicationsTargets At Multimedia Applications
• Coding Of Video Objects For Content-based Coding Of Video Objects For Content-based InteractivityInteractivity
• Improved Temporal Random AccessImproved Temporal Random Access
• More Efficient Coding More Efficient Coding
• Supports Very Low Video Data Rate (<64Kbps)Supports Very Low Video Data Rate (<64Kbps)
• Robustness For Wireless ApplicationsRobustness For Wireless Applications
MPEG-VII:MPEG-VII:
Digital Library InitiatvesDigital Library Initiatves
• Standford:Standford: integrated virtual library integrated virtual library with new services and uniform access to with new services and uniform access to networked information collectionsnetworked information collections
• UC BerkeleyUC Berkeley: : 1) computer vision for 1) computer vision for digital documents; 2) database protocols digital documents; 2) database protocols for client/server information retrieval; 3) for client/server information retrieval; 3) data acquisition technologies; 4) content-data acquisition technologies; 4) content-based browserbased browser
Digital Library Initiatves Digital Library Initiatves (con.)(con.)
• UC Santa Barbara (Alexandra):UC Santa Barbara (Alexandra): a a distributed system that provides a distributed system that provides a comprehensive range of library services for comprehensive range of library services for collections of spatially indexed and graphical collections of spatially indexed and graphical information (digital maps and images) information (digital maps and images)
• UC BerkeleyUC Berkeley: : high quality search and high quality search and display of Internet information (SGML display of Internet information (SGML documents)documents)
Digital Library Initiatves Digital Library Initiatves (con.)(con.)
• University of Michigan:University of Michigan: a diverse a diverse collection of earth and space sciences and collection of earth and space sciences and cooperating software agents cooperating software agents
• CMU (Informedia)CMU (Informedia): : content-search and content-search and retrieval of digital libraries using speech retrieval of digital libraries using speech understanding, computer vision, natural understanding, computer vision, natural language processinglanguage processing