EE 6850, F'02, Chang, Columbia U 1
Lecture 3: Multimedia Metadata Standards
Prof. Shih-Fu Chang
EE 6850, Fall 2002
Sept. 18, 2002Course URL: http://www.ee.columbia.edu/~sfchang/course/vis/
EE 6850, F'02, Chang, Columbia U. 2
References
� Digital Still Camera Image File Format Standard (Exchangeable image file format for Digital Still Cameras: Exif) - Version 2.1 http://www.exif.org/
� Introduction to MPEG-7 (v2), Document: ISO/IEC JTC1/SC29/WG11 N3751. Oct. 2000.
� DIG35 Image Metadata Standardhttp://www.i3a.org/i_dig35.html
� S.-F. Chang, T. Sikora and A. Puri, "Overview of the MPEG-7 Standard," IEEE Transactions on Circuits and Systems for Video Technology, special issue on MPEG-7, June 2001.
EE 6850, F'02, Chang, Columbia U. 3
Why Metadata Standard?
� Content Exchange� Content owners� Consumers
� Interoperable Client Applications� Cross-operator information access� Meta search engines
EE 6850, F'02, Chang, Columbia U. 4
DIG 35 - image metadata
EE 6850, F'02, Chang, Columbia U. 5
DIG35
� Participants: Canon, Kodak, Fuji, HP, Microsoft, Polaroid, Seattle Film Works, etc.
� Time frame: started 1999, WD 1.0 March ’00, V 1.0 Aug. 2000.
� Use-case Scenarios:� albuming, content searching, linking,
information/copyright preservation
EE 6850, F'02, Chang, Columbia U. 6
DIG 35 metadata interchange model
EE 6850, F'02, Chang, Columbia U. 7
DIG 35 metadata subblocks
EE 6850, F'02, Chang, Columbia U. 8
EXIF: Exchangeable image file format
� October 1996 Version 1.0, May 1997 Version 1.1, June 1998, Version 2.1 .
� Supported by most Digital Camera Manufacturers
� Consist of both image and audio file specifications
� Image file spec includes:� Structure of image data files,� Tags used by this standard,� Definition and management of format versions.
EE 6850, F'02, Chang, Columbia U. 9
EXIF Image File Spec
� Compressed files are recorded as JPEG.� Uncompressed files are recorded in TIFF Rev. 6.0� A feature of Exif image files is their compatibility with
standard formats in wide use today.� Related attribute information for both compressed
and uncompressed files is stored in the tag information format defined in TIFF Rev. 6.0.
� New EXIF specific attributes are stored as private tags in TIFF.
EE 6850, F'02, Chang, Columbia U. 14
� Flexible, extensible, multi-level, and standard framework for describing multimedia� Systems, DDL, Video, Audio, MDS, Software
� Scope
� Schedule
MPEG-7 Standard
9/0110/0012/9910/98
International Standard
Committee Draft
Working Draft
Call For Proposals
Feature Extraction
MPEG-7 Description
Search/Filtering Application
EE 6850, F'02, Chang, Columbia U. 15
MPEG-7 Segment Types
EE 6850, F'02, Chang, Columbia U. 16
MPEG-7 Framework
� Description Definition Language (DDL)� Language to create new
Ds/DSs or extend existing ones
� Extend XML-Schema
� Description Schemes (DSs)� Structure and semantics
of relations among Ds/DSs
� Descriptors (Ds)� Representation of a
feature of AV data
DescriptionDefinitionLanguage
DescriptionScheme
Descriptor
1..*
0..*
defines
describes
1..*
AV ContentItem
Data
Feature
User or System
to
signifies1..*
1..*
1..*
EE 6850, F'02, Chang, Columbia U. 17
MM ContentMM Content
MPEG7Coded
DescriptionEncoder Decoder
Description DefinitionLanguage (DDL)
Description Schemes(DS)
Descriptors (D)
DescriptionGeneration
MPEG7Description Search /
QueryEngine
User or dataprocessing
system
FilterAgents
MPEG-7 Application Chain
EE 6850, F'02, Chang, Columbia U. 18
Parts of MPEG-7 (ISO/IEC 15938)
� Systems� Binary encoding, Dynamic update, Transport,
Synchronization, and IPMP tools� Description Definition Language (DDL)
� Language for defining new, extending existing DSs and Ds� Visual
� Visual Ds and DSs� Audio
� Audio Ds and DSs� Multimedia Description Schemes (MDS)
� Generic Ds and DSs; neither purely visual nor purely audio� Reference Software� Conformance
EE 6850, F'02, Chang, Columbia U. 19
X M L
� eXtensible Markup Language (XML)� Derived from SGML (Standard Generalized Markup Language)� Description of structure and semantics of documents� Human- and machine- readable� Author-defined elements and attributes: DTD or XML-Schema
<customer id="Ana2000"><name> Ana Benitez </name><address country="US">
<street>500 W 120</street><city> New York </city><state> New York </state><postal> 94571 </postal>
</address></customer>
EE 6850, F'02, Chang, Columbia U. 20
XML / DTD / XML-Schema
XML Description
<customer id="Ana2000">
<name> Ana Benitez </name>
<address country="US">
<street>500 W 120</street>
<city> New York </city>
<state> New York </state>
<postal> 94571 </postal>
</address>
</customer>
DTD Definition
<!ELEMENT customer (name, email?, address+) >
<!ATTLIST customer id ID #REQUIRED>
<!ELEMENT name (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT address (street, city, state, postal)>
<!ATTLIST address country CDATA #REQUIRED>
<!ELEMENT street (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT postal (#PCDATA)>
XML / DTD / XML-Schema (cont)
XML Description
<customer id="Ana2000">
<name> Ana Benitez </name>
<address country="US">
<street>500 W 120</street>
<city> New York </city>
<state> New York </state>
<postal> 94571 </postal>
</address>
</customer>
XML-Schema Definition<complexType name=“customer”>
<element name=“name” type=“string”/><element name=“email” type=“string”
minOccurs=“0”/><element name=“address” type=“addressType” maxOccurs=“unbounded”/><attribute name=“id” type=“ID” use=“required”/>
</complexType><complexType name=“address”>
<element name=“street” type=“string”/><element name=“city” type=“string”/><element name=“state” type=“string”/><element name=“postal” type=“positiveInteger”/><attribute name=“country” type=“string”/>
</complexType>
EE 6850, F'02, Chang, Columbia U. 22
Some Useful Sites
� W3C: http://www.w3.org/xml
� XML Cover Pages: http://www.oasis-open.org/cover/
� Web Developer’s Virtual Library: http://wdvl.com/
� XML Industry Portal: http://www.xml.org/
� XML Schemas Endgame: http://www.xml.com/pub
� Apache XML Project: http://xml.apache.org/
� IBM alphaWorks: http://www.alphaWorks.ibm.com/
EE 6850, F'02, Chang, Columbia U. 23
Video Descriptors
OtherFace Recognition
LocalizationRegion LocatorSpatio-Temporal Locator
MotionCamera MotionMotion TrajectoryParametric MotionMotion Activity
ShapeRegion ShapeContour Shape3D Shape
TextureHomogeneous TextureTexture BrowsingEdge Histogram
ColorDominant ColorScalable ColorColor LayoutColor StructureGoF/GoP Color
EE 6850, F'02, Chang, Columbia U. 24
Example: Color Histogram
<GoFGoPHistogram HistogramTypeInfo = "Average"><ColorHistogram>
<ColorSpace> <HSV/> </ColorSpace><ColorQuantization ColorQuantizationType = ”uniform">
<bin_number> 4 </bin_number><bin_number> 4 </bin_number><bin_number> 4 </bin_number>
</ColorQuantization><Histogram HistogramNormFactor = "1" NumberHistogramBins = "64">
<HistogramValue> 444 </HistogramValue><HistogramValue> 34 </HistogramValue><HistogramValue> 58 </HistogramValue><HistogramValue> 564 </HistogramValue><HistogramValue> 16 </HistogramValue><! -- Other HistogramValue elements -- >
</Histogram></ColorHistogram>
</GoFGoPColorHistogram>
EE 6850, F'02, Chang, Columbia U. 25
Structure Description Tools
Segment DS
describes
MultimediaContent
SegmentRelation DS
SegmentDecomposition
DS
VideoSegment DS
MovingRegion DS
. . .
StillRegion DS
TextAnnotation D SpatialMask D . . .
Structure Description (I)
Video Segment
Segment Decomposition
Moving Region
Segment Decomposition
Moving Regions
Segment Decomposition
• MediaTime• Mosaic• GoFGoPColor• TextAnnotation
• MediaTime• ScalableColor• ParametricMotion• TextureBrowsing• ContourShape• TextAnnotation
Relation
Video Segments
above
Structure Description (II)<StillRegion id="SR1">
<TextAnnotation><FreeTextAnnotation>Alex shakes hands with Ana
</FreeTextAnnotation></TextAnnotation><SpatialDecomposition overlap="false" gap="true">
<StillRegion id="SR2"><TextAnnotation> <FreeTextAnnotation> Alex </FreeTextAnnotation>
</TextAnnotation><VisualDescriptor xsi:type="ColorStructureType"> ... </VisualDescriptor>
</StillRegion>
<StillRegion id="SR3"><TextAnnotation> <FreeTextAnnotation> Ana </FreeTextAnnotation>
</TextAnnotation><MatchingHint><Hint value="0.455" xpath=”../../VisualDescriptor"/>
</MatchingHint><Relation xsi:type="DirectionalSpatialSegmentRelationType“
name="left“ target="#SR2"/><VisualDescriptor xsi:type="ColorStructureType"> ... </VisualDescriptor>
</StillRegion>
</SpatialDecomposition></StillRegion>
Still region SR1: Creation inform a tion Text annotation
Still region SR2: Text annotation Color structure
Still region SR3: Text annotation Matching hint Color structure
Spatial segment decompos i tion: No overlap, gap
Directional spatial segment relation: left
EE 6850, F'02, Chang, Columbia U. 28
Semantic Description Tools
captures
SemanticBag DS
SemanticBase DS
Content
Narrative World
Object DS
Event DS
Concept DS
SemanticState DS
SemanticPlace DS
SemanticTime DS
AgentObject DSAbstractionLevel
describes
. . .
AnalyticModel DS
Segment DS
Semantic DSMultimedia
SemanticRelation
DS
Label
EE 6850, F'02, Chang, Columbia U. 29
Semantic Description
Agent object AO1: Label Person
Agent object AO2: Label Person
Event EV1: Label
Concept C1: Label Property Property
Comradeship
Shake hands
Alex Ana Object-event relation: hasAccompanierOf
Concept-semantic base rel a tion: hasPropertyOf
Object-event relation: hasAgentOf
Segment-semantic base relation: hasMediaPerceptionOf
Segment-semantic base relation: hasMediaSymbolOf
Segment-semantic base relation: hasMediaPerceptionOf
New York
9 September
SemanticPlace SP1: Label
Place
SemanticTime ST1: Label
Time Semantic time-semantic base relation: hasTimeOf
Semantic place-semantic base relation: hasLocationOf
Still region SR1: Creation inform a tion Text annotation
Still region SR2: Text annotation Color structure
Still region SR3: Text annotation Matching hint Color structure
Spatial segment decompos i tion: No overlap, gap
Directional spatial segment relation: left
Agent object AO1: Label Person
Agent object AO2: Label Person
Event EV1: Label Semantic time Semantic place
Concept C1: Label Property Property
Comradeship
Shake hands
Alex Ana
Object-event relation: hasAccompanierOf
Concept-semantic base rel a tion: hasPropertyOf
Object-event relation: hasAgentOf
Segment-semantic base relation: hasMediaPerceptionOf
Segment-semantic base relation: hasMediaSymbolOf
Photographer: Seungyup Place: Columbia University Time: 19 September 1998
704x480 pixels True color RGB http://www.ee.columbia.edu/~ana/alex&ana.jpg
Columbia University, All rights reserved
Creation information: Creation Creator Creation corrdinates Creation location Creation date
Media information: Media profile Media format Media instance
Usage unformation: Rights
AnMPEG-7 Description
EE 6850, F'02, Chang, Columbia U. 31
Multimedia Description Schemes
Content management
Content description
Creation &Production
Media Usage
SemanticsStructure
ModelsCollectionsContent organization
SchemaToolss
Links & MediaLocalization
Basic Tools
Basic elements
BasicDatatypes
Navigation &Access
Summaries
Variations
Views
Userinteraction
UserPreferences
UserHistory
EE 6850, F'02, Chang, Columbia U. 32
Other MDS
� Creation & Production� Description of content creation and production
(e.g. title and creator), mostly author-generated
� Usage� Description of usage of the content (e.g. rights
holders and publication)
� Media� Description of instances of storage media (e.g.
storage format) for AV content
EE 6850, F'02, Chang, Columbia U. 33
Other MDS Categories
� Navigation & Access� Description of summaries (hierarchical and sequential)
and views for efficient browsing� Description of variations for personalized access
� Translation, transcription, reduction, etc.
� Content Organization� Description of collections, classifications, and models
� User Interaction� Description of user’s preferences pertaining to
consumption of multimedia material
Transmission/Storage Medium
IP MP4Delivery
Layer
Demultiplex
MPEG-2 ATM ...
Multiplexed Streams
DemultiplexDemultiplex
Schemastreams
Descriptionstreams
CompressionLayer
Elementary Streams
Multimediastreams
UpstreamData
Application
APIs
Defines Describe
Reconstruction
DescriptionDecoder
SchemaDecoder
BiM/TextualParsing
BiM/TextualDecoding
MPEG-7 Terminal Architecture(March ‘01)