Date post: | 26-Jun-2015 |
Category: |
Technology |
Upload: | cyril-concolato |
View: | 1,048 times |
Download: | 0 times |
INSTITUT MINES-TÉLÉCOM
Comments on carriage of timed text (and graphics)
Cyril Concolato, Jean Le Feuvre
M25978
July 2012, Stockholm, Sweden
INSTITUT MINES-TÉLÉCOM04/13/20232
Timed text has a long history …
■ So many formats• See http://wiki.videolan.org/Subtitles
■ MPEG has already looked at it, some time ago…• Analysis of the streaming text requirements, MPEG, Shanghai, China,
October 2002, M8931• Existing MPEG technologies:
− Scene description streams (BIFS, LASeR)− MPEG-4 Part 17
■ Now new formats (TTML, WebVTT, …)• Each format so far requires MPEG to standardize a new mechanism
for its carriage • What will we do for the next formats (now or in 10 years) ?
■ MPEG should design future-proof technology for the carriage in ISOBMF of timed text !!
INSTITUT MINES-TÉLÉCOM04/13/20233
Not only of timed text but also of timed graphics
■ Proposed new use case• Frame-based synchronized graphics overlay on top of a
video−Ex: Graphics and video derived from Kinect devices−Ex: Recordings of Augmented Reality applications−Ex: SVG-based cartoons à la Flash
■ Same requirements as “timed text”• Selecting a graphics track• Playing while keeping synchronization• Accessing randomly in the graphics stream• Enabling progressive download and streaming or adaptive
streaming, • Positioning the track on top of the video …
■ Demo
INSTITUT MINES-TÉLÉCOM04/13/20234
Example of mis-synchronization
INSTITUT MINES-TÉLÉCOM04/13/20235
We need more generic requirements (1/2)
■ The ISOBMFF should be able to carry timed data, in a generic manner, for which the exact type or format can be identified. • Ex: to carry timed TTML, SVG, HTML, WebVTT...
■ The ISOBMFF should be able to carry samples of timed data composed of a main sample data referencing several individual pieces of data (sample resource), each of them carried efficiently, without requiring modifications to the main sample data. • Ex: Efficient carriage of JPEG images used by the
timed text or graphics document
INSTITUT MINES-TÉLÉCOM04/13/20236
We need more generic requirements (2/2)
■ The ISOBMFF should be able store sample resources together with or separately of the main sample data, possibly using movie fragments.• Ex: Share a JPEG across samples
■ The ISOBMFF should enable the storage of timed data in a fragmented manner across samples, for progressive loading by the application consuming sample data.• Ex: if an XML progressive loader can be used, use it!
INSTITUT MINES-TÉLÉCOM04/13/20237
Technical elements towards a solution
■ Situation• MPEG has already almost all tools for timed text and
graphics−Metadata tracks (part 12)−Scene description tracks (part 14)
• Reuse as much existing tools• Adapting them if needed
■ 2 proposals• Generic Tool: Usage of ‘meta’ in movie fragments
• Specific adaptations to carry timed text and/or graphics−Option 1: Usage of timed metadata samples−Option 2: Usage of ‘meta’ box as samples
INSTITUT MINES-TÉLÉCOM04/13/20238
‘meta’ in movie fragments
■ The ‘meta’ box provides• Carriage of un-timed metadata• a useful mapping between a URL and the location of the
metadata in the file (ItemInfoBox and ItemLocationBox)• Gives way to protect the metadata
■ Current situation• Fragmenting movie with ‘meta’, what happens?
−Media data not allowed in initialization segments!• Why ‘meta’ not allowed in movie fragments?
■ Proposal• Allow at most one ‘meta’ box (and possibly one ‘meco’
box to be consistent with the rest of the specification)• At the ‘traf’ level (not at the ‘moof’ level)
INSTITUT MINES-TÉLÉCOM04/13/20239
Option 1: Usage of “timed metadata” samples to carry timed text and graphics
■ Track handler: ‘meta’■ Sample entry:
• XMLMetaDataSampleEntry if XML (TTML, SVG, …)• TextMetaDataSampleEntry if textual (WebVTT, HTML, …)• URIMetaSampleEntry if needed.
■ Sample• Use of given mime type or namespace to identify the content of the
sample• Complete XML document or text chunks or binary content• Storage of secondary resources (eg. JPEG …) as items in a ‘meta’
box:− At ‘traf’ level, if fragmented − At the ‘trak’ level, at the ‘moov’ level, at the file level
■ Flatenning• Merge ‘meta’ • Or store ‘traf’-level meta boxes at the ‘trak’ level with ‘meco’ boxes
− Use a new sampleGroup type to associate ‘meta’ to sample
INSTITUT MINES-TÉLÉCOM04/13/202310
Option 2: Usage of ‘meta’ box as samples to carry timed text and graphics
■ Track handler: ‘metb’ ■ Sample entry:
• MetaBoxSampleEntry (merge of Text- and XMLMetadataSampleEntry)
■ Sample format• A ‘meta’ box with text, XML or graphics document stored
as primary item• Storage of secondary resources (eg. JPEG …) as items
in a ‘meta’ box:−This one, at the sample level (if any)−At ‘traf’ level, if fragmented −At the ‘trak’ level, at the ‘moof’ level, at the file level
■ Flatenning• As usual with audio, video, …
INSTITUT MINES-TÉLÉCOM04/13/202311
Important points
■ Sample mapping• Empty samples
−Option 1: an « empty » sample is a zero-length string−Option 2: an « empty » sample is a meta box with an empty
primary document
• Overlapping samples−Use sample start time as CTS−Use delta CTS as duration (as usual)−Let the sample define « real » presentation duration and
overlap−No « artificial » duplicate content
■ Simple processing• Import/export• Fragmentation/Un-fragmentation
INSTITUT MINES-TÉLÉCOM04/13/202312
Misc points: Current WD problems
■ « Time mapping »• What is needed? • What is the timeBase for TTML?• « adjacent time ranges » not defined in SMPTE?
■ Spatial registration• We agree with m25859: we want to be able to store a
text/graphics track in a separate file from the video track• We agree to reuse 3GPP-style positioning
■ Unnecessary Restrictions• We agree with m25859: do not add restrictions to the
ISOBMFF on timescale, …
■ Be careful not to restrict to complete XML documents only
INSTITUT MINES-TÉLÉCOM04/13/202313
Summary
■ Proposed one new use case: timed graphics■ Reformulated requirements (more generic) ■ Proposed clarifications for ‘meta’ box in movie
fragments■ Proposed 2 options based on existing tools to
carry timed text or graphics■ Comments about current WD
Questions?