Video Description in More than one Language

Video Description inMore than one Language

• Overview• Need• Capability• Approaches• Conclusion

Disclaimer: This presentation does not contain any recommendations, assessments or positions from or by NAB

Video Description

The term ‘video description’ means the insertion of audio narrated descriptions of a television program’s key visual elements into natural pauses between the program’s dialogue.

(S. 3304)

AKA: Descriptive Video, Visually Impaired (VI)

Video Description Insertion

Audio with Video

Description

(Complete Mix)

Descriptive insertions

Primary Audio Dialog

“no dialog” Voice under?“no dialog”“no dialog”

Pause to reflect

• Receiver makers refused to support the original Dolby design to save bits by enabling supplemental audio tracks

• So service providers must consume bits to send everything for any audio service

• Wonder if that lack of innovation is in Gary’s book… moving on…

Initial Mandates• Initial video description rules will go into effect October

2011.– Four Networks (ABC, CBS, Fox, and NBC) and top 5 national

non-broadcast networks will have to provide 50 hrs./quarter with video description1

– Broadcast stations and MVPDs with technical capability to do so generally must pass through audio containing video descriptions.

1. For the top 25 DMAs and 50k+ subscriber systems respectively

More to Come from the FCC• Reports• Rule makings• Not later than this or sooner than that• Soonest for all DMAs : 2037• Not tomorrow … but it is coming • English is assumed, but the Spanish speaking

population is growing …• So what can we do?

Digital Audio Interfaces

Ancillary Data

Control Data

Audio Subsystem

AudioSource Coding

Wor

k F

ile

Video Subsystem

VideoSource Coding

Studio/Master Control

Audio

Video

Rou

ter/

MC

Sw

itch

er

Dolby E8 ch. PCM Stream, compressed

3 Mb/s total, twisted pair or coax

Eight is enough

For one additional service (with 5.1)

-- need to replicate the path to get more than one language or type of service

Wor

k F

ile

Ancillary Data

Control Data

Audio Subsystem

AudioSource Coding& Compression

Video Subsystem

VideoSource Coding& Compression

MCSW

Studio/Master Control

Digital Audio Interface

Audio

VideoAES-3: 2 ch / PCM stream, uncompressed 1.92 Mb/s, Twisted Pair or Coax

(may also carry 8-channel Dolby E) then in HANC: 16 channels total

Uncompressed Audio Interconnects

Sixteen is enough

• For a pair of 5.1 channel services, each with an associated stereo audio descriptive video mix.

• For a 5.1 service and 5 stereo services

• So audio in several languages with VI and HI could be supported – but only one could be 5.1

Distribution

LocalStation

ATSCDigital

RemoteProduction

& PostVenues Contribution

HD or SDNetworkCenter

DTV

For MPEG links, Audio channels can be carried as program elements with PMT-based signaling

Transmission

• Digital Transmission– A large number of audio services for a single video

(MPEG-2 Transport) can be signaled and sent (depending on the number of descriptors associated with each audio)

• Analog Transmission– Second audio or video description <choose>

ATSC Transport

Virtual Channel 1

Video

Transport Component OrganizationTransport Component Organization

Audio

PSI(P)

Audio

MDTV

Virtual Channel 2

AudioAudio

Multi-Program Multiplex

Video+PCR

Audio1: CM eng

Audio2: VI eng

Audio 3: CM spa

Audio 4: VI spa

PES Streams

Mux

Mux

Program 1

Program 2

Program 3

Multi- ProgramTransport

Stream

4 each AC-3 and ISO-639 descriptors

SI Tables (PMT)

PSIP TablesEach event

different descriptors

Midplane

Encoder

ManagementPort

CPCControl Card

ASI Output Card2 ASI

Encoder Encoder Encoder

Video Encoding

Audio Encoding3 X 2.0

Video Encoding Video Encoding

StatMux Engine

StatMux Engine

Optional Audio Card

5.1 + 2.0or

3 x 2.0

Optional Audio Card

5.1 + 2.0or

3 x 2.0

Optional Audio Card

5.1 + 2.0or

3 x 2.0

Optional Audio Card

5.1 + 2.0or

3 x 2.0

Encoder

Video Encoding

SDI / HDSDI (embedded audio)

PSIP input

Dual ASIOutput

Audio Encoding3 x 2.0



Video Inputs

Encoder OutputsOptional Audio Cards

Each can encode or transcode from Dolby E one 5.1 + one 2.0 Dolby Digital

Note this configuration supports more than my example case.

The maximum audio + video shown isFour video programs (HD, SD or mixed) Four 5.1 surround channelsSixteen 2.0 stereo channels

Based on slide from

One configuration would be to provide one 5.1 in English with the Descriptive Video on a Dolby E path, and the

Spanish 5.1 with Descriptive Video in Spanish on another Dolby E path.

Announcement Paradigms• This Program has English, Spanish, with Video

Description in both Languages

• Separate virtual channels – English – English Description for the Blind– English for the Hearing Impaired– Spanish– Spanish Description for the Blind– Spanish for the Hearing Impaired

But Cable may have to do something like this if delivery

to NTSC sets is required

Terrestrial Emission Overview(signaling and announcement)

Event 1CM (5.1, eng)

Event 2

Event 2 – AC-3 descriptor with four audio services

PMT – four ISO-639 descriptors(one per program element)

EIT 0 (partial) EIT 0 (partial)Event 1 – AC-3 descriptor

with one audio service

Event 2 – AC-3 descriptor with four audio services

PMT – one ISO-639 descriptor

ATSC Transport

PSIP & PSI

CM (5.1, eng) + VI (2, eng)

CM ( 5.1, spa + VI (2, spa)

Events & tracks

DTV Receiver

RF Tuner& VSB De-Modulator

AudioDecoding

Tran

sport D

e-M

ultip

lex

VideoDecoding

PSIP Data

Audio

Video

Program GuideDatabase

DisplayProcessor

Program select from user

RFChannelSelect

AudioSelect

CEA-CEB-21

Recommended Practice for Selection and Presentation of DTV Audio

In progress since July 2008, but almost done

Key Issues

• User set up and control• Explicit Language selection• Explicit VI and HI selection • Differences between stream construction (Off-

air and Cable)

Key Recommendations

• Receivers should gather user preferences and allow them to be changed later

• Receivers should read the tables and descriptors and use the contents

• Receivers should automatically select best fit to preferences when more than one stream is present

Key Recommendations• Should consider the following items when

providing for user selection of their preferred audio stream:

– Stream type (CM, VI, or HI,) as signaled by the bsmod field in the AC-3_audio_stream_descriptor().

– The language field encoded in the AC-3_audio_stream_descriptor().

– The component_name_descriptor() to provide supplemental audio stream information to users, if needed.

Conclusion

• Multiple language, multiple community service audio tracks are part of your future (unless English is declared to be the <ONE> Official Language for the United States of America)

• Force fitting to the 2-audio mold is problematic• When breaking the mold, plan ahead

CreditsATSC

CEA

Mike Dolan

Graham Jones

Date post:	12-Jan-2016
Category:	Documents
Upload:	fell
View:	41 times
Download:	0 times

Video Description in More than one Language

Documents