+ All Categories
Home > Documents > COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand...

COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand...

Date post: 09-May-2020
Category:
Upload: others
View: 1 times
Download: 1 times
Share this document with a friend
38
February 19, 2002 Vol. 13.03 Copyright © CSR 2002 1 COMMUNICATIONS STANDARDS REVIEW Volume 13, Number 3 February 19, 2002 REPORT OF JOINT VIDEO TEAM (JVT) MEETING #2, ISO/IEC JTC1/SC9/WG11 (MPEG) AND ITU-T SG16 Q6 (VCEG), JANUARY 29 – FEBRUARY 1, 2002, GENEVA, SWITZERLAND The following report represents the view of the reporter and is not the official, authorized minutes of the meeting. Joint Video Team (JVT) Meeting #2, ISO/IEC JTC1/SC29/WG11 (MPEG) and ITU-T SG16 Q6 (VCEG), Jan. 29 - Feb. 1, 2002, Geneva, Switzerland.............................2 Meeting Summary................................................................................................................2 IPR Status.............................................................................................................................4 UVLC (Universal Variable Length Codeword).....................................................................5 Context-Based Adaptive Codes (CAC).................................................................................6 Other VLCs and Scanning....................................................................................................6 Context-Based Adaptive Binary Arithmetic Coding (CABAC).............................................7 Motion Compensation..........................................................................................................8 Macroblock Partition............................................................................................................8 Multiframe Motion Compensation........................................................................................9 Global Motion Compensation and Motion Vector Coding.................................................10 De-Blocking Filter..............................................................................................................10 SP Frames..........................................................................................................................12 Buffering............................................................................................................................12 Network Adaptation Layer (NAL)......................................................................................13 High-Level Syntax..............................................................................................................17 Transform Coding and Quantization...................................................................................19 Transform Size...................................................................................................................25 Robust Transmission..........................................................................................................26 Interlaced Coding and Progressive/Interlace Interaction......................................................28 Profiles and Levels..............................................................................................................30 Performance Evaluation......................................................................................................30 Complexity.........................................................................................................................31 Intra Coding........................................................................................................................31 Encoding.............................................................................................................................31 Fine-Grain Scalability.........................................................................................................31 JVT Ad Hoc Committees....................................................................................................32 JVT Meeting Roster, Jan. 29 - Feb. 1, 2002, Geneva, Switzerland............................................34 Acronym Definitions......................................................................................................................36 Communications Standards Review Copyright Policy....................................................................38
Transcript
Page 1: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 1

COMMUNICATIONS STANDARDS

REVIEW

Volume 13, Number 3 February 19, 2002

REPORT OF JOINT VIDEO TEAM (JVT) MEETING #2, ISO/IECJTC1/SC9/WG11 (MPEG) AND ITU-T SG16 Q6 (VCEG),

JANUARY 29 – FEBRUARY 1, 2002, GENEVA, SWITZERLAND

The following report represents the view of the reporterand is not the official, authorized minutes of the meeting.

Joint Video Team (JVT) Meeting #2, ISO/IEC JTC1/SC29/WG11 (MPEG) and ITU-TSG16 Q6 (VCEG), Jan. 29 - Feb. 1, 2002, Geneva, Switzerland.............................2

Meeting Summary................................................................................................................2IPR Status.............................................................................................................................4UVLC (Universal Variable Length Codeword).....................................................................5Context-Based Adaptive Codes (CAC).................................................................................6Other VLCs and Scanning....................................................................................................6Context-Based Adaptive Binary Arithmetic Coding (CABAC).............................................7Motion Compensation..........................................................................................................8Macroblock Partition............................................................................................................8Multiframe Motion Compensation........................................................................................9Global Motion Compensation and Motion Vector Coding.................................................10De-Blocking Filter..............................................................................................................10SP Frames..........................................................................................................................12Buffering............................................................................................................................12Network Adaptation Layer (NAL)......................................................................................13High-Level Syntax..............................................................................................................17Transform Coding and Quantization...................................................................................19Transform Size...................................................................................................................25Robust Transmission..........................................................................................................26Interlaced Coding and Progressive/Interlace Interaction......................................................28Profiles and Levels..............................................................................................................30Performance Evaluation......................................................................................................30Complexity.........................................................................................................................31Intra Coding........................................................................................................................31Encoding.............................................................................................................................31Fine-Grain Scalability.........................................................................................................31JVT Ad Hoc Committees....................................................................................................32

JVT Meeting Roster, Jan. 29 - Feb. 1, 2002, Geneva, Switzerland............................................34Acronym Definitions......................................................................................................................36Communications Standards Review Copyright Policy....................................................................38

Page 2: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

2 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

REPORT OF JOINT VIDEO TEAM (JVT) MEETING #2, ISO/IECJTC1/SC29/WG11 (MPEG) AND ITU-T SG16 Q6 (VCEG),

JAN. 29 - FEB. 1, 2002, GENEVA, SWITZERLAND

The Joint Video Team (JVT) is composed of ITU-T VCEG (Q6/SG16) and ISO/IEC MPEG(ISO/IEC JTC1/SC29/WG11). The JVT chair and co-chairs are G. Sullivan (Microsoft), T.Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the resultsof this meeting. JVT-B003 is the list of Participants at this meeting, JVT-B004 is the list of JVTExperts, and JVT-000 is the list of documents. JVT-B002 is the Report of the first JVT meetingin Pattaya, Thailand (Dec. 2001, see CSR 12.48).

JVT-B005 (G. Sullivan, Microsoft) is the Ad Hoc Report on the JVT Project. It contains a reviewof the general status of the project, and in particular announces a new ftp site sponsored by IMTC:<ftp://ftp.imtc-files.org/jvt-experts/>. The general JVT reflector can be subscribed to at<http://www.imtc.org/scripts/imtc.pl?enter=jvt-experts>. Email for the reflector should be sent to<[email protected]>. The JVT thanks the International Multimedia TelecommunicationsConsortium (IMTC) for their support in hosting both this email reflector and ftp site. Preparationsfor this meeting had used the prior ftp site, at which the documents for the meeting are available:<http://standard.pictel.com/ftp/video-site/0201_Gen>.

Meeting Summary

The JVT adopted a second Joint Working Draft (JWD 2) design (JVT-B118), and a Joint encodingtest Model (JM 2) non-normative reference encoder description. It includes the adoption of thefollowing changes to the prior JWD 1 and JM 1:

Normative content changes adopted:

JVT-B011: Deblocking filter (general usefulness)JVT-B029: Exp-Golomb VLC (general usefulness without CABAC)JVT-B036: CACM+ CABAC (general usefulness with CABAC)JVT-B063: Bitstream NAL structure with start code and emulation prevention part (use for

bitstream environments)JVT-B055: SI Frames (use for streaming, random access, error recovery)JVT-B080: Intra Prediction (put in software as configurable feature, use it in common

conditions, seek complexity analysis)JVT-B071: Interlace frame/field switch at picture level (with field coding as the candidate

baseline interlace-handling design)VCEG-O17: MB partition alteration (general usefulness)JVT-B101: CABAC efficiency improvement (general usefulness with CABAC)JVT-B038: Transform (general usefulness)Extension of quant range (general usefulness)JVT-B042: Normative picture number update behavior (error resilience)

Tentative:

JVT-B053: ABT (prepare completely final and switchable annex description and referencesoftware as candidate for adoption at next meeting – expected to be adopted as a non-baselinefeature if this is achieved)

Page 3: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 3

Non-normative content:

JVT-B022: Encoding Motion Search Range (lower complexity method not for commonconditions testing)

JVT-B102: Robust reference frame selectionJVT-B042: Enhanced GOP concept description in interim file format non-normative appendix

The working draft document editor (T. Wiegand, HHI) was asked to produce the new version of theworking draft within four weeks of the end of the meeting, and the software coordinator (K.Sühring, HHI) was asked to produce the corresponding version of the software by the halfway-point between this meeting and the Fairfax meeting in May. The following persons are designatedas subject-area assistants to the editor:

• Source Coder (Picture Formats, etc.): G. Sullivan (Microsoft)• Syntax and Semantics

– NAL / Slice Syntax: T. Stockhammer (Munich University of Technology)– Macroblock syntax, syntax Diagrams: J. Alvarez (Broadcom)

• Decoder Process– Slice Decoding: M. Hannuksela (Nokia)– Motion Compensation: J. Lainema (Nokia)– Transform Coefficient Decoding: H. Malvar (Microsoft)– Loop-Filter: P. List (Deutsche Telekom)– VLC: G. Bjøntegaard (RealNetworks)– CABAC: D. Marpe (HHI)– B-frames: H. Schwarz (HHI)– SP-frames: M. Karczewicz (Nokia)– Interlace: P. Borgwardt (VideoTele.com)– HRD: G. Sullivan (Microsoft)

The JVT notes that the working draft document is the draft of their future standard, and that ittherefore represents the primary and highest-priority representation of their adopted design. Thecontributors to that draft are reminded that it must contain a complete and full specification of thedecoding process for JVT video at a level of detail sufficient to achieve understanding andinteroperable implementation of the design (using only the document as the specification of thenecessary technical content).

The JVT expressed an interest in the following topics for future consideration:

• An interesting proposal (Scene Transitions, JVT-B043) was presented that needs furtherconsideration and analysis (including other composition-style features such as picture-in-picture), which could be considered for adoption; consensus on adoption could not be reachedat this meeting.

• Consideration of making quantization parameter have a period of 8 and adjustment of commonconditions a work item for AHG investigation.

• Consideration of JVT-B109 which was produced and reviewed as an output document draft forconsideration and comment, and for potential adoption into the draft at the next meeting.

The JVT adopted a plan to define and conduct the following core experiments:

JVT-B111, Core Experiment on Scattered SlicesJVT-B112, Core Experiment on SP PicturesJVT-B115, Core Experiment on Adaptive MV CodingJVT-B116, Core Experiment on Interlace Chroma Phase ShiftJVT-B117, Core Experiment on Macroblock Adaptive Frame/Field Interlace Coding

Page 4: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

4 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

The JVT produced JVT-B108 (D. Lindbergh, Polycom; A. Luthra, Motorola) as its draftframework for Profile and Level definition, and requests information and comment from its parentorganizations and members on its content. In particular, the JVT requests input on whatapplications are identified for JVT video, which tools appear to be appropriate for thoseapplications, analysis of the complexity of the JVT codec tools, and suggestions on the bestmapping of these needs into appropriate profiles and levels.

The JVT requests information and comment from its parent organizations on the consideration ofthe sub-picture error resilience tool JVT-B040 as a potential candidate for phase 2 work beyondthe currently-defined schedule.

The JVT requests information and comments from its parent organizations on the following issuesin regard to RTP payload packetization:• Remarks on whether conformance to the draft MPEG-4 “MultiSL” packetization format

should be a design constraint on JVT packetization design.• Remarks regarding the appropriate definition of an “access unit.” Is an Access Unit defined

as the smallest quantity of data that can be associated with a unique timestamp? Is this definitionappropriate for JVT video? (e.g., can a slice be an access unit?)

• Remarks on MultiSL draft support in the following areas:– Support of distinct classes of packets defined such as mode/MV data, intra coefficient data,

and inter coefficient data packets (e.g., for unequal error protection) within each slice.– Support for “compound packets” (muxing of multiple slices into one packet).– Support for placing data for parts of several pictures in one packet.

The JVT expressed its intent to move toward use of the MPEG-4 file format as the defined methodfor JVT video content storage, and directs its working draft editor to clearly indicate the interimnature of the JVT interim file format design in the working draft and to avoid/eliminate any slightdeviations of editorial terminology with regard to the drafted interim file format description. TheJVT notes that only bitstream switching and addressing of data below the picture level (addressingof fragments of access units) and the enhanced GOP concept of JVT-B042 appear to be features inthe JVT interim file format design that may not be supported in the current MPEG-4 file formatdesign. The parent bodies are requested to consider these features of the JVT interim file formatdesign for study as the JVT progresses toward use of the MPEG-4 file format.

IPR Status

JVT-B073, JVT IPR Status Report (G. Sullivan, Microsoft), is a general review of the IPRsituation with remarks on possible problem areas and open issues. It notes that the following itemswere reported to VCEG prior to JVT formation:

• Telenor sent email remarks to the VCEG chair about six months ago reporting that they heldIPR on some aspects of the UVLC design used in H.26L. G. Sullivan recalls that this IPR wasoffered for inclusion in the H.26L standard by Telenor under subclause 2.2 of the ITU-T patentpolicy.

• Verbal remarks were made at the Santa Barbara VCEG meeting in September 2001 that PhilipsCorp. may have IPR on the 2-D VLC design used in H.26L and that this IPR was likely to beavailable under terms covered by subclause 2.2 of the ITU-T patent policy.

• Verbal remarks were made at the Santa Barbara VCEG meeting in September 2001 that NetergyNetworks (which has now changed its name back to 8x8, Inc.) has IPR on long-term memorymotion compensation as used in H.26L and that this IPR is likely to be available under termscovered by subclause 2.2 of the ITU-T patent policy.

Page 5: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 5

As of this meeting the following IPR statements were received:JVT-B107, 8x8 (B. Andrews), agrees with Clause 2.2, with a relaxation clause stating “For

implementations of the baseline of the above Recommendation | Standard, the Patent Holder isprepared to grant a ‘royalty-free’ license to anyone on condition that all other Patent Holdersdo the same.”

JVT-B113, Sharp (see below, under De-Blocking Filter)JVT-B114, Sharp (see below, under Transform Coding and Quantization)

The group was informed about the JVT’s patent policy. The IPR status was reviewed. Thequestion regarding the support of a royalty-free baseline profile was asked. Telenor and 8x8, Inc.indicated an intent to follow sub-clause 2.2.1 (royalty free with reciprocity, see JVT-B107 for8x8’s patent statement). Telenor indicated that the spirit of 2.2.1 was expressed in a Telenor emailmessage six months ago. Philips indicated that they would follow sub-clause 2.2. Thus the Philipssituation appears to be the only identified problem in reaching the royalty-free baseline goal.

UVLC (Universal Variable Length Codeword)

JVT-B074 (T. Chujoh, Y. Kikuchi, Toshiba) proposes a new variable length code to improve thecoding efficiency. Simulation results show that the proposed VLC outperforms UVLC for high bitrates. The Proposed VLC is better for all sequences when QP is small. The coding gain is up to6.2% for Inter coding and QP=1. Gains depend on the sequence. For increasing QP, theperformance becomes worse than UVLC. The codewords of the proposed VLC are regular andwithout interleaving. It also provides bit-error detection mechanisms.

JVT-B099 (K. Takagi, KDDI) discusses the usefulness of the reversibility of codes. It provides amethod for systematically producing Reversible VLCs. The meeting agreed that results would beneeded to study the efficiency of this technique.

JVT-B034, Enhanced variable-length coding (T. Halbach, NTNU), investigates error resilienceproperties of two codes, UVLC and the RVLC of H.263, Annex D. It is claimed that the UVLChas been used so far in H.26L standardization, and VLCD is a reversible VLC that offers errorresilience/concealment possibilities which are superior to UVLC. Discussion noted that the variousfeatures are well understood. However, in order to change the design, the group needs moreevidence on how these methods could be applied beneficially in a practical system.

JVT-B029, Reduced Complexity VLC (L. Kerofsky, Sharp Labs; M. Zhou, TI), supports theprevious proposal VCEG-N36 (Sept. 2001, CSR 12.37) by repeating remarks about the UVLCperformance under bit errors given in VCEG-L23 (Jan. 2001, CSR 12.09). As mentioned inVCEG-N36, the interleaved structure of the UVLC codewords is complex and can be reduced byusing a non-interleaved structure. VCEG-L23 is used to evaluate the sacrifice in removing theresynchronization property of the UVLC. The group agreed that this provides more flexibility foradaptive codes. It can be used for binarization in CABAC.

JVT-B047 (M. Zhou, TI) suggests removing the EOB coding redundancy by transmitting the EOBonly for the coded blocks in which the last coefficient is zero. By doing so there is no complexityincrease, and the error detection feature is still maintained. Up to 1% overall bit-rate saving ismeasured. Removing the EOB coding redundancy would add one more differentiation to theH.26L coding standard. This would be unnecessary if the coefficient count techniques are adopted.The group also noted that some error detection mechanisms rely on the presence of EOB. It wasdecided to defer any decision.

In sum, the group agreed that it wants to change the UVLC. The Exp-Golomb Code providessimplification and flexibility. Non-synchronizing reversible codes may provide enhanced error

Page 6: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

6 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

resilience. They propose to adopt the Exp-Golomb codes per JVT-B029 unless the benefit of theenhanced error resilience feature of the other codes is shown.

Context-Based Adaptive Codes (CAC)

JVT-B045 (G. Bjøntegaard, Telenor) proposes an adaptive, low complexity entropy coding methodfor transform coefficients. It contains the idea of switching between various VLCs based oncontexts. Overall coding gain between 0-9% (QP=10) compared with UVLC coding is reported forInter Coding and between 0-8% (QP=10) for Intra coding. Discussion noted the impressive gains.It was clarified that a test set was used to design tables. A question about complexity was raised.

The goal of JVT-B072, VLC Coefficients Coding for High Bit rate (M. Karczewicz, Nokia), is toshow that VLC coding can be considerably improved for high rates or/and high resolution material.It also initiated the discussion on whether there is an interest to use VLC coding in such cases. Theproposal uses coefficient counter and parameterizable Exp-Golomb codes. The measured bit-ratereductions are between 0-20% at low QP values for Intra and between 0-13% for low QP values forInter.

The JVT has great interest in these methods to achieve bit-rate savings. CAC AHG Chair G.Bjøntegaard noted that the mandate is to further study the idea of adaptive codes with regards to thetrade-off between coding efficiency and complexity involving the Exp-Golomb and the CABACentropy coding.

Other VLCs and Scanning

JVT-B056 (S. Kadono, K. Abe, M. Schlockermann, Matsushita) proposes an improved 2D-VLCencoding scheme for high-bit rate (small QP). The basic idea of this technique was proposed asVCEG-O27 (S. Kadono, M. Schlockermann, Matsushita Electric), December 2001, CSR 12.48).The proposed scheme, using UVLC, improves 2D-VLC coding efficiency by modifying the VLCtable that is optimized for small QP and depends on the numbers of uncoded quantized coefficientsin the block:1) Level scaling 2D-VLC table design: For the small QP case, allocation of code number is

adaptive for high value.2) Position adaptive 2D-VLC table design: Long run is removed when encoder knows there will

be no long run.3) Inverse order zigzag scanning: a simplification. In combination with the second proposal, large

coefficients should be coded for the last part of scanning, to propose inverse order zigzagscanning.

The proposal is verified using JVT sequences, and it is shown to reduce 4-7% bits for I-picture and1-2% bits for P-pictures at a small QP. A fourth method is proposed to turn on/off scaling so as tonever degrade performance.

JVT-B093 (Kato, S. Adachi, M. Etoh, NTT DoCoMo) reports verification results of the improved2D-VLC encoding scheme, which were proposed in VCEG-O27. The results verify the results inVCEG-O27. At the same time, the results show the desirability of avoiding disadvantages on someencoding parameters and/or sequence characteristics.

JVT-B081 (C-W. Kim, McubeWorks; S-W. Rhie, SK Telecom) proposes to adaptively usedouble / simple scan. It proposes a delta value to adjust the threshold which separates simple anddouble scan. Results show that a bit rate reduction at near the QP=24 up to 5% can be achieved.Further consideration of this proposal must be conducted in the light of the results of the CACAHG.

Page 7: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 7

JVT-B062, Improved Entropy Coding with Codeword Re-Association (B. Jeon, SungKyunKwanUniv.; W. Choi, K.J. Kim, Serome Technology), proposes a method for improving the currentUVLC approach by using fixed re-association table (FRAT) that assigns existing codewords todifferent symbols according to QP values. It is proposed to use this approach for high bit rateapplications where statistics are different from the cases with medium and low bit rates. Themapping tables are divided according to quantizer ranges, e.g., 3 Ranges: QP=1-10, QP=11-20,QP=21-31. Gains up to 8% are shown for QP=1 for Mobile for Intra coding, while normal gainsare around 2-3%. Sometimes results are worse than UVLC. Results are worse for Inter codingthan for Intra. Also switches between active and non-active sequences are investigated. The groupencourages further work.

The JVT group stressed that further proposals in this area should show performance on top of theCAC result.

Context-Based Adaptive Binary Arithmetic Coding (CABAC)

JVT-B015, AHG Report: CABAC (D. Marpe, HHI), summarizes the CABAC-related proposalspresented at that last meeting, and gives an overview of the activities within the CABAC AHG in theinterim period between the last and the present meeting.

JVT-B101, New Results on Improved CABAC (D. Marpe, G. Blättermann, T. Wiegand, HHI; R.Kurceren, M. Karczewicz, J. Lainema, Nokia), proposes new coding elements and new contextmodels for improving coding efficiency of CABAC, both for inter and intra frame coding. Thisproposal combines ideas from earlier contributions. Simulation results show performance gains ofup to 5.2% BD bit-rate savings relative to the current CABAC specification for intra coding. In thecase of inter coding (IPPP), improvements of up to 2.4% BD bit-rate reduction relative to theoriginal CABAC method have been obtained.

JVT-B100, Performance of CABAC for Interlaced Video (D. Marpe, H. Schwarz, T. Wiegand,HHI), evaluates the improved CABAC entropy coding method with regard to interlaced sourcematerial. The simulation results show that CABAC significantly improves the rate-distortionperformance for the specified set of test sequences. An average BD (Bjøntegaard Delta) bit-ratesavings of 11%, or equivalently, BD PSNR gains of 0.45 dB in comparison to the UVLC entropycoding method is observed.

The proposal in JVT-B033, Low-Complexity Arithmetic Coding Implementation (R. J. van derVleuten, Philips Research), significantly reduces the arithmetic coding implementation complexity,while it has a negligible influence on the compression efficiency. Although the method wasoriginally designed and optimized for hardware implementation, it also significantly improves thesoftware execution speed. No degradation was seen against the original CABAC.

JVT-B036, Low-Complexity Arithmetic Codec Engine (X. Wu, MinimalMass Inc.; L. Winger, M.Gallant, VideoLocus Inc.), presents two arithmetic coding engines: CACM+/ACM98 (improvedAC engine from “Arithmetic Coding Revisited”, Moffat, Neal, Witten. ACM Transactions onInformation Systems, 16(3):256-294, July 1998.) and WAC (X. Wu’s proposed low-complexityAC engine).

JVT-B105, Improved CABAC (E. Hamilton, D. Lelescu, N. Terterov, A. Zheludkov, CompressionScience Corp.), proposes a new context-based binary arithmetic coder for use by H.26L for entropycoding of transform coefficients. The proposal considers only the texture coding part. It isdemonstrated to produce greater compression performance as compared to H.26L TML 9.4CABAC by up to 8% bit rate reduction for texture data. Overall the gains are up to 5.72%. Theproposal didn’t use the common test conditions. Further work is encouraged.

Page 8: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

8 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

JVT-B091 (M. Etoh, S. Adachi, M. Kobayashi, F. Bossen, NTT DoCoMo) proposes requirementsfor CABAC design. First requirements are about arithmetic coding core, in terms of feasibility withsmall mobile devices that require low-power consumption. Second requirements are about contextmodeling, in terms of coding efficiency degradation sacrificed by the Unified Binarization scheme.

In summary, the JVT group proposed to adopt the improved CABAC method of JVT-B101,CACM+ of JVT-B036, and to continue to discuss complexity reduction of the arithmetic codingengine in the AHG on CABAC. A a revised version of JVT-B101 is expected.

JVT-B064 (G. Bäse, N. Oertel, Siemens) presents additional results for the CABAC-related CEaccording to the proposed method in VCEG-O34. The number of coefficients is used to enhancethe Level coding further. The proposed method always outperforms any combination of the othermethods for smaller QP values. Up to 1% bit rate savings on top of the combination of the HHIand Nokia proposals can be achieved without additional complexity. Further work is encouraged.

Motion Compensation

JVT-B018, AHG Report: Motion Interpolation (T.I. Johansen, Tandberg; T. Wedi, Univ. ofHanover), notes that T. Wedi is porting and incorporating his implementation to the JVT software.Although many experts have shown interest in the subject, no new parties have come forward withcontributions. This is mainly believed to be due to the big effort required to start this work fromscratch and the short time between the December meeting in Pattaya and this meeting. The reportrecommends to continue the AHG with the previous mandate plus finalization of the work forpotential inclusion at the next meeting.

JVT-B066, 1/8-pel MC for interlaced video (T. Wedi, Univ. of Hanover), analyses the codingefficiency of 1/8-pel motion vector resolution for interlaced sequences. For sequences with activemotion (Football, Rugby,...), no coding gain is obtained with 1/8-pel compared to 1/4-pel MCP.The PSNR difference for this kind of sequences is in a range of 0.0 and -0.2 dB. For sequenceswith moderate motion and high spatial detail (Bus, Flower,...), a gain between 0.5 and 1.0 dB isobtained for higher bit rates of the test set.

JVT-B077, Short Tap Filter For High Resolution Sequences (K. Chono, Y. Miyamoto, NEC),reports the experimental results comparing the TML 6-tap filter and the Telenor 4-tap filter. Thegains are in the range of 0-10% BDBRS. It is pointed out that the 4-tap implementation can reduceinterpolation calculation up to 50% for that part of the decoder. The group suggested making filtersize a profiling and level issue.

Macroblock Partition

JVT-B090 (S. Adachi, S. Kato, M. Kobayashi, M. Etoh, NTT DoCoMo) reports the coreexperiment results of improved MB (Macroblock) prediction modes. It also describes B-framesyntax for VCEG-O22 (S. Adachi, S. Sekiguchi, S. Kato, M. Kobayashi and M. Etoh, NTTDoCoMo, Inc., December 2001, CSR 12.48), which is newly defined in the CE description. Theresults show that up to 8.5% bit rate reduction or 0.41 dB PSNR improvement can be seen by themethod of VCEG-O22, and up to 5.2% or up to 0.25 dB improvement by the method of VCEG-O17 (H. Schwarz, T. Wiegand, HHI, December 2001, CSR 12.48).

JVT-B054, Core Experiment Result on Improved MB Prediction Modes (H. Schwarz, T. Wiegand,HHI), presents an improvement of the proposal “Tree-structured macroblock partition” (VCEG-O17, CSR 12.48). Furthermore, the results of the Core Experiment on Improved Macroblock

Page 9: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 9

Prediction Modes defined in VCEG-O61 (S. Adachi, NTT; HHI; Mitsubishi; Matsushita,December 2001) are reported. The proponents report similar or reduced complexity.

JVT-B052, Core Experiment Result on Improved MB Prediction Modes (S. Sekiguchi, Y. Yamada,K. Asai, Mitsubishi), is basically the informative verification report of the technique proposed inVCEG-O22 at Pattaya. This independent implementation shows comparable coding gain to thatwhich was reported in Pattaya meeting (December 2001).

JVT-B058, Core Experiment Result on Improved MB Prediction Modes (M. Hagai, S. Kadono,Matsushita), verifies the VCEG-O61 (VPM2) method through independently developed software.It shows an average bit rate saving of 3.60% (single reference frame), 4.10% (5 reference frame)and average PSNR gain of 0.17 dB (single reference frame), 0.20 dB (5 reference frame) under thecondition of CE1.2. It was noted that VPM2 shows significant improvement for all sequences.

JVT-B059, Intra Prediction for Improved MB Prediction Modes (M. Hagai, S. Kadono, M.Schlockermann, Matsushita), proposes the method introducing intra-coded segment into VCEG-O22. That segment is coded using intra 4x4 mode coding scheme. It reduces bit-rate by 0.8%compared to the original VCEG-O22, therefore a 4.85% reduction compared to TML 9.0. Theseadditional intra coding modes need to be tested to see if the performance can be achieved.

In discussion it was noted that VCEG-O22 could require calculation of 4x4 SADs which iscomputationally complex. How important is encoder complexity as it is not required to check all ofthe new modes of VCEG-O22? The scheme of VCEG-O22 gives about the same performanceregardless of whether the 4x4, 4x8 and 8x4 modes are used. The original aim of VCEG-O22 wasto improve coding efficiency at low bit rates – which it achieves, but perhaps not to the extent thatmay have been expected.

The JVT group agreed to include VCEG-O17 into the test model at this meeting. Furtherinformation regarding the complexity of VCEG-O22 was requested.

Multiframe Motion Compensation

JVT-B009, AHG Report: Generalized ERPS (T. Wiegand, HHI), indicates that the goal ofintegration of ERPS (Enhanced Reference Picture Selection) into the design has been achieved.

JVT-B032, P-Frame Coding with Interpolative Prediction (M. Zhou, TI), describes a way to changethe baseline codec by introducing an interpolative prediction mode in the existing P-frame coding.In the proposed method, the reference frame number is limited to two; in addition to the forwardprediction from the two reference frames, a macroblock can also use interpolative prediction whichis similar to the bi-direction prediction mode in the B-frames. Experimental results reveal that thismay be an effective way to cut the reference frame number while still maintaining the codingefficiency.

JVT-B057, Proposal of Minor Changes to Multiframe Buffering Syntax for Improving CodingEfficiency of B-pictures (S. Kondo, S. Kadono, M. Schlockermann, Matsushita), builds onVCEG-O26 (December 2001, CSR 12.48). JVT-B057 explains the enhanced bi-directionalprediction mode briefly and proposes a suitable method to clarify the ambiguous simulation results.Then it proposes some minor semantics changes to multi-frame buffering in order to handle multi-frame buffering framework more effectively without loss of generality.

The prediction method for referencing B-pictures for coding of the next B-picture are: BDBRSbetween 0-3% are shown overall, while for B-pictures the gains are up to 6%. For furtherimprovement, a new direct prediction mode is introduced which almost doubles the gains when

Page 10: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

10 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

referencing B-pictures for coding of the next B-picture. Various problems to support such acoding method are outlined for the syntax.

The JVT group recommended forming an AHG on B-Picture coding (Chair: S. Kondo,Matsushita), Mandate: finalize B-picture syntax taking into account JVT-B057.

JVT-B043, Coding of Scene Transitions (M. Hannuksela, Nokia), proposes composition of scenetransitions in the decoder. Component pictures are coded and decoded separately and a transitionfilter is applied to reconstructed versions of the component picture to obtain a picture to display.The proposal is likely to improve compression efficiency and enhance picture quality in many scenetransition cases. A remark was made that this technique could provide coding gains in case of longfades. Further consideration and comment on JVT-B043 was requested.

JVT-B075, Improved Multiframe MC (Motion Compensation) using Frame Interpolation (Y.Kikuchi, T. Chujoh, Toshiba), proposes a new motion compensation method employing frameinterpolation with weighted sum of multiple reference frames. Simulation results show that theproposed method gains SNR improvement of up to 0.5 dB at high bit-rates, even with a simplemean weighting. Significant coding gain of 0.5 to 2 dB is derived for fading sequences. There isno need to calculate a fading factor at the encoder side because the proposed method works asextrapolation of the brightness. Discussion noted that this should be included in the B-frame AHGwork.

Global Motion Compensation and Motion Vector Coding

JVT-B046, Global Motion Vector Coding (GMVC) and GMC switched by MV (H. Kimata,NTT), proposes a modified syntax for GMVC and GMC. The hook of switching a GMVC orGMC block is integrated into the motion vector syntax. The proposed scheme improves codingefficiency especially in zoomed sequences. Bit-rate savings are reported to be up to 7.8%. Thegroup requested experiments with B-pictures: it was pointed out that for P-pictures which aretemporally further spaced apart, the gain increases. It was also pointed out that for B-pictures, thegains are typically smaller.

JVT-B019 (S. Sun, S. Lei, Sharp Labs) proposes a revised GMVC technique. Compared to theprevious GMVC proposal, the “GMVC mode with coefficients” has been rearranged to gain betterbit rate savings. Experiment shows significant bit rate savings and some visual qualityimprovements at very low bit rate when global motion is present. There were no gains at mediumbit-rates. A comment was made with regard to skipping at the MV predictor. Gains are reduced30-50% for B-pictures.

An AHG on GMVC/GMC (Chair: H. Kimata, NTT, co-Chair: J. Lainema, Nokia) was created, withthe mandate: study the inclusion of JVT-B046 and JVT-B019. It will also study the use of MBskip at the predictor.

De-Blocking Filter

JVT-B011, AHG Report: Loop Filter (P. List, Deutsche Telekom), recommends adopting theimproved software into the JM. The reduction in software complexity was estimated to be a factorof 2-4. A remark was made that the reduction in hardware complexity is smaller than thecomplexity reduction in software. The changes that need to be made to the document werepresented to the group. Subjective results were shown and an improvement was seen by the group.It is proposed to adopt this loop-filter. It was also proposed to continue the loop-filter AHG (chair:P. List). The Mandate was extended to consider finalization of the loop-filter design, and toconsider the case of varying QP on an MB-per-MB basis.

Page 11: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 11

JVT-B021, A modified loop filter of VCEG-O19 (M-C. Hong, Soongsil University; H. H. Oh,LG Electronics), proposes a frame-based loop filter that introduces QP and signal dependentadaptive filtering. The difference to VCEG-O19 (December 2001, CSR 12.48) is that it does notuse a temporal frame buffer to reduce the complexity. Instead it uses different filter coefficientsdepending on the signal (luma or chroma). An average saving of 15% against the loop filterdecoding time is obtained against the existing JM implementation. Subjective comparisons shownso far indicate a potential subjective gain of the proposed method. The proposal will be givenfurther consideration in the loop-filter AHG.

JVT-B037, PDS (Philips Deblocking Solution), a Low Complexity Deblocking for JVT (J. Jung,E. Lesellier, Y. Le Maguet, C. Miró, J. Gobert, Philips Research), proposes a new deblockingalgorithm. The deblocking filter has been tested as a post process function and is to beimplemented in the coding loop soon. The algorithm has been compared to the TML 9 deblockingin terms of both quality and complexity. The results so far show that it is equivalent to the TML 9deblocking in terms of perceived visual quality (for CIF sequences), and is 3.5 times less complex.Subjective results indicate some improvements of the proposed method while the method givesobjective losses up to 0.5 dB BD PSNR. The proposal will be given further consideration in theloop-filter AHG.

JVT-B078 (K. Chono, Y. Miyamoto, NEC) concerns two in-loop filter issues. One is thenecessity of in-loop filter. At middle QP-values, it argues that the filter should be switched off formedium bit-rates but no results were shown. The other is a comparison between MB-based andFrame-based filtering. It claims that MB-based filtering shows the better performance both incoding gain and subjective quality. Discussion noted that the differences shown for the twomethods are larger than some experts experienced in the past. The adopted design now uses MB-based filtering.

In JVT-B079, Loop Filtering Method Using DCT Coefficients Distribution (Y.L. Lee, I.H. Shin,J.H. Park, Samsung), the reconstructed signal is processed by a 4x4 DCT. The 4x4 DCTcoefficients distribution is used to control the deblocking filtering for the 4x4 block boundary inthat it provides the horizontal and vertical blocking information. Results are compared with thelatest JM optimized loop filter distributed by P. List (Deutsche Telekom). PSNR results aresimilar, while the group preferred the subjective quality of the latest JM. The optimized JM loopfilter has lower complexity. The proponent indicated the potential for further complexity reduction.Optimized JM loop filter has 0.5% more bit-rate reduction. The group remarked that the optimizedJM loop filter shows some color bleeding. The proponent was encouraged to do further work.

JVT-B084, Downloadable Threshold Tables for Loop Filter (T.W. Foo, S.M. Shen, Matsushita),proposes the use of a flag to indicate whether to use the standard threshold tables or to use user-defined threshold tables to improve the subjective quality. The subjective results shown providedsome benefit for the proposed technique. The group found the technique to be promising andwants to further investigate it in the loop-filter AHG.

JVT-B061, Adaptive Motion Vector Coding (Y. Suzuki, Hitachi), provides a new motion vectorcoding method for reducing the amount of the motion vector data. The idea of this proposal is toadaptively select the fractional pixel accuracy of differential motion vector components macroblockby macroblock. The proposed scheme provides improvements of 1-4% for similar complexity withJM-1. In discussion, a comment was made on a possible problem with overhead bits to signaldifferential motion vector. Concern was raised about the small gain. The group encourages furtherwork.

JVT-B113 (S. Sun, Sharp) is Sharp’s contribution regarding its IPR on “Loop Filter withBoundary Strength and Skip Mode,” referring to VCEG-N17 (Sept. 2001, CSR 12.37) and

Page 12: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

12 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

VCEG-M20 (April 2001). It cites clause 2.2, with a relaxation clause stating “To support aroyalty-free baseline we follow ITU-T 2.2.1 for Baseline Profile applications agreeing to a freelicense on condition that all other patent holders do the same.” This statement is understood by thegroup to indicate a claim to IPR on the content of JVT-B011, as JVT-B011 states that it includescalculation of the strength parameter as proposed in VCEG-N17.

SP Frames

Instead of quantizing the MC signal, JVT-B097, Advanced SP Coding Technique (X. Sun, W.Gao, Harbin; F. Wu, S. Li, Y-Q. Zhang, Microsoft), quantizes the reconstructed signal to achievethe synchronization feature. PSNR gains of 1.0 dB for Foreman and up to 0.5 dB for Coastguardindicate the potential of the method. A question was raised with regard to subjective performance.A core experiment to verify and analyze the proposal will be conducted. The CE document will beprovided by Nokia and MS. For that, experimental conditions need to be determined: e.g., QP ofsequence and QP of switch, frequency of switching. The proposals need to be compared withregards to subjective quality and decoder complexity.

VCEG-O47 (R. Kurceren and M. Karczewicz, Nokia Research Center, December 2001, CSR12.48), proposed new frame types called SI-frames, to be used in conjunction with SP-frames. Theprediction block in SI-frames is formed identically to Intra-frames whereas the rest of the decodingof the SI-frames is similar to SP-frames. The resulting property of the proposed frame is that SI-frame/slice makes use of only spatial prediction and identically reconstructs a corresponding SP-frame/slice, which makes use of motion-compensated prediction. This property providesfunctionalities in random access, splicing and error resiliency/recovery. Software has also beenprovided that implements SI-frames in TML 9.0 and the corresponding software tools todemonstrate the usage of SI-frames in random access and bitstream switching.

JVT-B055, New Macroblock Modes for SP-Frames (R. Kurceren, M. Karczewicz, J. Lainema,Nokia), provides description of SI-pictures and an example of how an SP-frame can be convertedinto an SI-frame, i.e., SI-frame encoding. In discussion it was noted that the group considers this tobe a bug-fix. For the SP-frame feature to be retained in the JM, a dramatic improvement needs tobe made that clarifies it in the document. The proponents are further requested to provide all thenecessary software to use the feature.

Buffering

JVT-B013 (E. Viscito, GlobespanVirata) is the ad hoc group report on H.26L Buffering.

JVT-B050, Video Complexity Verifier (VCV) for HRD (S. Regunathan, P. Chou, J. Ribas-Corbera, Microsoft), describes a new video complexity verifier (VCV). This verifier, when used asa part of the Hypothetical Reference Decoder (HRD), characterizes the amount of delay and buffer-size that is needed to decode and present a given bit stream at a certain level of computationalcapacity at the decoder. The VCV model proposed by Nokia is extended and integrated with theVideo Buffer Verifier (VBV) specified by the current HRD. In addition to reducing the level ofcomputational capacity required at the receiver to decode the bit stream, the new VCV model allowsthe bit stream to be decoded at multiple decoding speeds. These advantages are achieved at the costof introducing further delay, while additional memory may often be unnecessary.

In discussion, it was not clear how complexity can be defined in an implementation-independentway. There was no consensus on defining compliance that depends on a limited set of specificimplementation designs at some given time (year). Regarding how to apply it – there is a largemixture of decoder designs. Therefore, even if one can measure it, it is not clear how to apply it in a

Page 13: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 13

fashion that supports interoperability. There was no consensus that VCV should be included inprofiles and/or levels.

JVT-B089, Complexity-Constrained Generalized HRD (M. Hannuksela, Nokia), combines theGeneralized Hypothetical Reference Decoder design (VCEG-N58, (J. Ribas-Corbera, P. Chou,Microsoft, Sept. 2001, CSR 12.37) and the Slice-Oriented Hypothetical Reference Decoder design(VCEG-O45, December 2001, CSR 12.48). In discussion it was noted that VBV parametersshould be the normative part of the stream. The group proposed allowing the VBV parameters(transmission rate, buffer size, initial buffering period) to be sent either in the form of update of theparameter set or the SEI (Supplementary Enhancement Information) packet. (Initial bufferingperiod means VBV delay). The authors would like to have only one way of sending VBVparameters. The means of transmission is for further study.

Network Adaptation Layer (NAL)

JVT-B016, AHG Report: Transport (Y-K. Lim, net&tv Co.; M. Hannuksela, Nokia; D. Singer,Apple), provides a review of interim discussions and work on NAL issues, including areasdiscussed in proposals to this meeting, File Format, and FLC vs. VLC discussion. The ToRprovides both the scope of the work and relationships to other organizations.

JVT-B092 (M. Etoh, S. Adachi, NTT DoCoMo) proposes requirements for the NALSpecification. It includes requirements to keep functional separation of VCL and NAL, andrequirements to have an elementary stream syntax for interoperability testing. The proposaladvocates retaining separation of VCL and NAL, with emphasis on efficiency for VCL (only slicesand intra support). Discussion noted that one should look at the tools available and then determinehow these impact the VCL and NAL, and that test conditions can help determine appropriateactions. There was concern of whether RTP design is appropriate for use on IMT-2000. Wirelesscan benefit from larger packet sizes; Compound packet design can help with this. Very simpleelementary stream syntax (one potential NAL) is proposed in the contribution as useful for R&Dand interop tests (simpler than use of file format).

The principles expressed in JVT-B092 seem to be widely supported in the group. Action items forediting: 1) Add a remark to the file format section to clearly indicate its interim nature, 2)Avoid/eliminate slight deviations in terminology relative to the content of MPEG-4.

The group’s plan is to move toward the use of MPEG-4 file format as the defined method; onlybitstream switching and below-picture-level addressing (fragments of access units) seem to beunsupported features in the current MPEG-4 file format. These will be raised as issues with theparent bodies.

JVT-B028, Overview of NAL Concept & VCL/NAL Interface (S. Wenger, TU Berlin; T.Stockhammer, Munich U. of Tech.), provides a summarization of the concept of the H.26LNetwork Adaptation Layer (NAL). JVT-B028’s purpose is as:• Reference document on NAL concepts• Tutorial for those of JVT who are not, or are only rudimentarily, familiar with the NAL conceptJVT-B028 is not a proposal; it describes a concept and the implementation of this concept thatwere accepted into the H.26L test model. The group appreciated the tutorial information provided.

JVT-B026, RTP-NAL and RTP packetization (S. Wenger, TELES AG), is an informationdocument on the status of the MPEG-4 packetization effort. It includes:• An overview of the IETF standardization process (with references)• A description of the current MPEG-4 packetization RFCs and drafts

Page 14: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

14 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

• A short analysis concluding that none of those drafts would be sufficient to implement thefunctionality of the current H.26L IP/RTP NAL design

Discussion pointed out that it is already possible to have a payload format that is optimized for thetype of data to be carried (e.g., a format for JVT video). MultiSL draft format (a working draft inprogress in IETF, going to WG last call soon, approximately one year to get an RFC number“draft standard”) is a generic format, which is not intended to be considered as the best format forevery individual type of data. The best format can be designed in a customized fashion. Aparticipant noted that a media-specific format should be a subset of the generic MultiSL format. Isthis true of AAC format?

For referenc, the last documented design for RTP use of JVT video is JWD or VCEG-N72r1(NAL for RTP) and VCEG-N73 (draft packetization format, Sept., 2001, CSR 12.37).

It was agreed to request information from members/parent-bodies:• Whether conformance to the draft MPEG-4 “MultiSL” packetization format should be a

design constraint on JVT packetization design.• Regarding definition of “access unit”: Is an Access Unit defined as the smallest quantity of

data that can be associated with a unique timestamp? Is this definition appropriate for JVTvideo? (e.g., Can a slice be an access unit?)

• Does MultiSL draft support:– Distinct classes of packets defined such as mode/MV, intra coefficient, inter coefficient

(e.g., for unequal error protection) within each slice?– Compound packets (muxing of multiple slices into one packet)?– Parts of several pictures in one packet?

• Where is the best description of the concept of NAL/VCL? It is mostly missing from theworking draft. Editorial effort is needed to get a proper description into the working draft.

JVT-B049, Start Codes and Mapping to MPEG-2 Systems (A. MacInnis, J. Alvarez, S. Chen,Broadcom), explains some problems that occur when the network or storage layer underlying theJVT video coding layer does not wrap packets precisely around slices, noting especially thepotentially high cost in wasted bit rate. The issue is not unique to MPEG-2 systems protocols.The contribution explains why unique start codes form the preferred solution, explores how JVTcould be made to allow unique start codes in the NAL and the changes to UVLC and CABAC thatwould be necessary in JVT to make this possible, and proposes a unique start code. It also presentsa brief overview of how to design an NAL for use with MPEG-2 Systems (Transport and ProgramStreams).

Example transport scenarios include variable-length packets, and fixed-length packets (big ones orsmall ones). Fixed-size transport packets seem to require padding or start codes – and it ispotentially very wasteful to use padding. Start codes should be unique. Start codes may alsoenable leveraging of existing equipment or equipment designs. JVT-B049 proposes a 32 bit startcode prefix. One could enable UVLC-based start codes with a long codeword devoted to start codeand have some way to avoid same-length codewords being used as coefficient data, motion vectordata, etc. CABAC presents its own start code problems (possible emulation prevention per H.263Annex E). JVT-B049 reviews potential mapping to MPEG-2 Systems.

A comment was made that the specifics of mapping to MPEG-2 systems are out of the JVT scope.In broadcast, 2^{-24} emulation probability may sound small, but is more often not acceptableduring the duration of an entertainment-quality program. Also it was noted that low error ratechannels have important high volume applications. It was mentioned that JPEG-2000 includedsome use of start codes and study of interaction with arithmetic MQ coder; there could besomething to learn from that design.

Page 15: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 15

JVT-B063r1, On Random Access and Bitstream Format (G. Sullivan, Microsoft), is a proposal toallow random access point to start with any picture type, not just I pictures. The proposal allowsprior references from the RA point; providing a means for the decoder to eventually catch up. Para.2.2.1 is checked on the patent discosure form.

In response to a comment that ideally one wants I-frames to show immediately, G. Sullivan said thateven with that there is not a complete decoder reset. However, this does not prevent inclusion of anI frame at an RA point. Regarding time tags: Sony (VCEG-O53, December 2001, CSR 12.48)uses the 25-bit IEC method, which supports only PAL and NTSC frame rates. What is needed is amore general scheme, with “true time” (not UTC, but true relative time).

JVT-B063r1 wants a time-based limit on backward references. A “sync delay” number couldwork as follows: If 0, then it can immediately decode. If, say 250 ms, then it will have toparse/decode 250 ms of data stream before being able to decode fully (a “pre-roll delay” in termsof picture count, not time). This gives the encoder a lot of flexibility to trade off random-accessresponsiveness vs. coding efficiency. JVT-B063r1 proposes to permit all three of:

• Pre-roll delay (pictures)• Init delay (time)• Pre-roll + Init delay (may in some circumstances be faster than either of above)

In discussion:

Q/A: A decoder that starts decoding a given point should init its buffers, and start decoding (besteffort). Encoder responsibility is then to ensure that no further backward references are madeafter the promised init point.

Q: Implies some burden on the encoder to keep track of things. Is it practical? Is it back to basics?Q/A: Needs more study of the effects on loop filtering of intra macroblocks.Q: Item 3: (start with any picture type). How to support fast play/trick play?

A: Encoder can do this if it so chooses. Decoder can choose to show only those RA pointsattached to I frames (if it wants).

Q: Item 6: (broken link): Want to support “clear” editing process.Q: Don’t like time tag design. Don’t think anyone would use it. Maybe supplementary

enhancement info (discardable, synced with video).A: SMPTE time code has problems. The proposal is compatible with SMPTE design, but alsoreferences true time.

Q: Init parameters as SEI.A: Init parameters as SEI. No strong feelings.

Q: Overhead if not SEI?

The comment was made that it fully supports SEI for init parameters. JVT-B089 is anotherproposal regarding timestamps.

Result (following the Proposal’s numbering):2. Consider moving RA data down from GOP Layer. Will do so as long as it is simple.3. RA points with any picture type. Accepted.4. Time tag: Use simple LSBs of high rate clock. No “fixup” of SMPTE stuff, instead simple

count, publish conversion to SMPTE in text (accepted pending receipt of working conversionformula).

5. ReInit/ - no for now, FFS. It may be desireable to turn off/on loop filters at boundaries. Theloop filter issue is to be revisited.

6. Broken links FFS.

Page 16: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

16 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

The timing indicator part of the contribution was discussed. It is similar to MPEG-2 content datadescription (more compressed form). Time base should be established at the sequence layer.

Open issues which remain include: How to represent time? Use 27 MHz clock? The group agreedthat it is important for JVT to address the issue of how to handle the situation where the captureframe rate is different than the display rate (e.g., film capture and display on TV). It was agreed toleave this issue open for this meeting, and request further contributions and inputs. It should beclosed by the May meeting. A separate ad hoc group was proposed to discuss and study the issueof timing as it relates to capture time, decoding time and display time, transcoding from MPEG-2.Suggested chairs: G. Sullivan, and S. Chen (Broadcom). It was also suggested to think of theconversion problem in regard to the SMPTE timecode.

JVT-B104 (S. Wenger, TELES AG) is a proposal to explicitly allow start code emulations in suchNALs that need start codes, in order not to hurt the coding efficiency for normal symbols(especially in such NALs that do not need start codes). It also advocates disallowing start codeemulation prevention bits. It argues that emulated start codes happen mostly in error proneenvironments, and there decoders need to be capable of handling incorrect syntax anyway.

JVT-B070, NAL for MPEG-2 System (T. Suzuki, N. Oishi, Y. Yagasaki, Sony Corp.), describesthe contributors’ view of the requirements of VCL/NAL, and the requirements of NAL for MPEG-2 Systems. It also proposes an NAL for MPEG-2 Systems. It discusses the role of NAL and itsrelationship to VCL. It advocates high-level syntax element commonality (not necessarilycommonality of exact bit representation). This commonality goal was supported by the group.Should the group advocate a normative VCL/NAL boundary specification conformance point tosupport single decoder for multiple NAL use? No clear disagreement was expressed, however,further study of the issues surrounding conformance point specification is required. It was notedthat VBV in MPEG-2 video does not include some header data. In H.263, it does. This will needfurther work to define it for JVT. The issue remains open.

JVT-B070 advocates a unique start code for MPEG-2 (not alignment of video packets to PES[Program Elementary Stream] packets). It advocates start codes in an MPEG-2 environment forsequence (NAL-specific issue), GOP (open issue), and picture (NAL-specific issue). Although thiscan be done by the UVLC, it is not advocated.

A decoder configuration mechanism is necessary for the establishment of parameter sets. This wasagreed.

JVT-B070 suggest that user data should be supported; Supp Enh Info is in the draft, user data isone obvious candidate for use there. The group agreed that it needs to support user data. Thistopic will be revisited.

Draft SEI syntax has message type ID, then message payload, so it can support user data and otherdata types. It is left to NAL to decide where to carry it. The group agreed that it should be able tosynchronize the SEI with appropriate content of the VCL stream.

In discussion it was noted that the user data includes the manufacturer/organization. ID codes maybe helpful; use of an existing manufacturer/organization ID code registration authority wassuggested. It was noted that there is a code used by H.320 and H.323 consisting of 16-bit countrycode + 16-bit manufacturer code. The group agreed that it seems useful to have some regulatoryauthority, and perhaps to also allow anonymous user data content.

It was agreed to add a start code for SEI for bytestream environment. XML format for SEI mayalso be considered.

Page 17: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 17

JVT-B088, MPEG-2 Systems NAL (M. Hannuksela, Nokia), proposes an MPEG-2 TransportPacket format where identification of access units is not based on start codes. The proposed formatis similar to the planned RTP payload format for JVT. No changes in VCL are required.Moreover, for MPEG-2 Program Streams, the paper proposes a PES packet format that follows theinterim JVT file format design.

JVT-B088 advocates considering mapping of VCEG-N72r1 packets (in which first byte is packettype ID) into MPEG-2 Systems transport packets. It assumes the start of a new transport packetfor each picture if the PES header timing information is needed. It may be possible using theproposed techniques to have an MPEG-2 packetization without using start codes. An alternativeframing method is described, based on fixed-length packets and the ability to synchronize to thetransport packet locations. A comment was made that some of the specifics seem out of the scopeof JVT work.

For program stream syntax, JVT-B088 advocates using start codes. It considers using file format(or something with similar features) within the program stream to carry the data. It assumes use ofPES packets containing “boxes” of data. Note: It does not produce unique start codes, so wouldneed an emulation prevention mechanism (e.g., in spirit of JVT-B063) if uniqueness is necessary.A comment was again made that some of the specifics seem out of the scope of JVT work.

Action items for JVT:• Consider the mapping of the file format features into the bitstream format.• Submit rX version with 2.0 statement.

In the NAL subject area, the JVT group agreed to adopt Start Code and Bitstream Syntax Method(from JVT-B063):• Use MPEG-2/4 start code prefix: 0x00, 0x00, 0x01.• Use MPEG-4 visual style of byte alignment stuffing to achieve byte string payload, but flip 0 &

1 values so last byte of payload is never zero (add between 1 and 8 bits of value ‘1000…’).• Payload carried after start code prefix, starting with packet type indicator (not emulating

MPEG-2 Systems IDs).• Types include configuration information, random access point, picture, each slice content type,

sequence end?, supplemental enhancement information at each level.• Encoder emulation prevention method: Search in payload for any string of value 0x00, 0x00,

0x01 or 0x00, 0x00, 0xFF and insert a byte of value 0xFF between the second and third byte(average expansion factor is 0.00001% for random input data).

• Decoder side: Whenever it finds 0x00, 0x00, 0xFF, it removes the 0xFF. Whenever it finds0x00, 0x00, 0x01, it declares a next start code detection.

• At the end of payload, remove the last ‘1’ bit and all trailing zeros if any – the remainder ispayload prior to alignment padding.

High-Level Syntax

JVT-B017 (T. Suzuki, Sony Corp.) summarizes the discussions of the AHG on GOP syntax. Thereport mentions issues such as start codes (addressed elsewhere), FLC versus VLC coding ofheaders (no specific proposals here, possibly an NAL issue), and the general issue of relationshipto NAL and VCL. It recommends study of GOP issue in relation to VCEG-O53, VCEG-N52 andVCEG-N72.

JVT-B041, Simple Definition of GOP for Random Access (M. Hannuksela, Nokia), proposessignaling of independently decodable GOPs in the slice header structure. It also discusses therelation of independently decodable GOPs and the operation of the multi-picture buffer for motioncompensation. Furthermore, it clarifies how the interim JVT file format (VCEG-O58, December

Page 18: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

18 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

2001, CSR 12.48) supports random access. It reviews the needs of random access as 1) for filestorage access, and 2) streaming. It advocates the needs of random access as being:• HRD resync• Multi-picture buffer reset• Possible gap in picture numbers (indicating that gap is intentional and not a sign of loss)• I or SI frame (alternative proposed in JVT-B063)• Absolute time reference• File manipulation (cut/paste) capability

JVT-B041 proposes to segment the stream into independently decodable GOPs:• First picture number = 0 in each GOB• Start of GOP flag when picture number is zero

Why not allow start of GOP flag in every picture? JVT-B041 says it will take more bits, and has apotential error resilience impact if there is loss of the first picture of the GOP. Why not allowrandom access points at P or B pictures? JVT-B041 says consider the loop filter impact acrossMB border boundaries. Solutions include avoiding border areas for subsequent MC, or a way toturn off the loop filter for some MBs or slices. The contribution provides support for randomaccess in file format: file header box identifies random access points, segment box containsabsolute time reference, random access positions not aligned with the start of the file header clumpcan be identified from slice header structures which are included in the alternative track box. TheIntra picture flag in the file format is obsolete for random access purposes, thus it is not sufficient.

JVT-B041 advocates “complete decoder reset” random access point capability – simple randomaccess capability. There is no real conflict between this and the JVT-B063 advocacy of otherrandom access capability with eventual perfect recovery. The group thought it was good to have asense of the type of reset.

JVT-B041 notes that complete decoder reset enables cut/paste edit capability. In discussion it wasasked whether wrap-around of picture numbers should be prohibited? It was noted that themodulus can be set in the current design. Note that wrap-around is accompanied by a flagindicating wrap. A comment indicated a distaste for fields in slice header that are sometimes thereand sometimes not – how about taking care of GOP start indication at parameter set level or adifferent packet/slice type?

On packet type indication, JVT-B041 notes that it should try to avoid packet start emulation ofMPEG-2 system start code. Perhaps it should use an emulation prevention byte there or avoidvalues of MPEG-2 systems start codes for the packet type indicators. There is support for having acomplete decoder reset capability. The group requested further detail on exactly what to adopt.JVT-B109, Report Random Access and Time, was later provided and reviewed.

JVT-B069, Group of Pictures for JVT codec (T. Suzuki, N. Oishi, Y. Yagasaki, Sony Corp.), notesthat the GOP is proposed (in VCEG-O53) to support random access. Based on the discussion inthe AHG, this contribution proposes incorporating the GOP with the H.26L design. JVT-B069advocates having a random access capability such that:1) It is able to quickly and easily identify random access point (at a high level in syntax),2) It has absolute timing,3) It can identify whether there is a complete decoder reset at the random access point,4) The Random access syntax seems to reside at NAL, with a need to synchronize with VCL

content.These four aspects were generally supported.

JVT-B069 proposes the following syntax: 25-bit IEC 461 timecode (NTSC and PAL), completedecoder reset indicator (does not think all random access points need complete decoder reset), and

Page 19: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 19

broken link editing indicator. The proposal notes that there is a need to address the issue of priorreferences: it proposes pre-roll count and initialization delay. Note: The HRD/VBV managementissue is to be addressed at editing/random-access points to enable splicing locations. A commentwas made that “Parameter set” is currently defined as stream-level information, not synchronousinformation – the group needs to be clear on the meaning of these terms.

JVT-B042 (M. Hannuksela, Nokia) contains a proposal which enhances the concept ofindependently decodable GOPs presented in JVT-B041 so that disposable chains of pictures(called sub-sequences) can be easily identified and disposed. Such a disposal property may beadvantageous in streaming servers, for example. It includes a UEP (Unequal Error Protection)based on Temporal Scalability, which is proposed for “streaming” (IP-based distribution) and“storage.” It includes a simple addition to the interim file format. The MPEG-4 file format mayalready support it. Modification to the decoder multi-picture handling is needed. The proposal isto modify decoder multi-picture buffer handling so that intentional picture disposal is correctlyhandled. At the transport level, the proposal is to send the bits only as single layer. Study isneeded to see if this impacts non-IP (MPEG-2) based distribution systems. S. Chen (Broadcom)noted (verbally) IPR from Sarnoff. TELES AG may also have IPR; S. Wenger was requested toformally submit an IPR statement. (Post meeting note: S. Wenger followed up by email on Feb.14, 2002, indicating no such IPR and thus no need for an additional IPR statement.)

Transform Coding and Quantization

JVT-B008, AHG Report: Transform and Quantization (L. Kerofsky, Sharp Labs), summarizes thethree differing proposals of low complexity transform/quantization design in the view of the AHGchair. No significant coding performance difference has been demonstrated. Differences intransform and quantization implementations are discussed. No significant coding performance isreported for current quantization range. The group agreed to change Luma DC transform to theHadamard transform.

Two original techniques were taken in design of the main 4x4 transform, a reduced complexityTML definition proposed by Texas Instruments and Sharp and multiply free algorithms proposedby Nokia, Microsoft, and FastVDO. Three different 4x4 main transforms are proposed, TML,Nokia/Microsoft, and FastVDO. The Nokia/Microsoft and FastVDO definitions both haveefficient multiply free implementations but differ in the details of their multiply free algorithms.FastVDO’s original proposal was based on a lifting approach that has minimal increase in dynamicrange and enables an exact inverse. For the 9-bit residual application they have modified theirdefinition so the lifting technique is no longer used. The AHG reported the following conclusions:

• The transform proposals differ in their definition of main 4x4 transform and associatedquantization.

• All transform definitions have 16-bit matrix multiply definitions with the same complexity.• The multiply free implementation of TML requires more than three times as many adds and 6

times as many shifts. In the current description, it is not clear the multiply free implementationcan be used for the forward transform.

• Definitions differ somewhat in the memory needed. The TI numbers are subject to change ifthe quant range is increased or periodic quantization is included.

TI Nokia/Microsoft FastVDODecoder 96 Bytes 128 Bytes 128 BytesEncoder 128 Bytes 224 Bytes 224 Bytes

• If extension to finer quantization is required this appears to be a deciding factor.

Page 20: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

20 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

• Discussion was avoided of support for greater bit depth; however Nokia/Microsoft andFastVDO appear to have a path to supporting greater bit depth. With Nokia/Microsoft the onlychange which appears necessary is defining an extended quantization range. With FastVDO,the transform would be replaced with the lifting based definition. The quantization would onlybe modified by extending the quantization range.

From the discussion the following points were noted:• How was complexity compared? Three suggested primary H/W architectures: gen-purpose

dsp, asic, microproc.• 16-bit matrix multiply possible in all three proposals.• Shift-add possible with all: harder with TI version, may not be possible for encoder with

TI/TML design.• Quantization requirements: TI’s TML version does not need normalization on coefficient basis

in the absence of a quant matrix (e.g., 3 or more different normalizations), other two do.• Bytes versus shorts for some memory values when looking at amount of memory needed.• Periodic structure to quantization suggested, need QP/M and QP%M for modulus M – used to

reduce memory requirement for the multi-norm proposals or to extend quant range (can be usedin all three proposals).

• Significant interest expressed in finer quantization (Sony Pattaya proposal and others) downe.g., to QP = -12.

• Greater bit depth: Extensions in some proposals.• Two main comparison points suggested within current operating parameter range: Multiply free

and quantization complexity. Memory might also be considered here, although it is amultidimensional issue.

• Explorations and discussion areas beyond the current operating range:– Greater bit depth (how to handle dynamic range)– Finer quantization– Coarser quantization– No quantization– Use with quantization matrices– Consistency of design with larger block sizes

• QP-dependent normalization has an issue with quant matrix use. It was asserted that QP-dependence can be eliminated. Should verify that for considering quant matrix use.

JVT-B031 (M. Zhou, TI) describes the updated TI proposal on 16-bit based transform andquantization. The patent statement provided has checked 2.2.1. The existing TML transform andquantization is maintained, the only change is that the normative scaling factors are introduced toenable 16-bit implementation. The proposal was tested on the H.26L test set and no quality losswas reported with respect to the 32-bit solution. The extensions to the finer quantization scales,quantization matrices and greater bit-depth are also discussed in this document.

From the discussion the following points were noted:• This proposal downscales the forward transform and the inverse transform (with rounding).

Scale down B() reconstruction matrix to 16 bit (if unsigned, otherwise 17). Normativerounding with downshift after each transform dimension. Rounding downshift for inversequantization.

• Normative intermediate downscaling prohibits nonseparable inverse transform.• Rounding at each stage of downscaling requires extra adds in the matrix multiply

implementation.• Extension to coarser quantization may be a problem. However, periodic quantization solution

could be adopted to address that if coarser quantization capability is needed.

Page 21: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 21

JVT-B038, Transform and Quantizer - part 1: Basics (A. Hallapuro, M. Karczewicz, Nokia; H.Malvar, Microsoft), proposes an alternative set of transforms that have significantly lowercomputational complexity than those in the current TML draft standard. While the transforms inthe draft call for 32-bit arithmetic, the transforms presented here can be computed in 16-bitarithmetic, for 9-bit residual input data. Furthermore, the proposed transforms are multiplier-free;they require only additions and a minimum number of shifts. This document also proposes asimplified quantization structure, which reduces the size of the quantization tables. Impact onperformance due to these changes is negligible – typically less than 0.02 dB distance between theR/D curves for TML and the transform/quantization proposed here.

From the discussion the following points were noted:• The proposal uses periodic quantization with three normalization factors in transform. Inverse

transform coefficient +1, -1, _. Forward (example) transform uses +1, -1, 2 factors.• Proposal says intra with very low QP is the critical condition for testing coding efficiency

performance showed test results.• Asserted to be as simple as possible for multiply-free implementation without Hadamard, but

this may not be correct.• Decoder can be kept fully within 16 bits with 9-bit residual (even for intermediate values).• Only rounding is in final reconstruction.• Memory transform 192 bytes, for quant matrices 288, 144, or 54 bytes.

JVT-B039, Transform and Quantizer - part 2: Extensions (A. Hallapuro, M. Karczewicz, Nokia; H.Malvar, Microsoft), is an extension of the operating range of JVT-B038 in various ways. Para.2.2.1 is checked on the patent disclosure form. JVT-B039 considers 11 and 13 bit residual data,extended quant range, weighting matrices, and exact invertibility.

From the discussion the following points were noted:• Extension to greater bit depth has no impact on decoder transform or inverse quantization (with

16-bit multiples, but 32-bit memory access). Could add another downshift in encoder forhigher bit depth if a 16-bit computation in encoder is wanted.

• Extended quant range in either direction (due to use of QP periodic structure) results in nochange. Results shown for very small QP (down to -12). (Dequant matrices are a multiple of4.) If adopting the extended range, then the design should use same multiple of 4 design forboth cases.

• It doesn’t matter how much one extends the quant range due to periodic quantization structurein the proposal.

• Extension to exact invertibility: a small change to transform achieves invertibility.• Quantization matrix weighting design shown using Sony method of QP offset.• Cross-verified with independent implementations. Software was available before the Pattaya

meeting in December 2001.• A quant period of 6 is used in the contribution.• Example method of encoding shows more than a 16-bit downshift; however no impact is

expected if a smaller shift is used.• Example method of encoding shows downscale by 1/16 for 11 & 13 bit extension. There is no

expected impact unless QP extended is very small (down to QP=0 shows no impact).• The available demo also showed no difference.• This proposal requires calculating QP%6 and QP/6. It may be done by one multiply and shift

per macroblock when QP changes, or change the period to power of 2, or store the numbers.

The proposal in JVT-B103, FastVDO’s Unified 16-Bit Transform/Quantization (J. Liang, T. Tran,W. Dai, P. Topiwala, FastVDO), builds on contributions submitted to the Austin (April 2001 andSanta Barbara (Sept. 2001) meetings. FastVDO notes that they conditionally support para. 2.2.1 inthe patent statement. JVT-B103 describes a transform structure, proposes that JVT adopt the

Page 22: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

22 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

structure, with encoded specification of values of coefficients to use. Several example transformsare shown.

For a variety of applications, especially for use in wireless devices with limited power, memory, andCPU capacity, it is desirable to have a fixed, low bit-depth integer codec implementation (e.g., 16-bits). Integer-only codecs are also needed if a fully lossless mode is required. On the other hand,high-quality applications have other needs, such as support for higher-bit data, quantizationmatrices, or even adaptive block transforms. Yet the greatest value of a standard is that it can enableinteroperability, as well as the reusability of content. For high-rate film/TV content to be reusablefor wireless streaming, for example, it is necessary that the structures used in the high-rateapplication work well in the low-complexity mode of the wireless application. FastVDO’sapproach to transform/quantization proposes to support these requirements.

A recursive structure is shown for larger block-size transform, in which the larger block sizetransform includes the smaller block size transform as an intermediate stage. It enables three typesof transform methods: Direct matrix multiply, multiply-free direct structure, and structure withlifting; while these do not get exactly the same results (e.g., look at need for intermediate roundingfor the proposed specific transform), there is a need to shift in some cases to compensate forscaling.

The proposal advocates separating the design of quantization from the design of transform, andfocuses on the transform. Periodic quantization, for example, can be applied.

From the discussion the following points were noted:• Properties available from the family proposed: Exact invertibility, shift-add implementation

capability, low bit expansion, equivalent three types of implementation (described above).• It was asserted that low bit expansion of the proposed design provides advantage at small quant

values, especially for greater bit depth due to less need for downward scaling to avoid dynamicrange expansion.

• Intra results were requested, results provided mixed IPPPPPP. IPPPPP means one "I" or Intrapicture, followed by a series of "P" or "predicted" or "Inter" pictures.

• The results used software which was not provided. Results show quality loss in some cases,and quality gain in other cases.

• Software presented use R-D quant for the proponent’s method, not for others. It uses divisionsin quantization.

• Lossless case expands the data considerably. However, some transforms in the family havebeen tested with reasonable lossless compression capability.

• What to use for the structure outside of the transform (quantization and inverse quantization)?A definition is needed of what to be used if adopting something along these lines.

• Proposed transform called X5 was used for the results shown. It was proposed to adopt X5p=7/16, u=3/8 as a starting point and work on downloadable structure capability andquantization/inverse-quant definition.

• How about adopting the Nokia/MS as part of framework to extend upon? Eexact inversion andtight bit expansion limits are drawbacks of that transform. Norm value and the need forrounding in X5 matrix multiply implementation was noted.

• Each transform needs an associated quantization method. Thus there would be a finite numberof transforms.

• The recursive structure is not so sensitive in performance to exactly which numbers are used, sois there a use for that kind of flexibility? Response: The proponents responded that it isprobably mostly true in common conditions at 4x4 size, but if extended to higher bit depth andhigher resolution (motivating larger block size), there may be a need for more flexibility.

• The quality comparison to conventional 8x8 can get as close to DCT performance as desired,depending on particular choice within family.

Page 23: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 23

• For quantization and transform as well, perhaps transmit the parameter values or pick a finite setand select among them.

• This contribution is mostly an extension beyond the current scope proposal rather than areplacement within current scope proposal. The proponent advocates creating an activity tostudy the extension need beyond current range and study transform needs in that extendedscope range.

Transform Subject Summary and Results

Consideration aspects for current operating range:

Complexity: There is not a significant difference between the transform proposals, except forrounding requirements (TI has more rounding) and shift-add implementation (TI is lessfriendly), and need for same-result matrix multiply versus shift-add (X5 doesn’t appear tosatisfy). There is some concern over complexity of using a quantization weighting matrix tablewith Nokia/MS and X5.

Quality: No difference was demonstrated. Follow up: someone should confirm the asserted result.Testing: Less results were presented for X5, results for X5 were somewhat inconsistent, up to 0.5

dB at high rates.Verification: Cross-verification of Nokia/MS was based on independent implementations.S/W Availability: TI last week, Nokia/MS pre-Pattaya, X5 not yet (soon).

Consideration aspects for extensions of operating range:

• Greater bit depth (how to handle dynamic range): Family approach has less dynamic rangeproblem, Nokia/MS has same inverse transform for 16-bit multiply with 32-bit memory accesswith greater bit depth, but would need alteration of forward transform; don’t know about TI.

• Finer quantization: Nokia/MS tested to -12, no obvious problem.• Coarser quantization: TI would need some design change.• No quantization: Separate design.• Use with quantization matrices: All compatible.• Consistency of design with larger block sizes: Family approach offers extension – and

Nokia/MS is part of the family.

The group concluded that, based on the overall considerations described above and the current stateof maturity and the desire to make a decision as the CD state approaches, the JVT-B038 proposal(Nokia/Microsoft) was recommended for adoption. Future proposals of extensions should be ableto build upon this design. The family approach proposed in JVT-B103 (FastVDO) is consideredcompatible with this decision and further investigation is recommended.

The group recommends consideration of adjustment of the periodic structure to a period of 8 ratherthan 6 in the course of ad hoc interim work (assuming no difficulties are found in doing so) andextending the range of step size values by one or two factors of two in each direction and adjustingcommon conditions so that the fidelity range of common conditions remains essentially the same.

JVT-B114 (S. Sun, Sharp) is Sharp’s IPR Statement on “Transform and Quantizer - part 1:Basics,” referring to JVT-B038. It cites clause 2.2, with a relaxation clause stating “To support aroyalty-free baseline we follow ITU-T 2.2.1 for Baseline Profile applications agreeing to a freelicense on condition that all other patent holders do the same.”

JVT-B051 Improved transform coding for inter-frame (Y. Yamada, S. Sekiguchi, Y. Moriya, K.Sugimoto, K. Asai, Mitsubishi) discusses DCT and HAT (Hadamard Transform). DCT (DiscreteCosine Transform) has been used as orthogonal transformation of many video coding standardmethods (ex. H.261, MPEG-1, H.262/MPEG-2, H.263 and MPEG-4). For transforming a video

Page 24: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

24 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

signal, it is known that the performance of DCT is almost equivalent to KL (Karhunen-Loeve)transform that is derived as the optimum transformation. DCT is effective especially when theblock size is 8x8, adopted by the conventional coding system.

As known well, when block size is 2x2, only HAT is orthogonal transformation. Therefore, it isassumed that the method of the optimum orthogonal transformation differs with the block size.

In the JM (Joint Model), 4x4 block size is used for DCT. Motion compensated prediction byadaptive block size (16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4) is also used. This technique showshigh coding efficiency, and the power of signal becomes very small. It is assumed that thecharacteristics of the transformed video signal (motion compensated video signal) are different fromthe case of the conventional coding method.

JVT-B051 replaces the current 4x4 integer DCT by 4x4 Hadamard Transform in inter-frame, andshows its advantage considering the balance between complexity reduction and coding efficiency.The Hadamard method was tested for inter frames as a complexity reduction. Para. 2.2 is checkedon the patent disclosure form.

From the discussion the following points were noted:• The proposal is to include a Hadamard versus DCT switching flag in high-level syntax,

possibly even at the macroblock level.• When TML 9.0 software was used, with common conditions (one reference picture) there was

little performance penalty (usually within 0.1 dB, sometimes better).• DCT was used for I MBs in P pictures.• Open issues noted: Do we need multiple transforms? How to perform switching (sequence,

picture, slice, MB)? Criteria, efficiency with other prediction tools (1/8 pel, multi-frame, B-picture) which improve prediction, VLC optimization and scanning order are also open issues.There are no results yet of what switching flag would achieve.

• For intra there would be a quality penalty, potential for chroma and other visual problems. Itwas noted that this may work well mostly because of good prediction. Subjective quality is akey open question.

• This is also a special case of the family proposal and its selection criteria for when to usedifferent transforms.

• How was quantization handled (normalization issue)?• How much complexity reduction relative to JVT-B038?

The group recommended further investigation in the Transform AHG.

A weighting matrix and extension of QP range were proposed in Pattaya (VCEG-O52, December2001, CSR 12.48). The extension of quantization to support high quality video was also proposedin Pattaya, with the weighting matrix and extended quantization table. In JVT-B067, QuantizationTools for High Quality Video (T. Suzuki, P. Kuhn, Y. Yagasaki, Sony Corp), the results ofexperiments are shown. JVT-B067 also discusses how to integrate this proposal with the lowcomplexity transform, which is based on 16-bits arithmetic. The patent disclosure form indicatespara. 2.2.

From the discussion the following points were noted: Weighting matrix using QP offset approachwithin existing A() and B() matrices with clipping of QP index range should not cause overflowproblems. DC fidelity was emphasized (DC only adjustment, for example). Note that there isinteraction between weighting matrix and inverse quantization. (This is addressed in JVT-B039.)A question was raised about encoder-only techniques in comparison to use of weighting matrix.Suggestions included bringing test results especially for high quality video, extension of quantizer

Page 25: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 25

range to -8 (roughly equivalent to step size of 1 in spatial domain), and the use of a periodicquantization method.

JVT-B067 also reports a bug: TML 9.4 software crashes when encoding complex scenes withsmall QP. The probable issue is allocation of buffer in software that assumes no data expansion.Increasing the buffer size fixes the problem. M. Karczewicz (Nokia) volunteered to fix the problemduring integration of JVT-B038.

The group expressed interest in further investigation, reporting results in May, and building on thenew JVT-B038 design with consideration of the noted issues and interaction with other transform-area investigations.

Transform Size

JVT-B014 (U. Benzler, Robert Bosch GmbH) is the Ad Hoc Report for Transform Size. TheAHG work will continue in hopes of achieving a 5-10% gain, especially at high rate for high-ressequences.

JVT-B053, ABT Coding for Higher Resolution Video (M. Wien, RWTH Aachen), presents theconcept on Inter and Intra Adaptive Block Transforms (ABT). Results are given for interlacedITU-R 601 source material (the Interlaced CE test set from VCEG-O59), both for Inter-only andInter&Intra ABT coding. The simulations reveal an improved performance of the proposedscheme, with gains of more than 0.4 dB or 8.5% bit rate savings on average (at high rates, gains ofabout 1.0 dB or 15% bit rate savings can be observed).

JVT-B065 (U. Benzler, Robert Bosch GmbH) presents results of the Adaptive Block Transforms(ABT) for interlaced ITU-R 601 video source material, for both Inter-only and Inter&Intra ABTcoding. Para. 2.2.1 on the patent statement is checked. The simulations reveal an improvedperformance of the proposed scheme compared with the TML (~0.4 dB PSNR gain / ~8% bit ratesaving), validating the results presented in JVT-B053. Simulations were performed according tointerlace conditions (CCIR sequences). The sequences are different from the common test set.Coding gains of typically 5-10% are reported. Most gain was at high bit rates. A D-1 tape wasavailable for the viewing of results at this meeting, but logistical difficulties and schedulingconstraints prevented the viewing of the demo. The proponent claimed visible improvement onCCIR sequences. The effect of larger transform is mainly on larger picture formats. For intracoding: added complexity increase is due to the search of more modes.

There is no VLC solution proposed in this contribution. Discussion indicated that the complexityincrease should be quantified. The proponents consider this a tool for “interlace profile” – orhigher profile. The group was concerned as this is a considerable change. However, the groupfinds this is a useful tool for high resolution interlace material. Complexity considerations indicatethat the tool should not be included in all profiles. The tool should be considered when definingprofiles.

The software has been available for some time, and cross-verification of results has been performed,including independent implementation with bitstream exchange. The group recommendedconsidering tentative adoption with follow-through on harmonization with the 16-bit architecture4x4 transform, addition of VLC capability. If adopted, it should be possible to switch off thevariable size feature and use only 4x4. Noted that adoption is tentative and the features should beswitchable. The proposal needs integration with 16-bit transform architecture (e.g., 16 bit 8x8transform), and needs UVLC definition (requiring test results check) for higher than CIF resolutionand interlace application – considered non-baseline.

Page 26: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

26 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

Robust Transmission

JVT-B024, Coding Performance not using MV Prediction (S. Wenger, TELES AG), comments onthe efficiency of the in-slice prediction mechanisms of H.26L. At a slice size of one singlemacroblock, and when deducting the header overhead for the slices, a bit rate overhead of 0.6 to22% was observed. This document was submitted for information. Investigation was conductedonly for UVLC and not for CABAC. The authors were encouraged to do the tests with CABAC.

JVT-B027 (S. Wenger, TELES AG; M. Horowitz, Polycom) contains a proposal for a new videocoding layer based error resilience tool called Scattered Slices. Para. 2.2.1 of the patent declarationis checked. When used over packet lossy links and augmented with an appropriate errorconcealment technology, it greatly enhances reproduced picture quality at high packet loss rates,with a small amount of side information. In Scattered Slices, the macroblock ordering in the picturediffers from that found in regular raster-scan ordered slices. The penalty for coding macroblocksin an order different from raster-scan order is reported to be less efficient entropy coding, becausethe in-picture prediction mechanisms (in particular the motion vector and intra-pixel prediction) will,in general, not work as efficiently. However, the total overhead incurred by not taking advantage ofin-picture prediction is reported to be normally less than 10% (see JVT-B024 for details).

A demo was shown in the H.263 environment. This proposal is suggested for the profiles relatedto error prone environments with error ≥ 3% (packet loss rates). It causes ≤ 10% coding efficiencyloss which may be acceptable when there are very few I macroblocks. No hard numbers werepresented with regard to performance in the JVT environment and based on accepted commoncondition. There is a need to compare these results with other algorithms under consideration. Thegroup would like to see the results of experiments against JVT common conditions beforeaccepting these results.

Several issues remain open in slice prediction: variable slice size, and slice size adaptation to MTUsize. Use is targeted towards CIF or QCIF resolution; it is not recommended for sub-QCIF. Theauthors are encouraged to continue the work further as part of the Robustness ad hoc group. It wasagreed to define a core experiment on scattered slices (to be reported in JVT-B111r1) withparticipation from S. Wenger (TELES), T. Stockhammer (Munich U.), C.W. Kim (McubeWorks),M. Hannuksela (Nokia), D. W. Kang (Kookmin U.) and R. Sjoberg (Ericsson).

In the Pattaya meeting (December 2001), VCEG/JVT decided to initiate a core experiment about thesub-picture coding technique presented in VCEG-O46, titled “New Image SegmentationMethod.” VCEG-O57 (presented to this meeting) describes the core experiment. Objective andsubjective results (demo of Coastguard, Foreman) were shown. Intra MB update rates will affectthe results; however this had not been done yet.

JVT-B040 (Y-K. Wang, TICSP; M. Hannuksela, Nokia) shows results for the sub-picture codingcore experiment (VCEG-O57). The sub-picture coding method allows segmentation of the imageto rectangular foreground sub-pictures and to a background sub-picture. The method is reported toimprove error resiliency especially when applied with unequal error protection. The simulationresults are reported to show that sub-picture coding outperforms conventional TML in a multicaststreaming environment.

Carphone showed some visible improvement. For Coastguard and Foreman the improvementswere not significant. The expected improvement in the sharpness was not perceptible, perhaps dueto the quality of the projector. Artifacts at the border of foreground and background were visibleand were somewhat annoying. It was difficult to find Region of Interest (ROI). Encoders will bemore complex. How to find ROI is not very clear. There was no consensus that it will be helpfulfor error resilience.

Page 27: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 27

JVT-B082, Brief Result of Core Experiment on Sub-Picture Encoding (C-W. Kim, McubeWorks;S.W. Rhie, SK Telecom), provides the results of the core-experiment in VCEG-O57. Objectiveresults are included and subjective quality was demonstrated. A demo was shown of Silentsequence with 3% Packet loss. There was a visible improvement in the sharpness of theforeground. Artifacts in the background persisted longer than those in the foreground. Thetechnology needs improvement at the sub-picture boundaries and loss in the background.

JVT-B086 (W-S. Kim, D-S. Cho, Samsung) also verifies the Nokia proposal (VCEG-O57) ofsub-picture coding under the simulation environment as described in the document. No demo wasshown. Comments are similar to those given above in JVT-B082.

A QP selection method was provided in VCEG-O57 to remove the visible boundary between theforeground and background sub-picture. JVT-B087, Selection of QP for Sub-Picture Coding (W-S. Kim, D-S. Cho, Samsung), proposes a more generalized QP selection method to remove theboundary effect while the efficiency of the sub-picture coding technique is maintained. It proposesa slice header syntax for sub-picture coding which is the same as proposed in VCEG-O46 exceptthat the background QP is added for the foreground sub-picture slice. The generalized form mayhelp in reducing the artifacts in the boundary. A demo of Carphone for the no error case wasshown. There is a need to do more experiments under packet loss conditions. No conclusionswere reached. The authors will provide more results at the next meeting.

The group recommended considering sub-picture technology for Version 2 and not for Version 1for broader classes of applications and other technologies. The authors and S-W. Kim objected.

JVT-B102, Error Robust Macroblock Mode and Reference Frame Selection (T. Stockhammer, D.Kontopodis, Munich Univ. of Technology), contains two encoder test model extensions to increasethe error resilience in combination with multiple reference frames. Patent disclosure para. 2.2.1 ischecked. The first part restricts the selectable reference frames in the rate-distortion optimizedreference frame and macroblock selection such that no pixels are used for prediction which havebeen intra refreshed for error resilience reasons later. In addition the rate-“expected-decoder-distortion” optimized macroblock mode selection presented in VCEG-N50 (Sept. 2001, CSR12.37) is extended such that the reference frame is included in the optimization process. Resultsare presented based on the Internet test conditions. A demo was shown for Tempete withsignificant improvement in error resilience. No changes are needed in the syntax and decoder.Reference frame restriction may also be useful for “dirty” random access. The proposal is to addthe option in the reference encoder to allow the restriction on the selection of reference frames onMB basis and channel optimized combined MB mode and reference frame in the test modelencoder. This was agreed. Software implementation is expected to be completed as soon as a timeslot is made available by the software coordinator. The authors will continue to do moreexperiments as part of the Robustness ad hoc group.

JVT-B095 (P. Zhou, S. Chen, Y. He, Tsinghua University) proposes a set of error detectionschemes using a fragile watermark for hybrid codec based video communication. The watermarkschemes proposed do not embed extra bits into video, but constrain a relation between the Q-DCTcoefficients. It improves the error detection rate and error correct detection rate dramatically. Totake the advantages of the watermark, JVT-B095 proposes standardization of the schemes. Indiscussion it was noted that there is a loss of PSNR 0.3 to 0.65 dB. Error detection rate changesfrom 30 to 60%. The watermark description is not included. Para. 2.0 (no patents) is checked onthe patent disclosure form. The group agreed that there is a need to first establish the requirementfor a new scheme for error detection before this contribution is considered further.

Page 28: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

28 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

JVT-B076 (Y. Kikuchi, T. Chujoh, T. Nagai, Toshiba) proposes a new frame type called “R-picture” for quick recovery from error (e.g., packet loss). Patent disclosure para. 2.2 (patentsavailable for licensing) is checked. Motion compensated reference frame is intra coded in R-pictureand multiplexed into the stream. The server sends R-pictures only when an error occurs; the code-stream sent to the client for error free case stays the same as the normal code-stream. Therefore, thesize of the sent code-stream does not increase, unlike periodical intra refresh. This scheme isapplicable for streaming service since it does not use a feedback to the encoder. In addition,combination with SP pictures enables perfect drift-free reconstruction. This is a new idea. It isapplicable only in the system with back channel. The question was raised whether it can only beused with SP pictures. More experiments need to be done to clearly show the impact of R-pictures.What is the impact of SI pictures? The authors were requested to compare their approach withother previous investigations done by VCEG including VCEG-M38 (R. Kurceren, M. Karczewicz,Nokia, April 2001). VCEG-M38 provides a comparison between the coding efficiency of SPpictures (proposed in VCEG-L27r1, January, 2001 CSR 12.09) and S-frames.

Interlaced Coding and Progressive/Interlace Interaction

JVT-B010 (P. Borgwardt, VideoTele.com; L. Wang, Motorola; L. Winger, VideoLocus) is theAHG Report on Interlaced Coding. Testing resulted in the following:• Tempete, Mobil favored Frame Coding – In general, pictures with low motion favored frame

coding.• Bus, Football, Canoa and Rugby favored Field Coding – In general, fast action pictures favored

field coding.

The ad hoc group concluded that:• Frame or Field coding alone is not a good solution. Adapting is needed.• There is a need to adapt at least at the picture level.

The ad hoc group recommended adopting option E (picture level adapting) in the standard in thismeeting. Issues left open by the ad hoc group:• Direct mode – Details about block selection need to be made (tentatively settled)• Macroblock level coding – details and CABAC issues

JVT-B020 (M. Gallant, L. Winger, G. Côté, VideoLocus) summarizes results on the portion of theinterlaced coding core experiment, defined in VCEG-O59, conducted by VideoLocus.Experiments were done for the core experiment A and B. Examples:Rugby: field coding gives up to 2.5 dB (20 to 40%) gain over frame coding.Tempete: frame coding gives up to 1 to 3 dB (50 to 75%) gain over field coding.

JVT-B071, Adaptive Frame/Field Coding for JVT Video Coding (L. Wang, K. Panusopone, R.Gandhi, Y. Yu, A. Luthra, Motorola), presents the computer simulation results for core experimentsA (frame coding), B (field coding) and E (picture level adaptive coding) for interlace coding. Para.2.2 of the patent disclosure form is checked. The results for Experiments A and B agree with thoseof JVT-B020. In addition to the sequences described in the core experiments, Coastguardsequence was also used. Tempete, Coastguard and Mobile & Calendar favored frame coding.Example: For Tempete, frame coding gives the gain of up to 2+ dB over field coding. Rugby,football (and other fat motion sequences) favored field coding. Example: For Rugby, field codinggives gains up to 2.5 dB over frame coding. The simulation results demonstrate the advantages ofadaptive coding over frame or field coding. Picture level adaptive coding adapts to the better of thetwo modes.

Discussion noted that direct mode is useful mode for field pictures. Figure 3 of JVT-B071illustrates a way of doing it when the vectors are taken from the same parity fields, i.e., by scaling

Page 29: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 29

up the short vector. Another approach is to take the vector from the nearest field and scale down along vector. It is not a complexity issue. It is not clear if this will give any performance gain(VCEG-N84, VCEG-O40). No experiments were done or reported.

The group’s recommendation about Direct mode is that it is a relatively small issue. Therecommendation is to adopt the method described in JVT-B071. Results of comparison of themethods in JVT-B071 and VCEG-N84/VCEG-O40 were requested; JVT will determine at theMay meeting whether the decision needs to be modified.

JVT-B068r1, New Interlace Coding Tools (K. Sato, T. Suzuki, Y. Yagasaki, Sony Corp.), proposesnew tools to support interlaced video. Para. 2.2 of the patent disclosure form is checked. Newscanning and MC interpolation filter are proposed to improve coding efficiency. This newscanning proposal is applicable to the frame coded sequences. It gives a small gain for framepicture types:

All 1.7%IP 1.8%IBBP 1.3% (Tempete)

There is a need to demonstrate larger gain for acceptance.

The MC Interpolation filter identified a good issue related to chroma phase shifting, and has goodpotential to reduce visible distortion in chroma. However, a core experiment will be needed to cometo a conclusion. The group’s recommendation is to define the core experiment and include it as apart of the core experiments in the Interlaced AHG. Y. Yagasaki, U. Benzler, and K.Y. Yoo agreedto define the core experiments. The authors were encouraged to bring a demonstration of theresults, as they may be more visible than what SNR numbers can tell.

JVT-B106 (L. Wang, R. Gandhi, K. Panusopone, Y. Yue, A. Luthra, Motorola) describes MB-leveladaptive frame/field coding for interlaced video materials, and presents the performancecomparisons with frame, field and picture-level adaptive frame/field coding. Para. 2.2 of the patentdisclosure form is checked. MB-level adaptive frame/field coding is aligned with core experimentC for interlace testing. MB-level adaptive coding provides additional gain over picture-leveladaptive coding. This shows good potential for the sequences where the pictures have mixedmotion types (large and small) within a picture. More test sequences are needed with thesecharacteristics; a suggestion was to use Akiyo with Crowd. The interaction with deblocking filterneeds to be further defined. The group’s recommendation was to continue to do the coreexperiments to gather more evidence with regard to the gain, to justify the added complexity, and toinvestigate how it interacts with CABAC.

The JVT group agreed to adopt frame/field adaptation at the picture level (both text and software),and to continue to do the core experiments for MB level adaptation for possible inclusion at theMay meeting if results are positive.

JVT-B048, Supporting Film Mode in JVT Codec (S. Chen, J. Alvarez, S. MacInnis, Broadcom,describes an approach to deal with coding of film contents vs. 3:2 pull-down operations. Para. 2.2of the patent disclosure form is checked. The issue is that film is at 24 frames/sec. Many displaysrun at ~ 60 fields/sec rate. How to match 24 frames/sec to the display rate that could be different?MPEG-2 inserts two flags: top_field_first and repeat_first_field at the picture layer. The proposalin JVT-B048 includes:

• Define frame_rate and add a frame_rate flag at the sequence layer. Discussion – is it better todefine Delta T (picture clock frequency)? A contribution is needed with details. (See alsoJVT-B063, above.)

• Define display_rate information in Sequence layer.

Page 30: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

30 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

• Add two flags in “Picture display Extension” film_mode_flag and film_mode_state (similar toMPEG-2).

Whatever solution is determined, the issue of Transcoding from MPEG-2 to JVT must also betaken into consideration. A general comment is that there needs to be a distinction between thedisplay and decoding process. The display process is beyond the scope of the JVT standard.

Profiles and Levels

JVT-B007, AHG Report: Profiles & Applications (D. Lindbergh, Polycom), is based on commentson VCEG-O14 (Profile Framework for H.26L, D. Lindbergh, Polycom, December 2001 [CSR12.48], described in VCEG-O07), further discussions on the JVT reflector in the Profiles & AppsAd Hoc, and private comments. This revised proposal attempts to address all the comments thatwere made, and proposes solutions for some of the issues that were brought up.

JVT-B025, On Equivalence of BERs and Packet Loss Rates (S. Wenger, TELES AG), proposes a(better) definition “Error Limits” section of a profile/level framework document, as made availablein the Pattaya ad hoc report on Profile and Level Definitions. It tries to lay common ground for thediscussion of error resilience features in the profile/level contents. Theoretical thoughts lead to aformula for the conversion of packet loss rates to bit error rates. A definition for the Error Limitcolumn of the table in VCEG-O07 is developed. Finally, higher packet lossy error rates areproposed, however, the bit error rates would have to be reduced according to the formula. Thegroup agreed to tentatively accept the general outline at a high level for the discussion related toerror prone environment. Feedback was requested from experts about specific numbers / detailsprovided in the document. It will be considered for inclusion as an informative part of WD-2related to Profile and Level section’s error prone environment related part.

JVT-B023 (D. Lindbergh, Polycom; R. Koenen, InterTrust Technologies) contains remarks aboutJVT Profiling. It is built on past experience in MPEG and ITU with creating widely acceptedinteroperability points. It provides some of the ground rules, as the authors believe they should beestablished and adhered to. Ground rules are in the following areas:• Number of interoperability points should be as low as possible.• Definition of levels should be simple and flexible.• Encoders complexity and memory requirements should be included in the profiles.• Tools must be in the profiles.

JVT-B035 (D. Lindbergh, Polycom) offers an updated framework for the JVT codec Profiles andLevels, based on VCEG-O14 from the Pattaya meeting, discussions in Pattaya (see VCEG-O07),and comments in the Profiles and Applications Ad Hoc group since then. The high-level goals forthe framework remain the same.

Performance Evaluation

JVT-B060 (J. Boyce, Thomson Multimedia) provides experimental results about the codingefficiency of various numbers of reference frames (1, 2, 3, and 5 frames) with quarter and eighth pelmotion vector resolution in TML 9.0. Using two reference frames achieves on average 62% andthree reference frames 83% of the five reference frame coding gain. The use of multiple referenceframes requires additional memory at the decoder, but has a relatively small addition tocomputational complexity. This contribution is informational to assist in defining profiles andlevels. There was a suggestion to include higher resolution sequences. The plan for the ProfileAHG must be re-emphasized.

Page 31: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 31

Complexity

JVT-B012, AHG Report: Complexity (M. Horowitz, Polycom), covers H.26L complexityreduction work presented in and performed since the Pattaya meeting. It contains a summary ofcontributions related to H.26L complexity presented at the JVT meeting last December in Pattaya.Further, it contains a summary of complexity related activities occurring since Pattaya.

JVT-B030, Evaluation and Simplification of H.26L Baseline Coding Tools (M. Zhou, TI),evaluates the coding efficiency of major H.26L baseline coding tools, namely 16x16 intraprediction, multiple reference frame prediction, motion vectors of size below 8x8, Hadamardtransform in ME and intra prediction decision, RD-optimization, cost-function based zero blockdecision, and loop filter. The sequences chosen differ from the JVT test set and it was pointed outthat most of them show low activity. A remark was made regarding the use of VQEG sequencesfor testing. All experiments were done without the loop filter. When the loop filter is included, forthis test set, at QP=20, it is shown that the minimum gain of H.26L over MPEG-4 SP is min 20%,max 51.5%, and 35% on average. A remark was made regarding the distinction between encoderand decoder complexity.

Intra Coding

JVT-B080, More Results on New Intra Prediction Modes (G. Conklin, RealNetworks), aninformation document, provides additional results in support of the proposal made in VCEG-N54(Sept. 2001, CSR 12.37). In that proposal three new diagonal modes and improvements to existingmodes were proposed that achieve bit rate reductions of up to 10%. This document again showsthe benefits of the proposed scheme by presenting the results generated by coding entire sequencesas key frames – a suggestion made in Santa Barbara. It shows a compression gain of 10% isachievable with minor changes to the current Intra macroblock coding method by presenting resultsobtained by coding all frames of the test sequences as keyframes. Visual results are also containedin the document. A revised document is expected. The group agreed to adopt this proposal.

Encoding

JVT-B022, Range Decision for Motion Estimation of VCEG-N33 (M-C. Hong, SoongsilUniversity; H.H. Oh, LG Electronics), reduces motion estimation complexity using local motionvector statistics. Para. 2.2 of the patent disclosure form is checked. Since it is well-known thatmotion vector of a block is highly correlated to its neighboring blocks, the motion vectors of theneighboring blocks are used to determine the search range of the block. Experimental resultsreported that an average of 50% saving of MV encoding time was obtained without loss in visualquality. The group agreed to the adoption of JVT-B022.

Fine-Grain Scalability

JVT-B094r1, Water Ring Scan for H.26L-based FGS (Kyunghee University, ETRI, net&tv), firstdiscusses the case for FGS. MPEG-4 Fine Granularity Scalability (FGS) has been designed inresponse to the growing need for video coding methodologies for video streaming on various typesof networks and bandwidths, and provides the capabilities to distribute robust enhanced videobitstream at wide range of bit rates with multiple layers. Thus, an MPEG-4 FGS decoder canaccept a truncated enhancement-layer bitstream and also reconstruct a quality-improved video on thebasis of received partial bitstream until then. From these facts, it appears that the FGS is a highlydesirable functionality for a new H.26L based video coding standard.

However, FGS is not included in the working scope of JVT, in spite of its desirable functionality,due to the tight standardization schedule of JVT Phase 1. Therefore, FGS should be considered as

Page 32: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

32 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

an important work item for JVT Phase 2 activity. JVT-B094r1 describes the limitation of thecurrent FGS if it is directly adopted into H.26L based scalable coding methodology.

JVT-B094r1 introduces the water ring scan method as a potential technology that can improvesubjective picture quality of a decoded scalable video. No IP statement is attached. It shows someimprovements over the current methods being discussed. The authors will continue the work andbring the results to the next meeting. The discussion will continue in the MPEG ad hoc group.

JVT Ad Hoc Committees

Ad HocCommittee

Chair Charter

JVT ProjectManagement

G. Sullivan (Microsoft) To further the work on the JVT project as a whole, includingproject planning, work coordination, and status review.

Editing andSoftware

T. Wiegand (HHI), co-chair K.Sühring (HHI)

To further the work on the documentation and softwareimplementation of the joint model design, includingincorporation of modifications as approved by the group, andto rapidly provide improved software for group use in futureexperiments and for eventual approval as standardizedreference software.

Deblocking P. List (Deutsche Telekom) To investigate quality and complexity issues for loop filterdesign in the JVT codec and to assess the potential forimproved visual quality, reduced decoder computationalcomplexity, and enhanced design simplicity. In particularthis ad hoc group should consider the relationship of thedeblocking filter design to interlaced-scan video and thepotential usefulness of turning off loop filtering around theboundaries of slices.

VLC G. Bjøntegaard (RealNetworks) To study the potential for improvement of the VLC designfor JVT video, including particular consideration of contextbased VLC coding for efficiency improvement.

CABAC D. Marpe (HHI) To study the improvement of CABAC with regards to rate-distortion performance and complexity.

MultiframeMotion Prediction

M. Schlockermann (Matsushita) To finalize B-picture syntax taking into account JVT-B057 , and to study other multiframe motion predictionaspects of the JVT design.

GMVC/GMC H. Kimata (NTT), co-Chair: J.Lainema (Nokia)

To study techniques for inclusion of global motion vectorcoding and global motion compensation, particularlyincluding consideration of the designs described in JVT-B046 and JVT-B019 and the use of MB skip at predictor.

AdditionalTransforms andQuantizationMethods

M. Wien (RWTH Aachen) To investigate adaptive block transforms, study the transformfamily approach with adaptive transform specification asdescribed in JVT-B103, study the use of quantizationweighting matrices, consider Hadamard transformapplicability, and to study other issues surrounding thedesign of the transform and quantization/inverse-quantizationaspects of the JVT design.

Interlace P. Borgwardt (Videotele.com), L.Wang (Motorola)

To study and complete the core experiments on adaptiveframe/field macroblock coding and chroma phase distortionand to study other aspects relating to the design of JVT videowith respect to the coding of interlaced-scan video content.

MotionInterpolation

T-I. Johansen (Tandberg), T.Wedi (Univ. of Hannover)

To study the design of the motion compensationinterpolation processing in the JVT design, includingconsideration of adaptive motion interpolation andconsideration of the rate-distortion-complexity tradeoffs inmotion interpolation design.

Page 33: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 33

NAL and High-Level Syntax

Y-K. Lim (netNtv), T.Stockhammer (Munich Univ. OfTechnology), M. Hannuksela(Nokia)

To study the carriage of JVT bitstreams over varioustransport systems. To study harmonization of the NALconcept in the JVT design and the SL concept in MPEG-4Systems. To identify the common aspects and torecommend methods for the use of JVT bitstreams withRTP, H.32x, MPEG-2 Systems, and within the MPEG-4file format.

Complexity M. Horowitz (Polycom) To study the implementation complexity of the JVT codecdesign and to recommend methods of minimizing thatcomplexity in terms of encoder and decoder computationaland implementation complexity and design simplicity.

Robustness M. Horowitz (Polycom) To consider aspects of the JVT design in regard to robustnessto lost data, including particular consideration of scatteredslices error concealment and error robust macroblock modeand reference frame selection, and to define and conduct thecore experiments in this area.

Film mode /Timinginformation

G. Sullivan (Microsoft), S.Chen (Broadcom)

To study issues of timing (e.g., capture, presentation, etc)and Film mode video in relation to the JVT design.

Profiles, Levels,and Applications

D. Lindbergh (Polycom) To study the applications of the JVT codec and theappropriate methods of addressing these applications withprofiles and levels of the JVT codec design, includingparticular emphasis on the design of a baseline profile.

Buffering E. Viscito (Globespan/Virata) To study the needs of the JVT with respect to the videobuffering verifier / hypothetical reference decoder design andto consider its impact for a variety of applications.

CSR’s Fully Searchable CDs

CSR CDs are indexed for machine searching (Adobe Acrobat). They arevery useful for researching technical issues as well as for prior-art searches.Your company’s patent or legal departments may also find these CDs useful.

Twelve Year CD: all CSR reports from 1990 through 2001 on one CD$2,400 to non-subscribers; subscribers receive a $200 discount for each year ofsubscription during 1990 – 2001

Quarterly CDs : 3 months of CSR reports on each CD, in an annual subscription$695 to non-subscribers but only $200 as an add-on to current subscriptions

Annual CDs: 12 months of CSR reports on a CD for each calendar year 1990 to present$695 to non-subscribers, $200 to current subscribers

Page 34: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

34 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

JVT Meeting Roster, Jan. 29 - Feb. 1, 2002, Geneva, Switzerland

Gary Sullivan, Microsoft JVT Rapporteur | ChairThomas Wiegand, Heinrich Hertz Institute, JVT Associate Rapporteur | Co-ChairAjay Luthra, Motorola, BCS JVT Associate Rapporteur | Co-ChairHost: ITU

Canada VideoLocus Inc Lowell Winger [email protected] Nokia Corporation Miska Hannuksela [email protected] France Telecom Frederic Loras [email protected] France Telecom R&D Nathalie Laurent [email protected]

omFrance Philips Yann Le Maguet [email protected] Philips Marc Legrand [email protected] Thomson Multimedia Anne Lorette [email protected] Deutsche Telekom Peter List [email protected] Heinrich Hertz Institute Detlev Marpe [email protected] Heinrich Hertz Institute Heiko Schwarz [email protected] Heinrich Hertz Institute Thomas Wiegand [email protected] Matsushita/Panasonic Martin Schlockermann [email protected] Munich University of Technology Thomas Stockhammer [email protected] Robert Bosch GmbH Ulrich Benzler [email protected] RWTH Aachen Mathias Wien [email protected] Siemens AG Gero Bäse [email protected] TELES AG Stephan Wenger [email protected] University of Hannover Thomas Wedi [email protected] Harmonic Inc. Natan Peterfreund [email protected] AETHRA Roberto Flaiani [email protected] Hitachi, Ltd. Yoshinori Suzuki [email protected] KDDI Corp. Koichi Takagi [email protected] Matsushita/Panasonic Shinya Kadono [email protected] Matsushita/Panasonic Satoshi Kondo [email protected] Mitsubishi Electric Corporation Shun-ichi Sekiguchi [email protected] NEC Yoshihiro Miyamoto [email protected] NTT Hideaki Kimata [email protected] NTT DoCoMo, Inc. Satoru Adachi [email protected] NTT DoCoMo, Inc. Sadaatsu Kato [email protected] Sony Corp. Teruhiko Suzuki [email protected] Sony Corp. Yoichi Yagasaki [email protected] Toshiba Takeshi Chujoh [email protected] Toshiba Yoshihiro Kikuchi [email protected] ETRI Won-Sik Cheong [email protected] Kookmin University Dong Wook Kang [email protected] LG Electronics Inc. Byeong-Moon Jeon [email protected] LG Electronics Inc. Yoon-Seong Soh [email protected] McubeWorks Inc. Chul-Woo Kim [email protected] net&tv Co., Ltd. Young-Kwon Lim [email protected] onTimetek Inc. Angelo Yong-Goo Kim [email protected] Samsung Electronics Co. Ltd. Sang-Wook Kim [email protected] Samsung Electronics Co. Ltd. Woo-Shik Kim [email protected] Samsung Electronics Co. Ltd. Shi Hwa Lee [email protected] Sejong University Hae-Kwang Kim [email protected] Sejong University Yung Lyul Lee [email protected]

Page 35: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 35

Korea Serome Technology Kyeong-Joong Kim [email protected] SK Telecom Sang Hee Lee [email protected] SK Telecom Sang Woo Rhie [email protected] Soongsil University Min-Cheol Hong [email protected] SungKyunKwan Univ / Serome

TechnologyByeungwoo Jeon [email protected]

Korea VaroVision Joon-Ho Song [email protected] Yeungnam University Kook-Yeol Yoo [email protected] Royal Philips Electronics N.V. R.J. van der Vleuten [email protected] RealNetworks Gisle Bjøntegaard [email protected] Tandberg Tom-Ivar Johansen [email protected] Poznan University of Technology Marek Domanski [email protected] Matsushita/Panasonic Teck-Wee Foo [email protected] Matsushita/Panasonic S.M. Shen [email protected] Ericsson Radio Systems Rickard Sjöberg [email protected] Swisscom SA Pierre-André Probst [email protected] BTexact Technologies Mike Nilsson [email protected] Mitsubishi Electric Leszek Cieplinski [email protected] 8x8, Inc. (formerly Netergy

Networks, Inc.)Barry Andrews [email protected]

USA Apple Computer, Inc. Hsi-Jung Wu [email protected] Broadcom José Roberto Alvarez [email protected] Broadcom Sherman (Xuemin) Chen [email protected] FastVDO LLC Pankaj Topiwala [email protected] GlobespanVirata Eric Viscito [email protected] JRI Technology Jordan Isailovic [email protected] Microsoft Corp. Gary J. Sullivan [email protected] Microsoft Corp. Feng Wu [email protected] Motorola Faisal Ishtiaq [email protected] Motorola Ajay Luthra [email protected] Motorola Limin Wang [email protected] Nokia Marta Karczewicz [email protected] Nokia Jani Lainema [email protected] PacketVideo Corp. Chun-Jen Tsai [email protected] Polycom Inc. Michael Horowitz [email protected] Polycom Inc. Dave Lindbergh [email protected] RealNetworks Greg Conklin [email protected] Scientific-Atlanta, Inc. Arturo A. Rodriguez [email protected] Sharp Labs of America Louis Kerofsky [email protected] Texas Instruments Inc. Minhua Zhou [email protected] Thomson Multimedia Jill Boyce [email protected] VideoTele.com Peter Borgwardt [email protected] Vweb Corporation Qunshan Gu

Page 36: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

36 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

Acronym Definitions

AAC Advanced Audio CodingABT Adaptive Block TransformAC Arithmetic CodingAHG Ad Hoc GroupBD Bjøntegaard DeltaBDBRS Bjøntegaard Delta Bit Rate SavingsBD-PSNR Bjøntegaard Delta PSNRCABAC Context-based Adaptive Binary Arithmetic CodingCCIR Comité consultatif international des radiocommunicationsCD Committee DraftCE Core ExperimentCIF Common Intermediate FormatCPU Central Processing UnitDC Direct Current (steady state)DCT Discrete Cosine TransformEOB End of BlockFFS For Further StudyFGS Fine Granularity ScalabilityFLC Fixed Length CodewordGMC Global Motion CompensationGMVC Global Motion Vector CodingGOB Group of Blocks (H.261, H.263)GOP Group of PicturesH/W HardwareHRD Hypothetical Reference DecoderI Intra (JVT)ID IdentificationIEC International Electrotechnical CommissionIETF Internet Engineering Task ForceIMTC International Multimedia Teleconferencing ConsortiumIP Internet Protocol (IETF)IPR Intellectual Property RightsISO International Organization for StandardizationITU International Telecommunication UnionITU-T ITU Telecommunications SectorJM Joint test Model (JVT Group)JPEG Joint Photographics Expert GroupJTC Joint Technical CommitteeJVT Joint Video Team (MPEG Video + ITU-T Q6/16)JWD Joint Working Draft (JVT Group)LSB Least Significant BitMB Macro BlockMC Motion CompensationMCP Motion Compensated PredictionMPEG Motion Picture Experts Group (ISO/IEC)MTU Maximum Transfer UnitMV Motion VectorNAL Network Adaptation LayerNTSC National Television System CommitteeP Predicted (JVT)PAL Phase Alteration Line

Page 37: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

February 19, 2002 Vol. 13.03 Copyright © CSR 2002 37

PES Program Elementary StreamPSNR Peak Signal to Noise RatioQCIF Quarter CIFQP Quantization Parameter (H.262)R&D Research and DevelopmentR-D Rate DistortionRFC Designation for an IETF StandardRTP Real Time Transport Protocol (IETF)RVLC Reversible Variable Length CodesSAD Sum of Absolute DifferencesSEI Supplemental Enhancement InformationSG Study Group (ITU)SI Still ImageSMPTE Society of Motion Picture and Television EngineersSNR Signal to Noise RatioSP Switchable-P [frames]S/W SoftwareTML Test ModelToR Terms of ReferenceTV TelevisionUTC Universal Time CodeUVLC Universal Variable Length CodewordVBV Video Buffer VerificationVCEG Video Coding Experts GroupVCL Video Coding LayerVCV Video Complexity VerifierVLC Variable Length CodingVLCD Variable Length Coding and DecodingVPM Voice Privacy MaskVQEG Video Quality Experts GroupWG Working GroupXML eXtended Markup Language

The CSR LibrarySubscribers may order copies of documents shown in boldface type from CommunicationsStandards Review, where not controlled. $50.00 for the first document in any order, $40.00for the second, and $25.00 for each additional document in any order. Volume discountsavailable. Please contact CSR.

Documents listed with © are controlled documents. These documents are not for sale, but wecan provide you with the author’s contact information. ITU and ETSI meeting documents arealso not for sale, but we can provide you with the author’s contact information.

We have a large library of standards work in process and can help you locate otherinformation you may need.

CSR recommends that you obtain published standards from Global Engineering Documents.Tel: 800 854-7179, +1 303 792-2181, Fax : +1 303 397-7935, http://global.ihs.com

Page 38: COMMUNICATIONS STANDARDS REVIEWThe JVT chair and co-chairs are G. Sullivan (Microsoft), T. Wiegand (Heinrich Hertz Institute), and A. Luthra (Motorola BCS). JVT-B001 reports the results

COMMUNICATIONS STANDARDS REVIEW

38 Vol. 13.03 Copyright © CSR 2002 February 19, 2002

Communications Standards Review Copyright Policy

Copying of individual articles/reports for distribution within an organization is not permitted, unlessthe user holds a multiple copy license from CSR. The single user electronic version may bemounted on a server whose access is restricted both to a single organization and to one user at atime. You are welcome to forward your single user electronic copy (deleting it on your system) toanother user in your organization. CSR offers an Intranet subscription which permits unlimitedcopies to the subscribing organization.

Year 2002 Standards Committee Meeting SchedulesPlease see the updated calendar at http://www.csrstds.com/mtgs.html.

Visit the CSR Web Pages: http://www.csrstds.comThe Web Pages include an on-line store (order subscriptions and reports), an updatedTelecom Acronym Definitions list, updated meeting schedules, background material ontelecom standards and CSR (the company), data sheets on both CSR technical journals, andmore.

Communications Standards Reviewregularly covers the following committee meetings:

TIA TR-30 Data Transmission Systems &Equipment

ITU-T SG9 Cable Networks & TransmissionSG15 WP1 Network AccessSG15 WP2 Network Signal ProcessingSG16 Multimedia

ETSI AT Access and TerminalsTIPHON Voice over InternetTM6 Transmission & Multiplexing

DSL Forum xDSL, Access Technologies

Communications Standards Review (ISSN 1064-3907) reports are published within days after the relatedstandards meetings. Publisher: Elaine J. Baskin, Ph.D. Technical Editor: Ken Krechmer. Subscription Manager:Denise Hylen Lai. Copyright © 2002, Communications Standards Review. All rights reserved. Subscriptions:$795.00 per year worldwide, electronic format; $995.00 paper format. Corporate Intranet subscriptions (Corporatelicense for unlimited copies) are $2,150.00. Submit articles for consideration to: Communications StandardsReview, 757 Greer Road, Palo Alto, CA 94303-3024 USA. Tel: +1-650-856-9018. Fax: +1-650-856-6591.e-mail: [email protected]. Web: http://www.csrstds.com. 13097


Recommended