+ All Categories
Home > Documents > Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec,...

Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec,...

Date post: 05-Oct-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
32
TM Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. Designing and Implementing H.264/SVC on the Multicore MSC815x and MSC825x StarCore DSPs June, 2010 Yaniv Klein Team Leader, Video Software FTF-NET-F0559
Transcript
Page 1: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc.

Designing and Implementing H.264/SVC on the Multicore MSC815x and MSC825x StarCore DSPs

June, 2010

Yaniv KleinTeam Leader, Video Software

FTF-NET-F0559

Page 2: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 2

Introduction

►The computational effort of DSP applications is constantly increasing due to increase in bandwidth and data rates.

►On the other hand processors are reaching frequency limitations due to stringent power constraints.

►Multi processing is one approach that enables high complexity applications while keeping power requirements relatively low.

►We will show H.264/SVC video Codec as an example of such a high complexity application.

Page 3: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 3

Agenda

►H.264/SVC Overview

►Challenges

►MSC8256 StarCore DSP overview

►MSC8256 StarCore DSP advantages

►Conclusions

Page 4: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc.

•TM

H.264/SVC Overview

Page 5: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 5

H.264 Overview

►H.264 is a video compression standard defined by ITU intended to provide good quality with substantially lower bit rates than previous standards (H.263, MPEG2, MPEG4).

►The complexity of H.264 is higher due to introduction of new coding tools such as

• In-loop filter (deblocking).• 6-Tap quarter-pixel interpolation.• Enhanced intra prediction modes.• Motion vector per 4x4 pixel blocks.

►Moreover, H.264 supports HD resolutions such as 720p (1280x720) and 1080i/p (1920x1080)

Page 6: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 6

H.264/SVC Overview

► H.264 Scalable Video Coding (SVC) is an extension of H.264/AVC that provides 3 scalability options:

• Temporal scalability – different frame-rates.• SNR/Quality/Fidelity scalability – different video quality• Spatial scalability – different video resolutions.

► The ITU standard is in-force since November 2007

► Advantages:

• Encode multiple “streams” in a single stream, can serve different consumers with not additional effort.

• More efficient then multi cast (encoding each stream separately)

• Error resilient – since there is much redundant information, lost data has almost no effect on quality

Page 7: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 7

Normal Stream

Page 8: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 8

Scalable Stream qualityEnhancement

temporalEnhancement

spatialEnhancement

Base layer

Time

Page 9: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 9

SVC scalability types

► Temporal scalability

► Spatial scalability

► Quality scalability

Page 10: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 10

Temporal scalability

►The referencing structure allows for complete frame to be discarded without harming decoding

►Degrades reference frame quality since time difference is bigger. Also - bit allocation between temporal layers is not trivial

►Requires adaptation of motion estimation

0 1 2 3 4 5 6 7 8

T0 T3 T2 T3 T1 T3 T2 T3 T0

0 1 2 3 4 5 6 7 8

Page 11: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLink and VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 11

Spatial scalability

►Spatial enhancement layers are coded as the base layer but have additional prediction options

►Spatial enhancement layers use temporal prediction from different reference frames than those of the base layer

Page 12: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 12

Quality scalability

►Quality scalability is usually achieved by two methods:

• Re-quantization – in this method the coefficients are quantized for each quality layer and the residual between layers is being transmitted

• Scan partitioning – in this method the coefficients are divided into groups and each group is transmitted in a different quality layer , thus enhancing picture quality

Ex. :Scan partitioning

Page 13: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 13

H.264 Encoder Block Diagram

Sel

Page 14: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 14

SVC Encoder Block Diagram

* Per dependency layer

Motion Estimationand

Block Partitioning

Intra Prediction

Sel

TΣ Q Scan

Q-1

T-1

Entropy

Boundary StrengthAnd

Loop filter

SourceFrame

ReferenceFrame

Reconstructed Frame

+

-

Σ

Bitstream(Q0)

+

+

Q Scan Entropy

Q Scan Entropy

Q-1

Σ+

-

Σ+

Bitstream(Q1)

Bitstream(Q2)

Q-1

T-1Boundary Strength

AndLoop filter

Reconstructed Frame(Key) Σ

++

<Key Frame?>

Σ

Q0_residuals

Q1_residuals

Q2_residuals

-

+

+

+

Q0_residuals

Q1_residuals

Inter-layer IntraPredication

Σ+

Up-sampledResidualPixels

from D(n-1) { 0 }

-

+

+

Dn-1_res_pix

Dn-1_res_pix

Dn-1Reconstructed

Inter-layer Intra Deblocking

Page 15: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc.

•TM

Challenges & Possible Solutions

Page 16: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 16

Challenges

►Complex software Codecs will currently not fit on any single core DSP.

►Thereby software solutions must span on multiple cores or multiple devices.

►Software implementation is simpler on a single device solution with multiple cores, rather than multiple single-core devices.

►Therefore, we will focus on the challenges that arise in multiple cores solutions.

Page 17: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 17

Challenges

The implementation of the codec holds great challenges, such as:

►How to partition the codec?

►How to implement a multi-core Rate Control?

►How to parallelize Deblocking?

►How to manage task allocation?

Page 18: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 18

How to partition the codec?

Two approaches can be considered:

►Functional partitioning

►Slice based partitioning

Page 19: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 19

Functional Partitioning

Functional partitioning (pipelining)

►Breaking the processing based on functional stages of encoding.

►A possible partitioning is illustrated below.

Stage 1 Stage 2 Stage 3

MotionEstimation

Transform +Quantize

Entropycoding

IntraPrediction

InverseTransformquantize

Diff

Add

DeblockingFilter

Page 20: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 20

Slice Based Partitioning

Frame partitioning (Slicing)►Breaking each video frame into slices, each slice is allocated to an

available resource.

Slice 1

Slice 2

Slice 3

Page 21: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 21

Functional vs. Slice Based Partitioning to Tasks

Pros and Cons

Slice Based Functional Based

Implementation Some stages require the entire video frame (deblocking)

Simple, Maintains the normal macroblocks processing order

Timing considerations

Almost no time dependency between processing units. State variables need to be synchronized (RC, Quantizers)

Requires synchronization mechanism between stages

Scalability Scalable – a frame can be divided into many slices

Not scalable - functionality division is naturally limited

Load Balancing Easier to balance, requires dynamic balancing mechanism between slices

Hard to balance. Each stage’s MIPS can vary significantly based on the input stream

MotionEstimation

Transform +Quantize

Entropycoding

IntraPrediction

InverseTransformquantize

Diff

Add

DeblockingFilter

Page 22: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 22

Challenges - Rate Control

►Rate-control - Controlling the output bit rate by adjusting the quantization factor

• How does each core perform rate-control? Rate-control can be done independently, budget is divided between cores and each core handles rate-control locally. A master core handles rate-control by collecting data from all slaves and updating all the slaves with the rate-control changes.

• Adaptive algorithm must be used because input video may vary and the budget needed for each slice might change between frames

• If a frame is sliced, it is important not to have very big difference in quantization factor on the edges of the slice or the edge may be visible.

• Rate Control has to take into account the conflict of constraints between the different layers.

Page 23: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 23

Challenges – Deblocking

►Deblocking filter is applied to blocks in decoded video to improve visual quality by smoothing the sharp edges between blocks

• Deblocking is done in raster order, top-down from left to right.• Deblocking has a very serialized nature because its MB processing is

dependent on the MB above it and to the left of it. • Moreover, each MB deblocking also affects the MBs to the left and top.• It is a major challenge to parallelize it on several cores.

Partition options on one device with multiple cores

Partition options on Multi device

• Deblocking on a single core• Balance load by deblocking for

luma and chroma separately• Deblocking on a partial

reconstructed frame on one core, while the remainder of the frame is being encoded on the other cores

• Task allocation is critical in order to reduce data traffic - best if each device will do most of the processing on its own and transfer minimal data to other devices

• Each device does its own deblocking. need to synchronize between devices for the blocks on the device partition border

Page 24: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 24

Challenges – Task allocation

►The distribution and control of the different tasks can be done in several ways. Example:

• Master/SlaveOne task acts as master task for all slave tasksThe master task provides a well defined activity to the other tasks and controls the load balancing

• Fully distributed systemEach task is completely independentEach task must make sure that it’s work is not done by any other taskEach task is responsible to update the work status when it endsTask code is basically identical

Page 25: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc.

•TM

MSC8256 Overview

Page 26: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc.

MSC8256 Block Diagram

26 26

QEEthernet

DMA

8-Lanes 3.125G SerDes

sRIO

sRIO

CLASS Fabric

PCIe

1056 KBShared

M3 memory

64-bitDDR-2/3

Memory Controller

1GE

x4x4x4

Clocks/Reset

I2C

SPI

GPIO

DUART

1GE

HSSI

Starcore™SC3850 DSP Core

D-Cache I-Cache

512 KBBacksideL2 Cache 32 KB 32 KB

DMA

►6x SC3850 Cores Subsystems (6GHz/48GMACS) each with:• SC3850 DSP core at up to 1GHz (8GMACs 16b or 8b)• 512 Kbyte unified L2 cache / M2 memory. • 32 Kbyte I-cache, 32Kbyte D-cache, WBB, WTB, MMU, PIC• Fully Programmable

►Internal/External Memories/Caches• 1056 KByte M3 shared memory (SRAM)• Two DDR 2/3 64-bit SDRAM interfaces at up to 800 MHz• Internal/External Memories/CachesCLASS – Chip-Level Arbitration & Switching Fabric

• Non-Blocking, fully pipelined, low latency• Full fabric 12 masters to 8 slaves, up to 512 Gbps throughput

►High Speed Interconnects• Dual 4x/1x Serial RapidIO at 1.25/2.5/3.125 Gbaud

• PCI-e 4x/1x Dual RISC QUICCEngine® supporting• Dual SGMII/RGMII Gigabit Ethernet ports• Eth. Protocols, Talitos control and sRIO offload

►Ethernet• Dual Gigabit Ethernet ports (SGMII/RGMII)

►TDM Highway• 1024 ch., 400Mbps, divided into 4 ports of 256

►DMA Engine 16 bi-directional channels ►Other Peripheral Interfaces

• SPI, UART, I2C, 32 GPIO, 16 Timers, 96KB boot ROM, JTAG/SAP, 8WDT

►Technology• Process: 45nm SOI• Voltage: 1V core, 2.5, 1.8/1.5V I/O• Package: FCBPGA (29x29) 1mm pitch, RoHS

Page 27: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc.

•TM

MSC8256 StarCore Multicore Advantages

Page 28: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 28

MSC8256 StarCore DSP Advantages for Multicore

►6 cores DSP provides best in class performance. ►M3 and DDR are fully accessible by all 6 cores.►DMA transfer supports

• Up to 4 dimensions• Freeze capability after each dimension• Up to 16 channels of DMA

►DMA makes it possible to easily transfer a full frame MB by MB with a one time programming in the beginning of each frame processing.

►Large L2 Cache with L2 pre-fetch capabilities can replace DMA traffic in simple cases.

►Dynamic partitioning of M2/L2.►MMU translation and virtual addressing can help abstract private memory

of each core, same virtual address mapped to different physical address makes code simpler.

►Easy to communicate between cores in a non-cacheable area in M3

Page 29: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 29

Additional MSC8256 StarCore DSP Advantages

►SRIO provides ~20Gb/second throughput required for moving uncompressed video between devices

►SRIO supports one dimensional DMA transfers that are done in parallel to device processing

►SRIO support of doorbell implementation can act as an interrupt between devices for possible synchronization mechanism

►SRIO broadcasting capability can help distribution of data to more than one device

►PCI Express provides ~8Gb/second throughput if SRIO is not fully utilized.

►2 ports of Gigabit Ethernet.

Page 30: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc.

•TM

Conclusions

Page 31: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and Symphony are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions logo, Flexis, MXC, Platform in a Package, Processor Expert, QorIQ, QUICC Engine, SMARTMOS, TurboLinkand VortiQa are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © 2010 Freescale Semiconductor, Inc. 31

Conclusions

►DSP Processors are steadily moving towards a multicore architecture due to power constraints and increased computational effort

►HD Video Codecs processing requirements are constantly increasing and need a multi-task approach to be fully supported

►Today’s presentation has shown that implementing a HD Video solution on a multi-core device requires:

• Smart partitioning and management of tasks between cores • A powerful device to support all application needs

►The MSC8256 StarCore DSP combined with FSL’s knowledge in multi-core applications has proven to be a compelling solution to the challenges of high performance video processing

Page 32: Designing and Implementing H.264/SVC on the Multicore ...•Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, mobileGT, PowerQUICC, StarCore, and

TM


Recommended