+ All Categories
Home > Devices & Hardware > 1 introduction to dsp processor 20140919

1 introduction to dsp processor 20140919

Date post: 18-Aug-2015
Category:
Upload: hans-kuo
View: 43 times
Download: 2 times
Share this document with a friend
Popular Tags:
53
1 Introduction to DSP Processor Hans Kuo [email protected] om
Transcript
Page 1: 1 introduction to dsp processor 20140919

1

Introduction to DSP Processor

Hans [email protected]

Page 2: 1 introduction to dsp processor 20140919

2

OUTLINE

Introduction to DSP Processor C6000 Architecture C6000 Memory Map Homework 1

Page 3: 1 introduction to dsp processor 20140919

3

OUTLINE

Introduction to DSP Processor C6000 Architecture C6000 Memory Map Homework 1

Page 4: 1 introduction to dsp processor 20140919

Silicon Solutions

Decision table for designers of real-time

“Choosing the Right Architecture for Real-Time Signal Processing Designs”, Leon Adams, Texas Instruments

4

Page 5: 1 introduction to dsp processor 20140919

Programmability : GPP > DSP > FPGA > ASIC Performance : ASIC > FPGA > DSP > GPP Example : Wireless communication

GPP : OS, Network Protocol DSP : A/V Codec ASIC, FPGA : Reed Solomon, Viterbi decoder

Evaluating Category ASIC FPGA DSP GPP

Programmability 1 4 5 5

Development Cycle 2 3 4 5

Performance 5 5 4 2

Power consumption 4 2 2 2

GPP : general-purpose processor DSP : digital signal processorFPGA : field programmable gate arrayASIC : application specific IC

Silicon Solutions

5

Page 6: 1 introduction to dsp processor 20140919

Ti Embedded Processors

32-bitReal-time

32-bit ARM (MCU)

ARM M3/M4

Industry StdLow Power

<100 MHz

Flash64 KB to 1 MB

USB, ENET, ADC, PWM, SPI

Host Control

$2.00 to $8.00

16-bit

Microcontrollers

MSP430

Ultra-Low Power

Up to 25 MHz

Flash1 KB to 256 KB

Analog I/O, ADCLCD, USB, RF

Measurement,Sensing, General

Purpose

$0.49 to $9.00

DSPs

C647x, C64x+, C674x, C55x

Leadership DSP Performance

24,000 MMACS

Up to 3 MB L2 Cache

1G EMAC, SRIO,DDR2, PCI-66

Comm, WiMAX, Industrial/

Medical Imaging

$4.00 to $99.00+

ARM(MPU)

ARM9Cortex A-8

Industry-Std Core,High-Perf GPP

Accelerators

MMU

USB, LCD,MMC, EMAC

Linux/WinCE User Apps

$8.00 to $35.00

DSP

DaVinci, OMAP

Industry-Std Core +DSP for Signal Proc.

4800 MMACs/1.07 DMIPS/MHz

MMU, Cache

VPSS, USB, EMAC, MMC

Linux/Win +Video, Imaging,

Multimedia

$12.00 to $65.00

ARM + DSP

ARM-Based

C2000™

Fixed & Floating Point

Up to 300 MHz

Flash32 KB to 512 KB

PWM, ADC, CAN, SPI, I2C

Motor Control, Digital Power,

Lighting, Sensing

$1.50 to $20.00

6

Page 7: 1 introduction to dsp processor 20140919

7

DSP Applications

Page 8: 1 introduction to dsp processor 20140919

8

Why do we need DSP processors?

The Sum of Products (SOP) or Multiply-accumulate(MAC) is the key element in most DSP algorithms:

Algorithm Equation

Finite Impulse Response Filter

M

kk knxany

0

)()(

Infinite Impulse Response Filter

N

kk

M

kk knybknxany

10

)()()(

Convolution

N

k

knhkxny0

)()()(

Discrete Fourier Transform

1

0

])/2(exp[)()(N

n

nkNjnxkX

Discrete Cosine Transform

1

0

122

cos).().(N

x

xuN

xfucuF

Page 9: 1 introduction to dsp processor 20140919

9

Hardware vs. Software multiplication

DSP processors are optimized to perform multiplication and addition operations.

Multiplication and addition are done in hardware and in one cycle.

Example: 4-bit multiply (unsigned).

1011x 1110

1011x 1110

Hardware Software

10011010 00001011.1011..

1011...

10011010

Cycle 1Cycle 2Cycle 3Cycle 4

Cycle 5

Page 10: 1 introduction to dsp processor 20140919

10

OUTLINE

Introduction to DSP Processor C6000 Architecture C6000 Memory Map Homework 1

Page 11: 1 introduction to dsp processor 20140919

11

C6000 System Block Diagram

PERIPHERALS

Internal Memory

Internal Buses

ExternalMemory

.D1

.M1

.L1

.S1

.D2

.M2

.L2

.S2

Regs (B

0-B15)

Regs (A

0-A15)

Control Regs

CPU

Page 12: 1 introduction to dsp processor 20140919

12

C6000 Central Processing Unit

PERIPHERALS

Internal Memory

Internal Buses

ExternalMemory

.D1

.M1

.L1

.S1

.D2

.M2

.L2

.S2

Regs (B

0-B15)

Regs (A

0-A15)

Control Regs

CPU

Page 13: 1 introduction to dsp processor 20140919

13

Implementation of Sum of Products (SOP)

SOP is the key element for most DSP algorithms.

let’s write the code for this algorithm and at the same time discover the C6000 architecture.

The implementation in this module will be done in assembly.

Two basic

operations are required

for this algorithm.

(1) Multiplication

(2) Addition

Therefore two basic

instructions are required

Y =N

å an xnn = 1

*

= a1 * x1 + a2 * x2 +... + aN * xN

Page 14: 1 introduction to dsp processor 20140919

14

Multiply (MPY)

The multiplication of a1 by x1 is done in assembly by the following instruction:

MPY a1, x1, Y

This instruction is performed by a multiplier unit that is called “.M”

Y =N

å an xnn = 1

*

= a1 * x1 + a2 * x2 +... + aN * xN

Page 15: 1 introduction to dsp processor 20140919

15

Multiply (.M unit)

.M.M

Y =40

å an xnn = 1

*

The . M unit performs multiplications in hardware

MPY .M a1, x1, Y

Page 16: 1 introduction to dsp processor 20140919

16

Addition (.?)

.M.M

.?.?

Y =40

å an xnn = 1

*

MPY .M a1, x1, prod

ADD .? Y, prod, Y

Page 17: 1 introduction to dsp processor 20140919

17

Add (.L unit)

.M.M

.L.L

Y =40

å an xnn = 1

*

MPY .M a1, x1, prod

ADD .L Y, prod, Y

C6000 use registers to hold the operands, so lets change this code.

Page 18: 1 introduction to dsp processor 20140919

18

Register File - A

Y =40

å an xnn = 1

*

MPY .M a1, x1, prod

ADD .L Y, prod, Y

.M.M

.L.L

A0A1A2A3A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

Let us correct this by replacing a, x, prod and Y by the registers as shown above.

Page 19: 1 introduction to dsp processor 20140919

19

Specifying Register Names

Y =40

å an xnn = 1

*

MPY .M A0, A1, A3

ADD .L A4, A3, A4

Register File A contains 16 registers (A0 -A15) which are 32-bits wide.

.M.M

.L.L

A0A1A2A3A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

Page 20: 1 introduction to dsp processor 20140919

20

Data loading

Q: How do we load the operands into the registers?

.M.M

.L.L

A0A1A2A3A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

Page 21: 1 introduction to dsp processor 20140919

21

Load Unit “.D”

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.D.D

Data Memory

A: The operands are loaded into the registers by loading them from the memory using the .D unit.

Q: How do we load the operands into the registers?

Q: Which instruction(s) can be used for loading operands from the memory to the registers?

A: The load instructions.

(LDB, LDH,LDW,LDDW)

Page 22: 1 introduction to dsp processor 20140919

22

Using the Load Instructions

Y =40

å an xnn = 1

*

LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.D.D

Data Memory

Page 23: 1 introduction to dsp processor 20140919

23

Creating a loop

So far we have only implemented the SOP for one tap only, i.e.

Y= a1 * x1

So let’s create a loop so that we can implement the SOP for N Taps.

Y =40

å an xnn = 1

*

LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

Page 24: 1 introduction to dsp processor 20140919

24

Create a label to branch

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

Y =40

å an xnn = 1

*

Page 25: 1 introduction to dsp processor 20140919

25

Add a branch instruction, B.

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4 B .? loop

Y =40

å an xnn = 1

*

Page 26: 1 introduction to dsp processor 20140919

26

Which unit is used by the B instruction?

.S.SY =

40

å an xnn = 1

*

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.D.D

Data Memory

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4 B .S loop

Page 27: 1 introduction to dsp processor 20140919

27

How can we add more processing power to this processor?

.S.S

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

32-bits

.D.D

Data Memory

(1 ) Increase the clockfrequency.

(2 ) Increase the number of Processing units.

Page 28: 1 introduction to dsp processor 20140919

28

Increase the number of Processing units

.S.S

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

32-bits

.D.D

Data Memory

.S2.S2

.M2.M2

.L2.L2

.D2.D2

B0

B1

B2

B3

B15

Register File B

.

.

.

32-bits

Page 29: 1 introduction to dsp processor 20140919

29

C6211 Instruction Set (by unit)

.S Unit

MVKLHNEGNOT ORSETSHLSHRSSHLSUBSUB2XORZERO

ADDADDKADD2ANDBCLREXTMVMVCMVKMVKLMVKH

.M Unit

SMPYSMPYH

MPYMPYH

.L Unit

NOTORSADDSATSSUBSUBSUBCXORZERO

ABSADDANDCMPEQCMPGTCMPLTLMBDMVNEGNORM

.D Unit

STB/H/WSUBSUBAZERO

ADDADDALDB/H/WMVNEG

Other

IDLENOP

Page 30: 1 introduction to dsp processor 20140919

30

C language vs Assembly

HandOptimize

AssemblyOptimizer

CompilerOptimizer

Source Efficiency Effort

C

LinearASM

ASM

70-100%

95-100%

100%

Low

Med

High

Page 31: 1 introduction to dsp processor 20140919

31

'C6x Peripherals

Internal Memory

Internal Buses

ExternalMemory

.D1

.M1

.L1

.S1

.D2

.M2

.L2

.S2

Regs (B

0-B15)

Regs (A

0-A15)

Control Regs

CPU

PERIPHERALS

Page 32: 1 introduction to dsp processor 20140919

32

'C6x Peripherals

EMIF (External Memory Interface)

- Glueless access to async/sync memory

EPROM, SRAM, SDRAM, SBSRAM

DMA/EDMA (Enhance Direct Memory Acces)

- 4/16 Channels

BOOT

- Boot from 4M external block

- Boot from HPI/XB

‘C6x

CPU

‘C6x

CPU

EMIFEMIF

DMADMA

BootBoot

ExternalMemory

McBSPMcBSP

HPI/XBHPI/XB

TimerTimer

PLLPLL

McBSP (Multi-Channel Buffered

Serial Port) - High speed sync serial comm

- T1/E1/MVIP interface

HPI (Host Port Interface)

/Expansion Bus (XB)- 16/32-bit host P access

Timer/Counters- Two 32-bit Timer/Counters

Page 33: 1 introduction to dsp processor 20140919

33

OUTLINE

Introduction to DSP Processor C6000 Architecture C6000 Memory Map Homework 1 Reference

Page 34: 1 introduction to dsp processor 20140919

34

C6000 Memory

PERIPHERALS

Internal Memory

Internal Buses

ExternalMemory

.D1

.M1

.L1

.S1

.D2

.M2

.L2

.S2

Regs (B

0-B15)

Regs (A

0-A15)

Control Regs

CPU

Page 35: 1 introduction to dsp processor 20140919

35

C6416 Memory Map

FFFF_FFFF

0000_0000 1024KB Internal (L2 cache)

Internal Memory Unified (data or prog) 1024KB

On-chip Peripherals0180_0000

External Memory Async (SRAM, ROM, etc.) Sync (SBSRAM, SDRAM)

6000_0000

8000_0000

EMIFB 64MB x 4 External

Level 1 Cache 16KB Program 16KB Data Not in map CPU L2

1024K

16KP

16KD

EMIFA 256MB x 4 External

Page 36: 1 introduction to dsp processor 20140919

36

Memory Allocation

C source code

CompilerAssmebler

COFFObject file

Text

Data

Bss

COFFObject file

ROM

External RAM

Internal RAM

Target Memory0x00000

0xfffff

SECTION

Stack

Heap

Text

Data

Bss

MEMORY

Memory Layout

MEMORY { ISRAM : origin = 0x00000000, len = 0x00100000}SECTIONS{ .text > ISRAM}

Page 37: 1 introduction to dsp processor 20140919

37

What is stored in memory ?

What is stored in memory ? Code Constants Global and static variables Local variables Dynamic memory

Memory 0x00000

0xfffff

Page 38: 1 introduction to dsp processor 20140919

38

How is memory organized?

How is memory organized? text : Code and constant data data : Initialized global and

static variables bss : Unintialized global and

static variables stack :

Local variables Function return addresses Arguments of function

heap : Dynamic memory

Memory 0x00000

0xfffff

stack

heap

bss

data

text

Page 39: 1 introduction to dsp processor 20140919

39

How is memory allocated?

How is memory allocated ?

long array[100];long bufsize =100;int main(void) { int i; char* buf; i=10; buf=f1(i); return(0);}

Char* f1(int n){ int k; Return malloc(bufsize);}

Memory 0x00000

0xfffff

heapbssdata

text

stack

100 byte block

array[100]

bufsize = 100

int main(void) { i=10; buf=f1(i); return(0);} …

Main return addressibuff1 argument nf1 return addressk

Page 40: 1 introduction to dsp processor 20140919

40

Memory Allocation & Deallocation

How, and when , is memory allocated? Gobal and static variables = program startup Local variables = function call Dynamic memory = malloc()

How, and when, is memory deallocated? Global and static variables = program finish Local variables = function return Dynamic memory = free()

Page 41: 1 introduction to dsp processor 20140919

41

When is memory allocated?

long array[100];long bufsize =100;int main(void) { int i; char* buf; i=10; buf=f1(i); return(0);}

Char* f1(int n){ int k; Return malloc(bufsize);}

bss : 0 at startupdata : 100 at startup

Stack : at function call

Stack : at function call

Heap : 100 bytes at malloc()

Page 42: 1 introduction to dsp processor 20140919

42

When is memory deallocated?

long array[100];long bufsize =100;int main(void) { int i; char* buf; i=10; buf=f1(i); return(0);}

Char* f1(int n){ int k; Return malloc(bufsize);}

Available till terminationAvailable till termination

Deallocate on return from main()

Deallocate on return from f1()

Deallocate on free()

Page 43: 1 introduction to dsp processor 20140919

43

Sections defined in C6000 compiler

Initialized sections .cinit : Initial values for global/static variables .const : Global and static string literals .switch : Tables for switch instructions .text : code

Uninitialized sections .bss : Global and static variables .stack : Stack(local variables, return address, arguments) .far : Global and statics declared far .sysmem : Memory for malloc functions (heap)

Page 44: 1 introduction to dsp processor 20140919

44

Example : 6416 DSK

16MB512KB

Page 45: 1 introduction to dsp processor 20140919

45

Example : C6416 DSK

Base Length

Internal Memory 0x00000000 0x00100000 (1024K)

External SDRAM 0x80000000 0x01000000(16M)

External Flash 0x64000000 0x00080000 (512K)

Page 46: 1 introduction to dsp processor 20140919

46

Linker command file (*.cmd)

MEMORY Directive System memory description Name : origin = address, length = size-in-bytes

MEMORY{ ISRAM : origin = 0x00000000, len = 0x00100000 SDRAM : origin = 0x80000000, len = 0x01000000 FLASH : origin = 0x64000000, len = 0x00080000}

Page 47: 1 introduction to dsp processor 20140919

47

Linker command file (*.cmd)

SECTIONS Directive Binding sections to memory

SECTIONS{ .text > ISRAM .bss > ISRAM .cinit > ISRAM …}

Page 48: 1 introduction to dsp processor 20140919

48

C6416.cmd

-stack 0x400MEMORY{ ISRAM : origin = 0x00000000, len = 0x00100000 SDRAM : origin = 0x80000000, len = 0x01000000 FLASH : origin = 0x64000000, len = 0x00080000}SECTIONS{ .text > ISRAM .bss > ISRAM .cinit > ISRAM .stack > ISRAM …}

Page 49: 1 introduction to dsp processor 20140919

49

DSP/BIOS Configure Tool (*.cdb)

ISRAM Properties

System memory description

Page 50: 1 introduction to dsp processor 20140919

50

DSP/BIOS Configure Tool (*.cdb)

Properties

Binding sections to memory

Page 51: 1 introduction to dsp processor 20140919

Program Cases :

Case 1 :

51

Void main(){ int Image[1000]; …. }

int Image[1000];Void main(){ …. }

stack = ?

stack 0x400 (1024)

Page 52: 1 introduction to dsp processor 20140919

Program Cases :

Case 2 :

52

Void main(){ double Image[200000]; …. }

52

bss > SDRAM

stack 0x400 (1024)

bss < 0x100000 (1024k)double Image[200000];Void main(){ …. }

Page 53: 1 introduction to dsp processor 20140919

Q&A


Recommended