+ All Categories
Home > Documents > 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure...

1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure...

Date post: 20-Dec-2015
Category:
View: 230 times
Download: 1 times
Share this document with a friend
Popular Tags:
48
1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc
Transcript
Page 1: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

1

SHARC‘S’uper ‘H’arvard ‘ARC’hitecture

Nagendra Doddapaneni

ER

hit

HAR

ect

VARD

ure

SUP

Arc

Page 2: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

2

Overview

•Harvard

Architecture

•Super Harvard

Architecture

•TigerSHARC

processor

Page 3: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

3

Outline

• Background

• Harvard Architecture−Why?−What?

• Modern CPU Chip Design

• Super Harvard Architecture

• TigerSHARC Processor

Page 4: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

4

Outline

• Background <-

• Harvard Architecture−Why?−What?

• Modern CPU Chip Design

• Super Harvard Architecture

• TigerSHARC Processor

Page 5: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

5

Background

•von Neumann Architecture−Single storage for instructions and data

•Digital Signal Processors−Specialized microprocessor designed specifically for digital signal processing, generally in real time

Page 6: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

6

Outline

• Background

• Harvard Architecture−Why? <-−What?

• Modern CPU Chip Design

• Super Harvard Architecture

• TigerSHARC Processor

Page 7: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

7

Why Harvard Architecture ?

• von Neumann bottleneck

(‘memory bound’)

• DSP applications

• In von Neumann architecture−Either reading an instruction−Or reading/writing from/to memory

Page 8: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

8

Harvard Architecture (cont…)

Page 9: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

9

Outline

• Background

• Harvard Architecture−Why?−What? <-

• Modern CPU Chip Design

• Super Harvard Architecture

• TigerSHARC Processor

Page 10: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

10

What is Harvard Architecture ?

•Physically separate storage and signal pathways for instruction and data•Next instruction fetched, when executing current instruction•Program memory can be small and wide•Data memory can be large and narrower

Page 11: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

11

Outline

• Background

• Harvard Architecture−Why?−What?

• Modern CPU Chip Design <-

• Super Harvard Architecture

• TigerSHARC Processor

Page 12: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

12

Modern CPU chip design

• Incorporate features from both architectures• ‘On chip’ cache memory – divided into

instruction cache and data cache.

Harvard architecture used when CPU accesses cache memory.

• On a cache miss, ‘off chip’ main memory is accessed using von Neumann architecture.

Main memory is not separated into data and instruction sections.

Page 13: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

13

Outline

• Background

• Harvard Architecture−Why?−What?

• Modern CPU Chip Design

• Super Harvard Architecture <-

• TigerSHARC Processor

Page 14: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

14

Super Harvard Architecture

• Cache used to store instructions, leaving both instruction bus and data bus free to fetch operands

• Harvard Architecture + cache = Extended Harvard Architecture or Super Harvard Architecture

Page 15: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

15

Outline

• Background

• Harvard Architecture−Why?−What?

• Modern CPU Chip Design

• Super Harvard Architecture

• TigerSHARC Processor <-

Page 16: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

16

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller• Applications

Page 17: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

17

TigerSHARC Processor

• Processor Architecture <-• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller• Applications

Page 18: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

18

TigerSHARC Processor Architecture

•3 128-bit data

buses

•2 IALU’s

•2 Computational

Blocks− ALU ( Float and Integer )− SHIFTER− MULTIPLIER− CLU

Page 19: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

19

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation <-• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller• Applications

Page 20: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

20

TigerSHARCInstruction Parallelism and SIMD

Operation

• Core can execute simultaneously one to four 32-bit instructions encoded in single instruction line (VLIW).

• Can execute in parallel? Depends on….− Instruction line resources each requires−Source and Destination of registers used

• Supports SIMD operations through the use of both Computational Blocks in parallel.

• Each Computational Block can execute four 16-bit or eith 8-bit SIMD computations in parallel.

Page 21: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

21

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU <-• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller• Applications

Page 22: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

22

TigerSHARCInteger ALU

•31 32 bit general registers + 1 status register + 8 dedicated registers for circular buffers• Performs integer ALU operations and data addressing• ALU instructions: ADD, SUB, ARS, LRS (right shifts only), ROT (left and right), AND NOT, NOT, OR, XOR, ABS, MIN, MAX, CMP• Status flags: zero (Z), negative (N), overflow (V), carry (C)• Instruction conditions: EQ, LT, LE, NEQ, NLT, NLE• Instruction options: unsigned (U), circular buffer (CB), bit reverse (BR), computed jump (CJMP)• Address related operations: data address generation, circular buffers, bit reverse, UREG moves, DAB control.

Page 23: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

23

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File <-− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K Buses• DMA Controller• Applications

Page 24: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

24

TigerSHARC Computational Blocks

X and Y Register File

•Register File Syntax−Each Block has 32x32 bit Data registers−Each register can store 4x8 bit, 2x16 bit or 1x32 bit words. −Registers can be combined into dual or quad groups. These groups can store 8, 16, 32, 40 or 64 bit words.

Page 25: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

25

TigerSHARC Computational Blocks

X and Y Register File•Register File Syntax

Page 26: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

26

Volatile registers in each block

• 24 Volatile Data registers in each block−XR0 – XR23−YR0 – YR23

• 2 ALU summation registers in each block−XPR0, XPR1, YPR0, YPR1

• 5 MAC accumulate registers in each block−XMR0 – XMR3, YMR0 – YMR3−XMR4, YMR4 – Overflow registers

Page 27: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

27

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU <-− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller• Applications

Page 28: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

28

TigerSHARC

X and Y ALU• 2x64 bit input

paths• 2x64 bit output

paths• 8, 16, 32, or 64 bit

addition/subtraction - Fixed-point

• 32 or 64 bit logical operations - fixed-point

• 32 or 40 bit floating-point operations

Page 29: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

29

Sample ALU Instruction

• Example of 16 bit addition

• XYSR1:0 = R31:30 + R25:24

• Performs addition in X and Y Compute Blocks

Page 30: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

30

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier <-− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller• Applications

Page 31: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

31

TigerSHARC

Multiplier• Operates on fixed,

floating and complex numbers.

• Fixed-Point numbers− 32x32 bit with 32 or

64 bit results− 4 (16x16 bit) with

4x16 or 4x32 bit results

• Floating-Point numbers− 32x32 bit with 32 bit

result− 40x40 bit with 40 bit

result

• Complex Numbers− 32x32 bit with 32 bit

result − Fixed-point only

• Results stored in MR register

Page 32: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

32

TigerSHARC Multiplier

XR0 = R1*R2;;XR1:0 = R3*R5;;XMR1:0 = R3*R5;; //uses XMR4 overflowXR2 = MR3:2, XMR3:2 = R3*R5;; XR3:2 = MR1:0, XMR1:0 = R3*R5;;

XFR0 = R1*R2;;XFR1:0 = R3:2*R5:4;; //40 bit multiply

//32 bit mantissa

Page 33: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

33

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter <-− CLU

• Program Sequencer• I J and K data buses• DMA Controller• Applications

Page 34: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

34

TigerSHARC

Shifter• Operates on one 64-

bit, one or two 32-bit, two or four 16-bit, and four or eight 8-bit fixed-point operands

• Shifts and rotates bits

• manipulation operations, like bit set, clear, toggle and test

• Bit FIFO operations to support bit streams

Page 35: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

35

TigerSHARC Processor

• Processor Architecture• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU <-

• Program Sequencer• J and K data buses• I bus – data bus

Page 36: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

36

TigerSHARC CLU

• CLU instructions are designed to support different algorithms used for communications applications

• Algorithms supported are−Viterbi Decoding (minimal distance decoding

algorithm)−Turbo-code Decoding (variant of Viterbi

decoding)−De-spreading for Code Division Multiple Access

(CDMA) systems (used for tasking a signal in wide Pseudo Noise spread bandwidth)

Page 37: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

37

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer <-• I J and K buses• DMA Controller• Applications

Page 38: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

38

TigerSHARC Program Sequencer

• Supplies instruction addresses to memory • IAB caches up to five fetched instruction

lines waiting to execute• It extracts an instruction line from IAB

and distributes to appropriate core component for execution

• Determine flow control for instructions like JMP, CALL

• Reduce branch delays using branch prediction and BTB

Page 39: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

39

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses <-• DMA Controller• Applications

Page 40: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

40

TigerSHARC architecture at a glance

Page 41: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

41

TigerSHARC Buses

• DRAM divided into 6 blocks of 4Mbits• 6 blocks connect to four 128-bit wide

internal buses through a crossbar connection

• Internal bus architecture provides a total memory bandwidth of 32Gbytes/sec

• Core and I/O can access −twelve 32-bit data words−four 32-bit instructions

per cycle

Page 42: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

42

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller <-• Applications

Page 43: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

43

TigerSHARC DMA Controller

• On-chip, with 14 DMA channels

• Provide zero-overhead data

transfers

• Operates independently and

invisibly to the DSP’s core

Page 44: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

44

TigerSHARC Processor

• Processor Architecture• Instruction Parallelism and SIMD Operation• Integer ALU• Computational blocks

− X and Y Register File− X and Y ALU− Multiplier− Shifter− CLU

• Program Sequencer• I J and K buses• DMA Controller• Applications <-

Page 45: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

45

TigerSHARC Applications

Page 46: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

46

References

• ANALOG DEVICES− http://www.analog.com/processors/processors/tigersharc/index.ht

ml− http://www.analog.com/processors/processors/sharc/index.html− http://www.analog.com/processors/resources/teachingResources.ht

ml

• ECE-ADI-PROJECT HOME PAGE− http://www.enel.ucalgary.ca/People/Smith/ECE-ADI-PROJECT/Index/index.html− http://www.enel.ucalgary.ca/People/Smith/ECE-ADI-PROJECT/Index/otherschoolsFrame.

htm

Page 47: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

47

Summary

• What is Harvard Architecture?

• What is Super Harvard Architecture?

• TigerSHARC processor architecture

• How TigerSHARC is ‘faster’ for

targeted DSP applications?

Page 48: 1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.

48

Questions?

Thank You.


Recommended