1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State...

Post on 16-Jan-2016

224 views 0 download

transcript

1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

CPRE 583Reconfigurable Computing

Lecture 3: Wed 9/2/2009(Reconfigurable Computing Architectures,

VHDL Overview 3)

Instructor: Dr. Phillip Jones(phjones@iastate.edu)

Reconfigurable Computing LaboratoryIowa State University

Ames, Iowa, USA

http://class.ece.iastate.edu/cpre583/

2 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

• Reinforce some common questions

• Finish Chapter 1 Lecture

• Continue Chapter 2

• VHDL review

Overview

3 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

• How does an FPGA work?

• How does VHDL execute on an FPGA?

• How many LUT on the classes FPGA? 44,000

• State machines will be cover more next lecture

• Final Project group selection: choose your own groups

• Class machine resources– Coover 2048, 1212; Coover 2041 ML507 (will be 2)– Distance students xilinx.ece.iastate.edu (other servers on the way)

Common Questions

4 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

• Basic trade-offs associated with different aspects of a Reconfigurable Architecture. (Chapter 2)

• Practice with timing diagrams, start state machines

What you should learn

5 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Reconfigurable Architectures

• Main Idea Chapter 2’s author wants to convey– Applications often have one or more small

computationally intense regions of code (kernels)

– Can these kernels be sped up using dedicated hardware?

– Different kernels have different needs. How does a kernels requirements guide design decisions when implementing a Reconfigurable Architecture?

6 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Reconfigurable Architectures• Forces that drive a Reconfigurable Architecture

– Price• Mass production 100K to millions• Experimental 1 to 10’s

– Granularity of reconfiguration• Fine grain• Course Grain

– Degree of system integration/coupling• Tightly• Loosely

All are a function of the application that will run on the Architecture

7 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Example Points in (Price,Granularity,Coupling) Space

Price

$100’s

$1M’s

Granularity

Coarse

Fine

CouplingLoose Tight

Intel /AMD

Int

float

RFU

Processor

PC

ML507

Ethernet

Decode

Exec

Store

8 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

What’s the point of a Reconfigurable Architecture

• Performance metrics– Computational

• Throughput• Latency

– Power• Total power dissipation• Thermal

– Reliability• Recovery from faults

Increase application performance!

9 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Typical Approach for Increasing Performance

• Application/algorithm implemented in software– Often easier to write an application in software

• Profile application (e.g. gprof)– Determine where the application is spending its time

• Identify kernels of interest– e.g. application spends 90% of its time in function

matrix_multiply()• Design custom hardware/instruction to accelerate kernel(s)

– Analysis to kernel to determine how to extract fine/coarse grain parallelism (does any parallelism even exist?)

Amdahl’s Law!

10 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Amdahl’s Law: Example• Application My_app

– Running time: 100 seconds– Spends 90 seconds in matrix_mul()

• What is the maximum possible speed up of My_app if I place matrix_mul() in hardware?

• What if the original My_app spends 99 seconds in matrx_mul()?

10 seconds = 10x faster

1 seconds = 100x faster

Good recent FPGA paper that illustrates increasing an algorithm’s performance with Hardware

“NOVEL FPGA BASED HAAR CLASSIFIER FACE DETECTION ALGORITHM ACCELERATION”, FPL 2008

http://class.ece.iastate.edu/cpre583/papers/Shih-Lien_Lu_FPL2008.pdf

11 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Reconfigurable Architectures• RPF -> VIC (short slide)

12 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity

13 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Coarse Grain

• rDPA: reconfigurable Data Path Array• Function Units with programmable interconnects

ALU ALU ALU

ALU ALU ALU

ALU ALU ALU

Example

14 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Coarse Grain

• rDPA: reconfigurable Data Path Array• Function Units with programmable interconnects

ALU ALU ALU

ALU ALU ALU

ALU ALU ALU

Example

15 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Coarse Grain

• rDPA: reconfigurable Data Path Array• Function Units with programmable interconnects

ALU ALU ALU

ALU ALU ALU

ALU ALU ALU

Example

16 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Fine Grain

• FPGA: Field Programmable Gate Array• Sea of general purpose logic gates

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

Configurable Logic Block

17 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Fine Grain

• FPGA: Field Programmable Gate Array• Sea of general purpose logic gates

CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

Configurable Logic Block

18 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Fine Grain

• FPGA: Field Programmable Gate Array• Sea of general purpose logic gates

CLB CLB

CLB

CLB

CLB CLB CLB CLB

Configurable Logic Block

19 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

20 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op3

21 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op3

4

3

3AB

op3

4

3

3

AB

op3

22 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op

3

4

3

3AB

op

3

3

3

3

AB

op

23 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op

3

4

3

3AB

op

3

4

3

3

AB

op

3

24 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

25 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

(A and “1100”) or (B or “1000”)

26 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

(A and “1100”) or (B or “1000”)

A

B

27 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

(A and “1100”) or (B or “1000”)

A AND

OR

OR

1

0

B

4

4

It’s much worse, each 10-LUT only has one output

Area that wasrequired using

2-LUTS

28 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: Example Architectures

• Fine grain: GARP

• Course grain: PipeRench

29 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

30 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

RFUcontrol

(1)Execution(16, 2-bit)

N

PE (Processing Element)

31 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

RFUcontrol

(1)Execution(16, 2-bit)

N

PE (Processing Element)Example computations in one cycleA<<10 | (b&c)(A-2*b+c)

32 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

Impact of configuration size• 1 GHz bus frequency•128-bit memory bus• 512Kbits of configuration size

On a RFU context switch how longto load a new full configuration?

4 microseconds

An estimate of amount of time for theCPU perform a context switch is ~5 microseconds

~2x increase context switch latency!!

33 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

RFUcontrol

(1)Execution(16, 2-bit)

N

PE (Processing Element)

“The Garp Architecture and C Compiler”http://www.cs.cmu.edu/~tcal/IEEE-Computer-Garp.pdf

34 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench • Coarse granularity

• Higher (higher) level programming

• Reference papers• PipeRench: A Coprocessor for Streaming Multimedia Acceleration

(ISCA 1999): http://www.cs.cmu.edu/~mihaib/research/isca99.pdf• PipeRench Implementation of the Instruction Path Coprocessor

(Micro 2000): http://class.ee.iastate.edu/cpre583/papers/piperench_Micro_2000.pdf

35 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

Interconnect

8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE

Interconnect

8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE

8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE

Glo

bal b

us

36 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

37 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

38 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

39 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

40 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

41 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

42 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

43 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

44 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

45 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

46 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

47 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

3

1

2

48 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

3

1

2

3

4

2

49 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

3

1

2

3

4

2

3

4

0

50 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling • Independent Reconfigurable Coprocessor

– Reconfigurable Fabric does not have direct communication with the CPU

• Processor + Reconfigurable Processing Fabric– Loosely coupled on the same chip– Tightly coupled on the same chip

51 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

52 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPF

53 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPURPF

54 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPF

ConfigI/F

55 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPF

ConfigI/F

56 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPFI/O

ConfigI/F

57 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPURFU

58 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

MP2

FPGA

PC Display.cEthernet(UDP/IP)

Power PC

User Defined Instruction

Monitor VGA

59 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

MP2

FPGA

PC Display.cEthernet(UDP/IP)

Power PC

User Defined Instruction

Monitor VGA

60 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

MP2

FPGA

PC Display.cEthernet(UDP/IP)

Power PC

User Defined Instruction

Monitor VGA

61 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

MP2 Notes• MUCH less VHDL coding than MP1

• But you will be writing most of the VHDL from scratch

• The focus will be more on learning to read a specification (Power PC coprocessor interface protocol), and designing hardware that follows that protocol.

• You will be dealing with some pointer intensive C-code. It’s a small amount of C code, but somewhat challenging to get the pointer math right.

62 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Lecture 3 notes / slides in progress

63 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: PipeRench

• Scheduling virtual stage on to physical• Partial/Dynamically reconfig (each cycle)

64 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Granularity: GARP

• Impact of configuration size on performance• Context switching

• Garp feature• Dynamic reconfigurable• Store multiple configurations in an on chip

cache (4)• One configuration at a time

• Example app mapping to GARP (loop)• Amdahl's Law

The Garp Architecture and C Compiler• http://www.cs.cmu.edu/~tcal/IEEE-Computer-Garp.pdf

65 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames)

Overview• Dimensions

– Price– Granularity– Coupling– To optimize App Performance (compute (throughput, latency),

Power, reliability)• RPF to efficiently implement VICs

– Main picture authors' wants to convey• What’s the point or having a Reconfigure arch

– Example (Increase App performance)• App -> SW/CPU• Profile• ID kernels of intense compute• Design custom hardware/instruction (Amdels law)

– Intel FPL paper, great example for reading by Friday