+ All Categories
Home > Documents > The CMU Reconfigurable Computing Project

The CMU Reconfigurable Computing Project

Date post: 09-Feb-2016
Category:
Upload: spiro
View: 31 times
Download: 5 times
Share this document with a friend
Description:
The CMU Reconfigurable Computing Project. April 9, 1999 Mihai Budiu [email protected]. Current Project Members. ECE Department Herman Schmit Srihari Cadambi Matt Moe Robert Taylor Ronald Laufer. CS Department Seth Copen Goldstein Mihai Budiu. Why Study Reconfigurable Hardware?. - PowerPoint PPT Presentation
38
SSS 4/9/99 CMU Reconfigurable Comput ing 1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu [email protected]
Transcript
Page 1: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 1

The CMU Reconfigurable Computing Project

April 9, 1999Mihai Budiu

[email protected]

Page 2: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 2

Current Project Members

ECE Department

Herman Schmit Srihari CadambiMatt MoeRobert TaylorRonald Laufer

CS Department

Seth Copen GoldsteinMihai Budiu

Page 3: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 3

Why Study Reconfigurable Hardware?It is a nice computation paradigm

(wire your own computer)

Page 4: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 4

Algorithm Year System Versus Speedup xDNA matching 1992 SPLASH 2 SPARC 10 4300

FIR Filter 1998 PipeRench UltraSparc300Mhz

90

IDEA Encryption 1998 PipeRench UltraSparc300Mhz

61

SAT solver 1997 Pamette SPARC 5110Mhz

17--1100

Ray Casting 1995 RIPP-10 Pentium75Mhz

33.8

Hidden MarkovModel

1996 1 Xilinx FPGA SPARC 10 24.4

DES Encryption 1996 GARP UltraSparc170Mhz

24

SPEC92 1994 MIPS+RC MIPS 1.22

Why Study Reconfigurable Hardware

Page 5: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 5

Commercial Players

Source: In-stat April 1998  *Does not include software, hardwire or support EPROMs

Page 6: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 6

What Is “Reconfigurable Hardware?”

Universal gates and/or

storage elements

Interconnectionnetwork

Switches

Page 7: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 7

Basic Ingredient: RAM cell

0001

Universal gate = RAM

a0a1a0

a1

dataa1 & a2

Page 8: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 8

A switch is controlled by a 1-bit RAM cell

0

1

1

1

Basic Ingredients (ctd)

Page 9: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 9

Outline

• What is reconfigurable hardware• RH vs other computation paradigms• Challenges in RH research• PipeRench: the CMU project:

– the hardware– the software

• Conclusions

Page 10: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 10

RH vs ASICs• Generally Application-Specific Integrated Circuits

will be faster than RH:– RH wires are slow & big– RH bit-slices are costly to interconnect– RH devices must store configuration on the chip

but• RH can be reprogrammed

– new algorithms– to fix bugs

• RH cheaper in small production• RH tolerates faults better• RH sometimes faster with staged computation

Page 11: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 11

RH vs Microprocessors

• RH less flexible (like a VLIW with fixed instructions)

but• RH provides more (customized) computation

elements• RH can decrease memory traffic• RH can be tailored for specific algorithms

and data types

RH will not replace mP, but complement them

Page 12: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 12

Types of RH

• FPGAs: bit-level logic functionality(the basic processing elements compute on 1 bit)

• word-based architectures: PipeRench (CMU)(basic PE operates on 8 bits)(basic PE is a small ALU)

• coarse architectures: RAW (MIT)(basic PE is a MIPS 2000 core)

Page 13: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 13

RH In A SystemTitle:(coupling)Creator:(FrameMaker 5.5 PowerPC: LaserWriter 8 8.5.1)Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 14: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 14

Challenges In RC

• Software tools:– Programming RC like software development– Automatic compilation from HLL– Automatic program partitioning

• Mapping efficiently algorithms (no ISA)• System issues

– interfaces– find “ideal” RC fabric

Page 15: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 15

The CMU Reconfigurable Computing Project

Page 16: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 16

Hardware Goals

• To build a complete reconfigurable hardware device

• To build the system integration hardware• To host the device in a PC

Page 17: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 17

Our Device:

• Word processing elements• Pipelined architecture• Virtualized hardware• Local interconnection network• Wide pipelined bus

Page 18: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 18

Configurationmemory

Stripes

Data & Configcontroller

Processingelements

Page 19: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 19

Hardware Virtualization

Instructionscurrently in hardware

Instructions paged out

Actual availablehardware

Prog

ram

Page 20: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 20

Hardware Virtualization (2)

compute

compute

compute

configurePage in

Page out

Program in configurationmemory

hardware

Overlap configuration with computation.

Page 21: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 21

Processing Elements

• Look-up table• Any 3-to-1 function

a b

Cin

out

PE2 PE0PE1

Page 22: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 22

The Interconnection Network

Word-level cross-bar

P*B bits

Pass Registers

0

P*B*N bits

B bits

PEPE N PE 1

Page 23: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 23

The PCI BoardTitle:chip.epsCreator:fig2dev Version 3.2 Patchlevel 0-beta3Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 24: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 25

Software GoalTo program reconfigurable devices using the standard

software development processes:

– Compile C or Java– Do it quickly

Partitioner

DIL

Java

Data-flow Intermediate Language

Configuration

Reconfigurable HW CPU

Built

Page 25: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 26

Building Circuits From DIL

a = b + c * d;e = c - d;

• variables wires• operators gates

+*

cb d

a

-

e

Page 26: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 27

Mapping Circuits To

-

+

a b c

-

+

a b c

-+

a b c

-+

a b c

Page 27: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 28

The DIL Compiler Front-End

ParserEvaluator

Loader

Loader

Dil input file

Circuit

componentlibrary

Componentcircuits

Backend

Page 28: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 29

The DIL Compiler BackendCircuit

(expanded)

OptimizerPlacer-Router

CircuitCircuit

(placed)

Code generator

AsmC++

Front-end

C++xfig

The whole compilation process is very fast (compared to classical CAD tools).

We can compile two orders of magnitude faster.

Page 29: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 30

Small BigEfficient usage Wasteful

Slower Faster bit-sliceFlexible interconnect Coarse routingBigger configuration Fewer configuration bits

Place and route easier Constrains the compiler

Processing Element Size Tradeoffs

Page 30: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 31

Stripe Width Tradeoffs

Wider NarrowerFewer stripes More will fit

Virtualize more Fewer page-insBandwidth waste Less bandwidth available

Placer freedom Placement constrained

Page 31: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 32

Wider NarrowerMore area Less area

High bandwidth Time-mux bus

Bus Width Tradeoffs

Page 32: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 33

Clock Speed Tradeoffs(run-time)

Faster SlowerShort critical path Big chains

Long pipeline built Compact circuitsDecomposition overhead Little decomposition

Virtualized more Less virtualized

+24

2424+

++

2424

24

88

8

Page 33: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 34

Configuration Bits per Stripe

0

200400

600800

1000

12001400

1600

64 80 96 112 128 144Stripe Width

Con

figur

atio

n B

its

2 4 8 16 32PE bit width

Page 34: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 35

Title:(fir-throughput.eps)Creator:Adobe Illustrator(TM) 7.0Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 35: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 36

Project Status• Operational:

– Behavioral and structural models of Piperench in Verilog

– Assembler, simulator– Tools for visualization and debugging– One tile fabricated and tested– Very fast compiler from intermediate language

• In work:– Prototype PipeRench to be taped this summer – PCI board to host PipeRench in a PC

Page 36: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 37

Simulated Speed-up vs. UltraSparc @ 300Mhz

328.8

29.020.6

90.961.8

26.0

76.1

1.0

10.0

100.0

1000.0

ATR Cordic DCT FIR IDEA Nqueens Over

Page 37: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 38

Future Work

• Build the PCI board• Build the OS device drivers• Start investigating HLL issues:

– automatic partitioning– translation to DIL– special code transformations

Page 38: The CMU Reconfigurable Computing Project

SSS 4/9/99 CMU Reconfigurable Computing 39

Conclusions

• A set of important applications can benefit from RC devices

• RC offer potential for substantial performance improvement at a low cost

• RC devices will soon be mainstreamin the embedded computing world; perhaps in the future they will also permeate the desktop Pentium V

UVR


Recommended