+ All Categories
Transcript
Page 1: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems

Abelardo Jara-Berrocal, Ann Gordon-RossNSF Center for High-Performance Reconfigurable Computing (CHREC)

Department of Electrical and Computer EngineeringUniversity of Florida

Page 2: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

2 of 16

Introduction – Parallel Computation

Edges indicate communication volume

1.System Formulation

3. Task Allocation / System Placement

Source

FIR

Sink

Matrix

IFFT

Angle

4000

15000

15000

82500

40000

4000

15000

FFT

1

2

3

4

5

6

7

2. Application decomposition

High Performance Application

1, 7 Data 2,6 4 3,5

uProc MEM DSP1 ASIC DSP2

Modules

To leverage parallel computation speedups, system can be decomposed in smaller tasks

Parallel communication

How do designers provide efficient module communication?

Problem: Speedup can be limited by inefficient communication!

Profile 1:DSP:0.5ms

uProc: 2.2ms

Profile 2:ASIC:0.5msDSP: 2.5ms

Page 3: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

3 of 16

Communication Architectures

uProcMEM

DSP1

ASIC DSP2

a) Bus

Bus Network-on-Chip (NoC)

Adv

anta

ges

Dis

adva

ntag

esMEM

uProc DSP1

ASIC DSP2

b) Network-on-ChipNoC node

• Very well known • Smaller hardware overhead• SoC standards: Coreconnect®, Amba®, Wishbone

• Scalable• Very high bandwidth

• Wires are broken in smaller segments• Multiple and simultaneous parallel communications

• Does not scale well as number of modules increases• High power consumption due to long wires• Cross-talk issues

• Significant area overhead• Exacerbated by store-and-forward routers

• Interfaces between modules and nodes are not standard• Specific signals and handshaking protocols for each design

Page 4: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

4 of 16

General NoC architecture

NoC Interface

NoC Link

NoC NodeRouters (packet switching)Switches (circuit switching)

MEM

uProcDSP1

ASIC DSP2

I/O Slave

DSP2

uProc

[1] Salminem et.al. Survey of Network-on-Chip Proposals. White Paper. OCP-IP, March 2008

NoC TopologyVary across designsCommonly 2D mesh or torus [1]

Page 5: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

5 of 16

Motivation• Relevant NoC metrics:

• Throughput• Latency• Area• Power

• 2D Mesh NoC• High throughput• Low latency• High communication parallelism

• Due to these advantages, some commercial 2D NoCs for ASICs have appeared:

• Arteris®• How about NoC implementations in FPGAs?

• FPGAs are increasingly used in digital designs– Reconfigurable– Lower cost than ASICs

• NoC area overhead becomes a problem– Area of a 3x3 2D Mesh NoC consumed 28.72% of a Xilinx V2P30[2](for maximum throughput of 9.5Gbps for complete 3x3 2D NoC)

• Problem is exacerbated with low capacity & low cost FPGA devices

N7

N4

N1

N8

N5

N2

N9

N6

N3

Nod

e

Mod

ul e

Arteris NoC

[2] B. Sethuraman, P. Bhattacharya, J. Khan, Ranga Vemuri: LiPaR: A light-weight parallel router for FPGA-based networks-on-chip. ACM Great Lakes Symposium on VLSI 2005: 452-457

Page 6: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

6 of 16

• SCORES = Scalable CCommunication Architecture for Reconfigurable Embedded Systems

• Main contributions:• High throughput / bandwidth

– Circuit switching scheme• Low area overhead

– Linear topology • Multiple clock domains• Scalability

– VHDL model with numerous architectural parameters– Allows customization for different SoCs communication needs

SCORES - Contributions

REC

ON

FIG

UR

AB

LE

DEV

ICE

(FPG

A)

Module 1 Module 2 Module 3

SCORESInterface Interface Interface

scores-clk

clk2clk3

clk1Diff

eren

t clo

ck d

omai

ns

Implemented in

Xilinx VLX25 FPGA

Page 7: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

7 of 16

clk

REC

ON

FIG

UR

AB

LE

DEV

ICE

(FPG

A)

Module 1 Module 2 Module 3

clk2clk3

clk1

SCORES – Top Level Design• SCORES main components:

• Switches – communication nodes inside SCORES• Interfaces – communication between modules and SCORES• Channels – communication links between switches and other

switches or interfaces• Modules access interfaces through local input ports and local output

ports

Module

SCORES

Switch

Interface

Interface Interface Interface

Page 8: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

8 of 16

SCORES – Parametric Architecture

Module 4Module 3Module 2Module 1

kl – number of left switch channels

kr – number of right switch channelsko - number local output ports from the interface

ki - number local input ports to the interface

SCORES

Interfaces

Switch

N = Number of modules W = Width of a channel in bits

Additional parameters

Parameters enable SCORES to conform to custom communication requirements

Page 9: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

9 of 16

SCORES – Terminology

Interface InterfaceInterface Interface

Module 1 Module 4Module 2 Module 3

• Producer: module which transmits data

• Consumer: module which receives data

• Streaming Data Channel (SDC):• Dedicated path between a

producer and a consumer• Dynamically created and

destroyed inside SCORES• Bidirectional path

• Data flows from producer to consumer

• Control synchronization signals flow from consumer to producer Producer

Streaming Data Channel (SDC)

Consumer

Page 10: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

10 of 16

SCORES – Communication Phases

Interface InterfaceInterface Interface

Module 1 Module 4Module 2 Module 3

• Three communication phases• Phase I: Channel establishment:

• Producer requests a path to the consumer

• Path iteratively created inside switches between the producer and the consumer

• If a switch has no available channels

– Sends a DENY signal to the producer

– Producer can drop or maintain the request

• If successful, the Streaming Data Channel (SDC) is created between the producer and the consumer

Producer

Streaming Data Channel (SDC)

Consumer

Page 11: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

11 of 16

SCORES – Communication Phases• Phase II: Streaming

transmission• Pipelined operation• If consumer buffer is full

– Consumer asserts “Full” to inform producer to pause transmission

• Interfaces built around asynchronous FIFOs

– Eases crossing different clock domains

• Phase III: Channel release• Producer deasserts its

request• Path between the

producer and the consumer is iteratively destroyed

Interface InterfaceInterface Interface

Module 1 Module 4Module 2 Module 3

Producer

Streaming Data Channel (SDC)

Consumer

Register

Page 12: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

12 of 16

SCORES – Simultaneous Data Transfers

Interface

Input Registers

Switch 1 Switch 2 Switch 3 Switch 4

Interface Interface Interface

MUXes Free channel

• Set of FSM controllers running at each switch• Allows SCORES to establish and operate multiple SDCs in parallel

Page 13: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

13 of 16

Results – Clock FrequencyFr

eque

ncy

(MH

z)

Number of right switch channels (Kr) (1 left switch

channel)

Number of left and right switch channels (Kr, Kl) (1 local input

and 1 local output port per switch)

Number of local input and output ports (Ki, Ko) per switch (1 left and 1 right

switch channel)

• Achieved SCORES maximum frequency is equal to the SCORES maximum throughput

Customized SCORES switch with 32-bit channels, 2 left and right switch channels, and 1 local input and 1 local output port operates at 254 MHz (Throughput=8.0Gbps, post place-and-route timing report).

Page 14: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

14 of 16

Results - AreaA

rea

(slic

es)

Customized SCORES switch with 32-bit channels, 2 left and right switch channels and 1 local input and 1 local output port consumes 315 slices (1.41% of Virtex 4 VLX25)

Number of right switch channels (Kr) (1 left switch

channel)

Number of left and right switch channels (Kr, Kl) (1 local input

and 1 local output port per switch)

Number of local input and output ports (Ki, Ko) per switch (1 left and 1 right

switch channel)

Page 15: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

15 of 16

Conclusions• We developed SCORES (Scalable Communication

Architecture for Reconfigurable Embedded Systems) - a highly parametric communication architecture

• SCORES Contributions:– Low area overhead (315 slices for a 32-bit switch with multiple

ports)– Modules can run at different and independent clock frequencies– Highly parametric design, which enables architecture

optimization• Future work

– Optimization of switch FSM controllers– Development of algorithms for module placement inside

SCORES– Tools for automatic determination of SCORES parameter values

Page 16: SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.

16 of 16

Questions


Top Related