+ All Categories
Home > Documents > Basic Network-on-Chip (BANC) interconnection for Future ...

Basic Network-on-Chip (BANC) interconnection for Future ...

Date post: 02-Jan-2022
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
22
Basic Network-on-Chip (BA Future Gigascale MC Computation & Communic Abderazek Ben Abdalla TJASSST2006, Graduate School of Info DPL Labor The Univ. of Electro-c Tokyo, Ja abderazek.benabdal ANC) interconnection for CSoCs Applications: cation Orthogonalization ah , Masahiro Sowa , Sousse, Dec. 4-9, 2006 1 ormation Systems ratory communications apan [email protected]
Transcript

Basic Network-on-Chip (BANC) Future Gigascale MCSoCs Applications:

Computation & Communication

Abderazek Ben Abdallah

TJASSST2006, Sousse, Dec. 4

Graduate School of Information Systems DPL Laboratory

The Univ. of Electro- communicationTokyo , Japan

[email protected]

(BANC) interconnection for Future Gigascale MCSoCs Applications:

Computation & Communication Orthogonalization

Abdallah , Masahiro Sowa

, Sousse, Dec. 4-9, 2006 1

Graduate School of Information Systems DPL Laboratory

communication s, Japan

[email protected]

MCSoCs – introduction� Deep sub-micron technologies enable the implementation of chip integrating: � Multiple software programmable processors� Dedicated hardware components (cores)� Multicore are emerging as key solution for today’s nanoelectronics

TJASSST2006, Sousse, Dec. 4

Multicore are emerging as key solution for today’s nanoelectronics problems � MCSoCs are driven by: � Wireless communication, distributed/broadband computing, Multimedia

applications, etc.� ITRSC predicted that an IC will have billion of transistors by 2012

introduction

micron technologies enable the implementation of single

Multiple software programmable processorsDedicated hardware components (cores)

Multicore are emerging as key solution for today’s nanoelectronics

, Sousse, Dec. 4-9, 2006 2

Multicore are emerging as key solution for today’s nanoelectronics

Wireless communication, distributed/broadband computing, Multimedia

ITRSC predicted that an IC will have billion of transistors by 2012

Global on- chip communication delay

� Moore’s law provides exponential growth of resources� Design does not become easier

TJASSST2006, Sousse, Dec. 4

� Design does not become easier� Deep submicron problems • Wire vs. transistor speed,

power, signal integrity� Design productivity gap� IP re-use, platforms� Verification technologies

chip communication delay

, Sousse, Dec. 4-9, 2006 3

MCSoC – Computation view: the rest of the story !sys_performance = f(com_type_performance ,

compiler_performance,

compppp_performance(1)

com_type_performance = f(machine_dependence,

(2)

code_dependence = f(application_type,

TJASSST2006, Sousse, Dec. 4

code_dependence = f(application_type,

(3)

QoC = f(inst_generation_scheme, optimization_type)

(4)our new approach:� Improves the QoC –> SW view� No aggressive hardware techniques

Computation view: the rest of the story != f(com_type_performance ,

compiler_performance,

underlying_HW_performance,

RTOS_performance) +

com_type_performance = f(machine_dependence,

code_dependence)

code_dependence = f(application_type,

, Sousse, Dec. 4-9, 2006 4

code_dependence = f(application_type,QoC,

optimizations_effort)

QoC = f(inst_generation_scheme, optimization_type)

our new approach:

No aggressive hardware techniques –> HW view

The QueueCore

TJASSST2006, Sousse, Dec. 4

[abderazek06] B. A. Abderazek, T. Toshinaga, M. Sowa : ICPP 2006, Columbus, USA, August 2006

*

*

QueueCore processor

, Sousse, Dec. 4-9, 2006 5

[abderazek06] B. A. Abderazek, T. Toshinaga, M. Sowa : ICPP 2006, Columbus, USA, August 2006

*

Wire based communication problems� They are unstructured and have parasitic capacitance andadjacent wires:� Difficult to predict them early in the design process � May differ significantly from one run of the router to the next.� Solution: Full-swing static CMOS gates (or

The problem of the solution: High delay and high power dissipation

TJASSST2006, Sousse, Dec. 4

� The problem of the solution: High delay and high power dissipation � Long wires require repeaters at periodic intervalslinear.� Properly placing these repeaters is difficult� More complex with each successive technology scaling� Average wire on a typical chip is used less than 10% of

Wire based communication problems

They are unstructured and have parasitic capacitance and crosstalk to

design process router to the next.

swing static CMOS gates (or inverters) are employedHigh delay and high power dissipation [Lee2006]

, Sousse, Dec. 4-9, 2006 6

High delay and high power dissipation [Lee2006]

Long wires require repeaters at periodic intervals to keep their delay

difficult and places additional constraints

More complex with each successive technology scalingAverage wire on a typical chip is used less than 10% of the time [Dally2000]

The packet based communication approach (1)

� Resources (cores) use packets (not dedicated wires) to communicate to each other

TJASSST2006, Sousse, Dec. 4

� Each resource is placed in a square tile on the chip� The clients communicate with each other via the network

.

The packet based communication approach (1)

, Sousse, Dec. 4-9, 2006 7

The packet based communication approach (2)

� Organization (structure)� Wires electrical properties are optimized

and well controlled.� Low and predictable cross-talk,

� Reduce power dissipation

TJASSST2006, Sousse, Dec. 4

� Performance � Sharing: When one client is idle, other

clients continue to make use of the network resources.� Modularity

� Defining a standard interface is much the same manner as a backplane bus.

.

The packet based communication approach (2)

, Sousse, Dec. 4-9, 2006 8

NoCs challenges

� On-chip networks design is different from conventional inter� Wires and pins are more abundant than in inter� Buffers space is less abundant (this talk focuses on this point only)

TJASSST2006, Sousse, Dec. 4

� What topologies are best to the huge � What flow control schemes reduce buffer

hallenges

chip networks design is different from conventional inter-chip design.

ires and pins are more abundant than in inter-chip networks

(this talk focuses on this point only)

, Sousse, Dec. 4-9, 2006 9

huge wiring resources available on chip?

reduce buffer size and routing overhead?

Communication p erformance

� Communication Performance involves:� Topology : How cores (nodes) and

switches are interconnected� Routing : How to determines the

TJASSST2006, Sousse, Dec. 4

Routing : How to determines the route from source to destination � Switching strategy : How a message traverses the route � Circuit, packet, store and forward,

wormhole switching…? � Flow control : Schedules ( resource allocation) the traversal of the message.

erformance issues in NoC

, Sousse, Dec. 4-9, 2006 10

BANC

TJASSST2006, Sousse, Dec. 4

� Application layer: application-to-application� Session layer: process-to-process� Network layer: resource-to-resource� Data link layer: switch-to-switch and switch-to-resource� Physical layer: switch-to-switch and switch-to-resource

BANC layers

, Sousse, Dec. 4-9, 2006 11

Low level

TJASSST2006, Sousse, Dec. 4

� number of bits per link (channel dimension)� number of links � no pipelining � data link packet� data link clock = � single packet input buffer � no error correction

Higher level

, Sousse, Dec. 4-9, 2006 12

= Physical packet = Physical clock

ingle packet input buffer correction

� network layer packet = link layer packet � XY address routing � input Buffer

Stack layers and the

TJASSST2006, Sousse, Dec. 4

and the switch in BANC

, Sousse, Dec. 4-9, 2006 13

Interconnection of a “tile” in BANC

� The Network Layer is implemented by the network interface (NI)

TJASSST2006, Sousse, Dec. 4

interface (NI)� The adapter is needed to connect the core (resource) to the network

of a “tile” in BANC

, Sousse, Dec. 4-9, 2006 14

Packet f ormat

TJASSST2006, Sousse, Dec. 4

ormat in BANC

, Sousse, Dec. 4-9, 2006 15

BANC’ switch basic interconnection

TJASSST2006, Sousse, Dec. 4

basic interconnection

, Sousse, Dec. 4-9, 2006 16

BANC features� Here, we want to analyze the buffer size design onl y.� We described the BANC in TCL and used “ns� BANC Features:

TJASSST2006, Sousse, Dec. 4

� 5x5 mesh grid (50 components)� Duplex connection link (simultaneous transfer in bo th ways)• Adjustable delay and bandwidth� A FIFO at each input port� “Droptail” scheme for buffer overflow � RNG is used to select X and Y coordinate randomly

BANC features

Here, we want to analyze the buffer size design onl y.

We described the BANC in TCL and used “ns -2” from Berkley

, Sousse, Dec. 4-9, 2006 17

5x5 mesh grid (50 components)Duplex connection link (simultaneous transfer in bo th ways)

Adjustable delay and bandwidth

“Droptail” scheme for buffer overflow RNG is used to select X and Y coordinate randomly

Effect of buffer size on drop probability

TJASSST2006, Sousse, Dec. 4

Effect of buffer size on drop probability

1. The packet drop probability decreases when the buffer size increases.

2. For higher traffic rates (>=120Mb/s) , it is not

, Sousse, Dec. 4-9, 2006 18

(>=120Mb/s) , it is not significant that the drop probability decreases with the buffer size increase

Effect of communication load and drop probability

1. The drop probability increases as the communication load increases over communication load.

2. Increasing buffer size cannot provide significant

TJASSST2006, Sousse, Dec. 4

provide significant compensation to the increasing of drop probability.

3. The drop probability is more sensitive to the communication load that the buffer size.

Effect of communication load and drop probability

, Sousse, Dec. 4-9, 2006 19

Packet delay and communication load over buffer size s

1. The packet delay is < 1 ms when communication load equals 0.3019. � Buffer is little utilized when the

TJASSST2006, Sousse, Dec. 4

communication load is low .

2. Packet delay is not sensitive to the communication load when there are some packets dropped.

Packet delay and communication load over buffer size s

, Sousse, Dec. 4-9, 2006 20

Concluding remarks

I. QueueCore is a good candidate for NoC resourcesII. BANC architecture III. Buffer � For < ½ load , the drop probability is almost zero f or buffer size of 8

packets in each switch

TJASSST2006, Sousse, Dec. 4

packets in each switch � The delay in Queue is an important part for delay i n message� Delay in message is more sensitive to buffer size th an communication load� The drop probability is more sensitive to the commu nication load than to buffer size.

Concluding remarks

QueueCore is a good candidate for NoC resources

For < ½ load , the drop probability is almost zero f or buffer size of 8

, Sousse, Dec. 4-9, 2006 21

The delay in Queue is an important part for delay i n messageDelay in message is more sensitive to buffer size th an

The drop probability is more sensitive to the commu nication load

Thank you.

TJASSST2006, Sousse, Dec. 4

Thank you. Questions ?Thank you.

, Sousse, Dec. 4-9, 2006 22

Thank you. Questions ?


Recommended