Module
Module
Module
Module
Module
Module
Module
Module
Module Module Module
Module
Module
Module
R
R
R R R
R
RR R R R
R R
R
Module
R
R
R
Efficient Link Capacity and QoS Design for Wormhole
Network-on-Chip
Zvika Guz, Isask’har Walter, Evgeny Bolotin, Israel Cidon, Ran Ginosar and
Avinoam Kolodny
Technion, Israel Institute of Technology
DATE’06 NoC Capacity Allocation 2
Problem Essence How much capacity [bits/sec] should be
assigned to each link? - All flows must meet delay requirements - Minimize total resources
R
R
R R R
R
RR R R R
R R
R
R
R
R
DATE’06 NoC Capacity Allocation 3
Outline Wormhole based NoC The problem of link capacity allocation Solution:
- Wormhole delay model- Capacity allocation algorithm
Design examples Summary
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 NoC Capacity Allocation 4
Outline Wormhole based NoC The problem of link capacity allocation Solution:
- Wormhole delay model- capacity allocation algorithm
Design examples Summary
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 NoC Capacity Allocation 5
IP1
Inte
rface
IP2
Wormhole Switching
Interface
Suits on chip interconnect Small number of buffers Low latency Virtual Channels
- interleaving packets on the same link
DATE’06 NoC Capacity Allocation 6
Wormhole Switching Suits on chip interconnect Small number of buffers Low latency Virtual Channels
- interleaving packets on the same link
IP1
Inte
rface
Interface
IP3IP2Interface
DATE’06 NoC Capacity Allocation 7
Outline Wormhole based NoC The problem of link capacity allocation Solution:
- Wormhole delay model- Capacity allocation algorithm
Design examples Summary
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 NoC Capacity Allocation 8
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
ModuleModule
Module
Module
NoC Design FlowDefine inter-
module traffic
Place modules
Allocate link capacities
Verify QoS and cost
R
R
R R R
R
RR R R R
R RR
R
R
R
R
R
RR
R
R
R
R
R
R
R R
R R
R
R
R
R
R
RR
R
R
DATE’06 NoC Capacity Allocation 9
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
NoC Design Flow
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
Define inter-module traffic
Place modules
Allocate link capacities
Verify QoS and cost
Too low capacity results in poor QoS Too high capacity wastes power/area
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 NoC Capacity Allocation 10
Capacity Allocation Problem Simulation takes too long
a simulation based solution is not scalable
If no simulations are used:- How to extract flows’ delays? - How to reassign capacity?
Our solution:- Analytical model to forecast QoS- Capacity allocation algorithm that exploit the model
DATE’06 NoC Capacity Allocation 11
Outline Wormhole based NoC The problem of link capacity allocation Solution:
- Wormhole delay model- Capacity allocation algorithm
Design examples Summary
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 NoC Capacity Allocation 12
Delay Analysis
s1
d2s2
d1
R
R
R R R
R
RR R R R
R R
R
R
R
R
Approximate per-flow latencies Given:
- Network topology- Link capacities- Communication demands
DATE’06 NoC Capacity Allocation 13
Because they assume:- Symmetrical communication demands - No virtual channels- Identical link capacity!
Generally, they calculate the delay of an“average flow”- A per-flow analysis is needed
Why Previous Models Do Not Apply?
DATE’06 NoC Capacity Allocation 14
IP1
Inte
rface
IP2Interface
Wormhole Delay Analysis The delivery
resembles a pipeline pass
Packet transmission can be divided into two separated phases:- Path acquisition- Packet delivery
We focus on packet delivery phase
DATE’06 NoC Capacity Allocation 15
IP1
Inte
rface
IP2Interface
Packet delivery time is dominated by the slowest link- Transmission rate- Link sharing
Packet Delivery Time
Low-capacity link
DATE’06 NoC Capacity Allocation 16
IP1
Inte
rface
Interface Interface
IP2
Packet Delivery Time Packet delivery
time is dominated by the slowest link- Transmission rate- Link sharing
IP3
DATE’06 NoC Capacity Allocation 17
Analysis Basics Determines the flow’s effective bandwidth
Per link Account for interleaving
tt
DATE’06 NoC Capacity Allocation 18
- mean time to deliver a flit of flow i over link j [sec] - capacity of link j [bits per sec] - flit length [bits/flit] - total flit injection rate of all flows sharing link j
except for flow i [flits/sec]
Single Hop Flow, no Sharing
1
1ij
jl
tC
t
ijtjC
ij
l
DATE’06 NoC Capacity Allocation 19
- mean time to deliver a flit of flow i over link j [sec] - capacity of link j [bits per sec] - flit length [bits/flit] - total flit injection rate of all flows sharing link j
except for flow i [flits/sec]
1
1ij i
j jl
tC
ijtjC
ij
l
Bandwidth used by
other flows on link j
Single Hop Flow, with Sharing
tt
DATE’06 NoC Capacity Allocation 20
The Convoy Effect Consider inter-link dependencies
- Wormhole backpressure - Traffic jams down the road
1
1ij i
j jl
tC
| ( , )ij
i ii i k kj j i
k k k
l tt t
C dist j k
Link Load
Account for all subsequent hops Basic delay
weighted by distance
DATE’06 NoC Capacity Allocation 21
Weakest link dominates packet delivery time
Total Packet Transmission Time
- mean packet latency for flow i [sec]iT
max( | )i i i ijT m t j
Packet size[flits/packet]
Account for weakest link
=
- mean packet latency for flow i [sec]
DATE’06 NoC Capacity Allocation 22
Outline Wormhole based NoC The problem of link capacity allocation Solution:
- Wormhole delay model- Capacity allocation algorithm
Design examples Summary
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 NoC Capacity Allocation 23
Greedy, iterative algorithm
Capacity Allocation Algorithm
For each src-dst pair: Use delay model to identify most sensitive link
Increase its capacity Repeat until delay requirements are met
DATE’06 NoC Capacity Allocation 24
Outline Wormhole based NoC The problem of link capacity allocation Solution:
- Wormhole delay model- Capacity allocation algorithm
Design examples Summary
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 25
Capacity Allocation – Example#1
Before optimizationAfter optimization
00
01
02
03
10
11
12
13
20
21
22
23
30
31
32
33
Total capacity reduced by
7%
Uniform traffic with identical requirements Uniform allocation: 74.4Gbit/sec Capacity allocation algorithm: 69Gbit/sec
DATE’06 26
After optimizationBefore optimization
00
01
02
03
10
11
12
13
20
21
22
23
Capacity Allocation – Example#2 A SoC-like system
- Heterogeneous traffic demands and delay requirements Uniform allocation: 41.8Gbit/sec
Total capacity reduced by
30%
Capacity allocation algorithm: 28.7Gbit/sec
DATE’06 NoC Capacity Allocation 27
Outline Wormhole based NoC The problem of link capacity allocation Solution:
- Wormhole delay model- Capacity allocation algorithm
Design Examples Summary
ModuleModule
Module
Module
Module
Module
Module
Module
Module Module Module
ModuleModule
Module
R
R
R R R
R
RR R R R
R R
R
R
R
R
Module
DATE’06 NoC Capacity Allocation 28
Summary SoCs need non uniform link capacities
- Capacity allocation Wormhole delay analysis
- Heterogeneous link capacities - Heterogeneous communication demands- Multiple VCs
Greedy allocation algorithm Design examples
- NoC cost considerably reduced
DATE’06 NoC Capacity Allocation 29
Questions?
Module
Module M odule
M odule Module
M odule Module
Module
Modu le
Module
Module
Module
QNoCResearch
GroupGroup
ResearchQNoC
DATE’06 NoC Capacity Allocation 30
Backup
DATE’06 NoC Capacity Allocation 31
Grid topology Packet-switched Wormhole switching Fixed path XY routing Heterogeneous link capacities Quality-of-Service
QNoC Architecture
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
ModuleModule Module Module Module
ModuleModule Module Module Module
ModuleModule Module Module Module
R
R
R
R
R R
R
R
R
RR R R R
RR R R R
RR R R R
R
Router Link
E. Bolotin, I. Cidon, R. Ginosar, A. Kolodny, “QoS Architecture and Design Process for Cost-Effective Network on Chip”, Journal of Systems Architecture, 2004
DATE’06 NoC Capacity Allocation 32
Analysis and Simulation vs. Load
Nor
mal
ized
Del
ay
Utilization
Analytical model was validated using simulations- Different link capacities- Different communication
demands
Analysis Validation
DATE’06 NoC Capacity Allocation 33
Slack Elimination
Packet Delay Slack
Slac
k]%
[
Flow