An Analytical Model for Worst-case Reorder BufferSize of Multi-path Minimal Routing NoCs
Gaoming Du1, Miao Li 1, Zhonghai Lu2, Minglun Gao1, Chunhua Wang1
1 Hefei University of Technology, Anhui Province, China
2 KTH Royal Institute of Technology, Sweden
2014.09.17
Outline
1Motivation2 Concepts 3Method 4Evaluation
2014-09-17
Multi-path Routing NoC
2014-09-17
1 2 3
4 5 6
7 8 9
R
R f 1
f2
• Prospects– Minimize network congestion
and packet delay– Improve the load balance– Reduce power consumption – Fault tolerant routing
• Problem– Out of order
P1P1P2P2P3P3P4P4
disadvantage
The area overhead.Low hardware utilization.
With worst-case analysis, it can reduce the reorder buffer size with proper flow splitting configuration effectively.
Out of Order
2014-09-17
[1] S. Murali, D. Atienza, L. Benini, and G. De Micheli, “A method for routing packets across multiple paths in NoCs with In-Order delivery and Fault- Tolerance gaurantees,” VLSI Design, vol. 2007, pp. 1–11, 2007.
• Solution 1: flow control– Prospects
• Easy to control• Less hardware overhead
– Side effect• More congestion• Longer packet delay
Out of order packets
Packet in need
P4 P3
NI
RB
(b)
...
R1
R4
B
LUTR2
R3
P1P2
...
...
Out of Order
2014-09-17
• Solution 2: reorder buffer– Prospects
• Less on chip congestion• Less re-arbitration time
– Side effect • Area overhead
[11] M. Daneshtalab, M. Ebrahimi, P. Liljeberg, J. Plosila, and H. Tenhunen, “Memory-efficient on-chip network with adaptive interfaces,” Computer-Aided Design of ntegrated Circuits and Systems, IEEE Transactions on, vol. 31, no. 1, pp. 146–159, 2012.
Out of order packets
Packet in need
Reorder Buffer Size
• Traditional approaches– By experience– No formal method– Too pessimistic
• Our target– A general analytical model for worst-case reorder
buffer size– A method to diminish the reorder buffer size
• Traffic splitting proportion
2014-09-17
Outline
1Motivation2 Concepts 3Method 4Evaluation
2014-09-17
NoC Architecture
• Assumption– Non-intersecting sub-flows– Sub-flow number: 2– Delay bounds for sub-flows already known
2014-09-17
1 2 3
5 6 7
9 10 11
4
8
12
13 14 15 16
f (1,16)
f2 f1
PEReorder Buffer
NI
Counter
Packet IDLook-up
Table
110
0Packet In
Packet Out
Network Calculus Basics Results
2014-09-17
Assume: Linear arrival curve
Latency-Rate (LR) server
)()( TtRt
)()( trbt
Rb
TD
V
t
)(t)(t
b
r
R
T
βF : α F*: α*
input outputB
D
The delay bound is
Outline
1Motivation2 Concepts 3Method 4Evaluation
2014-09-17
General Analysis
• Srb Size of reorder buffer• D1 Packet delay in path f1
• D2 Packet delay in path f2
• △ t Packet injection interval
2014-09-17
1 2 3
5 6 7
9 10 11
4
8
12
13 14 15 16
f (1,16)
f2 f1
• Ideal case– No contention
Worst-case Reorder Buffer Size
2014-09-17
Definition 1
1 2 3
5 6 7
9 10 11
4
8
12
13 14 15 16
f (1,16)
f2 f1
1 2 3
5 6 7
9 10 11
4
8
12
13 14 15 16
f (1,16)
f2 f1
NC Model for Multi-path Routing
2014-09-17
tR*1 tR1
1S
tR*2 tR2
2S
t1 t*1
t1
t2 t*2
t2
Taffic Splitting Traffic
Convergence
Tag Flow
1 2 3
5 6 7
9 10 11
4
8
12
13 14 15 16
f (1,16)
f2 f1
• Step 1– Non-intersecting sub-flow identification
– Traffic split proportion calculation
NC Model for Multi-path Routing
2014-09-17
tR*1 tR1
1S
tR*2 tR2
2S
t1 t*1
t1
t2 t*2
t2
Taffic Splitting Traffic
Convergence
Tag Flow
1 2 3
5 6 7
9 10 11
4
8
12
13 14 15 16
f (1,16)
f2 f1
• Step 2
– Equivalent Service Curve (ESC) Calculation
• R: equivalent minimum service rate
• T: equivalent maximum processing latency [2] G. Du, C. Zhang, Z. Lu, A. Saggio, and M. Gao, “Worst-case performance analysis of
2-d mesh nocs using multi-path minimal routing,” in ISSS+CODES 2012.
NC Model for Multi-path Routing
2014-09-17
tR*1 tR1
1S
tR*2 tR2
2S
t1 t*1
t1
t2 t*2
t2
Taffic Splitting Traffic
Convergence
Tag Flow
1 2 3
5 6 7
9 10 11
4
8
12
13 14 15 16
f (1,16)
f2 f1
• Step 3
– Calculation of Worst-case Reorder Buffer Size.
Algorithm
Step 1
Path identification
2014-09-17
Step 2
ESC calculation
Step 3
Worst case reorder buffer size calculation
Outline
1Motivation2 Concepts 3Method 4Evaluation
2014-09-17
Evaluation
• Experiments targets– △ D ~ ?– ↓ ?
• Experiments methods– Synthetic pattern– Industry pattern
2014-09-17
Flow type arrival curve Service curve
Target flow
Contention flows
Experiments Setup
2014-09-17
1 2 3 4
65
14 15 16
1211109
87
13
f(1,16) f(2,12) f(3,8)
f(6,11)65
14 15 16
1211109
87
13
f1
f2 f4f3
1 2 3 4f1
f4f3 f2
1 2 1612843
f2
f1
1 5 161514139
f(1,16)
f(2,12) f(3,8) f(6,11)
Delta Delay VS. Buffer Size
• The bigger the delay difference, the larger the reorder buffer size
• To balance the traffic & proper path configuration
• Maximum reduction: 56.99%2014-09-17
Full Traffic Splitting
2014-09-17
• Target flow: full traffic splitting
• The more balanced traffic, the smaller the reorder buffer size
• Average improvement of 57.04%
Simulation
• Setup– Px =0.1
• Results– No packet loss
– Fully covered by analytical results
2014-09-17
Industry Case
2014-09-17
• Shorter long-path– Max hops: 3
• Less number of reorder buffers– Number of reorder buffers: 3
Node 4, 6, and 7
• Mapping 1– Less worst-case reorder buffer size– Shorter path delays
2014-09-17
Node 4 Node 6 Node 7
Total Size
• Mapping 2– Reduction of maximum 36.50% (76 packets)
– Average 29.20% (61packets)
– Minimum 22.12% (46 packets)2014-09-17
Summary
• Our analytical model– Reduce worst case reorder buffer size
• To choose proper sub-flows pairs• To alter traffic splitting proportion
– Explore mapping effects• Reorder buffer size
• Future work– To extend to more general cases
2014-09-17
Conclusion
2nd priority initiatives
1
2
3
4
5
Evaluate whether offer DT store more margin is possible
Together with other strong brands, communicate “Unilever” company brand more
Optimize our promotion pack allocation policy
Optimize island display in Northern area, pay more attention to season differences
Add more POSMs to more outlets. Using multiple ways to communicate with consumers
Thanks for your time
2014-09-17