Date post: | 16-Apr-2017 |
Category: |
Technology |
Upload: | moslemah |
View: | 427 times |
Download: | 0 times |
Interconnect Architectures for Modulo-Scheduled
Coarse-Grained Reconfigurable Arrays
Ahmed Hassan Mohammed
1
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
2
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
3
Introduction4
Introduction
ASIC
Reconfigurable
Arraysµ
Processors
Flexibility
Perf
orm
ance
5
Platforms
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
6
Reconfigurable Arrays
Fine-grained Coarse-grained
Purpose
Basic Unit
level
Re-configurability
Performance
General purpose
LUT
bit-level
High overhead
Low
Application Specific
ALU
word-level
Reduced overhead
High
7
Reconfigurable Arrays
Fined-grained
Coarse-grained
Purpose
Basic Unit
level
Re-configurability
Performance
General purpose
LUT
bit-level
High overhead
Low
Application Specific
ALU
word-level
Reduced overhead
High
8
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
9
Device Architecture10
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
11
Mapping Technology
Dataflow graph Architecture
12
Mapping Technology
Iteration 1 Iteration 2 Iteration 3
13
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
14
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
15
Proposed Architectures
Finput = 3
Example16
Finput : Number of possible inputs for a CFU
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
17
Proposed Architectures
Closest Topology
Finput = 2Finput = 3Finput = 4
6
4 2 5
3 3 7
5 2 4
Finput = 5
Labels Label <= Finput
18
Label by the closest
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
19
Proposed Architectures
Clique Topology
Finput = 2Finput = 3Finput = 4
4
5 2 5 6
3 2 4
6 3
Finput = 5
Labels Label <= Finput
20
Label by the row and column
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
21
Proposed Architectures
Directional Topology
Finput = 2Finput = 3Finput = 4
4 4 5 5
2 2 3 3
6 6 7 7
Finput = 5
Label <= Finput
22
Label by the next row and column
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
23
Proposed Architectures
Heterogeneous Topology
Finput = 2Finput = 3Finput = 4
6
2 5 2 4
3 7 4
3 5 6
Finput = 5
Label <= Finput
24
Label by the third column
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
25
Experimental Results
Ten benchmark kernels are used for comparison. Each is a single loop containing between 18 and 184 operations per iteration.
How Finput affects IPC (instruction per cycle)?
How Finput affects the Area ?
26
Experimental Results
Finput vs. IPC27
Experimental Results
Finput vs. Area28
Overall Results29
All Topologies
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
30
Overall Results31
Finput vs. IPC/Area
Overall Results Different interconnect topologies affect both performance and area.
Partially interconnected fabric is better than the fully connected fabric.
Software pipelining is affected by the amount of flexibility in the interconnect architecture.
32
References Steven J.E. Wilton, Noha Kafafi, Bingfeng Mei, Serge Vernalde
“Interconnect Architectures for Modulo-Scheduled Coarse-Grained Reconfigurable Arrays ”, 2004 IEEE.
Frank Bouwens, Mladen Berekovic, Andreas Kanstein, and Georgi Gaydadjiev, “Architectural Exploration of the ADRES Coarse-Grained Reconfigurable Array”, 2007.
Reiner Hartenstein, “Coarse Grain Reconfigurable Architectures”, 2001.
Lu Wan, Chen Dong, Deming Chen, “A New Coarse-Grained Reconfigurable Architecture with Fast Data Relay and Its Compilation Flow”.
33
Thanks
34