1D and 2D Bitstream Relocation for Partially
Dynamically Reconfigurable Architecture
BY
Marco Novati
Thesis committee:
Shantanu Dutt (chair), Marco Domenico Santambrogio, Piotr Gmytrasiewicz
UIC Thesis Defense: May 8, 2008
AimsAims
Architectural support for relocation:
Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture
Create efficient bitstream relocation solutions suitable for the target system:
1D - 2DHW – SW
2
OutlineOutline
IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
3
What’s Next…What’s Next…
IntroductionReconfigurationXilinx FPGAs
RelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
4
55
Reconfigurable ComputingReconfigurable Computing
“Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially
much higher performance than software, while maintaining a higher level of flexibility than hardware”
(K. Compton and S. Hauck, Reconfigurable Computing: a Survey of Systems and Software, 2002)
6
5 W5 W
whowho controls the reconfiguration
wherewhere the reconfigurator is located
whenwhen the configurations are generated
whichwhich is the granularity of the reconfiguration
in whatwhat dimension the reconfiguration operates
7
Reconfiguration in everyday Reconfiguration in everyday lifelife
Hocke
y
Football
(Complete – Static)
(Partial –
Dynamic)
(Partial – Static)
7
Soccer
88
Reconfigurable architectureReconfigurable architecture
A basic reconfigurable architecture consists of:a Static area: a basic Harward architecturea Reconfigurable area: an device area composed by several reconfigurable regions
9
Basic DefinitionsBasic Definitions
CoreCore: a specific representation of a functionality. It is possible, for example, to have a core described in VHDL, in C or in an intermediate representation (e.g. a DFG)
IP-CoreIP-Core: a core described using a HD Language combined with its communication infrastructure (i.e. the bus interface)
Reconfigurable Functional UnitReconfigurable Functional Unit: an IP-Core that can be plugged and/or unplugged at runtime in an already working architecture
Reconfigurable RegionReconfigurable Region: a portion of the device area used to implement a reconfigurable core
10
Xilinx FPGAs and Configuration Xilinx FPGAs and Configuration MemoryMemory
Frame Addressing: Virtex, Frame Addressing: Virtex, Virtex-EVirtex-E
11* Inspired to Virtex Series Configuration Architecture User Guide
Frame Addressing: Virtex2proFrame Addressing: Virtex2pro
12* Taken from Virtex-II Pro and Virtex-II Pro X FPGA User Guide
Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (1/2)(1/2)
New Frame Addressing:Possibility of addressing rows and columns
13* Inspired to Virtex 4 & 5 Configuration Architecture User Guide
Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (2/2)(2/2)
14* Inspired to Virtex 4 & 5 Configuration Architecture User Guide
What’s Next…What’s Next…
IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
15
16
Relocation: RationaleRelocation: Rationale
Bitstreams relocation technique to: speedup the overall system executionreduce the amount of memory used to store partial bitstreamsachieve a core preemptive execution assign at runtime the bitstreams placement
17
Relocation: The ProblemRelocation: The Problem
People Demanding for Functionalities
Set of Available Functionalities
FiArea/Time
Legenda:
A2/1
B 1/2
C2/2
D 1/1 E 1/1
F 2/2
RR3RR2RR1
FPGA
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
18
Relocation: ScenarioRelocation: Scenario
Time
Area
AB
Rec. F
F
Rec. E
E
Rec. C
C
Rec. D
D
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
A
E
D
C
B
F
2/1
2/2
1/2
1/1
1/1
2/2
A possible scenario
FiArea/Time
Legenda:
Time
19
Relocation: MotivationRelocation: Motivation
A
E
D
C
B
F
2/1
2/2
1/2
1/1
1/1
2/2
A possible scenario
FiArea/Time
Legenda:
Time
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
RR3RR2RR1
A
RR3RR2RR1
C
RR3RR2RR1
B
RR3RR2RR1
B
RR3RR2RR1
D
RR3RR2RR1
D
E
RR3RR2RR1
E
RR3RR2RR1
RR3RR2RR1
F
Time
Area
AB
Rec. C
C
Rec. F
F
Rec. E
E
DRec. D
Time
Area
AB
Rec. C
C
R2 F
F
R2 E
E
DR2 D
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
What’s Next…What’s Next…
IntroductionRelocationState of Art
PARBITBITPOSBAnMaTREPLICA
Proposed SolutionsResultsConcluding Remarks and Future Work
20
PARBITPARBIT
[E. Horta and John W. Lockwood, ”PARBIT: A Tool to Transform Bitfiles to Implement Partial Reconfiguration of Field Programmable Gate Arrays (FPGAs)”, Washington University, Technical Report, July 2001.]
Features:PureC software Enables the generation of the partial bitstream fileSmall modifications, altering only the parts related to the location on the device.
CONS:Only offlineOnly 1D reconfiguration
21
BITPOSBITPOS
[Yana E. Krasteva, Eduardo de la Torre, Teresa Riesgo and Didier Joly, ”Virtex II FPGA Bitstream Maniplation: Application to Reconfiguration Control Systems”, 2006 International Conference on Field Programmable Logic and Applications, August 2006.]
Features:Extract an area from a configuration fileGenerate the new relocated bitstream
CONS:Only offlineOnly Virtex II, Virtex II Pro [1D]
22
BAnMaTBAnMaT
[D. Deori, ”BAnMaT: un Framework per l’Analisi e la Manipolazione di un Bitstream Orientato alla Riconfigurabilita Parziale”, DEI, Milano, Politecnico di Milano, 2006]
Features:Bitstream correctness checkPerform modification on a configuration bitstreamPermits to bypass synthesis process from the VHDL
CONS:Only offline manipulation
23
REPLICAREPLICA
[H. Kalte, G. Lee, M. Porrmann and U. Rckert, ”REPLICA: A Bitstream Manipulation Filter for Module Relocation in Partial Reconfigurable Systems”, The 12th Reconfigurable Architectures Workshop (RAW 2005), 2005.]
Features:Hardware filter that exploit relocationNecessary manipulations during the download processRelocation hiding
CONS:Only for external reconfigurable systemOnly 1D relocationMaximum frequency of 50 MHz 24
What’s Next…What’s Next…
IntroductionRelocationState of ArtPolarisProposed Solutions
PolarisTarget ArchitectureProposed Relocation Solutions
Results Concluding Remarks and Future Work
25
26
Polaris: MotivationsPolaris: Motivations
Complete workflow to generate a self dynamically reconfigurable architecture that:– Supports 1D and 2D reconfiguration
– Has “good” area constraints for cores
– Performs Runtime task placement decisions
– Exploits internal and fast Core relocation
Starting from specification of:– Target application– Target device info– Reconfiguration model– Communication Infrastructure
2727
Polaris Polaris OverviewOverview
Workflow to manage allocation and relocation of tasks in self dynamically reconfigurable architectures
Final goal: complete architecture (bitstreams and code) generation
Target Architecture: YaRATarget Architecture: YaRA
28
PPC Based YaRAPPC Based YaRA
29
STATIC AREA
Proposed Relocation SolutionsProposed Relocation Solutions
Runtime Support for Self Dynamical Runtime 1D and 2D Reconfiguration– Xilinx Virtex, Virtex-E, Virtex2pro [1D]– Xilinx Virtex-4 and Virtex-5 [2D]
Relocation, different solutions:– Software:
• BAnMaT Lite– Hardware:
• BiRF [1D]• BiRF Square [2D]
30
Configuration BitstreamConfiguration Bitstream
31
BiRF & BiRF Square Block BiRF & BiRF Square Block DiagramDiagram
32
The ParserThe Parser
33
CRC CalculationCRC Calculation
Particular CRC value, used by Xilinx tools
Two version of BiRF and BiRF Square:– By using the “predefined” values– With actual CRC calculation
X16 + X15 + X2 + 1 [1D]
X32 + X28 + X27 + X26 + X25 + X23 + X22 + X20 + X19 + X18 + X14 + X13 + X11 + X10 + X9 + X8 + X6 + 1 [2D]
34
What’s Next…What’s Next…
Introduction Relocation State of Art Proposed Solutions Results
– Synthesis Results– Relocation Solutions Results
Concluding Remarks and Future Work
35
ResultsResults
Relocation solutions:– Small area usage (slide 37)– High time performance (slide 38)
Relocation results:– Internal memory saving (slides 39 – 40)– Time saving (slides 41- 44)
36
Synthesis Results: AreaSynthesis Results: Area
37
FPGA BiRF BiRF Square
FamilyModel
Generic
Version
Optimized
Version
Generic
Version
Optimized
Version
Virtex II Pro
vp7 11.6 % 3.6 % − −
Virtex II Pro
vp20 5.8 % 1.8 % − −
Virtex II Pro
vp30 4.2 % 1.3 % − −
Virtex 4 vlx40 − − 2.2 % 0.9 %
Virtex 4 vlx60 − − 1.5 % 0.6 %
Virtex 4 vlx100
− − 0.8 % 0.3 %
Virtex 5 vlx50 − − 1.1 % 0.8 %
Virtex 5 vlx85 − − 0.6 % 0.4 %
Virtex 5 vlx110
− − 0.5 % 0.3 %
Synthesis Results: Time Synthesis Results: Time PerformancesPerformances
BiRF:– On a Virtex2pro with speed grade -5
• General purpose version: max frequency of 101 MHz• Specific version: max frequency of 136 MHz
BiRF Square:– On a Virtex-4 with speed grade -12
• General purpose version: max frequency of 160 MHz• Specific version: max frequency of 290 MHz
– On a Virtex-5 with speed grade -3• General purpose version: max frequency of 226 MHz• Specific version: max frequency of 304 MHz
38
Relocation Solutions Results Relocation Solutions Results (1/2)(1/2)
BiRF, BiRF Square, BAnMaT Lite– Permit to support relocation in a self partially and
dynamically 1D or 2D reconfigurable system– The occupation ratio is relatively small– Frequency more than acceptable– Reduction of internal memory requirements
Throughput:– BiRF: 6 MB/s – BiRF Square: 7.3 MB/s– BAnMaT Lite: 2.6 MB/s
39
Relocation Solutions Results Relocation Solutions Results (2/2)(2/2)
A total configuration file size is about 1 MB Considering an architecture:
– 1/3 of the area as fixed part – 2/3 as reconfigurable part with 6 slots
With such hypothesis– Size of a partial bitstream will be about 110 KB– Relocation time of about:
• 18 ms with BiRF• 15 ms with BiRF Square• 42 ms with BAnMaT Lite
40
Relocation Time Results (1/4)Relocation Time Results (1/4)
41
Relocation Time Results (2/4)Relocation Time Results (2/4)
FPU1: clock time 0.01 ms, required for 3.65 s (7 add, 3 sub, 10 mul, 1 square root and 4 div)– Feasible RR assignment: (0,0) and (6,0)
JPEG: a complete JPEG Hardware Compressor, compression rate 24 img(352x288)/s, required for 3 seconds (72 img 352x288)– Feasible RR assignment: (0,0), (0,1), (6,0) and
(6,1) FPU2: clock time 0.01 ms, required for 3.13 s (6
add, 5 sub, 8 mul and 4 div)– Feasible RR assignment: (0,0) and (6,0)
3DES: a Triple-DES 64-bit block cipher, required for 1 second, in order to process a file of 72 MB– Feasible RR assignment: (0,0),(1,0), (3,0) and (3,1)
42
Relocation Time Results (3/4)Relocation Time Results (3/4)
43
Relocation Time Results (4/4)Relocation Time Results (4/4)
44
What’s Next…What’s Next…
Introduction Relocation State of Art Proposed Solutions Results Concluding Remarks and Future Work
45
Concluding RemarksConcluding Remarks Architectural support for relocation:
– Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture
– Create efficient bitstream relocation solutions suitable for the target system:
• 1D - 2D• HW – SW
Pubblications:– International conferences:
• M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Core allocation and relocation management for a self dynamically reconfigurable architecture, ISVLSI 2008, IEEE Computer Society Annual Symposium on VLSI
• S. Corbetta, F. Ferrandi, M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Two Novel Approaches to Online Partial Bitstream Relocation in a Dynamically Reconfigurable System, ISVLSI 2007, IEEE Computer Society Annual Symposium on VLSI
– IEEE Transaction on VLSI (Second Rewiew Phase):• M. Morandi, M. Novati, M.D. Santambrogio, P Spoletini, D. Sciuto, Internal and
External Bitstream Relocation for Partial Dynamic Reconfiguration, TSVLSI, IEEE Transactions on Very Large Scale Integration Systems46
Future WorkFuture Work
Validation tool for the chosen– Reconfiguration model– Communication infrastructure
Simulation framework– Monitor the reconfigurable system evolution– Evaluate different placement policies and area
constraints definitions
47
48
General InformationGeneral Information
Webpage– www.dresd.org/?q=polaris
Mailing List– [email protected]
Contact– To have more information regarding polaris:
• [email protected] – For a complete list of information on how to contact us:
• www.dresd.org/?q=contact_polaris
49
QuestionsQuestions