+ All Categories
Home > Documents > Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C....

Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C....

Date post: 01-Apr-2015
Category:
Upload: calvin-wion
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) R. Cattaneo, C. Pilato C. Pilato , M. Mastinu, M.D. Santambrogio Politecnico di Milano – Dip. di Elettronica, Informazione e Bioingegneria O. Kadlcek, O. Pell Maxeler Technologies Ltd., London, UK Runtime Adaptation on Dataflow Runtime Adaptation on Dataflow HPC Platforms HPC Platforms
Transcript
Page 1: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Torino, Italy – June 25, 2013

NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013)

R. Cattaneo, C. PilatoC. Pilato, M. Mastinu, M.D. SantambrogioPolitecnico di Milano – Dip. di Elettronica, Informazione e

Bioingegneria

O. Kadlcek, O. PellMaxeler Technologies Ltd., London, UK

Runtime Adaptation on Dataflow Runtime Adaptation on Dataflow HPC PlatformsHPC Platforms

Page 2: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 2

Context Definition The portion of the application that needs to be

accelerated is usually implemented in the hardware

Resource limitations can become a bottleneck

In some contexts, the HPC application should be able to adapt to the environment

Partial dynamic reconfiguration is a well-know technique to change the behavior at run timewhile reusing the same logicacross different tasks

Page 3: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 3

Reconfigurable Computing

“Reconfigurable computing is intended to fillthe gap between hardware and software, achieving

potentially much higher performance than software, while maintaing a higher level of flexibility than hardware”

(K. Compton and S. Hauck, Reconfigurable Computing: a Survey of Systems and software,2002)

Page 4: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 4

Reasons Behind

Some applications require performance that cannot be achieved by software

Some applications require to be flexible, modifiable, adaptable. Traditional hardware cannot achieve these results

Reconfigurable Computing platforms allow to be altered after their deployment, turning into a high-performance device able to meet resources constraints, adaptability constraints and reliability constraints

Page 5: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano

Maxeler Architecture

•Maxeler systems are based on the interaction between a CPU and an FPGA

•Maxeler exploits FPGAs only as devices devoted to hardware acceleration

5

Why do not try enhancing the flexibility and

performance of Maxeler platforms by exploiting

some intrinsic characteristics of the

FPGAs?

Page 6: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 6

Objectives

Dynamic Partial Reconfiguration is a technique that can be applied to cope with problems such as the lack of available resources and the system adaptability and reliability

Maxeler architectures are very efficient for computation but they do not support the use of Dynamic Partial Reconfiguration

Designing a new tool flow able to support Dynamic Partial Reconfiguration in Maxeler architectures to offer adaptivity in the HPC domain

Rationale

Goals

Page 7: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 7

Canny edge detector

Page 8: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 9

Reconfiguration in FPGAs

Useful Definitions

Full Bitstream

Reconfigurable partitions

Reconfigurable modules

Partial Bitstream

Configurations

FPGA

Full bitstream

Page 9: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 10

Maxeler Architecture

Page 10: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 11

Example application

Manager

SLiC SLiC

Page 11: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 12

MaxCompiler flow

MaxIDE

Javacompilatio

n

VHDL

BIT file

Java runtime

Page 12: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano

Preliminary Considerations

13

Hierarchical design VS flat design

NGDBuild, Map, PAR, Bitgen, are run as many times as the number of configurations

Need for the PXML file to lead the process

Page 13: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano

Proposed Approach

14

Focusing on Kernels instead of Manager

Kernels in the same Reconfigurable Block must have the same characteristics;

In every Configuration, exactly one Kernel must be assigned to each Reconfigurable Bock;

The same Kernel can not be placed in two different Reconfigurable Blocks.

Preserving as much as possible MaxCompiler/Xilinx tool flow structure

Mask the details to the designer

Page 14: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 15

Reconfiguration on Kernels

Page 15: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 16

User interface: DFE code

PRManager Main

...Configuration A = ...Configuration B = ...

build(A,B)

• Reconfigurable Block = Reconfigurable Partition• Kernel = Reconfigurable Module

Page 16: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 17

Considerations

Page 17: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 18

User interface: Host code

max_reconfig_partial_bitstream

DFE

Page 18: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 19

Case Study: Edge Detection

Canny edge detection is applied to a video

There are two Reconfigurable Blocks and a total of four filters

each filter represents a Reconfigurable Module

Initially, the first two filters are applied

Then, the device is partially reconfigured and the other two filters are applied

19

DFE

Page 19: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano

MaxWorkstation

20

The targeted platform is MaxWorkstation

It contains a Intel i7 870 quad core CPU with 16 GB RAM

The Intel CPU is connected to the DFE via PCI Express

The DFE has 24 GB RAM, and it is a MAX3 board - XilinxV6

Page 20: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 21

Experimental Results

Methodology applied to a video taken from “Mission Impossible”

combined with a set of compiler extensions for the automatic code generation of the kernels

details are totally hidden to the designer

[VIDEO]

Page 21: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 22

Conclusions and Future Work

The proposed approach integrated Partial Dynamic Reconfiguration in a dataflow architecture

The process is totally transparent to the designer

Future works will focus on the current limitations:

Reconfigurable Areas constraints can be specified only as multiple of clock regions

During the partial reconfiguration of some Reconfigurable Blocks, all the Kernels are in reset status

Page 22: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

??QuestionsQuestions

Page 23: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 24

Implementation: design flow

The build process is divided in four mainstages

Page 24: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 25

First build stage

• When the build process starts, MaxDC, XST and NGCBuild are run for each Reconfigurable Block and for the static part independently;

• The result of this first stage is a large number of netlist files.

Page 25: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 26

Second build stage

• The second stage consist in running NGDBuild, MAP, Par, pr_verify and Bitgen for each configuration

• PXML file is automatically generated

• The static part is implemented only in the first configuration

• The reconfigurable modules are implemented only the first time they appear in a Configuration

Page 26: Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Christian Pilato – Politecnico di Milano 27

Final stage

• Once the full bitstream and all the partial ones have been generated, they are encapsulated in the .Max file

• The first Configuration passed to the build method is choosen as the “default” Configuration

• This means that its full bitstream will be loaded in the CFPGA when the program starts


Recommended