[IEEE 2012 International Conference on Reconfigurable Computing and FPGAs (ReConFig 2012) - Cancun,...

Dreams: A Tool for the design of Dynamically Reconfigurable Embedded and Modular Systems

Andrés Otero, Eduardo de la Torre, and Teresa Riesgo Centro de Electrónica Industrial

Universidad Politécnica de Madrid Madrid, Spain

{joseandres.otero, eduardo.delatorre, teresa.riesgo}@upm.es

Abstract—Dynamically Reconfigurable Systems are attracting a growing interest, mainly due to the emergence of novel applications based on this technology. However, commercial tools do not provide enough flexibility to design solutions, while keeping an acceptable design productivity. In this paper, a novel design flow is proposed, targeting dynamically reconfigurable systems. It is fully supported by a tool called Dreams, which is able to implement flexible systems, starting from a set of netlists corresponding to the modules, as well as a system description provided by the user. The tool automatically post-processes the nets, implementing a solution for the communications between reconfigurable regions, as well as the handling of routing conflicts, by means of a custom router. Since the design process of every module and the static system are independent, the proposed flow is compatible with system upgrade at run-time. In this paper, a use case corresponding to the design of a highly regular and parallel mesh-type architecture is described, in order to show the architectural flexibility offered by the tool.

Keywords-component; FPGA; Dynamic and Partial Reconfiguration; Design Flow; CAD Tool; XDL

I. INTRODUCTION Enhancing reconfigurable systems with dynamic and partial reconfiguration (DPR) features may lead to substantial benefits, ranging from cost reductions, due to the use of smaller FPGAs [1] [2], to power consumption savings [3] [4]. Regarding power, this tendency is expected to increase in the future, since static power is becoming the leading contributor to the total device consumption, as CMOS technology shrinks [5]. Moreover, dynamically reconfigurable systems deliver an unrivalled capacity to be adapted or upgraded at run-time. This flexibility enables, for instance, the implementation of self-adaptive systems [6], as well as envisaging more autonomous operating conditions. Among other “self-*” properties [7], reconfigurability has been exploited in fault-tolerant and self-healing [8] [9] strategies, as well as in self-optimizing [10] [11] and self-organizing systems. Even autonomous hardware design is possible by means of the combination of reconfigurable platforms with other techniques, like Evolutionary Algorithms [12] [13]. Motivated by this increasing research interest, part of the FPGA industry has included support for the design of

dynamically reconfigurable systems in commercial design tools. All those tools cover the problem of communications from/to reconfigurable modules, as well as the generation of partial configuration bitstreams. On the one hand, inter-module communications must guarantee signal integrity of the nets crossing the boundaries of the reconfigurable modules, during the reconfiguration process. On the other hand, partial bitstream generation implies extracting only the configuration information corresponding to the device area occupied by the reconfigurable module, discarding the rest of the design. To this end, logic resources and internal nets corresponding to the module must be contained in a reconfigurable area. In addition to this, there must be no conflict between resources corresponding to different reconfigurable modules, as well as to the static system. In this work, we have focused on Xilinx devices, because this is currently the most widespread solution. Xilinx has provided quite a few design flows and tools during the last years. First generations of Xilinx tools [14] relied on bus macros to guarantee routing correctness. However, the use of bus macros introduces area and delay overheads, which may be not acceptable in some designs. This is accompanied with the inability to guarantee net routing inside reconfigurable region. Those limitations have been addressed by subsequent Xilinx design flows. Thus, last versions of the tool included in PlanAhead [15] propose the use of proxy logic, instead of bus macros, to solve the communication problem. In addition, static signals are allowed to cross reconfigurable areas, but at the cost of disabling module relocation. Besides commercial solutions, CAD Tools have been also proposed in academic environments. GoAhead [16] and OpenPR [17] are good examples within this context. In both works, authors are facing many of the same problems we are, such as module relocation or the independent design of reconfigurable modules and the rest of the system. In this paper, an alternative tool called Dreams is presented, providing some different approaches compared to these works. Proposed tool targets the design of highly flexible architectures, besides traditional modular ones. Dreams design flow is conceived as a post-processing stage, that starting from a conventional and independent placed and routed netlist for

each module, is able to transform them in order to meet the specific requirements of dynamic and partially reconfiguration, giving the corresponding bitstream as a result. This strategy, combined with the use of Xilinx Description Language (XDL), allows also an easy portability of designed modules among different devices. Finally, the last achievement is that everything is done without further designer's intervention. The structure of the rest of the paper, is as follows. In section 2, background information is provided. In section 3, main features of the tool are described, while in section 4 the strategy to model and to describe reconfigurable systems is proposed. In section 5, the mechanism to implement inter-module communications is detailed, and design flow is outlined in section 6. Section 7 describes the proposed router, and a use case is proposed in section 8. Finally, paper conclusions are drawn in section 9.

II. BACKGROUND Main features of both XDL and Rapidsmith APIs are described in this section, together with the related work.

A. Related Work First generations of Xilinx tools [14] relied on bus macros to solve the problem of the communications from/to reconfigurable modules. Bus macros are presynthesized interfaces containing two terminals and fixed routing between them, which are placed in fixed positions in every module. This approach has some benefits, such as allowing module relocation and independent design of each module and the static system. However, bus macros introduce area and delay overhead. This is accompanied with the inability to guarantee net routing inside reconfigurable regions. Also nets belonging to the static system must be carefully controlled to avoid them entering in any reconfigurable area. Therefore, this methodology led to low designer's productivity. Differently, last tools targeting reconfigurable systems within Xilinx PlanAhead rely on Proxy Logic to overcome bus macro limitations. Proxy logic is a single LUT belonging to the static system, but placed in a fixed position of the reconfigurable region. Routing between the proxy logic and the system is kept unchanged during the implementation of both the static system and all the reconfigurable modules. In addition, all the reconfigurable modules and the static system are designed together, and therefore, static signals are allowed to cross reconfigurable areas. This solution, while increasing designer's productivity, allows neither module relocation in different reconfigurable regions, nor reconfigurable module to reconfigurable module communications. CAD Tools such as GoAhead [16] and OpenPR [17] have been also proposed in academic environments. However, some differences there exist between them and Dreams tool. The way to guarantee routing of modules within reconfigurable regions is one of them. While Dreams uses a custom router, previous works are based on blocker macros together with vendor's router. With regard to inter-module communication, in Dreams the implemented router is also used to guarantee that interfaces between interchangeable modules are using the

same device wires, with no extra area or delay overhead. Data structures used for this purpose can be extracted and applied in subsequent designs reusing the same reconfigurable modules. In addition, Dreams design flow is conceived as a post-processing stage, reducing human intervention during the design process.

B. Xilinx Description Language (XDL) XDL is a human readable language [21], which is fully equivalent to the binary proprietary NCD format, internally used by Xilinx tools to describe netlists after design technology mapping. Translation from NCD to XDL can be carried out using xdl (with option –ncd2xdl), a command line tool included in the ISE Toolset. Besides the header, the body of an XDL design file contains three kinds of elements, (1) the design ports, (2) the design instances, which are instantiations of different primitive types containing the logic of the design and (3) the nets connecting those instances. Each instance definition has a unique name as well as its corresponding type in the device fabric. In addition, it contains a configuration string defining the attributes to obtain the expected logic, and even the instance placement. With respect to the nets, each one includes a description of the involved pins. In addition, if the net is routed, the path is described by means of the PIPs (Programmable Interconnect Points) actually configured, which corresponds to the combination of wires selected in the switch matrix, affecting the activation of the required multiplexers. A tutorial on XDL, with several use examples can be found in [19]. As follows from the description above, XDL files gather all the information about the design in a readable language, making them a suitable interface, to check design features, to implement design transformations or to generate design modules and macros, among others. Moreover, since the structure and syntax of XDL files is regular and repetitive, it allows the automation of these activities, opening the way to the development of custom design tools. Some examples of automatic macro generators based on XDL are ReCoBus-Builder [20], the HMFlow [22] or the DHHarMa design flow presented in [26]. Designs obtained from the modification of XDL files, can be later converted again to NCD format with xdl (-xdl2ncd option). Therefore, CAD tools built upon XDL can be completely integrated within the conventional design flow. In addition of requiring full access to design netlists, the development of a custom CAD Tool requires a complete database with detailed descriptions of every resource on the device fabric. In this regard, xdl tool is also able to generate reports (known as XDLRC) describing the resources available on the selected device part. However, those descriptions are huge, requiring up to several gigabytes in the case of larger devices.

C. RapidSmith In this work, instead of beginning from the scratch, the functions required in Dreams to modify XDL files have been developed on top of RapidSmith [18]. This is a Java-based open framework developed in the Brigham Young University

for the development of custom CAD Tools. RapidSmith has been selected for its public availability, ease of use and completeness, by the time this work began. A similar alternative is Torc [23], a solution based on C++. However, Torc XDL APIs are delivered as a component of a complete design flow for reconfigurable systems designs, including packers, routers and bitstream interfaces. Furthermore, RapidSmith incorporates an efficient device database enhanced with a set of APIs and data structures through which the XDL design can be loaded and subsequently processed. Unlike XDLRC reports, RapidSmith database includes a compact and efficient description of the architecture. In addition, data structures included in RapidSmith have a direct correspondence both with the device resources included originally in the XDLRC file, such as Tiles, Primitive Sites, Wires or PIPs, as well as the main XDL design declarations, like instances and nets. In addition, RapidSmith APIs offer easy access and basic transformation features on the design data structures. For instance, it allows adding new PIPs to a net, changing the position of design instances or modifying their attributes, like the LUT equations. Based on these structures, Rapidsmith makes easier the creation of FPGA CAD Tools, like the one reported in this work.

III. DESIGN TOOL FEATURES Main features of the Dreams design flow are the following:

• Module relocation in any compatible reconfigurable region in the device, using a communication scheme without extra overhead.

• Independent design of modules and the static system, allowing system upgrading during its life-time.

• Reduced support is needed from the designer, hiding the low level details.

• Enhanced module portability among different reconfigurable devices.

• Support for the design of highly flexible reconfigurable architectures.

Dreams tool targets architectures which are made up of a set of disjoint rectangular regions. Those regions can allocate system modules, which can be either static or reconfigurable. Every system must have a static module, where hardware elements which remain unchanged during system life-time, as well as design IOBs, are located. To increase architectural flexibility, each reconfigurable module can be allocated in one or more reconfigurable regions. The only limitation is that the merged region must be a rectangle itself. This way, the same architecture can be used to define modules with different requirements in terms of logic resources and shapes. When defining system regions, it is recommendable to choose the granularity envisaged for the finest grain module in the system. This choice will also impact communication among modules, as will be described later. Two examples of possible architectures are shown in figure 1. In the case (a), a slot-based architecture is defined, where each slot expands a row of a single CLB wide. The possibility of allocating modules with different sizes in each set of slots is

also exploited, avoiding the well-known internal and external fragmentation problems. In the case (b), the architecture is defined as an array of regions, which can be merged to contain

(a)

(b)

Figure 1: Examples of possible architectures to de designed with the Tool. In a), a fine-grain slot architecture is shown, while b) provides a Mesh-type solution.

modules with different sizes. As can be seen by preceding examples, architecture definition in terms of regions does not limit system flexibility, since selecting the suitable collection of regions allows implementing any envisaged architecture. Whatever architecture is defined, Dreams tool is able to detect and guarantee communications symmetry, based on the set of possible positions where each module would be reallocated. Regarding the user interface, designers simply have to follow some rules with respect to the naming of the ports of the reconfigurable modules, in order to allow the tool to automatically identify the nets corresponding to communications with other modules. In addition, two system description files must be provided, as described in the following section.

IV. RECONFIGURABLE SYSTEM DESCRIPTION AND MODELLING

Before proceeding to the implementation stage, how to model and specify the system under design is described in this section, in such a way as to be understood by Dreams Tool. Mainly, it is required to define the area corresponding to each reconfigurable region, as well as to describe the modules compounding the system.

A. SYSTEM ARCHITECTURE DESCRIPTION System Architecture is modeled by means of two objects, the Virtual Region (VR) and the Virtual Architecture (VA). A Virtual Region is defined as a set of FPGA logic resources which can be used to allocate a module, either reconfigurable or static. VRs are rectangles, defined by the coordinates of the CLBs located in the lower left and upper right corners. One of the VRs of the system can be defined without providing corner coordinates. This way, VR area will be the area of the FPGA not included in any other VR of the system. This is intended to

easily define the Static Area. The set of Virtual Regions describing the full system compounds the Virtual Architecture. The VA corresponding to a system under design must be described by the user using an XML file. An example is shown in figure 2, for the case of a mesh-type architecture. In this case, seventeen different regions are defined. Sixteen of them (Rec00 to Rec33) are defined using the coordinates of the CLBs in the corners, and the other one is described as the rest of the FPGA area.

<Architecture> <partName>xc5vlx110tff1136-</partName>

<VirtualRegion> <Name>Rec00</Name> <P0>INT_X43Y80</P0> <Pf>INT_X44Y89</Pf> </VirtualRegion> <VirtualRegion> <Name>Rec01</Name> <P0>INT_X45Y80</P0> <Pf>INT_X46Y89</Pf> </VirtualRegion>

... <VirtualRegion> <Name>Rec33</Name> <P0>INT_X49Y110</P0> <Pf>INT_X50Y119</Pf> </VirtualRegion> <VirtualRegion> <Name>Static</Name> <P0></P0> <Pf></Pf> </VirtualRegion> </Architecture>

Figure 2: Examples of architecture definition, for the case of an architecture with 16 elements. (Only 3 of them are shown in the XML file)

B. SYSTEM MODULES DESCRIPTION Each module to be integrated in the system is modeled using a data structure called Virtual Module (VM). Each VM specifies the file (HDL or NCD) describing the module, as well as the position of the module in the FPGA. Module position is indicated as a set of neighboring Virtual Regions. In addition, other sets of VRs where the module would be reallocated can be defined. Finally, module connectivity is also described in the structure in terms of the Virtual Region and the border sides affected by each interface. Besides, information about which clock resources must be used is displayed. The set of Virtual Modules to be processed by the tool must be defined by the designer in another XML file. In figure 3, an example is shown. It describes a Reconfigurable System with two modules, one static and the other one reconfigurable. The static one will be fixed in the region called Static, and it includes two communication interfaces, called Virtual Borders. Virtual borders are an important concept, which will be further referenced in the next section. In this case, both of them communicate the region Static, where the module is defined, with Rec00. Module descriptions are referred to the VA shown in figure 2. Indicating explicitly the border side (North, South, East or West) prevents ambiguities. In this case, the tool will be able to distinguish which signals are part of the Virtual Border crossing through the south, and which of them through the west, even though both communicates the static region with Rec00. On the other hand, the reconfigurable module in the example is located in Rec11, and

it has four Virtual Borders, one with each closer neighbor regions. In this example, the module occupies a single Virtual Region, but it can be reallocated in all the positions of the mesh-type architecture. Based on this information, Dreams tool will identify which inter-module communication interfaces must be equal.

<System> <CLK>1</CLK> <Name>ReconfigurableSystemName</Name> <VM type = "Static"> <Name>StaticSystemExample</Name> <File>top.ncd</File> <VirtualRegion>Static</VirtualRegion> <Connectivity> <VB> <Region>Rec00</Region> <Side>W</Side> </VB> <VB> <Region>Rec00</Region> <Side>S</Side> </VB> </Connectivity> </VM> <VM type = "Reconfigurable"> <Name>element</Name> <File>pe.vhd</File> <VirtualRegion>Rec11</VirtualRegion> <Connectivity> <VB> <Region>Rec10</Region> <Side>W</Side> </VB> <VB> <Region>Rec12</Region> <Side>E</Side> </VB> <VB> <Region>Rec21</Region> <Side>N</Side> </VB> <VB> <Region>Rec01</Region> <Side>S</Side> </VB> </Connectivity> <Positions> <VirtualRegion>Rec00</VirtualRegion>

... <VirtualRegion>Rec33</VirtualRegion> </Positions> </VM> </System>

Figure 3: Example of Modules definition file, with two modules.

Each system module can be described using either a HDL file or as a routed netlist, and the designer can select the most suitable choice in each case. Using an HDL File for each module is suitable in the case of simple designs, such as the parallel modules of the parallel architecture described in figure 1 b). However, if the module is complex, including further requirements and constraints, it is preferable to design it completely using vendor’s tools. This is the typical situation of the Static module. In this case, module position must be introduced by the designer manually in the User Constraints File (UCF), instead of being automatically generated by Dreams Tool. Distinguishing between the original module placement and the positions where it would be reallocated, allows capturing modules from placed and routed NCD files.

V. INTER-MODULE COMMUNICATIONS: VIRTUAL BORDERS Communications among modules is one of the main issues to solve when a reconfigurable system is implemented, since signal’s integrity at the module’s boundaries has to be guaranteed. The solution proposed in Dreams tool is based on the definition of Virtual Borders (VB).

A Virtual Border is a data structure containing information about the specific device wires used to cross the boundaries between two Virtual Regions, for each cross-border net. More specifically, for each crossing signal, the last wire in the region where it is an output, and the first one where it is an input are stored. If the Virtual Borders of Virtual Modules, which can be allocated at run-time in the same set of Virtual Region are equal, correctness of communication is guaranteed. If a Virtual Module occupies more than one Virtual Region, different Virtual Borders are defined for each interface between regions. In the rest of the section, we will consider that each module occupies a single Virtual Region, for the sake of simplicity. Reconfigurable communication interfaces with respect to other modules are described by the user as ports in HDL, and therefore mapped as external IOBs, when synthesized independently. To allow the tool identifying these ports as members of a Virtual Border, their names have to be chosen according to the following rules: - It has to include the name of the destination Virtual

Region. - Its name has to include the word “Reconf”, to distinguish

them of the non-reconfigurable ports. - Port name has to include the side where they leave the

Virtual Region. When the module is static, the side is defined with respect to the neighbor region, while if it is reconfigurable, it is respect to its own region.

In any case, the names of the reconfigurable ports are described with respect to the original position for the module, independently on where it can be relocated. Virtual Borders may not only be extracted with the Dreams tool to check inter-module communications correctness, but also to be applied, and even transformed. The application of a Virtual Border implies routing each signal using the predefined resources. Virtual Borders act like masks, as it is in photolithography. Differently to the GoAhead of OpenPR proposals, in this work a custom FPGA router has been implemented, which is able to guarantee this condition. Since those nets are mapped to external device ports, the first step to apply them is to remove those IOBs. The original situation before this process is shown in Figure 4 a). Thus, to route a Virtual Border net is necessary to instantiate temporal Fake Macros, allocated in the destination region. Fake Macros are CLBs which are added in random positions of the neighbor region, and used like pins of the nets involved in the Virtual Border. Those macros will not have any impact on the final system, since they will be removed when extracting the partial bitstream corresponding to the module. Thus, just the net wire will be configured. In figure 4 b), the situation with nets routed using those fake macro, but without symmetry, is shown. Finally, figure 4 c) shows the situation after the application of the Virtual Borders to every interface. In the case described in the figure 4, the same Virtual Border is applied to interfaces with Virtual Regions at the North and South, and East and West, respectively. In this case, equivalent nets cross the module boundaries using equivalent resources. Since the same Virtual Border is applied to every

module, involved in the interface, during reconfiguration routing resources always fit, allowing a successful communication between the modules. A detail of the symmetry achieved in the communication interface after the application of the Virtual Border is shown in figure 5.

(a)

(b)

(c)

Figure 4: Virtual Border Application process.

Figure 5: Situation after the application of the Virtual Border

Furthermore, Virtual Borders can be transformed, in advance of its application. Valid transformations are displacements, which allows, for instance, applying the same Virtual Border to different frontiers between modules, as well as inverted. Inversions are necessary since some Virtual Modules have to be applied both to the reconfigurable Module and the Static one, and therefore, input signals in one case are outputs in the other one. This is the same situation as the symmetric connections required in the case of the 2 dimensional mesh-type architectures. Based on the positions where a Virtual Module can be relocated, Dreams tool automatically detects,

Figure 6: Dreams tool design flow

identifies and applies Virtual Borders which have to be equal, for each system. In the example of figure 2, since the Module element can be relocated in any position of the array, its north Virtual Border must be symmetric with respect to the one in the south, the same way the east interface must be symmetric to the west interface. Moreover, since the element can be relocated in the region Rec00, those interfaces must be also symmetric with respect to the static system Virtual Borders. In order to enable system updating during its life-time, Virtual Borders can be stored in external Files, and loaded in different projects.

VI. RECONFIGURABLE SYSTEM DESIGN FLOW This section outlines the proposed flow for the design of reconfigurable systems. Dreams Tool has been divided into two layers, the front-end and the back-end. The overall process is described in figure 6. The back-end is in charge of implementing each individual module, starting from the XML Virtual Architecture definition, as well as its XML description extracted from the Modules definition file. The extraction of the single module XML file is carried out by the design front-end, which also assigns equivalent Virtual Borders and decides the suitable module generation order. System designers can decide running both or just the Back-end, depending if all system modules, or just a single one, are to be designed. However, individual Virtual Module descriptions include information about which Virtual Border has to be extracted or applied in each Virtual Region interface. Therefore, in case a new module is implemented in the system, the designer has to decide which of those interfaces are equivalent and which Virtual Borders have to be applied.

A. SYSTEM DESIGN FRONT-END This stage is in charge of preparing the netlist corresponding to each Virtual Module to be processed by the subsequent Module stage, reducing designer’s effort, and therefore, increasing its productivity. In addition of generating the netlists, XML files corresponding to each module have to be prepared, starting from the system description file provided by the designer. Therefore, the following processes are accomplished: 1.- Netlist Generation: Netlist introduced using a HDL file, instead of an NCD, are synthesized using vendor’s Tools. More specifically, PlanAhead is called by the Dreams Tool in command-line mode, using the placement constraints described by the user in the Virtual Modules XML File. Therefore, after this point, a placed and Routed NCD is available for each module in the system.

2.- System Analysis: Starting from the XML descriptions provided by the designer as input arguments, the tool analyses the system in order to (1) Identify the Virtual Borders between each module and all its neighboring Virtual Regions; (2) Decide which of those Virtual Borders are equivalent, based on the positions where each module can be relocated; (3) Identify in which module each equivalent Virtual Border has to be extracted, and where it must be applied; (4) Identify the order how modules design have to be launched; And finally (5) , launch subsequently the Module design. Among all the equivalent Virtual Borders, it is necessary to identify which one has to be extracted, and which of them will be applied in the rest of the equivalent positions. In case all the modules have exactly the same ports, the specific order is not relevant. However, it is possible that some of the modules requires only some of the signals crossing the boundary. A solution in this case is to generate for each set of equivalent modules, one including all the connections, called the Reference Module. It can be a simple module, which could be built automatically.

B. MODULE DESIGN BACK-END

The back-end includes all the required methods and algorithms to carry out the implementation of each module. It starts from the Module XML Description, the Virtual Architecture and the NCD file, received as entries, and delivers a partial bitstream corresponding to the module, without requiring extra effort from the designer. This step is repeated for every module in the system. To obtain it, the following steps are carried out: 1.- Checking Routing Validity: The first step performed by the tool is the detection of conflicting nets outside the area corresponding to the reconfigurable region, and to reroute them using the router developed within this work. The effect of this step is shown in figure 7 where the area to allocate reconfigurable modules is free. 2.- Reconfigurable IO Parsing: Specific signals crossing the Virtual Region borders depend on the logical description of the modules. Therefore, according to the previously defined naming policy, the tool automatically detects which IOBs correspond to actual FPGA ports, and which of them are reconfigurable connections. Reconfigurable connections will be part of Virtual Borders, and therefore, will have to be routed using specific resources. 3.- Module’s Processing: The next step is to process the module, in order to create the connections of the input and output signals, belonging to Virtual Borders. To achieve this, it is necessary to replace the eliminated IOBs by temporal

Fake Macros, described in the previous section. The tool is also able to deal with special cases, such as when the same net is connected to several output ports, as well as bypass situations that appear when no logic resource belonging to the net is located in the Region. 4.- Load/Extract Virtual Borders: The nets involved in a Virtual Border, are at this point, routed to a Fake Macro. In case the System front-end decides to use this border as a reference, corresponding Virtual Border resources are extracted and stored in an external file. Otherwise, the tool will load the corresponding resource description from a previously created file 5.- Transform/Apply Virtual Borders: In every boundary where a Virtual Border is not extracted, it has to be applied. This has to be done after transforming it, in order to consider the displacement between the original position where it was extracted, and where it has to be finally applied. 6.- Clock Management: Every VM using the same VRs, as well as the static system must share the same clocking resources. To obtain this, the custom router is also used to reroute the clock signal. In figure 7, the extra logic introduced to generate the clock, in the middle of the reconfigurable regions, is shown. 7.- Partial Bitstream Generation: Once the design is completely processed, corresponding bitstream will be extracted. In the case of the static system, the full bitstream will be created. This process is done using a custom tool created within Dreams, on the basis of the frame addressing information. Relocation during system life- is carried out using the reconfiguration engine provided in [24].

Figure 7: Example of Rerouting conflicting nets and the generation of the

clock signal.

VII. CUSTOM ROUTER FOR MODULAR SYSTEMS Due to the lack of flexibility and control restrictions of vendor routers, in this work we have developed a custom router. With this purpose, RapidSmith features have been exploited. Design of routers for FPGAs is a complex problem. Typically, the routing process is divided into two phases: a global routing, which gathers nets together to balance all routing channels, and a detailed router, which finally assigns specific wires to each net. In this case, since the number of signals to be routed is limited, just a detailed router has been implemented, which routes sequentially every desired net.

More specifically, Dijikstra's algorithm has been selected to route each net, according to the following metric:

fi = b + (1-a)×fi-1 + a×(di + NodeLevel)

Where b is the base cost, depending on the type of used wire, fi-1 is the accumulative cost of the path, di is the Manhattan distance to the origin, NodeLevel is the level of the node in the path, and a is a constant balancing the weight of both factors. This metric leads to a Depth first Search Algorithm, similar to the well-known A* Search Routing [25]. Implemented routing methods offer some extra features, in order to meet the requirements of Dreams tool. Most important ones are listed below: - Possibility of skipping wires crossing region boundaries, as

well as to allow PIPs only inside a given area. - Net routing using fixed given resources, splitting the

process into subsequent stretches. - Guarantee that at least a net PIP is located in a given

region, in order to deal properly with bypass nets. - Avoid each net crossing multiple times a border among

modules, to allow the extraction of Virtual Borders. The router implemented in this work offers a performance good enough to cope with rerouting of the Virtual Border nets, as will be shown the results section. However, the rest of the design flow can be applied independently of the router underneath, while it offers enough flexibility.

VIII. SYSTEM DESIGN EXAMPLE AND RESULTS As a special case, this paper describes the design process of highly parallel and regular architectures, consisting of two dimensional mesh-type arrays of fine-grain processing elements, such as systolic arrays. In this example, each processing element has a connectivity restricted to their closest neighbors, as shown in figure 8. Those kinds of architectures are widely used in intensive data-processing applications. Implementing each processing element as a separated reconfigurable module, using the architecture shown in figure 1 b), it is possible to adapt them with a reduced reconfiguration overhead, just by reconfiguring the necessary modules

Figure 8: 2D highly parallel and regular mesh-type architecture, together with a single processing element with the communication interfaces.

In addition, architecture regularity makes some of the modules to be equal. Therefore, module relocation is essential. Dreams tool must support not only static to reconfigurable module, but also reconfigurable to reconfigurable module communications. In figure 7, the processed layout corresponding to the static system is shown. It includes different Virtual Borders to

different VRs, such as Rec00, according to the terminology described in the VA in figure 2. The elimination of nets crossing the reconfigurable module has been previously done. Sixteen different reconfigurable modules have been created with the tool. Each of them implements a basic 8-bits arithmetic operation, and all of them can be relocated in any position of the array. One of the reconfigurable modules is shown in figure 5, with communications compatible with the static system. In case system designer wants to port the architecture to a new device, Dreams tools avoids the manual repetition of tedious steps. Netlists must be resynthesized, and just by calling Dreams again, the new system is generated. Up to now, the tool only supports Virtex-5 devices. Tests shown in this work have been done with Virtex-5 LX110T. The same in case the Virtual Architecture changes, for instance, if it is necessary to generate some of the modules with a different shape. In this use case, different versions of the architecture have been created, including PEs with different dimensions, such as 1×5, 1×10, 2×5 and 2×10 CLBs.

IX. CONCLUSIONS AND FUTURE WORK In this work, a tool for the design of dynamically reconfigurable systems with an enhanced architectural flexibility is provided. The tool is based on the application of a set of processing steps described throughout the paper, including a novel scheme for inter-module communications, as well as a the rerouting of conflicting nets. Dreams tool is able to post-process netlists describing the different system modules, in order to create valid and independent bitstreams, which can be relocated in any compatible region of the device. Everything is done without further designer's intervention. In the future work, the tool will be enhanced with a graphic interface, and support to automatically detect all the Virtual Border nets without requiring the reference. Also the routing features will be enhanced. The tool will be public available for download.

ACKNOWLEDGMENT This work was supported by the Spanish Ministry of Economy and Competitiveness under the project DREAMS (Dynamically Reconfigurable Emdedded Platforms for Networked Context-Aware Multimedia Systems) with number TEC2011-28666-C04-03.

REFERENCES [1] Feilen, M.; Ihmig, M.; Zahlheimer, A.; Stechele, W.; , "Real-time signal processing

on low-cost-FPGAs using dynamic partial reconfiguration," 13th International Symposium on Integrated Circuits (ISIC), 2011

[2] Bayar, S.; Yurdakul, A.; Tukel, M.; , "A self-reconfigurable platform for general purpose image processing systems on low-cost spartan-6 FPGAs," 6th International Workshop on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2011

[3] Liu, Shaoshan; Pittman, Richard Neil; Form, Alessandro; Gaudiot, Jean-Luc; , "On energy efficiency of reconfigurable systems with run-time partial reconfiguration," 21st IEEE International Conference on Application-specific Systems Architectures and Processors (ASAP), 2010

[4] Hu�bner, M.; Meyer, J.; Sander, O.; Braun, L.; Becker, J.; Noguera, J.; Stewart, R.; , "Fast Sequential FPGA Startup Based on Partial and Dynamic Reconfiguration," IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2010, vol., no., pp.190-194, 5-7 July 2010

[5] http://www.altera.com/literature/wp/wp-01137-stxv-dynamic-partial-reconfig.pdf

[6] Santambrogio, M.D.; , "From Reconfigurable Architectures to Self-Adaptive Autonomic Systems," International Conference on Computational Science and Engineering, 2009. CSE '09

[7] Berns, A.; Ghosh, S.; , "Dissecting Self-* Properties," Third IEEE International Conference on Self-Adaptive and Self-Organizing Systems, 2009. SASO '09

[8] Paulsson, K.; Hubner, M.; Becker, J.; , "Strategies to On- Line Failure Recovery in Self- Adaptive Systems based on Dynamic and Partial Reconfiguration," First NASA/ESA Conference on Adaptive Hardware and Systems, 2006. AHS 2006., vol., no., pp.288-291, 15-18 June 2006

[9] Salvador, R.; Otero, A.; Mora, J.; de la Torre, E.; Sekanina, L.; Riesgo, T.; "Fault Tolerance Analysis and Self-Healing Strategy of Autonomous, Evolvable Hardware Systems," ), International Conference on Reconfigurable Computing and FPGAs (ReConFig) 2011

[10] Shaobin Zhang; Tongsen Hu; Minghui Wu; Tianzhou Chen; Zening Qu; , "Load-Aware Dynamic Partial Reconfiguration Implementation of Crossbar Scheduler," IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing (DASC), 2011

[11] Paulsson, K.; Hubner, M.; Becker, J.; , "Exploitation of dynamic and partial hardware reconfiguration for on-line power/performance optimization," International Conference on Field Programmable Logic and Applications, 2008.

[12] Otero, A.; Salvador, R.; Mora, J.; de la Torre, E.; Riesgo, T.; Sekanina, L.; , "A fast Reconfigurable 2D HW core architecture on FPGAs for evolvable Self-Adaptive Systems," NASA/ESA Conference on Adaptive Hardware and Systems (AHS), 2011

[13] Upegui, A.; Sanchez, E.; , "Evolving Hardware with Self-reconfigurable connectivity in Xilinx FPGAs," First NASA/ESA Conference on ,Adaptive Hardware and Systems, 2006. AHS 2006.

[14] Xilinx Inc., Two Flows for Partial Reconfiguration: Module Based or Difference Based, May 2002.

[15] Partial Reconfiguration User Guide. UG702 (V13.1). Available in http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_1/ug702.pdf

[16] Beckhoff, Christian; Koch, Dirk; Torresen, Jim; , "Go Ahead: A Partial Reconfiguration Framework," 20th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2012

[17] Sohanghpurwala, A.A.; Athanas, P.; Frangieh, T.; Wood, A.; , "OpenPR: An Open-Source Partial-Reconfiguration Toolkit for Xilinx FPGAs," IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW),2011

[18] Lavin, C.; Padilla, M.; Lamprecht, J.; Lundrigan, P.; Nelson, B.; Hutchings, B.; , "RapidSmith: Do-It-Yourself CAD Tools for Xilinx FPGAs,"International Conference on Field Programmable Logic and Applications (FPL), 2011

[19] Beckhoff, C.; Koch, D.; Torresen, J.; , "The Xilinx Design Language (XDL): Tutorial and use cases," 6th International Workshop on Reconfigurable Communication-centric Systems-on-Chip, 2011,

[20] Koch, D.; Beckhoff, C.; Teich, J.; , "ReCoBus-Builder — A novel tool and technique to build statically and dynamically reconfigurable systems for FPGAS," International Conference on Field Programmable Logic and Applications, 2008.

[21] Xilinx Design Language Version 1.6, Xilinx, Inc., Xilinx ISE 6.1i. Documentation in ise6.1i/help/data/xdl, July 2000.

[22] Lavin, C.; Padilla, M.; Lamprecht, J.; Lundrigan, P.; Nelson, B.; Hutchings, B.; , "HMFlow: Accelerating FPGA Compilation with Hard Macros for Rapid Prototyping," 19th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2011

[23] Neil Steiner, Aaron Wood, Hamid Shojaei, Jacob Couch, Peter Athanas, and Matthew French. 2011. Torc: towards an open-source tool flow. In Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays (FPGA '11). ACM

[24] Otero, A.; Morales-Cas, A.; Portilla, J.; de la Torre, E.; Riesgo, T.; , "A Modular Peripheral to Support Self-Reconfiguration in SoCs," 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools (DSD), 2010,

[25] R.Tessier, Negotiated A* Routing for FPGAs, in Proceedings of the Fifth Canadian Workshop on Field-Programmable Devices, 1998.

[26] Korf, S.; Cozzi, D.; Koester, M.; Hagemeyer, J.; Porrmann, M.; Ruckert, U.; Santambrogio, M.D.; , "Automatic HDL-Based Generation of Homogeneous Hard Macros for FPGAs," IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2011, vol., no., pp.125-132, 1-3 May 2011

Date post:	05-Dec-2016
Category:	Documents
Upload:	teresa
View:	216 times
Download:	3 times

[IEEE 2012 International Conference on Reconfigurable Computing and FPGAs (ReConFig 2012) - Cancun,...

Documents