Date post: | 07-Apr-2018 |
Category: |
Documents |
Upload: | vijaymails |
View: | 231 times |
Download: | 2 times |
of 14
8/6/2019 FPGA Floorplanning
1/14
Introduction to Floorplanning
Why do Floorplanning?
Floorplanning is the process of identifying structures that should be placed close
together, and allocating space for them in such a manner as to meet the sometimes
conflicting goals of available space (cost of the chip), required performance, and the
desire to have everything close to everything else.
Within the Xilinx chips it is often the case that the smallest area design is also the highest
performance design. This flies in the face of many design methodologies, where area and
speed are considered to be things that should be traded off against each other.
The reason this is so is probably because there are limited routing resources, and the
more routing resources that are used, the slower the design will operate. Optimizing for
minimum area allows the design to use fewer resources, but also allows the sections of
the design to be closer together. This leads to shorter interconnect distances, less routing
resources to be used, faster end-to-end signal paths, and even faster and more
consistent place and route times. Done correctly, there are no negatives to
Floorplanning.
What negatives could there be? Well, if the Floorplanning is done with no regard for the
architecture of the chip, then it is possible to actually do a worse job than the Xilinx placer
section of the place and route software. It is also possible that there are constraints that
are not well understood until placement is complete, and routing commences. So the
issue then is what constitutes the "Done correctly".
As a general rule, data-path sections benefit most from Floorplanning, and random logic,
state machines, and other non-structured logic can safely be left to the placer section of
the place and route software.
Data paths are typically the areas of your design where multiple bits are processed in
parallel with each bit being modified the same way with maybe some influence from
adjacent bits. Example structures that make up data paths are Adders, Subtractors,
Counters, Registers, and Muxes.
How to Floorplan a design
Although there are no hard and fast rules to Floorplanning, this section outlines the basic
structure for a Floorplanned design, and highlights the issues you need to consider when
Floorplanning a design. As described above, Floorplanning has its greatest return whenapplied to data path elements. The Xilinx XC4000 devices, and all of the derivative
families (the A, D, E, EX, H, L, XL, Spartan, and SpartanXL families) all have the
following basic structure:
o A rectangular array of Configurable Logic Blocks (CLBs). These
logic blocks contain two main function generators, and two flip-flops. The
function generators can represent any number of gates that as a group
has no more than 4 inputs, one output, and no internal loops (that would
8/6/2019 FPGA Floorplanning
2/14
implement latch like behavior). The flip-flops are either rising or falling
edge triggered, include a clock-enable function that is implemented with
a re-circulation multiplexer from the Q output to the D input, and can
have either an active high asynchronous reset or set function.
Associated with each CLB are two tri-stateable buffers.
o Segmented interconnect including short interconnect for local
signals, and long-lines for spanning the width or height of the chip. Inmany of the devices, the horizontal long-lines can be split into a left and
a right half, allowing up to twice as many lines, that span half the width of
the chip.
o The two tri-stateable buffers associated with each CLB are pre-
connected to two of the horizontal long-lines.
o Input and Output pins on all 4 sides of the array.
o Pre-built Carry logic that is pre-connected vertically in column of
CLBs.
To support these characteristics, consistently implement all data path elements with a bit
pitch of two bits per row, and data path elements are always vertical structures, of one or
more columns.
The Xilinx FPGAs are biased to have data flow along horizontal interconnect, and to have
arithmetic functions operate in vertical columns. The bias comes from the horizontal long
lines with tri-stateable buffers, and the vertical pre-built and routed carry logic.
The carry logic is also used to build fast counters, so although you may not initially think
of a counter as an arithmetic function, it falls into the same pattern as adders,
subtractors, and arithmetic comparisons, because of its use of the carry chain. This view
can be clarified by thinking of a counter as an incrementor, followed by a holding register.
The bit pitch of two bits per row is driven primarily by the structure of the carry logic, but
is also the bit pitch that the tri-stateable buffers implement. What this means is that thenatural structure of arithmetic functions in these devices implements 2 bits of a function
(a two bit slice) in one row of CLBs, and for simple functions, in one column. A simple
function such as a ten bit synchronous up-counter will therefore take 5 rows and 1
column, a total of 5 CLBs.
Although the XC4000 devices and the A, D, E, H, and L derivatives allow the carry signal
between CLBs to interconnect in both an up and down direction within a column, the
more recent XC4000EX, XC4000XL, Spartan and SpartanXL devices only support the
carry signals being routed up a column. For all devices, within a CLB, the carry routing is
up, with regard to the two function generators. It is expected that this up only bias will
exist in future products from Xilinx. To be compatible with all these products, you should
onlyuses the up direction for carry, and this bias then affects allother functions that aregenerated. For the example 10 bit counter described in the previous paragraph, the
Floorplan will have bit 0 and 1 in the CLB at the bottom of the column of 5 CLBs, and the
top CLB will have bits 8 and 9.
8/6/2019 FPGA Floorplanning
3/14
Following Xilinx's standard, the two main function
generators are shown on the left of diagrams, and are
labeled F and G, and the two flip-flops are shown on the
right and are labeled X and Y.
For the example counter, in the CLB at the bottom of thefive CLB group (the one with the RLOC=R4C0 attribute),
the F function generator will be used to implement the
logic that feeds the D pin of the X flip-flop, the output of
which, is the least significant bit of the counter, Q0.
The G and Y sections of the same CLB implement bit 1 ofthe counter. The next CLB above (the one with the
RLOC=R3C0 attribute) implements bit 2 and 3. This
continues up the column, through to the top CLB which
implements bits 8 and 9.
When two or more functions of your design are Floorplanned in this way and placed side
by side, with the signals that flow from one function to the next aligned on the same row,
and in near or adjacent columns, the design will place and route much faster and the
resulting design will perform faster than a design without Floorplanning, and that relies on
the Xilinx place and route software to decide on placement. Of course, custom building
each function section of your design with detailed Floorplanning for each function
generator and flip-flop can be a complex, time consuming, and potentially error prone
process.
The Xilinx Place and Route software uses a hierarchical placement constraint system
called relative location attributes. Each level of the hierarchy has an origin in the top left
corner that has a relative location of row zero and column zero. As a constraint this isrepresented as R0C0. Rows are numbered from top to bottom, and columns are
numbered from left to right. When a relative location attribute (RLOC) is assigned to a
part of the hierarchy that is not a single CLB, then the underlying RLOCs are added to
the attached attribute to calculate the RLOC value for each of the underlying RLOCs.
This process continues throughout the hierarchy, resolving each CLB RLOC to a value
that is relative to the RLOC at the top of the hierarchy. This process, and other issues
related to how RLOCs are processed are discussed in full in the Xilinx "Libraries Guide"
document, in the "Attributes, Constraints, and Carry Logic" chapter, in the "Relative
Location (RLOC) Constraints" section. Although this section of Xilinx's documentation is
quite complex, it is recommended that you review it to better understand how the RLOCs
in the modules support Floorplanning.
http://www.fliptronics.com/images/fgxy.gif8/6/2019 FPGA Floorplanning
4/14
An Example design, with various levels of Floorplanning
This section examines the results of Floorplanning, and compares the resulting structure,
the place and route time, and the design performance. The example while contrived istypical of the types of logic that benefit from Floorplanning. The example design
comprises four sixteen bit binary up counters, that all feed into a selection multiplexer.
The output of the selection multiplexer is registered, and the output of this register is
connected to the FPGA pins.
There are two basic timing path categories that need to be analyzed. The first is the
maximum delay in any of the counters. And the second is the maximum delay from any of
the counters to the multiplexer output register. For the counter, the maximum delay will
be from the clock to out time of the LSB flip-flop, through the logic that establishes the
next counter value, to the D input of the MSB flip-flop, and meeting its setup time. The
reciprocal of this maximum internal delay within the counter is the maximum clock rate at
which the counter will count reliably.
Seven different levels of Floorplanning are applied to this simple design, using the
XC4005E, XC4010E, and XC4010XL as targets. The '-2' speed grade is used for all
examples, and place and route programs used are as follows:
1. XC4005E-2 PPR V5.2.1
2. XC4010E-2 PPR V5.2.1
3. XC4010E-2 PAR M1.4
4. XC4010XL-2 PAR M1.4
The combination of running the XC4010E devices with both place and route programs
allows comparison of these programs on the XC4000E families. Running both theXC4010E and XC4010XL on the M1.4 program, allows comparison of these two product
families. While the goal is to show the value of Floorplanning, the program and product
comparisons are interesting.
The same seven levels of Floorplanning were applied to each of these four
product/program combinations. The seven design styles have the following
characteristics:
1. The 4 counters are binary ripple counters (CB16CE), from the
Xilinx unified library XC4000E, the multiplexer and output register are
also taken from this library. There is no Floorplanning in this style, and
the choice of a ripple counter, while available in the library, is a poorchoice.
2. The 4 counters are binary counters that use the built-in carry
logic (CC16CE), from the Xilinx unified library XC4000E, the multiplexer
and output register are also taken from this library. While there is no
explicit Floorplanning in this style, the counters include internal
Floorplanning, because the carry logic imposes a column structure on
the counters.
8/6/2019 FPGA Floorplanning
5/14
8/6/2019 FPGA Floorplanning
6/14
flops with the multiplexers. A four-to-one multiplexer requires all the gate resources of a
CLB, so to build a 16 bit wide multiplexer with four inputs will require 16 CLBs. Strictly
maintain a Floorplanning structure of two bits of data path implemented per row of
structure. The 16 CLBs are Floorplanned to use two columns by eight rows, with bits 0
and 1 on the row at the bottom, and bits 14 and 15 at the top. This exactly matches the
bit position of the counters, except the counters have an additional block at the top, for
the TC and CEO outputs. This is resolved by placing the counters with RLOC-ORIGINSon row 1, but the multiplexer is placed on row 2.
At this point you may wonder what additional improvement could be made to style 6.
Consider the routing from the left most counter to the multiplexer. It must pass through
the other three counters to get to the multiplexer. Similarly, the output of counters two
and three must also pass through the fourth counter to get to the multiplexer. Therefore,
there is more routing congestion around counter four, although it has the shortest path to
the multiplexer. The output of the first counter must traverse the furthest distance to get
to the multiplexer. In synchronous designs like this, the slowest path out of a group of
paths will be the limiting factor. For the counters to run at their fastest, they need to have
their routing congestion minimized. For the paths from the four counters to the multiplexer
to be minimized, the multiplexer and the four counters need to be placed so as tominimize the worst-case distance. Both of these goals are achieved in style 7 by placing
the multiplexer and its output register in the middle of the structure, with two counters to
its left, and two counters to its right.
As can be seen from the following tables and diagrams, style 7 delivers the fastest
counters, the fastest counter to multiplexer output register time, the fastest placement
time, and the fastest routing time. Studying the schematics for design styles 1 and style 7
shows almost no additional effort to create design 7's result. Selecting counters and
multiplexers that are pre-Floorplanned, together with five placement attributes is all that is
required. (Some thought as to what the placement constraints should be, obviously is
also needed)
XC4005EPC84-2 Processed with PPR V5.2.1c
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Partition +
Placement
time (S)
Routing
Time
(Seconds)
CLBs
Used
1 17.1 58.4 11.8 4+28 12 72
2 13.1 76.3 10.8 6+15 13 48
3 13.4 74.6 11.7 6+14 17 48
4 13.1 76.3 14.4 7+12 17 48
5 14.3 69.9 14.5 6+12 16 48
8/6/2019 FPGA Floorplanning
7/14
6 13.3 75.1 9.4 3+11 16 48
7 13.1 76.3 8.9 3+11 14 48
XC4010EPC84-2 Processed with PPR V5.2.1c
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Partition +
Placement
time (S)
Routing
Time
(Seconds)
CLBs
Used
1 17.5 57.1 12.9 7+53 32 88
2 13.3 75.1 11.2 4+13 12 48
3 13.5 74.0 12.6 4+11 15 48
4 13.1 76.3 14.6 4+11 17 48
5 13.2 75.7 14.2 3+11 14 48
6 13.3 75.1 10.2 2+10 16 48
7 13.1 76.3 8.9 1+10 15 48
XC4010EPC84-2 Processed with M1.3.7 (PAR L4 D5) (A)
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Placement
time
(Seconds)
Routing
Time
(Seconds)
CLBs
Used
1 21.9 45.6 19.4 65-7=58 574-65=509 55
2 13.7 72.9 10.0 47-7=40 142-47=95 48
3 13.8 72.4 10.3 38-8=30 170-38=132 48
4 13.8 72.4 12.7 28-8=20 132-28=104 56
8/6/2019 FPGA Floorplanning
8/14
5 13.7 72.9 13.1 28-8=20 128-28=100 56
6 13.7 72.9 9.4 15-8=7 80-15=65 48
7 13.7 72.9 8.9 14-8=6 75-14=61 48
XC4010XLPC84-2 Processed with M1.3.7 (PAR L4 D5) (B)
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Placement
time
(Seconds)
Routing
Time
(Seconds)
CLBs
Used
1 18.5 54.0 8.8 68-20=48 147-68=79 55
2 11.6 86.2 7.0 53-21=32 134-53=81 48
3 11.9 84.0 6.9 46-21=25 128-46=82 48
4 12.1 82.6 10.6 34-22=12 95-34=61 56
5 11.7 85.4 10.7 33-21=12 91-33=58 56
6 11.9 84.0 6.8 25-20=5 64-25=39 48
7 11.7 85.4 6.1 26-21=5 69-26=43 48
XC4010XLPC84-2 Processed with M1.4.12 (MAP K, PAR L4 D5)
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Placement
time
(Seconds)
Routing
Time
(Seconds)
CLBs
Used
1 18.2 54.9 11.3 64-20=44 185-64=121 83
2 11.3 88.5 9.8 39-21=18 183-39=144 72
3 11.8 84.7 10.6 33-20=13 108-33=75 72
8/6/2019 FPGA Floorplanning
9/14
4 11.6 86.2 10.8 32-21=11 128-32=96 72
5 11.7 85.4 11.0 32-21=11 116-32=84 72
6 11.6 86.2 6.8 24-21=3 59-24=35 48
7 11.7 85.4 6.1 24-20=4 61-24=37 48
XC4010XLPC84-2 Processed with M1.4.12 (MAP K, PAR L5 D5)
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Placement
time
(Seconds)
Routing
Time
(Seconds)
CLBs
Used
1 17.3 57.8 11.3 99-20=79 224-99=125 83
2 11.7 85.4 9.9 58-21=37 229-58=171 72
3 12.1 82.6 10.5 46-20=26 140-46=94 72
4 11.6 86.2 11.1 44-21=23 117-44=73 72
5 11.7 85.4 10.9 44-21=23 134-44=90 72
6 12.1 82.6 6.7 27-21=6 60-27=33 48
7 11.7 85.4 6.1 27-21=6 66-27=39 48
XC4010XLPC84-2 Processed with M1.4.12 (PAR L4 D5)
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Placement
time
(Seconds)
Routing
Time
(Seconds)
CLBs
Used
8/6/2019 FPGA Floorplanning
10/14
1 18.8 53.2 9.1 63-20=43 199-63=136 55
2 12.0 83.3 7.7 45-20=25 132-45=87 48
3 12.2 81.9 6.7 36-21=15 116-36=80 48
4 11.9 84.0 10.3 30-20=10 97-30=67 56
5 12.0 83.3 10.5 31-21=10 103-31=72 56
6 11.6 86.2 6.8 24-20=4 58-24=34 48
7 11.7 85.4 6.1 24-20=4 61-24=37 48
XC4010XLPC84-2 Processed with M1.4.12 (PAR L5 D5)
Design
Style
Counter
Delay
(nS)
Max
Frequency
(MHz)
Counter to
MUX REG
delay (nS)
Placement
time
(Seconds)
Routing Time
(Seconds)
CLBs
Used
1 18.1 55.2 7.7 105-21=84 257-105=152 55
2 12.0 83.3 6.7 72-21=51 199-72=127 48
3 11.8 84.7 6.8 55-21=34 138-55=83 48
4 12.1 82.6 10.5 40-21=19 148-40=108 56
5 12.1 82.6 10.6 40-20=20 102-40=62 56
6 12.1 82.6 6.7 29-22=7 61-29=32 48
7 11.7 85.4 6.1 27-21=6 66-27=39 48
Interpreting the Floorplan Pictures
The full manual has all the pictures for all 8 of the above tables of data. This page only
has the pictures for the last table, Which is the M1 PAR V1.4.12, with -L 5 and -D 5,
which represent high effort in both placer and router.
8/6/2019 FPGA Floorplanning
11/14
At the time of writing this page, the XC4000XL is Xilinx's leading FPGA family, and the
M1 PAR version 1.4.12 is the current version of the place and route software.
The color coding of the following Floorplans is as follows:
All the pictures are of XC4010XL devices, which is an array of 20 by
20 CLBs. These are represented by small squares. If it is empty, the CLB
is not used
Within each CLB, colored squares on the left are F & G function
generators, colored squares on the right are the flip-flops, and a colored
rectangle in the middle represents the H function generator.
If a square is colored blue, then it is being used
If a square is colored yellow, then it is a function generator, and the
carry logic is active
If a square is colored magenta, then it is a function generator, and it
is being used for single ported RAM
If a square is colored red, then it is a function generator, and it is
being used for dual ported RAM If a square is colored green, then it is a function generator, and it is
being used for ROM
If an I/O cell is colored red, then it is being used for a global clock
buffer
An "X" over an I/O cell indicates an I/O cell that is not bonded to a
package pin
An inward pointing arrow on an I/O cell indicates usage as an input
An outward pointing arrow on an I/O cell indicates usage as an
output
If an I/O or CLB cell has a gray background, then it means that there
was placement control used on that location
XC4010XL-S1-F
The 4 counters are binary ripple counters (CB16CE),
from the Xilinx unified library XC4000E, themultiplexer and output register are also taken from this
library. There is no Floorplanning in this style, and the
choice of a ripple counter, while available in the library,
is a poor choice.
This is also what you will get from synthesis if it does
not know about the carry logic in the XC4000 families.
http://www.fliptronics.com/images/xc4010xl-s1-F.gif8/6/2019 FPGA Floorplanning
12/14
XC4010XL-S2-F
The 4 counters are binary counters that use the built-in
carry logic (CC16CE), from the Xilinx unified library
XC4000E, the multiplexer and output register are also
taken from this library. While there is no explicitFloorplanning in this style, the counters include internal
Floorplanning, because the carry logic imposes a columnstructure on the counters.
This is also what you will get from synthesis if it knows
about carry logic, but you do not do any Floorplanning.
While the performance for this style is not too bad for
this example, when a chip is used at 50% or more, the
lack of Floorplanning can seriously degrade
performance, and routing times may become very long.
XC4010XL-S3-F
This style adds four RLOC_ORIGIN Floorplanning
constraints to the style 2 design, placing the four
counters in adjacent column, and aligning the MSBs of
the counters (and all other bits).
The Floorplanning is shown by the gray background to
the four columns that contain the counters. Since the
multiplexer is not Floorplanned, it is the CLBs with
logic in them, but a white background.
http://www.fliptronics.com/images/xc4010xl-s3-F.gifhttp://www.fliptronics.com/images/xc4010xl-s2-F.gif8/6/2019 FPGA Floorplanning
13/14
XC4010XL-S4-F
This style replace the un-Floorplanned output register of
the previous styles with a Floorplanned register, and
places it in the column to the right of the fourth counter.
It also is aligned with regard to bit positions.
Note that the multiplexer logic is still scattered allaround the Floorplanned core. Although there is room in
the Floorplanned output register CLBs to merge some of
the multiplexer, the mapper in the current version of the
M1 software will not do this.
XC4010XL-S5-F
This style is like style 4, except the output register is
placed in the column to the right of the column used for
the register in style 4.
This opened up a column for the placer to move themultiplexer into. It looks like half of the 16 bits of
multiplexer logic have been moved into this area, and
half are still floating about. Merging the multiplexer into
the Floorplanned output register CLBs has not happened.
http://www.fliptronics.com/images/xc4010xl-s5-F.gifhttp://www.fliptronics.com/images/xc4010xl-s4-F.gif8/6/2019 FPGA Floorplanning
14/14
XC4010XL-S6-F
This style uses a Floorplanned multiplexer and output
register built by FlibGenmodule generator, and places it
in the two columns to the right of the fourth counter. The
odd bit multiplexers and output register flip-flops are inone of these two columns, and the even bits are in the
other column.
XC4010XL-S7-F
This style uses the same components of style 6, but the
Floorplan has been changed. The first two columns
contain the first two counters, the next two columns are
the multiplexer and output register, and the last two
columns contain the third and fourth counter.
If you have read this page and found it useful, please send an email [email protected]
http://www.fliptronics.com/flibgen.htmlhttp://www.fliptronics.com/flibgen.htmlmailto:[email protected]:[email protected]://www.fliptronics.com/images/xc4010xl-s7-F.gifhttp://www.fliptronics.com/images/xc4010xl-s6-F.gifhttp://www.fliptronics.com/flibgen.htmlmailto:[email protected]