2
Recommended reading
• 7 Series FPGAs Configurable Logic Block:User Guide
§ Overview§ Functional Details
4
Modern FPGARAM blocks
Multipliers
Logic blocks
Graphics based on The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Multipliers/DSP units
RAM blocks
Logic resources
(#Logic resources, #Multipliers/DSP units, #RAM_blocks)
8
4-input LUT (Look-Up Table) (used in earlier families of FPGAs)
• Look-Up tables are primary elements for logic implementation
• Each LUT can implement any function of 4 inputs
x1 x2 x3 x4
y
x1 x2
y
LUT
x1x2x3x4
y
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
x1 x2 x3 x4
y
x1 x2 x3 x4
y
x1 x2
y
x1 x2
y
LUT
x1x2x3x4
y
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
11
Reset and Set Configurations
• No set or reset• Synchronous set• Synchronous reset• Asynchronous set (preset)• Asynchronous reset (clear)
15
u Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and
performance of adders, subtractors, accumulators, comparators, and counters
u Carry logic is independent of normal logic and routing resources
Fast Carry Logic
LSB
MSB
Carry
Log
icRo
utin
g
x y COUT0011
0101
y
y
CINCIN
Propagate = x Å yGenerate = ySum= Propagate Å CIN = x Å y Å CIN
xy
Carry & Control Logic in Xilinx FPGAs
21
Accessing Carry Logic
u All major synthesis tools can infer carry logic for arithmetic functions
• Addition (SUM <= A + B)• Subtraction (DIFF <= A - B)• Comparators (if A < B then…)• Counters (count <= count +1)
23
16-bit SR
16 x 1 RAM
4-input LUT
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Xilinx Multipurpose LUT (MLUT)
64 x 1 ROM(logic)
64 x 1 RAM
32-bit SR
25
Question 1
How many Xilinx LUTs are necessary to implement the following functions:
A. f1 = x1x2 + x3x4 + x5x6
f2 = x1x2x3x4x5
B. f1 = x1x2 + x2x3 + x3x4 + x4x5
f2 = x1+x2+x3+x4+x5
26
Question 2
How many Xilinx LUTs are necessary to implement:
A. 4-to-1 MUXB. 2-to-4 decoder with enableC. 3-to-8 decoder with enable
27
Question 3
How many Xilinx LUTs are necessary to implement 4-bit priority encoder?
w 0
w 3
y 0
y 1d001
010
w0 y1
d
y0
1 1
01
1
11
z
1--
0
-
w1
01-
0
-
w2
001
0
-
w3
000
0
1
z
w 1
w 2 w
y
George Mason University
Question 5
Determine the amount of Xilinx FPGA resources needed
to implement a given circuit
1
01
0
01234567
cin
x y
cout
s
<<<3
x3
x2
x1
x0
y3
y2
y1
y0
w1
w0
En
y3
y2
y1
y0
a
bcd
a
b
c
dc
ab
e
ef
3
2-to-4 Decoder
FullAdder
f
g
h
g h
y
Circuit 1:F – function
1
01
0
01234567
x y
cout
s
>>2
x3
x2
x1
x0
y3
y2
y1
y0
y1
y0
z
w3
w2
w1
w0
a
b
c
d
ae
f
gh
3
Priority Encoder
HalfAdder
g
h
i
e i
y
a
b
c
d
Circuit 2:F – function
en enaclk
y(0)
en enaclken enaclken enaclk
x
3232 32 32
32
y(1)
+ −
R0 R1 R2 R3
A BA>B
16 16 16 16
1616
Circuit 4:Top level
37
Basic I/O Block Structure
DEC
Q
SR
DEC
Q
SR
DEC
Q
SR
Three-StateControl
Output Path
Input Path
Three-State
Output
Clock
Set/Reset
Direct Input
Registered Input
FF Enable
FF Enable
FF Enable
38
IOB Functionality
• IOB provides interface between the package pins and CLBs
• Each IOB can work as uni- or bi-directional I/O
• Outputs can be forced into High Impedance• Inputs and outputs can be registered
• advised for high-performance I/O• Inputs can be delayed
FPGA Design process (1)Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds…..
Library IEEE;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;
entity RC5_core isport(
clock, reset, encr_decr: in std_logic;data_input: in std_logic_vector(31 downto 0);data_output: out std_logic_vector(31 downto 0);out_full: in std_logic;key_input: in std_logic_vector(31 downto 0);key_read: out std_logic;
);end AES_core;
Specification / Pseudocode
VHDL description (Your Source Files)Functional simulation
Post-synthesis simulationSynthesis
On-paper hardware design (Block diagram & ASM chart)
43
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC;signal B1:STD_LOGIC;signal Y1:STD_LOGIC;signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;
beginA1<=A when (NEG_A='0') else
not A;B1<=B when (NEG_B='0') else
not B;Y<=Y1 when (NEG_Y='0') else
not Y1;
MUX_0<=A1 and B1;MUX_1<=A1 or B1;MUX_2<=A1 xor B1;MUX_3<=A1 xnor B1;
with (L1 & L0) selectY1<=MUX_0 when "00",
MUX_1 when "01",MUX_2 when "10",MUX_3 when others;
end MLU_DATAFLOW;
VHDL description Circuit netlist
Logic Synthesis
47
Implementation
• After synthesis the entire implementation process is performed by FPGA vendor tools
51
Configuration
• Once a design is implemented, you must create a file that the FPGA can understand• This file is called a bit stream: a BIT file (.bit extension)
• The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information
Two main stages of the FPGA Design Flow
Synthesis
Technologyindependent
Technologydependent
Implementation
RTLSynthesis Map Place & Route Configure
- Code analysis- Derivation of main logic constructions- Technology independent optimization- Creation of “RTL View”
- Mapping of extracted logic structures to device primitives- Technology dependent optimization- Application of “synthesis constraints”-Netlist generation- Creation of “Technology View”
- Placement of generated netlist onto the device-Choosing best interconnect structure for the placed design-Application of “physical constraints”
- Bitstream generation- Burning device