Post on 31-Dec-2015
description
transcript
© 2006 Xilinx, Inc. All Rights Reserved
System On Chip
DAPNIA Day, November 10th
Presenter : Olivier REGNAULT / SILICAFAE Xilinx
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 2
SOC Introduction
• Challenge: – Create High speed frequency design– Use very High speed communication links – Keep flexibility for modification
• Xilinx response:– FPGA provides hardware structure that enables integrated high speed design (up to
550Mhz)– FPGA offers integrated differential solution (LVDS) for DDR high speed
communication + Hard IP Transceiver (Up to 3.2Gbps)– FPGA is by default the best hardware flexible solution offered through hardware
reconfiguration (even partial reconfiguration)– FPGA can implement processor core as
• Soft IP core (Microblaze)• Hard IP core (PowerPC)
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 3
Create High speed frequency design
• This is the first thing we expect from an FPGA.• What we know :
– FPGA can reproduce Chip set such as DSP– FPGA enables parallel structure – FPGA integrates features to improve performance and decrease the logic
needs
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 4
Number of Taps
MAC Engine FIR Filter
Samples31 × 12
Coefficients16 × 11
Samplein
SampleAddress
CoefficientAddress
D QCE
+
+ D Q
Max Sample Rate = Clock Rate
12
11
232727
550 MHz=
31
= 17.7 MHz
× 14 14
1
6
6 12
• Several clock cycles between samples • The clock rate must be higher than the sample rate
Load
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 5
• For the very highest sample rates, the full parallel structure performs all calculations in parallel, and registers provide the ultimate in “memory” bandwidth
Max Sample Rate = Clock Rate = 400 MHz (Virtex5™ FPGA)
Sample rate is essentially independent of number of taps.Size is set by the number of taps
+ +
+
+ +
+
+
+ +
+
+ +
+
+
+
+ +
+
+ +
+
+
+
+
Adder Tree Sample Latency
Registers: One per tap6
12
12
13
13
14
Samplein
Sampleout
Full Parallel FIR Filter
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 6
Systolic FIR Filter
Coefficients are from left to right, which causes the latency to be as large and grow with the increase of coefficients
Input time delay series is created inside the DSP48 slice for maximum performance irrespective of the number of coefficients without additional cost
Max Sample Rate = Clock Rate= 550 MHz
This filter structure, while referred to as a systolic FIR filter, is really a Direct Form with one extra stage of pipelining
K0 K1 K29 K30
0
DSP48 Sliceopmode = 0010101
DSP48 Sliceopmode = 0000101
27
12Samplein
Sampleout
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 7
• Components interconnection:– Data width may be large and may require a huge number of IOBs
• PCB Integrity signal Xilinx Sparse Chevron + LVDS• Power consumption LVDS
• System communication– Ethernet
Xilinx includes Tri-mode MAC Hard IP (10/1000/1000Mbps) in Virtex4 FX and Virtex5 LXT
– PCI Express Xilinx includes PCI Express Hard IP in the newest Virtex5 LXT family.
– PCIe x1,x2,x4 and x8…
Use very high Speed communication links
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 8
High-Speed Serial Applications
Each Application has Unique RequirementsEach Application has Unique RequirementsEach Application has Unique RequirementsEach Application has Unique Requirements
Backplane InterfaceToughest channel to drive
Backplane InterfaceToughest channel to drive
Optics InterfaceSFP Modules - GE, OC-48XFP Modules - 10GE, OC-192
Optics InterfaceSFP Modules - GE, OC-48XFP Modules - 10GE, OC-192
Board-to-BoardCables, short reach opticsBoard-to-BoardCables, short reach optics
Chip-to-Chip- Aurora
Chip-to-Chip- Aurora
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 9
Advantages of Serial Connectivity
Single Ended I/ODataInClkIn
DataOutClkOut
Noise Limited above ~200 Mbps(Single Data Rate)
Serial I/O+
-X
Clk
Data+
-CDR
Clk
DataEliminates traditional noise and clock-skew issues
3.125 Gbps and up!
Differential I/O (e.g., LVDS)DataIn
ClkIn
DataOut
ClkOut
+-
+-
+-
+-
Clock skew Limited above ~1.0 Gbps (Double Data Rate)
Traditional I/O schemes have limited bandwidth
Transceiver Transceiver
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 10
Transceiver Block Diagram
TX+
TX-
RX+
RX-
TXDATA
RXDATA
Channel Bonding and
Clock Correction
TX Clock Generator
RX Clock Recovery
REFCLK
Deserializer
Comma Det.
Serializer
ReceiveBuffer
TransmitBuffer
Transmitter
Receiver
Transceiver Module
*32/16/8 bits
*32/16/8 bits
50 – 156.3 MHz
FIFO
8B / 10B
Encode
8B / 10B
Decode
Elastic
Buffer
**20X Multiplier
Lo
op
-bac
k
Physical Coding Sublayer Physical Media AttachmentMindspeed IP
CRC
CRC
TXUSRCLKTXUSRCLK2
RXUSRCLKRXUSRCLK2
FPGA
FABRIC
PC
BOARD
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 11
TxTx
RxRx
DiscreteDiscretePHYPHY
DiscreteDiscretePHYPHY
TxTx
RxRx
Benefits of PCIe Hard Block• Saves logic resources
– 5,000 to 10,000 LUTs• Saves system cost• Saves power• Saves design time
– Automated design flow• Guaranteed functionality and
performance
Hard core with GTP Transceivers
Hard core with GTP Transceivers
Soft coreSoft core
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 12
Keep Flexibility:
Processor embedded in an FPGA• Processor embedded in an FPGA consists of the following
– FPGA hardware design– Software design
• Software routines• Interrupt service routines (optional) • Real Time Operating System (RTOS) (optional)
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 13
MicroBlaze Processor-Based Embedded Design (Soft IP)
Flexible Soft IPMicroBlaze32-Bit RISC Core
UART 10/100E-Net
Memory Controller
Off-ChipMemory
FLASH/SRAM
Fast Simplex Link
0,1….7
CustomFunctions
CustomFunctions
BRAM Local Memory
BusD-CacheBRAM
I-CacheBRAM
ConfigurableSizes
Arb
iter
Processor Local Bus
Instruction Data
PLBBus
Bridge
PowerPC405 Core
Dedicated Hard IP
Arb
iter
Processor Local Bus
Instruction Data
PLBBus
BridgeBus
Bridge
PowerPC405 Core
Dedicated Hard IP
PowerPC405 Core
Dedicated Hard IP
PowerPC405 Core
Dedicated Hard IPPossible inVirtex-II Pro
Hi-SpeedPeripheral
GB E-Net
e.g.Memory
Controller
Hi-SpeedPeripheralHi-Speed
PeripheralGB
E-NetGB
E-Net
e.g.Memory
Controller
e.g.Memory
Controller
Arb
iter OPB
On-Chip Peripheral Bus
CacheLink
SRAM
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 14
PowerPC Processor-Based Embedded Design (Hard IP)
PowerPC405 Core
Dedicated Hard IPFlexible Soft IP
RocketIO™
Full system customization to meet performance, functionality, and cost goals
DCR Bus
UART GPIOOn-Chip
PeripheralHi-Speed
PeripheralGB
E-Net
e.g.Memory
Controller
Arb
iter
On-Chip Peripheral Bus
OPB
Arb
iter
Processor Local Bus
Instruction Data
PLB
DSOCMBRAM
ISOCMBRAM
Off-ChipMemory
ZBT SSRAMDDR SDRAM
SDRAM
BusBridge
IBM CoreConnecton-chip bus standardPLB, OPB, and DCR
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 15
APU Interface
• Virtex™-4 FX devices• Coprocessor interface
– Connects the PowerPC™ processorto fabric
– Offload computations to fabric; forexample, hardware FPU
• Extends native PowerPC 405 processorinstruction set
• Decodes but does not execute instructions• Tighter integration between processor
and fabric
I-SidePLB
D-SidePLB
Control Control LogicLogic
BRAMBRAM
BRAMBRAM
405405CoreCore
DSOCM DSOCM ControllerController
ISOCM ISOCM ControllerController
APU APU ControllerController
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 16
Embedded Development Kit
• What is the Embedded Development Kit (EDK)?– The Embedded Development Kit is the Xilinx software suite for designing
complete embedded programmable systems– The kit includes all the tools, documentation, and IP that you require for
designing systems with embedded IBM PowerPC™ hard processor cores, and/or Xilinx MicroBlaze™ soft processor cores
– It enables the integration of both hardware and software components of an embedded system
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 17
Embedded System Tools• GNU software development tools
– C/C++ compiler for the MicroBlaze™ and PowerPC™ processors (gcc)– Debugger for the MicroBlaze and PowerPC processors (gdb)
• Hardware and software development tools– Base System Builder Wizard– Hardware netlist generation tool: PlatGen– Software library generation tool: LibGen– Simulation model generation tool: SimGen– Create and Import Peripheral wizard– Xilinx Microprocessor Debugger (XMD)– Hardware debugging using ChipScope™ Pro Analyzer cores– Eclipse IDE-based Software Development Kit (SDK)– Application code profiling tools– Virtual platform generator: VPGen– Flash Writer utility
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 18
Detailed EDK Design Flow
Processor IPMPD Files
system.ucf
Create FPGA Programming (system.bit)
MHS Filesystem.mhs
PlatGen
FPGA Implementation(ISE/Xflow)
Hardware
Data2MEM
download.bit
Compile
Link
Object Files
Executable
Libraries
Source Code (C code)
LibGen
MSS Filesystem.mss
EDIF IP Netlists
Source Code(VHDL/Verilog)
Synthesis
Standard Embedded Software Flow Standard Embedded Hardware Flow
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 20
Virtex-5 FPGA FamilyThe Ultimate System Integration Platform
LogicOn-chip RAM
DSP Capabilities
Serial I/OsParallel I/Os
PowerPC
LogicLogic
* Normalized to highest quantity
Now
Logic/SerialLogic/Serial
Now
DSP/SerialDSP/Serial
Very soon
Embedded/Embedded/SerialSerial
2007
Built on the Success of ASMBL
Platform Roadmap
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 21
Virtex-5 LX Platform
X X = IO capacity
5VLX305VLX30 5VLX505VLX50 5VLX855VLX85 5VLX1105VLX110 5VLX2205VLX220 5VLX3305VLX330
Logic CellsLogic Cells 30,720 46,080 82,944 110,592 221,184 331,776
Block RAM KbitsBlock RAM Kbits 1,152 1,728 3,456 4,608 6,912 10,368
CMTsCMTs 2 6 6 6 6 6
DSP48E SlicesDSP48E Slices 32 48 48 64 128 192
EasyPathEasyPath No Yes Yes Yes Yes
LUT6/FFsLUT6/FFs
Total I/O BanksTotal I/O Banks 13 17 17 23 23 35
PackagePackage SizeSize IOIO
FF324FF324 19 220 220 220
FF676FF676 27 440 400 440 440 440
FF1153FF1153 35 800 560 800
800FF1760FF1760 42.5 1,200 1,200
560
800
19,200 28,800 51,840 69,120 138,240 207,360
No
Distributed RAM KbitsDistributed RAM Kbits 320 480 840 1,120 2,280 3,420
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 22
Virtex-5 LXT FPGAs Industry’s First 65nm Serial I/O Solution
* Comparisons made to 90nm Virtex-4 FPGA devices
• Built on Virtex-5 LX platform 65nm ExpressFabric technology
• FPGA industry’s first built-in PCIe & Ethernet blocks
• Compliance tested at PCISIG Plugfest and UNH IOL
• Industry’s lowest power 65nm transceivers: <100mW @ 3.2Gbps
• Support for all major protocols: PCIe, GbE, XAUI, OC-48, etc.
• Six devices ranging from 30K to 330K logic cells
5VLX30T, 5VLX50T and 5VLX110T
Available
Now!Available
Now!
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 23
Virtex-5 LXT Platform 5VLX30T 5VLX50T 5VLX85T 5VLX110T 5VLX220T 5VLX330T
Logic Cells 30,720 46,080 82,944 110,592 221,184 331,776
LUT6/Flip-Flops 19,200 28,800 51,840 69,120 138,240 207,360
Total Distributed RAM (Kbits) 320 480 840 1,120 2,280 3,420
Total Block RAM (Kbits) 1,296 2,160 3,888 5,328 7,632 11,664
Clock Management Tiles (CMT) 2 6 6 6 6 6
DSP48E Slices 32 48 48 64 128 192
RocketIO GTP Channels 8 12 12 16 16 24
PCIe Subsystem Blocks 1 1 1 1 1 1
10/100/1000 EMACs 4 4 4 4 4 4
EasyPath No No Yes Yes Yes Yes
Package Size
FF665 27 360,8 360,8
FF1136 35 480,12 480,12 640,16
FF1738 42.5 680,16 680,16 960,24
X,Y X = SelectIO, Y=RocketIO Channels
CEA Saclay, Dapnia daySystem On Chip solution - 1 - 24
Conclusion
• Depending of your Digital system, you may use Xilinx FPGA as the solution for System On Chip.
– Today Xilinx can provide in 1 component (Virtex4 or Virtex5):• Embedded PowerPC 405• Embedded Ethernet MAC 10/100/1000• Embedded MAC DSP• Embedded High Speed Transceivers • Embedded PCI Express (Virtex5 only)• Programmable Logic Cells• ….