Chapter 9 High Speed Clock Management
Agenda
• Inside the DCM
• Inside the DFS
• Jitter
• Inside the V5 PLL
The Digital Clock Manager
Delay Lock Loop Block Diagram
Key ideas: All clocks look the same (more or less)Future clocks resemble current clocks
(ie, they are interchangeable)
Simplified DLL Diagram
Simplified Clock Tree Driving Four CLBs
CLKIN
F: Exact User Clock Frequencies : Unknown
d: Chip clock delay: Known
D: Amount of delay need to insert: Unknown
CLKIN
FBCLK (B)
F
d D
D“d”B
CLKIN
F: Exact User Clock Frequencies : Unknown
d: Chip clock delay: Known
D: Amount of delay need to insert: Unknown
CLKIN
FBCLK (B)
F
d D
DD“d”B
Aligining FBCLK with CLKIN
FBCLK
CLKIN DLL
FBCLK (B)
GCLK
(B)FBCLK
CLKIN DLL
FBCLK (B)FBCLK (B)
GCLKGCLK
(B)
Compensating for Setup Delays with DLL
Without DLL:
Tco = Td_C + Td_ff + Td_out
Tsu = (Td_D - Td_C) + Tsu_ff
Th = (Td_C - Td_D) + Th_ff
With DLL:
Tco = Td_ff + Td_out
Tsu = (Td_D - 0) + Tsu_ff
Th = (0 - Td_D) + Th_ff
DFF
Td_C
Td_out
GclkTd_ff
Td_DD
Without DLL:
Tco = Td_C + Td_ff + Td_out
Tsu = (Td_D - Td_C) + Tsu_ff
Th = (Td_C - Td_D) + Th_ff
With DLL:
Tco = Td_ff + Td_out
Tsu = (Td_D - 0) + Tsu_ff
Th = (0 - Td_D) + Th_ff
DFF
Td_C
Td_out
GclkTd_ff
Td_DD
Basic idea:We can setTd_C = 0
Feedback from End Cell to Center Cell
FBCLK
GCLK DLL
A
GCLK
FBCLK (B)
Tco(A) = Tco(B) - Td (A B)
Tsu(A) = Tsu(B) + Td (A B)
Th(A) = Th(B) - Td (A B) (more neg)
Tco(B) = Td_ff + Td_out
Tsu(B) = (Td_D - 0) + Tsu_ff
Th(B) = (0 - Td_D) + Th_ff
B
A
FBCLK
GCLK DLL
AA
GCLK
FBCLK (B)
Tco(A) = Tco(B) - Td (A B)
Tsu(A) = Tsu(B) + Td (A B)
Th(A) = Th(B) - Td (A B) (more neg)
Tco(B) = Td_ff + Td_out
Tsu(B) = (Td_D - 0) + Tsu_ff
Th(B) = (0 - Td_D) + Th_ff
B
A
Additional Skew Reduction
GCLK DLL
GCLK DLL
Place DLL outputCentrally, minimizeExtra delays
Completing the Picture
GCLK DLL
GCLK DLL
Clock tree Geometry is Crucial.
Delivering clockUniformly to largeArea is the mainIdea.
Can deliver toHundreds of LUTFlip flops withVirtually identical“skew”
DLL “Zoom In”
ck27
0
DLL power supply Pump / regulator
No-glitch counter
Huge coupling caps
For DLL power supply
Z1
ck0
ck90
ck18
0
clk2xclkdiv
Phase Detector2
Phase Detector1
FBCLK
LockgenCLOCK TREE
DELAY
clk0clk90clk180clk270
. . . ......
. .. . ...
clkgen
clk360
clkz
Very Complicated
State machine
p1
p2
Z2
RST
CLKIN
No-glitch counter
LOCKck
270
DLL power supply Pump / regulator
No-glitch counter
Huge coupling caps
For DLL power supply
Z1
ck0
ck90
ck18
0
clk2xclk2xclkdivclkdiv
Phase Detector2
Phase Detector1
FBCLK
LockgenCLOCK TREE
DELAY
CLOCK TREE
DELAY
clk0clk0clk90clk90clk180clk180clk270clk270
. . . ......
. .. . ...
clkgen
clk360
clkz
Very Complicated
State machine
p1
p2
Z2
RSTRST
CLKINCLKIN
No-glitch counter
LOCKLOCK
Z2 Delay Exposed
Path 0: 1 unit-delay
Path 1: 3/4 unit-delay
Path 2: 1/2 unit-delay
path 3: 1/4 unit-delay
1 unit-delay
From very complicated State machine
CLK0
CLK180
CLK90
...clkz
CLK0
CLK90
CLK270
CLK180
...
...
CLK360
CLK270
Path 0: 1 unit-delay
Path 1: 3/4 unit-delay
Path 2: 1/2 unit-delay
path 3: 1/4 unit-delay
1 unit-delay
From very complicated State machine
CLK0
CLK180
CLK90
...clkz
CLK0
CLK90
CLK270
CLK180
...
...
CLK360
CLK270
Main Delay & Trim Detail
...Path 0: 1 unit-delay
Path 1: 3/4 unit-delay
Path 2: 1/2 unit-delay
path 3: 1/4 unit-delay
1 unit-delay
From very complicated State machine
clkz
INCLK
Trim unit
...Path 0: 1 unit-delay
Path 1: 3/4 unit-delay
Path 2: 1/2 unit-delay
path 3: 1/4 unit-delay
1 unit-delay
From very complicated State machine
clkz
INCLK
Trim unit
Configuration & Lock Process
General idea:
Don’t assert “Locked”To outside world untilDone is asserted fromThe configurationState machine
Looks like instantaneousLocking as part powersup
Clock Doubling
CLKIN
CLK90
CLK180
CLK270
CLK360
CLK2X
S1 R1 S2 R2
CLKIN
CLK90
CLK180
CLK270
CLK360
CLK2X
S1 R1 S2 R2
Capture a periodIdentify end pointsIdentify middle (50%)Identify 25%, 75%Reassemble pieces
Board Deskewing
We seek the idealSituation with A,BAnd C all tracking.
However, differentEnvironments.
Takes two DCMsTo lock B to A, andC to A, but can Make outside trackInside the chip
More on Board DeskewingForeward pathdelay
Note howLocked is usedTo enable/Disable theRight handdevice
Synchronization trick delivers four clocksWorth of reset to the DCM and stops
System Synchronous Applications
The Classic, single clocked synchronous system. Looks good, but doesn’taccount for clock arriving skewed all over the place to the right hand block(s)
Source Synchronous Applications
More common these days. The clock is recreated by the middle boxand forwarded to the boxes on the right. This is what DDR SDRAM does, making whichever device is blasting data, also provide the clockfor that data to the receivers. You may receive in one case, but transmitin another case, so often both boxes can transmit data and clocks. Depends on which one is Sourcing the Data.
Compare System and Source Synchronous Timing
Most Source Synchronous devices have DLL units inside.Data centering can select appropriate setup time by phaseShifting the clock.
Duty Cycle Correction
DLLs have ability to correct duty cycle to 50%. This meansdata clocking with the rising edge has same setup windowas data clocking with the falling edge.
Digital Frequency Synthesizer Capabilities
Its nice to distribute slow, external clocks and be able to increasespeed within the device. Frequency synthesis allows this.
Basic Internal DFS Structure
Variable Ring OscillatorOutput
Control
CLKFX
CLKIN
M
D
Up/Down Variable Ring OscillatorOutput
Control
CLKFX
CLKIN
M
D
Up/Down
Output Control manages Up/down signalTask: adjust variable ring oscillator so:
CLKFX = CLKIN M D
Some Common DFS Applications
Fixed Value Phase Shift
See XAPP 462: ISE software lets you place a phase shift constantInto the programmable phase shifter, to assign clock edges
Dynamic Fine Phase Shift Control
Can also do it dynamically, on the fly.Status feedback gives indicator of where youAre in the phase shifting
Dynamic Phase Shift Controls
Spartan 3 DCM Restrictions
Clock Jitter
Jitter comes from many sources: clock crystal drift noise VCC ripple SSO ground bounce temperature drift, gain change
Cycle to Cycle Jitter
Peak to Peak Period Jitter Distribution
Period jitter captured with digital sampling scope (typically)
Period Jitter Spec. as % Unit Interval
Another point of view: consider the amount of time allocated for a dataBit on a line. Call it a Unit Interval. Identify the amount of peak to peakJitter as a percentage of that Unit Interval.
Jitter is a “performance thief”, you must assume it subtracts out of your setup time (usually). Leaves less time available to handle data properly
Peak to Peak Jitter Calculation
Adding more devices is done by squaring the device jitter and addingunder the radical. Deviation gets divided by “n” as number increases
Jitter for DCM Cascades
Virtex 5 Clock Management Tile
V5 Clock tile doesn’tHave PMCDs, itHas PLLs, instead.
V5 PLL –High Level
PFD = Phase & Frequency DetectorCP = Charge PumpLF = Low Frequency FilterVCO = Voltage Controlled OscillatorD = Divisor counterM = Multiplier counter
V5 PLL – More Detail
PLL/DCM Cascades(preferred)
Conclusions
• Clocking resources simplify design• DCMs cover standard user domains working with
the global clock networks• DFS adds in extra clock multiplication• Phase shifting allows “tweaking” clocks to better
center to data• PMCDs are quick/cheap ways to make more
clocks and track together• Jitter can be identified, managed and predicted• PLLs can extend the frequency range and reduce
overall jitter• Virtex 6 and Spartan 6 resemble Virtex 5 DCMs