Post on 09-Feb-2020
transcript
Power Matters
FPGA Technology & FPGA Design
Nizar Abdallah, Ph.D.
Workshop on FPGA Design for Scientific Instrumentation and ComputingInternational Centre for Theoretical PhysicsNovember 2013
Power Matters
A Short Bio…
INFOTRON
Power Matters
Outline
▪ Introduction
▪ What is a Field Programmable Gate Array (FPGA)?
▪ Essentials of FPGA Technology and FPGA Design
3
Power Matters
Design Methodology & Design Tool Flow
Modern FPGA Design Curriculum
Essentials of FPGA Design
Designing with VHDL
Designing with Verilog
Advanced VHDL System Verilog
Timing Analysis & Design Constraints
Low-Cost Design Design Debug
Low-Power DesignDesigning with
SmartFusion2
Advanced
FPGA Design
Interface
Design
DSP
Design
Embedded
Design
Designing with HLS
Designing with OpenCL
Power Matters
Today…
5
FPGAs & Processors
are meeting
the era of
Programmable SoC
Power Matters 6
Power Matters
Design Cost
7
Power Matters
More Intelligence in Every System
8
Power Matters
Trend Data Center Infrastructure:Cloud Computing
9
Power Matters
Industry Mandates
Programmable Imperative
10
FPGA Technology Overview
11
Power Matters
What’s a FPGA?
Field Gate Array
Power Matters
The GAP
PLDs
Highly configurable
Fast Design & Modification
Time
No Complex Functions
FPGAs ASICs
Not Configurable
Expensive in Design Time
Support Complex Functions
FPGAs
Power Matters
Where Do FPGAs Fit?
14
Power Matters
What’s a FPGA?
A simplistic old definition:
▪ a high capacity programmable logic device
▪ An array of programmable basic logic cells surrounded by programmable interconnects
▪ Can be configured (programmed) by end-users (field-programmable) to implement specific applications
▪ Capacity up to multi-millions logic gates and up to 500MHz core clock speed, supporting giga-sample per second data throughput rates
▪ Popular applications: prototyping, on-site hardware reconfiguration, DSP, logic emulation, network components, etc…
Power Matters
FPGA Definition
Field Programmable Gate Array
A large number of logic gates in an IC array that can be connected
(configured) electrically
The Four Components of an old FPGA▪ The Configuration Element
▪ The Logic Module
▪ The Memory
▪ Control Circuits/Special Features
Power Matters
A Simplistic Old Architecture
Generic FPGA Architecture
? ? ? ?
? ? ? ?
? ? ? ?
? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
Power Matters
A More Modern Architecture
Power Matters
FPGAs with Embedded Processors
▪ As you would expect…
▪ Processor• With associated memory and
peripherals
▪ FPGA Fabric• Logic blocks, Memory blocks,
Math blocks, etc.
▪ Transceiver and GPIO
19
Processor
FPGA Fabric
Peripherals
Memory
XCVR GPIO
Power Matters
Processor, Memory & Peripherals
▪ Processor• ARM: Single/Dual, FPU
• Interrupt, Debug, Cache
• Bus Interface
• Eco-system
▪ Memory• Instruction and Data
• Usually SRAM (Flash from Microsemi)
• Cache (L2)
▪ Peripherals• Internal: Timers, WDT, DMA, Security, etc
• External: UART, SPI, I2C, CAN, SDIO, USB, ENET, Flash Controller, DDR Controller, ADC, DAC
20
Processor
Peripherals
Memory
Power Matters
FPGA Fabric
▪ Look-up Tables
▪ Interconnect
▪ Carry logic for counters
▪ Block memory• Large, Small, Multi-port
▪ Math blocks• DSP, Fixed/Floating point
▪ Interface to Processor Subsystem
21
Processor
FPGA Fabric
Peripherals
Memory
Power Matters
Transceivers
▪ aka, Serializer Deserializer• Low Level logic needed for high
speed serial IOs
• Programmable PHY
– Advanced features
– 6-10Gbps
– Adaptive Equalization, Pre-Distortion, etc
– Testing: On-chip Eye
• Some also have hard MACs
– PCIe
22
Processor
FPGA Fabric
Peripherals
Memory
Transceivers
Power Matters
GPIO, DDR Controllers
▪ GPIO• Support for many, many IO
standards
• Bank basis
• Programmable features
▪ DDR Controllers• Range of standards
– LPDDR, DDR2/3, etc
• Advanced features
– PHY, Access Optimization, ECC, etc
23
Processor
FPGA Fabric
Peripherals
Memory
XCVR GPIO
DDR
Power Matters
Software
▪ Define the system
▪ Program the Processor in “C” or Assembly• Libraries
▪ Program the FPGA in HDL• IP Blocks
▪ Simulate, Program and Debug
▪ Don’t forget to look at the software tools offered by the FPGA provider
24
ProcessorFPGA
Program
Simulate
Debug
System
Definition
Power Matters
Microsemi SmartFusion 2
25
FPGA
Logic
SerDes
Channels
SECDED
Memory
Interface
ARM CPU
SEU-Free
Flash FPGA
Configuration
Memory
Encryption,
Error Detection
And Low Power
Control
SEU Protected
SRAM Blocks
CPU
Peripherals
NVM and
SRAM
Memory
Power MattersECE 448 – FPGA and ASIC Design with VHDL
Major FPGA Vendors
SRAM-based FPGAs
▪ Xilinx
▪ Intel (form. Altera)
▪ Lattice Semiconductor
Flash & antifuse FPGAs
▪ Microsemi (form. Actel)
Power Matters
▪ The Interconnect Switch
XXX
FPGA Routing Technologies
Power Matters
FPGA Routing Technologies
SRAM
6T
Flash
1T
Anti-fuse
Reprogrammable
Best of Both Worlds
Reprogrammable
& Nonvolatile
Nonvolatile
Large Switch
expensive wiresLow Logic Utilization
typ 60%
Small Switch
cheap wiresHigh Logic Utilization
typ >85%
Smallest Switch
cheapest wires Highest
Logic Utilization
typ >90%
Power Matters
FPGA Routing Resources
▪ Flexible High-Performance Routing Hierarchy• Ultra-fast Local Network
• Efficient Discrete Long-line Network (1, 2 and 4 Tiles Long)
• High-speed Very-long-line Network
• Eighteen Low-skew Global Networks
▪ Benefits• Multiple Routing Path Alternatives for Low Congestion
• Short Corner-to-corner Delays
• Enables Rapid Timing Convergence
Power Matters
FPGA Local and Long Line Networks
Long Line
1 VersaTile
Long
LL L L L L
LL L L L L
LL L L L L
LL L L L L
LL L L L L
Long Line
4 VersaTiles
Long
Long Line
2 VersaTiles
Long
Local Lines connect VersaTile Outputs to
nearest-neighbor VersaTiles, I/O Buffer, or
Memory Block
Long Lines route longer
distances and support
higher fanout nets
Power Matters
FPGA Very Long Line Network
16 Tiles
12 Tiles
24 Lines 24 Lines 24 Lines
32 Lines
32 Lines
32 Lines
Very Long Lines reach ±12 Tiles vertically and ±16 Tiles horizontally (SmartFusion)
Power Matters
FPGA Global Networks
▪ FPGA fabric contains multiple Global Resources• Chip-wide Global Networks and Quadrant Global Networks • Chip-wide Global Networks• Can reach all Tiles (Ports, RAM, I/O, and CCC Tiles)• Driven by Clock Conditioning Circuitry (CCC)
▪ Quadrant Global Networks• Can Reach All Tiles Within the Quadrant• Driven by Clock Conditioning Circuitry (CCC), usually in all
Corners of the Die
Power Matters
FPGA Global Network (SmartFusion)
▪ Left and Right CCCs Provide Access to 6 Chip-wide Global Networks
▪ CCCs in 4 Corners Provide Access to 12 Quadrant Global Networks (3 per Quadrant)
▪ Each Tile Has Access to 9 Global Resources
MSS
PLL
/CCC
3
3
3
3
3
CCC
3 3 33
6 6 6 6
6 6 6 6
33
PLL*
/ CCC
CCC
CCC
CCC
333
*PLL on A2F500
Power Matters
Additional Resources
Altera Arria Web Page
Altera Arria Development Kits
Microsemi SmartFusion2 Web Page
Microsemi SmartFusion2 Development Kits
Xilinx Zynq Web Page
Xilinx Zynq Development Kits
All Programmable Planet
Warren’s APP Blog (SerDes)
34
Design Methodology
35
Power Matters
Motivations
▪ Accelerated time-to-market and reduce life-cycle• Flexibility needs
Power Matters
Motivations
▪ High integration• Basic: memory, logic, I/Os… plus
• More: PLL, DSP, Micro-controller, Flash, SerDes, clock oscillator…
Power Matters
Motivations
▪ Design skills: One person cannot do it all
• Ideal team: System level, DSP algorithms, SW/HW co-design, HDL modeling, Design methodology, Project management, Board level, Signal integrity, High-speed I/Os
Power Matters
Motivation
$700 million.
Power Matters
ARM
Cortex-M3
SYSTEM DESIGNER’S DREAM
Power Matters
First Step in System Design
41
Power Matters
Design Principles
▪ Hierarchy• Divide & conquer
• Simplification of the problem
▪ Regularity• Divide into identical building blocks
• Simplifies the assemblage verification
▪ Modularity• Robust definition of all components (entity)
• Allows easy interfacing
▪ Locality• Ensuring that interaction among modules remains local
• Makes designs more predictable and re-useable
Power Matters
▪ Top-Down design methodology in 5 steps
Team Design Methodology
1- Specifications
2- Partitioning
3- Partial Implementation
4- Assemblage
5- Implementation
Power Matters
Step 1: Specifications
▪ Put down the circuit concept• Easy verification
• A reference manual for communication
• How?
▪ Put down the requirements• Timing budget
• Power budget
• Area budget
• Financial budget
Power Matters
Step 2A: Partitioning
▪ Divide and conquer strategy• Driven by technology, teams, availability (IPs), etc…
SOFTWARE HARDWARE
Power Matters
Step 2B: Partitioning
▪ Divide and conquer strategy• Driven by technology, teams, availability (IPs), etc…
Power Matters47
Step 3: Partial Implementation
Configurators
RTL IPs
Structural
RTL
Power Matters
Step 4: Assemblage
▪ Hierarchical way
▪ Start from the lowest level
▪ Includes mixed-level description
▪ Final product validation is now possible• Compare to original specifications
• Simulate
• On-board verification
Power Matters49
Step 5: Full Design Implementation
▪ Simplified FPGA design implementation flow
Design
Entry
Logic
Synthesis
P&R
(Layout)Programming
Verification
Power Matters50
Step 6: Validation
▪ Simulation
• Bus Functional Model (BFM)
• Mixed language HDL simulation
▪ Hardware Prototype for system validation
Timing Analysis & Design Constraints
51
Power Matters
“Every circuit is considered guilty until proven innocent”
Barto's Law
Power Matters
Detecting problems as early as
possible
Timing Golden Rule
Power Matters
▪ Timing Driven Synthesis
▪ Timing Driven Optimization
▪ Timing Driven Floor-Planner
▪ Timing Driven Place & Route
Timing in the Design Flow
Power Matters
▪ Simulators• Circuit-Level
• Timing
• Switch-Level
• Logic-Level
▪ Verifiers (Pattern Independent)• Static Timing Analysis
Simulators versus Verifiers
Power Matters
▪ 4 types of basic timing paths in a synthesized digital design:• Input to registers
• Registers to output
• Input to output
• Registers to registers
Timing Paths
Power Matters
▪ These basic timing paths can apply to:• A module within an ASIC/FPGA
• A whole ASIC/FPGA
• A system with multiple chips
Timing Paths
Power Matters
▪ Maximum or minimum limits placed on timing paths• Input to registers: External Setup / Hold
• Registers to output: Maximum / Minimum Clock-to-Out
• Input to output: Maximum / Minimum Delay
• Registers to registers: Maximum Clock Frequency
▪ Other exceptions: False Paths, Multi-Cycle Paths
▪ Usually expressed in ns or ps
▪ First understanding flip-flop timing parameters
What are Timing Constraints?
Power Matters
▪ The level of d is sampled on the rising edge of clk
▪ q holds the sampled value until the next clk rising edge
▪ The level of d must be stable for some amount of time before and after the sampling clock edge
CLK
D Q
INFER: process (CLK) begin
if (CLK’event and CLK =‘1’) then
Q <= D;
end if ;
end process INFER;
Flip-Flop Timing: Overview
Power Matters
▪ Clock Parameters• Clock cycle time (tCYC), minimum
• Clock pulse width high (tCH), minimum
• Clock pulse width low (tCL), minimum
CLK
D Q
Flip-Flop Timing: Clock Requirements
Power Matters
▪ Input Setup (tS)
• The minimum time that the D input must be stable before the active (rising or falling) edge of the clock
CLK
D Q
Flip-Flop Timing: Input Setup
Power Matters
▪ Input Hold (tH)
• The minimum time that the D input must be stable after the active edge of the clock
CLK
D Q
Flip-Flop Timing: Input Hold
Power Matters
▪ Setup and Hold define a minimum window around the active clock edge during which D must be stable
▪ tS or tH may be negative, but tS + tH > 0
CLK
D Q
Flip-Flop Timing: Stability Requirements
Power Matters
▪ Clock-to-out (tCO) a.k.a. Clock-to-Q
• The time delay from the active edge of the flip-flop’s clock input to the resulting change in the Q output
▪ Specified minimum and maximum times
CLK
D Q
Flip-Flop Timing: Clock-to-Out
Power Matters
▪ Constraining FPGA Designs with SDC
Constraining Designs @ Board-Level
Source: Mentor Graphics Corporation ©, 2002
set_multicycle_path
set_false_path
set_output_delay
set_input_delaycreate_clock
Power Matters
▪ Design Environment: Corner Analysis• Operating conditions (Process / Voltage / Temperature)
Worst: to fix the setup violationsTypical: mostly ignoredBest: to fix the hold violations
▪ Timing Assertions (Design-level)• Clock Characteristics
• Arrival Time at Each Input Port
• Required Time at Each Output Port
▪ Timing Exceptions• False Paths
• Minimum / Maximum Path Delay
• Multi-cycle Paths
Constraining Designs
Power Matters
▪ Create Clock: reg-to-reg requirement
regA regB
D Din1
clk
Clk at
regA
Clk at
regB
setup
hold
Single-cycle timing relationship
Timing Analysis: Setup/Hold Check
Power Matters
▪ Arrival/Required
Ck
0
Slack = Required_time – Arrival_time (Violation if < 0)
20
FF2:D Arrival Time
FF2FF1
CK
D
CPCP
FF2:D Required Time
slack
40FF2:CP
FF1:CP – FF2:D
20
CK-FF2:CP
15
= 30
= 33
= +3
-STP
2
35
FF1:CP
10CK–FF1:CP
10
Setup Check
Power Matters
▪ Arrival/Required
CK0
Slack = ArrivalTime – RequiredTime (Violation if < 0)
20
FF2:D Arrival Time from CK
FF2:D Required Time from CK
slack
FF2:CP
= 19
= 16
= +3
15
FF1:CP
5
FF2FF1
CK
D
CPCP
1
0
FF1:CP-FF2:D
105
FF1:CP
FF2:CP 1
4
Hold Check
Power Matters
▪ set_multicycle_path: reg-to-reg exception
regA regB
D Din1
clk
Clk at
regA
Clk at
regB
setup
multi-cycle timing relationship
Timing Analysis: Multi-cycle
Power Matters
▪ Captures External Setup/Hold Requirements
Input Delay Constraints
A
D
Q1
Q4
ENB
Register
A
D
Q1
Q4
ENB
Register
MyDesign
A
D
Q1
Q4
ENB
Register
A
D
Q1
Q4
ENB
Register
Clock
GeneratorCK
IN
SDC: set_input_delay 2.0 -clock CK {IN}
Option “-min” for hold check
Option “-max” for setup check
2.0
nsCK
3.0ns
Power Matters
Design Methodology: Timing Closure Loop
Enter synthesis
constraints via SDC
Place and Route
Improve
resultsResults
OK
Static Timing Analysis
PASS FAIL
Modify clock constraints
Modify I/O constraints
Set max delay
Add False paths
Add Multicycle paths
If timing closure
Not achieved in p/r
Back annotated
Timing Simulation
clock I/O false paths multicycle
Power Matters
▪ Study over the Least Common Multiplier
common period
Clock-Domain-Crossing (CDC)
Power Matters
▪ Today’s systems have a multitude of components working with different clock-domains running at varying speeds
▪ Examples range from the small mobile phone chips to huge graphics or microprocessors that interface with a variety of busses and I/Os
▪ Signals that cross the clock-domain boundaries can be typically classified into two types:
- Synchronous- Asynchronous
▪ Synchronous crossings are those where the receiving domain has a phase/frequency relationship with the sending domain
Clock-Domain-Crossing Analysis
Power Matters
▪ Synchronous crossings are those where the receiving domain has a phase/frequency relationship with the sending domain
▪ These crossings are timed and verified robustly that they meet the timing requirements
Synchronous Clock-Domain-Crossing
Power Matters
▪ Asynchronous crossings are those where there is no relationship between the sending and receiving clocks
▪ These clocks originate from different clock generators or derivatives of those
▪ As a result, timing cannot be accurately verified since the order of clock edges cannot be guaranteed
Asynchronous Clock-Domain-Crossing
Power Matters
▪ Metastbility refers to signals that do not assume stable 0 or 1 states for some duration of time at some point during normal operation of a design
▪ In a multi-clock design, metastability cannot be avoided but the detrimental effects of metastability can be neutralized
Metastability
Power Matters
Metastability
Power Matters
▪ Typical usages of signals flowing from one asynchronous domain ClkA to another domain ClkB can be categorized into the following types:
1. Reset signals
2. Single-bit Data signals
3. Multi-bit Data signals
4. Synchronized Control signals
Metastability
Power Matters
▪ The metastability occurrences can be predicted by using the mean time between failures (MTBF) formula
▪ Where C1 and C2 are constants that depend on the technology used to build the flip-flop; tMET is the duration of the metastable output; and fclk and fdata are the frequencies of the synchronous clock and the asynchronous input, respectively
Mean Time Between Failures
Power Matters
▪ Reset signals are used to reset the logic in the receiving domain during chip-reset phase or whenever interrupts such as software resets or aborts happen in the sending domain
▪ Single-bit data signals are typically used to convey some sort of status to the receiving domain; for example, to convey that the sending domain is busy or to gate the receiving domain clock
▪ Before usage in the receiving domain, both these kind of signals need to be synchronized (usually with a double-flop synchronizer)
▪ Synchronization is sufficient because these signals are intended to transition intermittently and be stable the rest of the time
Synchronization, Control & Data Paths
Power Matters
▪ On the other hand, multi-bit data transfer is used to transfer buses of data signals between domains
▪ If each of these bits is individually synchronized, the outputs of the synchronizers lose their correlation due to the metastability problem
▪ Hence, some sort of common control mechanism is required to
(i) let the receiving domain know that the transmitted multi-bit data is valid and
(ii) let the receiving domain capture that data only when it is stable
▪ This is often accomplished via handshake based synchronized control signals or via FIFO based synchronization
Synchronization, Control & Data Paths
Power Matters
Synchronization, Control & Data Paths
▪ This kind of transfer has to be designed very carefully
Power Matters
Synchronization, Control & Data Paths
▪ MUX synchronizer
Power Matters
Synchronization, Control & Data Paths
▪ The MUX synchronizer has a critical requirement for all input in terms of the domains and functionality:
• The select input of the MUX comes from the destination domain (domain into which the signal is being synchronized)
• One of the MUX inputs is coming from the destination domain—that is, the holding loop
• The MUX inputs can be source, destination, or user-specified static signals
• The logic between the MUX synchronizer and the destination flop is driven by the destination domain or static signals
Power Matters
Convergence in the Crossover Path
▪ Clock domain crossover paths are false paths for timing tools
▪ Any logic in this path must be carefully crafted and verified, because the logic can cause glitches and create functional errors downstream
89
Power Matters
Other Challenges in Verification of CDC
▪ Adding CDC verification in the early design stages verifies and validates the unverified portion of the design
90
Power Matters
Questions ?
Power Matters
Why Power Management?
▪ “When I talk to companies, power is the number one problem they have to solve.” – Lip-Bu Tan, Cadence
▪ “Now, power is also becoming a system problem. You have to start at the top.” – Wally Rhines, Mentor Graphics
▪ “The world faces one mega-issue, which is power. Electronics could help cut power consumption by 20-30%.” –Aart de Geus, Synopsys
Power Matters
Why Power Management? (cont.)
Battery life Environmental concerns
Cooling Cost System Reliability
Power Matters
Why Power Management? (cont.)
▪19 % of total electricity worldwide consumed by electrical
lightening
▪5 to 25% of power wasted by standby TV, PC, games,
printers!!!
▪80% of IT power wasted (globally 100 Million MWH)
Power Matters
Why Power Management? (cont.)
▪ Low Voltage Leakage + Noise
▪ Technology Shrink Variability + Design Complexity
▪ High Density & High Frequency High Temperature + EM
Power Matters 96
Why Power Management? (cont.)
96
Source: Chip Design Trends Newsletter, John
Blyler, April 2007
Sample size
19,720
Power Matters
What Power-related issues did you encounter on last project?
(*) source: Synopsys 2008
97
Why Power Management? (cont.)
97
Power Matters
Why Power Management? (cont.)
Voltage Drop Electromigration
Power Consumption
Static &
Dynamic
Power
Dissipation
Power Matters
Power Closure Challenges*
10X
10%
System Level
Power Optimization
Architecture selection
Voltage scaling
Clock frequency scaling
Power Analysis based on
Estimated gate counts
Estimated activity
RTL Design
Module clock gating
Voltage island isolation
Defined clocks and registers
Estimated gate counts
Realistic activity
Floor Planning
Voltage islands
Power gating
Physical Synthesis
Threshold voltage scaling
Advanced clock gating
Gate-level optimization
Actual gate counts
Realistic activity
Wireloads or global routing
Final libraries
Place & Route
Power-aware placement
Clock tree optimization
I/O configuration
Actual gate counts
Realistic activity
Detailed routing
Final libraries
(*) Adapted from Synopsys
Power Matters
Power Closure Challenges
▪ Power metrics –what and when is it important?
▪ Power analysis accuracy and consistency
▪ Need for a combination of spatial and temporal information
▪ Good power vectors difficult to generate
▪ Power models complexity• IPs, operating modes, Temperature dependence of leakage
▪ Design & requirement complexity• Power, timing, area, cost, reliability
Power Matters
Power Optimization: System
Technique Dynamic Static
Clock optimization
Parallelism / Pipelining (2-3x) X
Energy efficient SW & FW
Voltage & frequency scaling
System level power breakdown
Chip level power specification
Hardware and firmware algorithm partition
Power Matters
Power Optimization: RTL
Technique Dynamic Static
Module clock gating
Bus & State encoding
Voltage & Frequency scaling
Retiming
Power gating and sleep devices
Voltage and power islands
Power constraints and per IP power specification
RTL and IP power optimization
power coverage
Power Matters
Power Optimization: Physical Design
Technique Dynamic Static
Clock optimization
Activity aware P&R
I/O optimization
Voltage & Frequency scaling
Input state aware leakage
power coverage
Power Matters
Architecture
Technology
Microsemi Power Management
FPGA
Power
Management
Design Techniques
Power Matters
Microsemi Power Management: Technology
▪ Actel Flash FPGAs > 1000 times less static power
▪ Actel Flash FPGAs competitive for dynamic power
▪ RTAX-SL 50% lower standby current at 125 degree C
0
20000
40000
60000
80000
100000
120000
25°C 70°C 85°C
Sta
tic P
ow
er
(uW
)
Temperature
Actel IGLOO AGL600
Xilinx Spartan-3AN XC35400AN
Altera Cyclone-III EP3C5
Power Matters
Microsemi Power Management: Architecture
▪ Low power macros
▪ Segmented clocks
▪ Low-power modes
▪ Multi-Voltage
Power Matters
Microsemi Power Management: Architecture
▪ Low power macros: memory• Width Cascading:
No extra logic, Better performance
• Depth Cascading: Extra decoding & muxing logic, Lower performance & higher area
▪ Synplicity uses depth
▪ > 50% power saving
▪ Low-power option in SmartGen
1K X 4 1K X 4
WADDR
RADDR
WD
WADDR
RADDR
WD
512 X 8
512 X 8
REN
REN
WADDRRADDR
WD
WADDR RADDR
WD
Power Matters
Microsemi Power Management: Architecture
▪ Low power macros: arithmetic
▪ SmartGen Ripple adder • Power (-25%) Performance (-27%)
▪ SmartGen Brent-Kung and Sklansky adders • Power (-6 to -18%) Comparable performance
▪ Experiment: Multipliers• Low-power multiplier with clock and signal gating
• We could save up to 50% in idle mode
• Project on hold
Power Matters
Microsemi Power Management: Design Techniques
▪ Know your system power and temperature profile
Power Matters
Microsemi Power Management: Analysis Tools
▪ Know your tools: Power Analysis
Design Flow
Power Calculators SmartPowerDatasheet
Power Matters
SmartPower: Power Analysis
▪ Average Analysis: Power budget for• Package Selection
• Heat Dissipation
▪ Scenario Analysis: Capturing multi-functional power modes
▪ Glitch Analysis: Detecting power waste
▪ I/O Timing & Power Advisor
▪ Time Based Analysis: Peak power for• Power Supply Specification
• Hot spot
• Voltage drop
Power Matters
User Design
After Layout
Power Analysis
Report
SmartPower
Signals
Activity
Operating
Conditions
SmartPower: High Level Flow
Power Matters
▪ Impact of Temperature: • High on Static, Limited on Dynamic Power
▪ Impact of Process: • High on Static, Moderate on Dynamic
▪ Impact of Voltage: • High on Static, High on Dynamic
▪ Impact of Radiation:• For RTAX-S, very small rise in ICC at 100Krad.
• For RT-A3P, the TID reports show that this is also true for TID < 40Krad
SmartPower: Operating Conditions
Power Matters
Impact of Voltage on Static Power
AGL600 Static ICC versus Voltage
Reducing the voltage from 1.5V to 1.2V = 70 % static power saving
0
0.5
1
1.5
2
1 1.2 1.4 1.6 1.8
Sta
tic P
ow
er
Voltage (V)
Normalized Static Power for AGL600
SmartPower: Operating Conditions (cont.)
Power Matters
Impact of Voltage on Dynamic Power
Power=C.V^2.Freq
Reducing the voltage from 1.5V to 1.2V = 40 % dynamic power saving
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9
Dy
nam
ic P
ow
er
Core Voltage
Dynamic Power Normalized at 1.5V versus Core Voltage
SmartPower: Operating Conditions (cont.)
Power Matters
SmartPower: Signals Activity
▪ Through simulation data (VCD file)• Recommended flow
• Simulation quality very important
▪ Actel’s vectorless estimator• Clock constraints imported from SmartTime
▪ Default annotation for data and clocks• Two fixed values per clock domain: Clock frequency, data toggle rate
• Convenient and fast - no simulation required – but inaccurate
▪ User specified net by net• Only useful to rectify specific nets activity
Power Matters
VCD/Default
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
SmartPower: Signals Activity (cont.)
▪ Actel’s vectorless estimator• Goal: Improve accuracy in the absence of simulation data (VCD)
• Input: Probabilities and transition densities on primary inputs
• Output: Switching activities on each pin
Vless/VCD
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
0 0.2 0.4 0.6 0.8 1 1.2
+/- 10% of VCD+20 to -40% of VCD
Power Matters
SmartPower: Signal Activities (cont.)
▪ Monitor the simulation coverage• If coverage < 95%, must revisit VCD flow
Power Matters
SmartPower: Modes and Scenarios
▪ Mode definition• To save a set of parameters defining the power of a design
• To record specific operating conditions and activities
• Predefined Modes: Flash*Freeze, Sleep, Stand-by (Fusion Only)
▪ Scenario definition• To combine different modes for power estimation
• Predefined scenarios
Power Matters
▪ SmartPower Modes and Scenarios Pane
SmartPower: Modes and Scenarios (cont.)
Power Matters
▪ Hazard or spurious transition definition: Due to delay mismatch among re-convergent paths.
▪ Wasted power represents 15%-20% of the global power
▪ Strongly dependent on circuit topology and test vectors
▪ Automatic glitch filtering when smaller than a given threshold, defined by family, and characterized by Spice
▪ A hazard report is accessible from SmartPower menu
SmartPower: Glitch Analysis
Power Matters
SmartPower: I/O Advisor
▪ Modification of I/O attributes• Output load
• Output drive and slew
▪ Algorithm to optimize power while meeting timing constraints – for output drive and slew• Positive slack – selects attributes with least power but still maintain
positive slack
• Negative slack – selects attributes to minimize negative slack
• No slack – selects attributes with least power
▪ Silicon family support• G3 derivatives (8.6 release)
Power Matters
SmartPower: I/O Advisor (cont.)
▪ Introduction page
▪ Individual steps optional
Power Matters
SmartPower: I/O Advisor (cont.)
▪ Output load page• Change current output load to reduce power
Power Matters
SmartPower: I/O Advisor (cont.)
Output drive and slew page• User can change “current” output drive and slew
• I/O Advisor provides “suggestion” for better power consumption while meeting timing constraints
Power Matters
Power during M1/WFI is higher
than when not in WFI !!!
SmartPower: Time Based Analysis
Using Cycle Accurate Analysis to debug a simulation time window
Power Matters
SmartPower: Time Based Analysis (cont.)
Verifying that the Gated Clock solved the problem
When in WFI mode we are now consuming only 13 mw.
This is to compare with 32.5 mW when in Burst Mode
Power in the core during WFI is less than
2 mw. it was 16 mW without gated clock
Power Matters 128
PDPR: Global Routing Optimization
128
Sample size
115
Survey end users at DAC Suite 2006
Source Sequence Design inc.
Power Matters
PDPR: Global Routing Optimization
▪ For designs without memories on IGLOO 1.2V devices• 65% of total power (Nets); 12% (Gates); 13% (I/Os)
129129
Power Matters
PDPR: Global Routing Optimization
▪ Reducing the global network segments (spines) during placement
▪ From 7 spines and 60 ribs to 2 spines and 26 ribs.
130130
Power Matters
PDPR: Global Routing Optimization
▪ 12% average (28% max) reduction in overall dynamic power
▪ 18% average (37% max) reduction in net power
▪ 1% performance loss
131131
Power Matters
▪ Area minimization with sequential optimization
▪ 6.7% average timing improvement
▪ 13% average power saving
PDPR: Power-Driven Re-Synthesis
Power Matters
Conclusion: Best Design Practices
▪ Architectural exploration has great impact
▪ Write power friendly RTL
▪ Clock reduction scheme is very important
▪ Power-efficient memory selection is key
▪ Low-power arithmetic macros can be helpful
▪ Develop accurate power vectors
▪ Verify power early and often
▪ Run “power regressions” throughout – RTL to tapeout
133133
Power Matters
Questions ?