1
SystemSystem--onon--ChipChipQi C iQi C iQiong CaiQiong Cai
Intel Barcelona Research Center Intel Barcelona Research Center
Aula Empresa, Facultat d’Informàtica de Barcelona, February 2010Aula Empresa, Facultat d’Informàtica de Barcelona, February 2010
© © Intel Corporation, 2010Intel Corporation, 2010
OutlineOutline
•• SystemSystem--onon--Chip IllustratedChip Illustrated
•• SoC Challenges and Current SolutionsSoC Challenges and Current Solutions
•• Intel’s Moorestown PlatformIntel’s Moorestown Platform–– Designed for Next Generation SmartphonesDesigned for Next Generation Smartphones
•• Future SoC:Future SoC:
2 Designing Tomorrow’s Microprocessors
–– Programmable AcceleratorProgrammable Accelerator
SystemSystem--onon--Chip is Everywhere !Chip is Everywhere !
3 Designing Tomorrow’s Microprocessors
Example: iPhone 3GS disassembledExample: iPhone 3GS disassembled
4 Designing Tomorrow’s Microprocessors
CPU (Samsung)
NAND Flash Memory (Toshiba)
SOC contains all components required for a complete system.
SoC Challenges and Current SolutionsSoC Challenges and Current Solutions
5 Designing Tomorrow’s Microprocessors
SoC ChallengesSoC Challenges
•• Exponentially increasing SoC design complexityExponentially increasing SoC design complexity
•• Endless Performance RequirementsEndless Performance Requirements
•• Maximum PowerMaximum Power--efficiency efficiency
6 Designing Tomorrow’s Microprocessors
2
SoC Challenges SoC Challenges –– 1/3 1/3
•• Exponentially increasing SoC design complexityExponentially increasing SoC design complexity–– SoC = all components required for a complete systemSoC = all components required for a complete system
–– 1995: < 1 million gates /one microprocessor1995: < 1 million gates /one microprocessor
–– Today: 100s millions gates / multiple microprocessors Today: 100s millions gates / multiple microprocessors
–– ““Today, as always, to be competitive, you must design Today, as always, to be competitive, you must design almost as complex theoretically possible. If you don’t, almost as complex theoretically possible. If you don’t,
7 Designing Tomorrow’s Microprocessors
almost as complex theoretically possible. If you don t, almost as complex theoretically possible. If you don t, someone else will do, and the extra complexity will provide someone else will do, and the extra complexity will provide competitive edgecompetitive edge.”.”
SoC Challenges SoC Challenges –– 2/32/3•• Endless Performance RequirementEndless Performance Requirement–– Multimedia: many different codec for images/audio/videoMultimedia: many different codec for images/audio/video
•• HD H.264 Decode: 10HD H.264 Decode: 10--50 GOPS for 30 fps (real time)50 GOPS for 30 fps (real time)
•• HD H.264 Encode: ~100 GOPS for 30 fpsHD H.264 Encode: ~100 GOPS for 30 fps
•• Super Resolution: ~1000 GOPS for 30fpsSuper Resolution: ~1000 GOPS for 30fps
–– Networking: diverse and complicated standardsNetworking: diverse and complicated standards•• Telephony: constant bit rate serviceTelephony: constant bit rate service
8 Designing Tomorrow’s Microprocessors
•• Telephony: constant bit rate serviceTelephony: constant bit rate service
•• Multimedia streaming: higher bandwidth but variable bitMultimedia streaming: higher bandwidth but variable bit--rate rate serviceservice
–– WirelessWireless•• Many wireless coding standardMany wireless coding standard
–– WiWi--Fi, WiMax, etcFi, WiMax, etc
•• New standards appear all the timeNew standards appear all the time
SoC Challenges SoC Challenges –– 3/33/3
•• Maximum PowerMaximum Power--EfficiencyEfficiency–– Phone: 3W power budgetPhone: 3W power budget
•• Otherwise, the user feels too hotOtherwise, the user feels too hot
–– HD.264 DecodeHD.264 Decode•• Normal PC: 45W or more, depending on the video formatNormal PC: 45W or more, depending on the video format
•• Hardware Decoder in SoC: 100Hardware Decoder in SoC: 100--300mW300mW
9 Designing Tomorrow’s Microprocessors
•• A big challengeA big challenge–– How to meet different performance requirements under How to meet different performance requirements under
very tight power budget?very tight power budget?
Current SoC Design Solution: Divide and ConquerCurrent SoC Design Solution: Divide and Conquer
•• A SoC is divided into blocks and each block is optimized for A SoC is divided into blocks and each block is optimized for different performance/power requirement.different performance/power requirement.–– Some blocks are control processors (control plane)Some blocks are control processors (control plane)
•• Programmable / flexibleProgrammable / flexible
–– Some blocks are data processing units and implemented with RTL Some blocks are data processing units and implemented with RTL hardware (data plane)hardware (data plane)
•• Very limited programmableVery limited programmable
10 Designing Tomorrow’s Microprocessors
Partition of Control and Data PlanesPartition of Control and Data Planes
1. Data plane blocks can be optimized for different applications.
2. The IP blocks can be reused and the design complexity decreases.
11 Designing Tomorrow’s Microprocessors
(a) Control/Data plane functions in SoCs
Multiple CPUs are reality. For example, Cisco’s router chip contains 192 CPUs.
On-chip memory is growing to meet data-intensive requirements from data planes
1999: 20% of total area2002: 52%2005: 71%2008: 83%
Intel’s Moorestown PlatformIntel’s Moorestown PlatformDesigned for Next Gen SmartphonesDesigned for Next Gen Smartphones
12 Designing Tomorrow’s Microprocessors
-- Designed for Next Gen SmartphonesDesigned for Next Gen Smartphones
3
Intel SoC RoadmapIntel SoC Roadmap
13 Designing Tomorrow’s Microprocessors
Moorestown Platform OverviewMoorestown Platform Overview
2D/3DGraphics
CPUCore
Hardware Video Accelerator
Link toLangwell
MemoryController
Lincroft (45nm)ATOM uLP CPU Core• 45nm HighK SoC Process• Intel Hyper Threading Support
Intel Graphics• OpenGL ES 2.0 (subset of OpenGL 3D)• OpenVG (for Flash/SVG)
Memory Support
Hardware Accelerated Video Decode and Encode
14 Designing Tomorrow’s Microprocessors
AudioCodec
USBController
NANDController
…
Langwell (65nm)
Memory Support• LPDDR1 and DDR2• Single Channel• x32 bit interface
Moorestown HighlightsMoorestown Highlights
•• High Performance for amazing Internet experienceHigh Performance for amazing Internet experience
•• Dramatically lower PowerDramatically lower Power–– Up to 50x platform idle power reduction compared to Up to 50x platform idle power reduction compared to
Menlow PlatformMenlow Platform
•• Small size for smartphone form factor Small size for smartphone form factor
15 Designing Tomorrow’s Microprocessors
Small size for smartphone form factor Small size for smartphone form factor
Moorestown Performance TechnologyMoorestown Performance Technology
•• Intel Hyper Thread TechnologyIntel Hyper Thread Technology
•• Intel Burst Performance TechnologyIntel Burst Performance Technology
16 Designing Tomorrow’s Microprocessors
Intel HyperIntel Hyper--Threading TechnologyThreading Technology
17 Designing Tomorrow’s Microprocessors
Intel Burst Performance TechnologyIntel Burst Performance Technology
18 Designing Tomorrow’s Microprocessors
4
Moorestown Power ReductionMoorestown Power Reduction
•• Active Power Management Active Power Management –– Power and Clock Power and Clock GatingGating
•• Use Low Power and Handheld IOUse Low Power and Handheld IO–– LPDDRLPDDR
•• Accelerators (GFX video decoder)Accelerators (GFX video decoder)
19 Designing Tomorrow’s Microprocessors
•• Accelerators (GFX, video decoder)Accelerators (GFX, video decoder)–– Enable functionality at low power (e.g.. HD video)Enable functionality at low power (e.g.. HD video)
Active Power ManagementActive Power Management
•• Enables long standby battery life needed for Enables long standby battery life needed for smartphonessmartphones–– Through low idle powerThrough low idle power
•• Lower active scenario power by shutting things off Lower active scenario power by shutting things off that are not used in that modethat are not used in that mode
20 Designing Tomorrow’s Microprocessors
–– Switching off video decode block in web browsingSwitching off video decode block in web browsing
Power GatingPower Gating
21 Designing Tomorrow’s Microprocessors
Software Development for Moorestown: MoblinSoftware Development for Moorestown: Moblin
Opportunity forCustomers to differentiate
22 Designing Tomorrow’s Microprocessors
Moblin-basedSoftware Stack
Future SoCFuture SoC-- Programmable AcceleratorProgrammable Accelerator
23 Designing Tomorrow’s Microprocessors
Current SoCCurrent SoC
•• IP blocks based on fixed function implementation IP blocks based on fixed function implementation are not programmableare not programmable–– Each data plane block cannot used for other purposeEach data plane block cannot used for other purpose
•• Long TimeLong Time--toto--market (TTM)market (TTM)–– The block is not programmableThe block is not programmable
•• High NonHigh Non Recurrent Engineering(NRE) costRecurrent Engineering(NRE) cost
24 Designing Tomorrow’s Microprocessors
•• High NonHigh Non--Recurrent Engineering(NRE) costRecurrent Engineering(NRE) cost
5
A Solution for Future SoCA Solution for Future SoC
•• Low Power Programmable accelerator Low Power Programmable accelerator
•• The challenge is to meet performance requirement The challenge is to meet performance requirement under a given power and cost budgetunder a given power and cost budget
CPU+Fixed LogicPo
wer
25 Designing Tomorrow’s Microprocessors
CPU +Programmable
Accelerator
Fixed Logic
Cost & TTM
Perf
orm
ance
/P
Softwareon CPU
Our Programmable Accelerator: BSAOur Programmable Accelerator: BSA
Main Memory
Next Generation SoC
Blue Summer Controller
Master DataPort to Memory
Master Control
Port to DMA
Control/Data Network
Slave ControlPort from
Core
DMA Engine
26 Designing Tomorrow’s Microprocessors
x86 core
Control/Data Network
SharedCache Shared
RAMPEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE
Blue Summer Architecture (BSA)
DMA Engine
Low PowerProgrammable
Accelerator
Programmable AcceleratorProgrammable Accelerator
•• Different applications can be mapped into the same Different applications can be mapped into the same programmable accelerator programmable accelerator
•• TTM & NRE cost are reducedTTM & NRE cost are reduced
27 Designing Tomorrow’s Microprocessors
BSA for H264 EncodingBSA for H264 EncodingBlue Summer Controller
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE
PE
PE
PE
PE
Motion Estimation
Transformation
Deblocking
Entropy Coding
Other
28 Designing Tomorrow’s Microprocessors
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE Other
BSA for H264 DecodingBSA for H264 Decoding
Blue Summer Controller
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE
PE
PE
PE
Transformation
Deblocking
Entropy Coding
Other
29 Designing Tomorrow’s Microprocessors
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
PE PEPE PEPE PEPE PE
(In sleep mode to save power)
PE
TakeTake--away Messagesaway Messages
•• SoC is everywhere!SoC is everywhere!
•• The SoC design challenge is to meet exponentially The SoC design challenge is to meet exponentially increasing performance requirement under tight increasing performance requirement under tight power budget and design costpower budget and design cost
30 Designing Tomorrow’s Microprocessors
6
Q & AQ & A
31 Designing Tomorrow’s Microprocessors
ReferencesReferences
•• Rapid Repair Website: IPhone 3GS disassemblyRapid Repair Website: IPhone 3GS disassembly
•• R. Patel, Moorestown Platform: Based on Lincroft R. Patel, Moorestown Platform: Based on Lincroft SoC Designed for Next Generation Smartphones, SoC Designed for Next Generation Smartphones, HotChips 2009HotChips 2009
•• Shreekant Thakkar, Moorestown: Intel’s Next Shreekant Thakkar, Moorestown: Intel’s Next Generation Platform for MIDs and Smartphones Generation Platform for MIDs and Smartphones
32 Designing Tomorrow’s Microprocessors
Generation Platform for MIDs and Smartphones, Generation Platform for MIDs and Smartphones, IDF 2009IDF 2009
•• Steve Leibson, How to Avoid the Traps and Pitfalls Steve Leibson, How to Avoid the Traps and Pitfalls of SoC Design, Microprocessor Report, 2009of SoC Design, Microprocessor Report, 2009
•• Moblin: http://moblin.orgMoblin: http://moblin.org