+ All Categories
Home > Documents > By: Xinya (Leah) Zhao Abdulahi Abumeseec.ce.rit.edu/551-projects/winter2011/3-3.pdfPC and Multi-Core...

By: Xinya (Leah) Zhao Abdulahi Abumeseec.ce.rit.edu/551-projects/winter2011/3-3.pdfPC and Multi-Core...

Date post: 21-Apr-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
30
Architecture By: Xinya (Leah) Zhao Abdulahi Abu Logo Source: http://gamez-gear.com/ds/images/logos/playstation3logo%20%281%29.gif
Transcript

ArchitectureBy: Xinya (Leah) Zhao

Abdulahi Abu

Logo Source: http://gamez-gear.com/ds/images/logos/playstation3logo%20%281%29.gif

Outline

• Evolution of Game Consoles and Gaming Industry

• PlayStation 3 Architecture

• High Level Architecture Comparison

• What's next?

Evolution of Game Consoles

1st Gen 2nd Gen 3rd Gen 4th Gen 5th Gen 6th Gen 7th Gen 8th Gen

1972•Based around vector display, not analog video•Ex: Odyssey 200

1976•Contained a programmable microprocessor•Cartridges only needed a single ROM chip to store microprocessor instructions•Ex: Atari 2600

1983•Supported high-resolution sprites and tiled backgrounds with more colors•Ex: Nintendo Entertainment System

1988•Increased storage space for multimedia-based games•32X: added polygon-processing•Ex: Mega Drive/Genesis

1994•More polygon processing•Use of discs that contained far more information and cheaper•More onscreen colors•Ex: PlayStation

1999•PC-like architecture•DVD for game media•Emergency of online gaming.•Implementation of flash and hard drive storage•Ex: PlayStation 2

2005•Support of new disc formats: Blu-ray Disc, HD DVD•Wireless controller support•Motion as input and IR tracking•Ex: PlayStation 3

2011•Support high definition graphics up to 1080p•Controller with built-in touchscreen•EX: Wii U

Industry DevelopmentGamers•Affordable•High definition graphics•Backwards compatibility

Developers•High definition graphics compatibility•Good hardware

Evolution of Gaming Industry Business ModelRoll out cutting-edge hardware and couple it with compelling games and novel new ways to play.

New Game Plan!!•Integration of social media•Extend life of console by offering add-ons

PlayStation 3 Development• Released in 2006

o Cost $805.85 for the 20 GB model and $840.35 for the 60 GB

o For the first 2.5 years Sony lost $306 or $241 per console

• Sony Cut 70% production cost in 2009o Shrinking technology o Removing the Emotion Engine o Blu-ray Disc diodes are cheaper to manufacture

• Current Cell CPU at 65nm, GPU at 45nm

PlayStation 3 SalesRegion Units sold First available

Canada 1.5 million as of October 6, 2010 November 17, 2006

Europe 16 million as of August 17, 2010 March 23, 2007

Japan 6,341,950 as of April 1, 2011 November 11, 2006

United Kingdom 3 million as of January 26, 2010 March 23, 2007

United States 13.5 million as of November 11, 2010 November 17, 2006

Worldwide 55.5 million as of September 30, 2011 November 11, 2006

Source: Wikipedia

PlayStation 3 Specs• CPU: Cell Processor• GPU: RSX at 550MHz• Full HD support (up to to 1080P)• Memory:• 256MB XDR Main RAM @3.2GHz• 256MB GDDR3 VRAM @700MHz• 10/100/1000Base-T Ethernet Adapter• Wi-Fi: IEEE 802.11 b/g

PlayStation 3 Architecture Overview

Source : http://www.cg.tuwien.ac.at/events/EG06/gmgfiles/perthuis-talkeg2006.ppt

Architecture Outline• CPU Architecture

o Power Processing Element (PPE)

o Synergistic Processing Element (SPE)

• RSX GPU Architecture

• Memory Architectureo XDR DRAM

o XIO

CPU Architecture• PowerPC-base Core @3.2GHz

o 512KB L2 cacheo VMX (aka Altivec) ISA support

• 7 x SPE @3.2GHzo 7 x 128b 128 SIMD GPRso 7 x 256KB SRAM for SPEo 1 of 8 SPEs reserved for redundancy total floating

point performance: 218 GFLOPSo 1 VMX vector unit(Altivec) per SPE

CPU Architecture

Source:http://www.ibm.com/developerworks/power/library/pa-cellperf/

POWER Processing Element (PPE)• Handles most of the work load of all the

processors• Dual-issue, in-order processor with dual-

thread support (4 fetch, 2 issue)• 2 instructions issued per cycle• 64kB L1 (instruction + Data), 512kB L2

Cache• Includes VMX (aka Altivec) ISA

POWER Processing Element (PPE)

• Direct Memory Access to and from main memory• Super-scalar with deep 2-way pipeline• Delayed-execution pipeline• Limited out-of-order execution of load instructions

Source:http://www.unixer.de/publications/img/22c3_slides.pdf

PPE Pipeline

Source:http://www.ibm.com/developerworks/power/library/pa-cellperf/

Synergistic Processing Elements (SPE's)• Clocked at 3.2 GHz (2506 GFLOPS of single

precision perfamance theoretica).

• Based on the pervasively data parallel computing (PDPC) architecture. (wide datapaths throughput

• Compute Engine with SIMD support• 256Kb embedded SRAM for intruction and

data ("Local storage")

SPE Block diagram

Source:http://www.ibm.com/developerworks/power/library/pa-cellperf/

SPE continued...• Can Compute 16 8-bit integers, 8 16-bit

integers, 4 32-bit integers or 4 single precision floating-point numbers in one cycle

• Usually used for small programs (i.ethreads)

• Floating point and fixed point units are on even pipeline, rest on odd pipeline

RSX GPU• Developed joinly by

NVidia and Sony specifically by the PS3

• Same Architecture as GeForce 7800 GTX

• Rendering to both local and system memory

Source:http://www.spuify.co.uk/?p=645

RSX GPU• Over 300 million transistors on 8 layer 90nm

process. (currently on 40nm)• Connected to the cell by 35GB/s link

(20GB/s Write, 15GB/s read)• Multi-way programmable parallel floating

point shader pipeline• Made of 24 parrallel pixel-shader ALU pipes

and 8 parallel vertex pipelines at 550MHz

RSX GPU• 5 ALU operations per

pipeline, per cycle (2 vectors, 2 scalar/dual/co-issue and fog ALU, 1 Texture ALU) for pixel shaders

• 27 floating-point operations per pipeline, per cycle for pixel shaders

• 2 ALU operations per pipeline, per cycle (1 vector4 and 1 scalar, dual issue) and 10 floating-point operations per pipeline, per cycle for parallel vertex pipelines

RSX GPU• 256K DDR3 RAM at 700

MHZ• 128-bit memory bus

width and has a read and write bandwidth of 22.4 GB/s

• FlexIO provides bandwidth of 62.4 GB/s (36.4 GB/s outbound, 26 GB/s inbound) at 2.6 GHz

Source:http://www.spuify.co.uk/?p=645

PS3 Memory ArchitectureRambus XDR Memory Architecture•3 Primary Semiconductor Components

o XDR DRAMo XIOo XMC

Purpose•To be effective in small, high bandwidth consumer systems, high-performance memory applications, and high-end GPUs.

Source : http://www.rambus.com/us/technology/solutions/xdr/index.html

XDR DRAMExtreme Data Rate Dynamic Random Access Memory•CMOS DRAM organized as 32M words by 16bits•Bi-directional Differential RSL (DRSL)

• Eight banks: bank-interleavedtransactions at full bandwidth

• Capable of sustained data transfers of 8000/6400/4800 MB/s

• Dynamic request scheduling• Early-read-write support• Zero overhead refreshSource:

http://www.rambus.com/us/technology/solutions/xdr/xdr_dram.html

XIO Controller IO Cell• High performance, low-latency controller

interface.• Can support bandwidths of up to 28.8 GB/s

• Compositiono One or two 12-bit o Request bus block (RQ)o One control block (CTL)o A variable number of 8

or 9-bit data blocks (DQ)Source: http://www.rambus.com/us/technology/solutions/xdr/xdr_controller.html

XDR Memory Controller (XMC)• Configurable Soft macro reference

design• Flexible with integration

o Direct integration o Use as a reference design

• Flexible to accommodate a wide variety of expected DRAM configurations

• Supports Dynamic Point-to-Point

I/O ArchitectureFlex I/O Interface•Unidirectional 8-bit wide point-to-point path (12 lanes)

o 5 inbound and 7 outbound•Peak bandwidth 62.4GB/s (36.4 GB/s outbound, 26GB/s inbound) at 2.6 GHz•Can be clocked independently•4 inbound and 4 outbound lanes support memory coherencyIO: BluRay, HDD, USB, Memory Cards, GigaBit Ethernet

High Level Architecture ComparisonPC and Multi-Core PC Wii

• Memory is cached• As long as synchronization primitive

are used to avoid race conditions, the system takes care of getting the right data

• Two types of memory and both accessible by CPU and GPU

• A portion of the L1 cache could be locked and explicitly managed by DMA transfers

Source: http://beautifulpixels.blogspot.com/2008/08/multi-platform-multi-core-architecture.html

High Level Architecture Comparison (Cont.)

Xbox 360 PS3

• Multiple hardware threads per core• Single memory use for CPU and GPU• GPU is the memory controller and has

access to L2

• Cell processor• Series of co-processors named SPUs

that have dedicated memory for instructions and data

• Why floodgate was builtSource: http://beautifulpixels.blogspot.com/2008/08/multi-platform-multi-core-architecture.html

What's Next?• PS3>XBox360• Sony wants to lead the next generation

o PS3 was last to be release and is behind in sales

• Market is Very Differento Integration with other technologies such social

mediao Customers want more than a gaming systemo Moving away from traditional controller

• PS4 soon?

Questions?


Recommended