1
Contents of the Lecture
1. Introduction2. Methods for I/O Operations3. Buses4. Liquid Crystal Displays5. Other Types of Displays6. Graphics Adapters7. Optical Discs
12/21/2017 Input/Output Systems and Peripheral Devices (06-1)
2Input/Output Systems and Peripheral Devices (06-1)
6. Graphics Adapters
Structure of a Graphics Adapter Video Memory Graphics Accelerators3D AcceleratorsGraphics Processing UnitsDigital Interfaces for Monitors
12/21/2017
3Input/Output Systems and Peripheral Devices (06-1)
Structure of a Graphics Adapter (1)
12/21/2017
4Input/Output Systems and Peripheral Devices (06-1)
Structure of a Graphics Adapter (2)
Graphics Controller Implements the main functions of the graphics adapter System Bus Interface
Transfers in burst mode Transfers with no wait states when reading the video memory FIFO memory for efficient write to the video memory
12/21/2017
5Input/Output Systems and Peripheral Devices (06-1)
Structure of a Graphics Adapter (3)
Video Memory Interface Allows to update the video images
VGA Registers and Control RegistersEnable programming of the video adapter for operation in VGA modes There are adapters that are no longer compatible with the VGA standard
Cursor GeneratorGraphic Functions
Implemented by graphics accelerators 12/21/2017
6Input/Output Systems and Peripheral Devices (06-1)
Structure of a Graphics Adapter (4)Video BIOS
Provides video functions for access to the graphics adapter The BIOS programs of different adapters are different difficult programming VESA (Video Electronics Standards Association) standard for high-resolution BIOS functions
Video MemoryHolds the video image frame buffer
12/21/2017
7Input/Output Systems and Peripheral Devices (06-1)
Structure of a Graphics Adapter (5)
RAMDAC Circuit (RAMDAC – RAM Digital to Analog Converter)
Reads the digital image and converts it into analog signals The RAMDAC functions may be integrated into the graphics controller Only required for displays with analog inputs Displays that operate in the digital domain reconvert the analog signals to digital form
12/21/2017
8Input/Output Systems and Peripheral Devices (06-1)
Structure of a Graphics Adapter (6)
Video PortsEnable to transfer the video images to a monitor There are several variants of video ports VGA (Video Graphics Array)
Analog interface Designed for CRT displays, but also used by some liquid crystal displays Electrical noise may occur DB-15 connector
12/21/2017
9Input/Output Systems and Peripheral Devices (06-1)
Structure of a Graphics Adapter (7)
VIVO (Video In Video Out)Analog interface for connecting to TV sets, DVD players, game consoles (TV Out) Signals: S-Video (Y/C); composite video; component video (e.g., RGB) 9-pin mini-DIN connector
DVI (Digital Visual Interface) Digital interface DVI-I (digital and analog signals) or DVI-D (digital signals only) connector
12/21/2017
10
Structure of a Graphics Adapter (8)
HDMI (High-Definition Multimedia Interface)Digital interface for uncompressed video dataAllows to send digital audio data over the same cable19-pin (single-link) or 29-pin (dual-link) connector
DisplayPort Digital interface for video and audio data Targeted to replace the VGA and DVI interfaces 20-pin connectors for 1, 2, or 4 lanes
12/21/2017 Input/Output Systems and Peripheral Devices (06-1)
11Input/Output Systems and Peripheral Devices (06-1)
6. Graphics Adapters
Structure of a Graphics Adapter Video Memory Graphics Accelerators3D AcceleratorsGraphics Processing UnitsDigital Interfaces for Monitors
12/21/2017
12Input/Output Systems and Peripheral Devices (06-1)
Video Memory (1)
Can be single-ported or dual-ported Single-ported video memory
The data port is used to refresh the screen and to write new data
Dual-ported video memoryOne of the ports is used to update the images in memory The second port has serial access and is used to refresh the images on the screen
12/21/2017
13Input/Output Systems and Peripheral Devices (06-1)
Video Memory (2)
Video Memory Transfer RateThe maximum transfer rate bandwidth Affected by video memory technology and access time Bandwidth has to be shared by: screen refreshing circuits, CPU, graphics controller 30 .. 50% of the bandwidth should be reserved for other functions, different than refreshing
12/21/2017
14
Video Memory (3)
DDR-400 (PC3200) memory Maximum transfer rate: 3,200 MB/s Average transfer rate: ~1,600 MB/s
DDR2-667 (PC2-5300) memory Maximum transfer rate : 5.336 GB/s
DDR3-2133 (PC3-17000) memory Maximum transfer rate : 17 GB/s
DDR4-3200 (PC4-25600) memory Maximum transfer rate : 25.6 GB/s
12/21/2017 Input/Output Systems and Peripheral Devices (06-1)
15Input/Output Systems and Peripheral Devices (06-1)
Video Memory (4)
GDDR (Graphics Double Data Rate)Designed by ATI Technologies with the collaboration of the JEDEC committee Several versions: GDDR2 .. GDDR5
GDDR2 and GDDR3: based on DDR2 technologyGDDR4 and GDDR5: based on DDR3 technology
Low voltage: 1.8 V .. 1.5 V reduced power consumption and heat output Separate data strobe signals for read and write
12/21/2017
16Input/Output Systems and Peripheral Devices (06-1)
Video Memory (5)
GDDR5Combines high performance with stable operation and low implementation costs Memory organization: 32 Differential command clock signal (CK, CK#) Two diff. write clock signals (WCK, WCK#)
Two data bytes are aligned to one WCK signal Example for a data rate of 5 Gbits/s:
fCK = 1.25 GHz; fWCK = 2.5 GHz
12/21/2017
17Input/Output Systems and Peripheral Devices (06-1)
Video Memory (6)
Data bus inversion Reduces the number of zero bits transmitted Indicated with a DBI# signal for each byte Transmission lines have high level termination power dissipation is reduced
Address bus inversion Signal training
Phase adjustment of clock, data, and address signals
12/21/2017
18
Video Memory (7)
Address training: alignment of the address bus to the CK clock signal Alignment of WCK signal to the CK signal Data training: alignment of the data lines to the corresponding WCK signal A “hidden” data re-training is possible
Calibration: improves the reliability Auto-calibration: drive strength, termination impedance Software-controlled adjustment
12/21/2017 Input/Output Systems and Peripheral Devices (06-1)
19Input/Output Systems and Peripheral Devices (06-1)
Video Memory (8)Burst read/write access to the internal memory: 8 bits/pin 256 bits (two CK cycles)
Maximum transfer rates of 4 .. 7 Gbits/s per pin 16 .. 28 GB/s for 32 pins
Error detection Dedicated EDC (Error Detection Code) pins for sending CRC codes to the controller CRC code: for each data byte + DBI# line Allows to detect single-bit and double-bit errors
12/21/2017
20Input/Output Systems and Peripheral Devices (06-1)
Video Memory (9)
Power management Features that allow to consume power only when it is needed Scalable clock frequency and data rate: 5 Gbits/s .. 200 Mbits/s Low power mode for the DRAM core Multiple levels for termination impedance: increasing the impedance at slower data rates Low supply voltage: 1.5 V Data and address bus inversion
12/21/2017
21Input/Output Systems and Peripheral Devices (06-1)
6. Graphics Adapters
Structure of a Graphics Adapter Video Memory Graphics Accelerators3D AcceleratorsGraphics Processing UnitsDigital Interfaces for Monitors
12/21/2017
22Input/Output Systems and Peripheral Devices (06-1)
Graphics Accelerators (1)
Contain specialized circuits to execute the mathematical operations required for graphics rendering
Release the CPU from the task of executing these operations
The first graphics accelerators: AVGA (Accelerated VGA) adapters Subsequent graphics accelerators: 2D acceleratorsThe link between the accelerator circuitry and the OS is made via a driver
12/21/2017
23Input/Output Systems and Peripheral Devices (06-1)
Graphics Accelerators (2)Common 2D graphics functions:
BitBlt (Bit Block Transfer)Two bitmaps are combined with a raster operation Boolean operatorThe result is transferred to the destination areaBlitter: dedicated circuit for the BitBlt operation
Tracing lines, drawing rectangles, circles Filling surfaces or polygons Adding color
12/21/2017
24Input/Output Systems and Peripheral Devices (06-1)
Graphics Accelerators (3)
Multimedia accelerators: graphics accelerators extended with audio and video acceleration functionsFunctions:
Decoding audio data streams Scaling video images in x, y directionsConverting digital video signals into RGB signals Decompressing video images represented in various formats
12/21/2017
25Input/Output Systems and Peripheral Devices (06-1)
6. Graphics Adapters
Structure of a Graphics Adapter Video Memory Graphics Accelerators3D AcceleratorsGraphics Processing UnitsDigital Interfaces for Monitors
12/21/2017
26Input/Output Systems and Peripheral Devices (06-1)
3D Accelerators
3D Accelerators3D Images 3D Operations
12/21/2017
27Input/Output Systems and Peripheral Devices (06-1)
3D Images (1)
Are managed using abstract models An object is represented as a set of points defined by its x, y, and z coordinates position of verticesIf the object vertices are connected with lines, surfaces are obtained can be filled with a certain color or texture Each 3D object is composed of a large number of triangles (or polygons) that describe its surface
12/21/2017
28Input/Output Systems and Peripheral Devices (06-1)
3D Images (2)
Animated 3D graphics requires to perform a series of geometry computations that define the position of objects in 3D space
The geometry computations that handle the vertices of triangles can be performed by the CPU or by the graphics processor
The graphics processor must convert these triangles into solid surfaces intensive computations are needed
12/21/2017
29Input/Output Systems and Peripheral Devices (06-1)
3D Images (3)
In the real world, objects interact with each other Complex mathematical equations are used to determine whether an object is visible in a scene from a given angle Besides the color components, for each pixel an alpha value must also be stored
Indicates the degree of transparency of the pixel in the final image
12/21/2017
30Input/Output Systems and Peripheral Devices (06-1)
3D Images (4)
Another information that must be stored: the depth in space or z coordinate
The accelerator determines the z value of the objects’ pixels in a plane and displays those with a smaller z value The pixels’ depth information is stored in a separate buffer z-buffer Usually, 32 bits are allocated in the z-buffer for each pixel
12/21/2017
31Input/Output Systems and Peripheral Devices (06-1)
3D Images (5)
Each time the image is updated, the color and depth of pixels must be recomputed
Applying different 3D computations to the scene process of rendering Fills in all of the points on the surface of the object that previously was stored only as a set of vertices A solid object with 3D effects will be drawn on the monitor
12/21/2017
32Input/Output Systems and Peripheral Devices (06-1)
3D Accelerators
3D Accelerators3D Images 3D Operations
12/21/2017
33Input/Output Systems and Peripheral Devices (06-1)
3D Operations (1)
3D operations are performed in two stages:
Geometry stage: clipping, transformation, lighting Rendering stage: shading, texture mapping with adding the perspective effect, texture filtering, alpha blending On current 3D accelerators, operations in both stages are performed by the graphics processor
12/21/2017
34Input/Output Systems and Peripheral Devices (06-1)
3D Operations (2)
12/21/2017
35Input/Output Systems and Peripheral Devices (06-1)
3D Operations (3)
ClippingDetermines what part of an object is visible on the screen Eliminates the parts that are not visible
LightingObjects are modeled by light sources in the scene Lighting effects create color shading, light reflection, shadows
12/21/2017
36Input/Output Systems and Peripheral Devices (06-1)
3D Operations (4)
TransformationTranslation: moving every point by a fixed distance in the same direction Reflection: transforming an object into its mirror image Glide reflection: combining a reflection with the translation along the reflection axis Scaling: linear transformation to change the size of objects
12/21/2017
37Input/Output Systems and Peripheral Devices (06-1)
3D Operations (5)
Tessellation Dividing polygons into smaller structures for rendering Dividing into triangles: triangulation
12/21/2017
38Input/Output Systems and Peripheral Devices (06-1)
3D Operations (6)
Shading Enables the realistic representation of 3D objects on 2D screens Algorithms: Gouraud, Phong Reading the color information of vertices Interpolating the intensities for the color components
12/21/2017
39Input/Output Systems and Peripheral Devices (06-1)
3D Operations (7)
Texture Mapping Adding surface details (textures) to polygons that represent objects
Loading the texture elements (texels) from a bitmapCombining the texels Writing the resulting pixel to video memory
Applying a single texture Multi-texturing: a combination of textures is applied to an object
12/21/2017
40Input/Output Systems and Peripheral Devices (06-1)
3D Operations (8)
Textures may require a large space in memory compression is used Textures must be corrected to create the perspective effect
12/21/2017
41Input/Output Systems and Peripheral Devices (06-1)
3D Operations (9)
Texture Filtering Reduces some unwanted effects that may occur with texture mapping The color of a new pixel is determined through interpolation between the colors of several texels in the original textureBilinear filtering: uses the weighted average of the four texels nearest to a particular texel
12/21/2017
42Input/Output Systems and Peripheral Devices (06-1)
3D Operations (10)Trilinear Filtering
The texture resolution is reduced when the distance to the object increases 3D accelerators store in memory several variants of a texture “MIP mapping”Combining this feature with bilinear filtering
12/21/2017
43Input/Output Systems and Peripheral Devices (06-1)
3D Operations (11)
Fogging Gradually fading objects in the distance The scene will appear more realistic illusion of distant objects Allows to perform the 3D processing faster
Alpha Blending Used to create the transparency effect for some objects (e.g., windows)
12/21/2017
44Input/Output Systems and Peripheral Devices (06-1)
3D Operations (12)
Anti-Aliasing Oblique lines: approximated by combining vertical segments with horizontal segments the aliasing effect occurs Removing this effect (“anti-aliasing”):
Changing the color of pixels near the outlines The background color is gradually mixed with the object's color
The clarity of the outlines is reduced 12/21/2017
45Input/Output Systems and Peripheral Devices (06-1)
6. Graphics Adapters
Structure of a Graphics Adapter Color Representation Video Memory Graphics Accelerators3D AcceleratorsGraphics Processing UnitsDigital Interfaces for Monitors
12/21/2017
46Input/Output Systems and Peripheral Devices (06-1)
Graphics Processing Units
Graphics Processing UnitsOverviewGPGPU ComputingThe CUDA ArchitectureThe NVIDIA GP100 GPU
12/21/2017
47Input/Output Systems and Peripheral Devices (06-1)
Overview (1)
GPU – Graphics Processing UnitDedicated graphics processors for PCs, workstations, and game consoles
Initially used to accelerate the rendering stage for 3D graphics (e.g., texture mapping)Later also used to accelerate the geometric computations (rotation, translation)
GPUs contain shader units, modules for texture mapping, anti-aliasing etc.
12/21/2017
48Input/Output Systems and Peripheral Devices (06-1)
Overview (2)
Vertex shader units Transform the 3D position of each vertex to the 2D coordinates on the screen and to the depth value for the z-buffer Modify the attributes of vertices: position, color, texture coordinates
Geometry shader unitsGenerate geometric figures or add volumetric details to objects
12/21/2017
49Input/Output Systems and Peripheral Devices (06-1)
Overview (3)Pixel/fragment shader units
Determine the color, z depth, and alpha value for each pixel or fragment
Unified shader units Programmable units Able to perform various shading operations (vertex, geometry, pixel) GPUs contain an array of computing units and a unit that distributes the operations to be performed
12/21/2017
50Input/Output Systems and Peripheral Devices (06-1)
Overview (4)
The architecture with programmable units allows a more flexible use of the hardware resources The programmable units can also be used for other types of computations A flexible parallel architecture is obtained
GPUs also include modules for 2D acceleration, MPEG compression, high-definition video decoding
12/21/2017
51Input/Output Systems and Peripheral Devices (06-1)
Overview (5)
GPUs can be dedicated or integrated Dedicated GPUs
Used in graphics cards interfaced with the motherboard via a PCI Express bus or AGP (Accelerated Graphics Port) interface Have a dedicated memory to the card use Examples
AMD Radeon HD 8xxxM (e.g., 8970M)
NVIDIA GeForce GTX (e.g., GTX 1080)
12/21/2017
52Input/Output Systems and Peripheral Devices (06-1)
Overview (6)
Integrated GPUsAre integrated into a chipset or processorUse a portion of the system memory Have lower performance compared to dedicated GPUs Examples
Intel HD Graphics (e.g., UHD Graphics 630) AMD A-10 APU (Accelerated Processing Unit) processor series NVIDIA in Tegra processors (K1, X1)
12/21/2017
53Input/Output Systems and Peripheral Devices (06-1)
Overview (7)
The design of GPUs was influenced by the 2D and 3D programming interfaces
Implement API functions in hardware OpenGL (Open Graphics Library)
For various platforms and languages Functions to draw 3D scenes from primitives
Direct3D (component of DirectX) Only for the Microsoft operating systemsLow-level interface to the 3D hardware functions
12/21/2017
54Input/Output Systems and Peripheral Devices (06-1)
Overview (8)
Technologies for connecting multiple GPUs on different graphics cards NVIDIA: SLI (Scalable Link Interface)
2 .. 4 identical graphics cards are connected via a motherboard (PCIe x 16)
AMD: CrossFireX Up to 4 graphics cards can be connectedThe graphics cards do not have to be identicalThe cards have external connectors
12/21/2017
55Input/Output Systems and Peripheral Devices (06-1)
Graphics Processing Units
Graphics Processing UnitsOverviewGPGPU ComputingThe CUDA ArchitectureThe NVIDIA GP100 GPU
12/21/2017
56Input/Output Systems and Peripheral Devices (06-1)
GPGPU Computing (1)GPGPU (General Purpose computing on GPU)The GPU processing cores provide massive FP computational power
Example: a single NVIDIA GP100 GPU (3,584 cores) achieves 10.6 TFLOPS
The graphics pipeline can also be used for general-purpose applications
The performance can be orders of magnitude higher than that of conventional CPUs
12/21/2017
57Input/Output Systems and Peripheral Devices (06-1)
GPGPU Computing (2)
GPUs can process independent vertices and pixels/fragments stream processors
Stream: set of records that require similar computation Kernel function: applied to each element in the stream Shared memories cannot be used
Ideal GPGPU applications: large data sets, high parallelism, reduced dependencies
12/21/2017
58Input/Output Systems and Peripheral Devices (06-1)
GPGPU Computing (3)Disadvantages of GPGPU computing:
The programmer needs to be familiar with the graphics APIs and the GPU architectureProblems need to be expressed in terms of coordinates, textures, shader functions The need to use graphics programming languages: OpenGL, DirectX, Cg
API extensions for running some program functions on GPU's processors: CUDA (NVIDIA), OpenCL (Khronos Group)
12/21/2017
59Input/Output Systems and Peripheral Devices (06-1)
Graphics Processing Units
Graphics Processing UnitsOverviewGPGPU ComputingThe CUDA ArchitectureThe NVIDIA GP100 GPU
12/21/2017
60Input/Output Systems and Peripheral Devices (06-1)
The CUDA Architecture (1)
CUDA (Compute Unified Device Architecture)Software and hardware architecture
Enables GPUs to execute programs written in C, C++, Fortran, OpenCL languagesAllows to use Microsoft's DirectCompute API Allows to access directly the GPU resources for general-purpose computing
Exploits the GPU's capability to operate on large matrices in parallel
12/21/2017
61Input/Output Systems and Peripheral Devices (06-1)
The CUDA Architecture (2)
A CUDA program calls kernel functions executed by threadsThreads are organized into blocks and groups of blocks (grids) Thread block:
Set of concurrent threadsCommunicate via a shared memory Each thread has an identifier, registers, private memory, inputs, outputs
12/21/2017
62Input/Output Systems and Peripheral Devices (06-1)
The CUDA Architecture (3)
Grid of blocks:Group (array) of thread blocks The blocks execute the same kernel function Ensure synchronization between dependent kernel functions Results are shared in a global memory allocated to an application global synchronization
12/21/2017
63Input/Output Systems and Peripheral Devices (06-1)
The CUDA Architecture (4)
12/21/2017
64Input/Output Systems and Peripheral Devices (06-1)
The CUDA Architecture (5)
The hierarchy of threads is executed on a hierarchy of processors on the GPU
Threads: executed by CUDA cores and other execution units Thread blocks: executed by a streaming multiprocessor (SM) Group of 32 threads: warp Grids of blocks: executed by the GPU
12/21/2017
65Input/Output Systems and Peripheral Devices (06-1)
The CUDA Architecture (6)Unified Virtual Addressing (CUDA 4)
Provides a single virtual memory address space for CPU and GPU memory
Unified Memory (CUDA 6)Part of the GPU physical memory is shared between the CPU and GPU CUDA software migrates data allocated in the unified memory between GPU and CPU The memory modified by the CPU should be synchronized with the GPU memory
12/21/2017
66Input/Output Systems and Peripheral Devices (06-1)
Graphics Processing Units
Graphics Processing UnitsOverviewGPGPU ComputingThe CUDA ArchitectureThe NVIDIA GP100 GPU
12/21/2017
67Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (1)Uses a new architecture, Pascal
Previous architectures: Fermi, Kepler, Maxwell
Main features:15.3 billion transistors, 16 nm technologyUses NVIDIA's NVLink interconnect, with a bandwidth of up to 160 GB/s Integrates an HBM2 (High Bandwidth Memory 2) stacked memory, 16 GB .. 32 GBImproved power efficiency (performance/W)
12/21/2017
68Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (2)
A full implementation contains:Six Graphics Processing Cluster (GPC) units 30 Texture Processing Cluster (TPC) units60 SM units with 64 CUDA cores and 4 texture units (3,840 cores; 240 texture units)Eight 512-bit memory controllers (4096-bit memory interface)Four HBM2 DRAM memory stacks Common L2 cache memory for the SM unitsGlobal scheduler GigaThread Engine
12/21/2017
69Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (3)
12/21/2017
70Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (4)
Each SM unit contains:64 single-precision (SP) CUDA cores
Integer arithmetic and logic unit Floating-point unit IEEE 754-2008Fused multiply-add instruction (FMA)
32 double-precision (DP) units 16 Load/Store units (LD/ST)16 special-function units (SFU) transcendental functions
12/21/2017
71Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (5)
12/21/2017
72Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (6)
Threads are scheduled in groups of 32 (warps)Each SM unit contains:
Two warp schedulers provide increased performance and reduced power consumptionFour instruction dispatch units
From each warp, two instructions can be dispatched in each clock cycle
12/21/2017
73Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (7)
Memory subsystemEach SM unit contains an instruction cache memoryThere is a separate L1 data cache memory, which can also be used as a texture memory 4096 KB of unified L2 cache memory: allows to share data between the SM units64 KB of shared memory The register files and memories are protected by an ECC code
12/21/2017
74Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (8)
The NVIDIA Tesla P100 GPU AcceleratorBased on the Pascal GPU architecture and the CUDA parallel computing model Designed as an accelerator for datacenters Contains one GP100 GPUNumber of CUDA cores: 56 x 64 = 3,584
Only 56 SM units are enabled Single-precision FP perform.: 10.6 TFLOPSDouble-precision FP perform.: 5.3 TFLOPSMemory size: 4 x 4 GB = 16 GB (HBM2)
12/21/2017
75Input/Output Systems and Peripheral Devices (06-1)
The NVIDIA GP100 GPU (9)
12/21/2017
76Input/Output Systems and Peripheral Devices (06-1)
Summary (1)
The main component of a graphics adapter is the graphics controller
It contains the system bus interface; the speed of this interface is an important performance factor
Video ports enable to combine video images from other sources with graphics imagesThe transfer rate of the video memory has a major impact on performance
Dual-ported memories: updating the images and refreshing the screen can be performed in parallel
12/21/2017
77Input/Output Systems and Peripheral Devices (06-1)
Summary (2)
The GDDR5 memory has advanced features for high performance and stable operation
Data and address bus inversion; signal training; calibration; error detection; power management
3D accelerators are required to convert 3D objects into 2D images in a realistic mannerFor each pixel of a 3D object, an alpha value and the z coordinate have to be stored3D operations are performed in two stages: geometry stage and rendering stage
12/21/2017
78Input/Output Systems and Peripheral Devices (06-1)
Summary (3)
GPUs are used to accelerate the geometric stage and the rendering stage of 3D graphics
Can be dedicated or integrated in a chipset GPUs contain a large number of processing cores, programmable for various shadings
The processing power of GPUs can also be used for applications that require vector operations
The CUDA architecture allows to directly access the GPU resources for general-purpose computing
12/21/2017
79Input/Output Systems and Peripheral Devices (06-1)
Concepts, Knowledge (1)Structure of a graphics adapterComponents of the graphics controllerFunction of the RAMDAC circuitFeatures of the GDDR5 graphics memoryData and address bus inversion of the GDDR5 graphics memorySignal training of the GDDR5 graphics memoryRepresentation of 3D objects3D operations performed in the geometry stage
12/21/2017
80Input/Output Systems and Peripheral Devices (06-1)
Concepts, Knowledge (2)
3D operations performed in the rendering stageTypes of shader units contained by GPUsDedicated and integrated GPUsAdvantages and disadvantages of GPGPU computingThe CUDA architectureThread block in the CUDA architectureGrid of blocks in the CUDA architecture
12/21/2017
81Input/Output Systems and Peripheral Devices (06-1)
Questions
1. What is the advantage of a dual-ported video memory?
2. What are the power management features of the GDDR5 video memory?
3. What information is required for representing 3D objects?
4. What operations are performed in the rendering stage for 3D images?
12/21/2017