Introduction to GPU Computing - · I 32 CUDA cores per SM I = 512...

Post on 20-May-2020

17 views 0 download


Universitat Hamburg

MIN-FakultatFachbereich Informatik

Introduction to GPU Computing

Introduction to GPU Computing

Matthis Hauschild

Universitat HamburgFakultat fur Mathematik, Informatik und NaturwissenschaftenFachbereich Informatik

Technische Aspekte Multimodaler Systeme

December 4, 2014

M. Hauschild - Introduction to GPU Computing 1

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Introduction to GPU Computing

Table of Contents

1. Architecture of a GPU

2. General-purpose computing on GPUs

3. Applications of GPGPU

4. Performance evaluation examples

M. Hauschild - Introduction to GPU Computing 2

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Architecture of a GPU Introduction to GPU Computing

What is a GPU

I Graphics processing unitI Main GPU manufacturers

1. Intel2. AMD3. Nvidia

I Performance characteristics:1

I GPU architecture: 28 nmI GPU speed: ∼ 1 GHzI Memory amount: 8 GiB GDDR5I Memory bandwidth: 640 GiB/s

1based on the AMD Radeon R9 series (cf.[1])M. Hauschild - Introduction to GPU Computing 3

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Architecture of a GPU Introduction to GPU Computing

Difference between GPU and CPU[3]

I CPU optimized for single thread execution

I GPU optimized for multiple data execution

M. Hauschild - Introduction to GPU Computing 4

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Architecture of a GPU Introduction to GPU Computing

Architecture of a GPU[4]

based on the Nvidia Fermi architecture:

M. Hauschild - Introduction to GPU Computing 5

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Architecture of a GPU Introduction to GPU Computing

Architecture of a GPU[4]

M. Hauschild - Introduction to GPU Computing 6

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Architecture of a GPU Introduction to GPU Computing

Architecture of a GPU[4]

Summary of the Nvidia Fermi architecture:

I 16 Streaming Multiprocessors (SM)

I 32 CUDA cores per SM

I = 512 CUDA cores ⇒ 512 FMA op/clock

⇒ it is great for generating graphics, but what else could be donewith it?

M. Hauschild - Introduction to GPU Computing 7

Universitat Hamburg

MIN-FakultatFachbereich Informatik

General-purpose computing on GPUs Introduction to GPU Computing

What is GPGPU[5]

I General-purpose computing on graphics processing unitsI Using GPU for non-graphical computations

I Good for data parallelismI Bad for instruction parallelism

I First use in LU factorization

I Became popular at 2001 with matrix multiplication

I Started using DirectX and OpenGL

M. Hauschild - Introduction to GPU Computing 8

Universitat Hamburg

MIN-FakultatFachbereich Informatik

General-purpose computing on GPUs Introduction to GPU Computing

GPGPU Frameworks

I Brook – One of the earliest GPU frameworks by StanfordUniversity

I CUDA – Proprietary Nvidia-only framework

I OpenCL – Open source general framework by Khronos Group

I C++ AMP – Open C++ extension by Microsoft

I OpenACC – C, C++ and Fortran extension

I ArrayFire – Wrapper for CUDA, OpenCL, etc.

M. Hauschild - Introduction to GPU Computing 9

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Applications of GPGPU Introduction to GPU Computing

General applications of GPGPU

Again, GPGPU can only be superior to CPU computing, if thesame algorithm is applied to a lot of data (data parallelism)For example:

I k-nearest neighbor

I Fast Fourier Transform

I Segmentation

I Audio Processing

I CT reconstruction

I Weather forecasting

I Cryptography

I Database operations

M. Hauschild - Introduction to GPU Computing 10

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Applications of GPGPU Introduction to GPU Computing

Applications of GPGPU in Robotics[2]

For example:

I Generally many image processing tasks

I Frame transformation

I Inverse kinematic calculation

I 3D pose estimation

I Point-set registration

M. Hauschild - Introduction to GPU Computing 11

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

Performance evaluation examples

Test 1I Sobel operator on a real image using OpenCL

I Measurement of the possible frames per second

I On GPU and CPU

Test 2I Matrix multiplication of two squared matrices using OpenCL

I Measurement of time needed for calculation

I On GPU and CPU

M. Hauschild - Introduction to GPU Computing 12

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

Performance evaluation examples - System characteristics

I My CPU:I Model: AMD Phenom II X4 965I Clock speed: 3400 MHzI Misc: 4 Cores, SSE3

I My GPU:I Model: AMD Radeon HD 6950,I Memory: 2048 MBI Core clock: 800 MHzI Memory clock: 1250 MHzI Memory bandwidth: 160 GB/s

I My RAM: 8 GB

M. Hauschild - Introduction to GPU Computing 13

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

Performance evaluation examples - Test 1

The Sobel operator:

3. s =√dx2 + dy2

M. Hauschild - Introduction to GPU Computing 14

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

M. Hauschild - Introduction to GPU Computing 15

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

Performance evaluation examples - Test 1

M. Hauschild - Introduction to GPU Computing 16

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

Performance evaluation examples - Test 2

Matrix Multiplication2:


M. Hauschild - Introduction to GPU Computing 17

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

Performance evaluation examples - Test 2

M. Hauschild - Introduction to GPU Computing 18

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing

Thank you for your attention!


Universitat HamburgFakultat fur Mathematik, Informatik und NaturwissenschaftenFachbereich Informatik

Technische Aspekte Multimodaler Systeme

M. Hauschild - Introduction to GPU Computing 19

Universitat Hamburg

MIN-FakultatFachbereich Informatik

Performance evaluation examples Introduction to GPU Computing


[1] AMD. AMD RadeonTM R9 Grafikkartenserie, 2014.

[2] J. Bedkowski and A. Maslowski. GPGPU computation in mobile robotapplications. Warsaw University of Technology, 2012.

[3] Nvidia. CUDA C Programming Guide, 2014.

[4] Nvidia. NVIDIA’s Next Generation CUDA Compute Architecture: Fermi,2014.

[5] Wikipedia. General-purpose computing on graphics processing units, 2014.


M. Hauschild - Introduction to GPU Computing 20