+ All Categories
Home > Documents > Performance modeling in GPGPU computing

Performance modeling in GPGPU computing

Date post: 22-Feb-2016
Category:
Upload: wind
View: 50 times
Download: 0 times
Share this document with a friend
Description:
Performance modeling in GPGPU computing. Wenjing xu Professor: Dr.Box. What’s GPGPU?. GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, engineering, and enterprise applications . What’s modeling. - PowerPoint PPT Presentation
Popular Tags:
16
Performance modeling in GPGPU computing Wenjing xu Professor: Dr.Box
Transcript
Page 1: Performance modeling in  GPGPU computing

Performance modeling in GPGPU computing

Wenjing xu Professor: Dr.Box

Page 2: Performance modeling in  GPGPU computing

GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, engineering, and enterprise applications.

What’s GPGPU?

Page 3: Performance modeling in  GPGPU computing

a simplified representation of a system or phenomenon

it is the most explicit way in which to describe a system or phenomenon

use the parameter we set to build formula to Analysis system

What’s modeling

Page 4: Performance modeling in  GPGPU computing

Hong and Kim [3] introduce two metrics, Memory Warp Parallelism (MWP) and Computation Warp Parallelism (CWP) in order to describe the GPU parallel architecture.

Zhang and Owens [4] develop a performance model based on their microbenchmarks so that they can identify bottlenecks in the program.

Supada [5] performance model consider memory latencies are varied depending on the data type and the type of memory

Relate work

Page 5: Performance modeling in  GPGPU computing

Different application and device cannot use same setting

Find the relationship between each parameters in this model, and choose the best block size for each application on different device to get peak performance.

1 Introduction and background

Page 6: Performance modeling in  GPGPU computing

varies data size with varies size of block have different performance

Page 7: Performance modeling in  GPGPU computing

How GPU working

Page 8: Performance modeling in  GPGPU computing

Memory latency hiding

Page 9: Performance modeling in  GPGPU computing

The structure of threads

Page 10: Performance modeling in  GPGPU computing

Specification of GeForce GTX 650

Page 11: Performance modeling in  GPGPU computing

Parameters

Threads / Warp Number of thread in warp NRW Warps / Multiprocessor Number of warp in multiprocessor NWM Threads / Multiprocessor number of thread can be resided

into SM NRT

Thread Blocks / Multiprocessor number of block can be resided into SM

NRB

Thread Blocks / Multiprocessor needed

Number of block needed NB

Max Shared Memory / Multiprocessor (bytes)

Size of memory can be used by SM MSM

Register File Size Registers per block RB Max Registers / Thread Max number of Registers can be

used by thread

Max Thread Block Size Max number of threads in block NMB Threads number of threads needed NT Threads Blocks number of threads in block NTB Threads in warp Number of threads in warp NTW

Page 12: Performance modeling in  GPGPU computing

NMB >= NTB = N* NTW (N is integer) >= NRT/ NRB

Block size setting under threads limitation

Page 13: Performance modeling in  GPGPU computing

Memory resource

Memory

Location Hit Latency

Program Scope

Global Off-chip 200-300cycles global

Local

Off- chip

Same as global function

Shared

on-chip register latency

function

Constant

on-chip cache

register latency

global

Texture

on-chip cache

>100 cycles

global

Page 14: Performance modeling in  GPGPU computing

MR / MTR >= N* NTB (N is integer) N* NTB (N is integer) <= NRT N<= MSM / MSB

Block size setting under stream multiprocessor resource

Page 15: Performance modeling in  GPGPU computing

Though more threads can hide memory access latency, but the more thread use the more resource needed. Find the balance point between resource limitation and memory latency is a shortcut to touch the peak performance. By different application and device this performance model shows it advantage, adaptable and without any rework and redesign let application running on the best tuning.

Conclusion

Page 16: Performance modeling in  GPGPU computing

Recommended