+ All Categories
Home > Education > CSTalks - GPGPU - 19 Jan

CSTalks - GPGPU - 19 Jan

Date post: 10-May-2015
Category:
Upload: cstalks
View: 699 times
Download: 0 times
Share this document with a friend
Description:
First talk, GPGPby Tung
Popular Tags:
29
Research in GPU Computing Cao Thanh Tung
Transcript
Page 1: CSTalks  -  GPGPU - 19 Jan

Research in GPU Computing

Cao Thanh Tung

Page 2: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 2

Outline

● Introduction to GPU Computing– Past: Graphics Processing and GPGPU

– Present: CUDA and OpenCL

– A bit on the architecture

● Why GPU?● GPU v.s. Multi-core and Distributed● Open problems.● Where does this go?

Page 3: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 3

Introduction to GPU Computing

● Who have access to 1,000 processors?

Page 4: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 4

Introduction to GPU Computing

● Who have access to 1,000 processors?

Page 5: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 5

Introduction to GPU Computing

● Who have access to 1,000 processors?

YOU

Page 6: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 6

Introduction to GPU Computing

● In the past– GPU = Graphics Processing Unit

Page 7: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 7

Introduction to GPU Computing

● In the past– GPU = Graphics Processing Unit

Page 8: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 8

Introduction to GPU Computing

● In the past– GPU = Graphics Processing Unit

Page 9: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 9

Introduction to GPU Computing

● In the past– GPU = Graphics Processing Unit

Page 10: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 10

Introduction to GPU Computing

● In the past– GPU = Graphics Processing Unit

Page 11: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 11

Introduction to GPU Computing

● In the past– GPGPU = General Purpose computation using GPUs

Page 12: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 12

Introduction to GPU Computing

● Now– GPU = Graphics Processing Unit

General

__device__ float3 collideCell(int3 gridPos, uint index...{ uint gridHash = calcGridHash(gridPos); ... for(uint j=startIndex; j<endIndex; j++) { if (j != index) { ... force += collideSpheres(...); } } return force;}

Page 13: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 13

Introduction to GPU Computing

● Now– We have CUDA (NVIDIA, proprietary) and OpenCL (open standard)

__device__ float3 collideCell(int3 gridPos, uint index...{ uint gridHash = calcGridHash(gridPos); ... for(uint j=startIndex; j<endIndex; j++) { if (j != index) { ... force += collideSpheres(...); } } return force;}

Page 14: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 14

Introduction to GPU Computing

● A (just a little) bit on the architecture of the latest NVIDIA GPU (Fermi)– Very simple core (even simpler

than the Intel Atom)

– Little cache

Page 15: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 15

Why GPU?

Page 16: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 16

Why GPU?

● Performance

Page 17: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 17

Why GPU?

● People have used it, and it works. – Bio-Informatics

– Finance

– Fluid Dynamics

– Data-mining

– Computer Vision

– Medical Imaging

– Numerical Analytics

Page 18: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 18

Why GPU?

● A new, promising area– Fast growing

– Ubiquitous

– New paradigm → new problems, new challenges

Page 19: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 19

GPU v.s. Multi-core

● A lot more threads of computation are required:– The GPU has a lot more “core” than a multi-core CPU.

– A GPU core is no where as powerful as a CPU core.

Page 20: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 20

GPU v.s. Multi-core

● Challenges:– Not all problems can easily be broken into many small sub-

problems to be solved in parallel.

– Race conditions are much more serious.

– Atomic operations are still doable, locking is a performance killer. Lock-free algorithms are much preferable.

– Memory access bottleneck (memory is not that parallel)

– Debugging is a nightmare.

Page 21: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 21

GPU v.s. Distributed

● GPU allows much cheaper communication between different threads.

● GPU memory is still limited compared to a distributed system.

● GPU cores are not completely independent processors– Need fine-grain parallelism

– Reaching the scalability of a distributed system is difficult.

Page 22: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 22

Open problems

● Data-structures● Algorithms● Tools● Theory

Page 23: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 23

Open problems

● Data-structures– Requirement: Able to handle very high level of concurrent access.

– Common data-structures like dynamic arrays, priority queues or hash tables are not very suitable for the GPU.

– Some existing works: kD-tree, quad-tree, read-only hash table...

Page 24: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 24

Open problems

● Algorithms– Most sequential algorithms need serious re-design to make good

use of such a huge number of cores. ● Our computational geometry research: use the discrete

space computation to approximate the continuous space result.

– Traditional parallel algorithms may or may not work.● Usual assumption: infinite number of processors● No serious study on this so far!

Page 25: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 25

Open problems

● Tools– Programming language: Better language or model to express

parallel algorithms?

– Compiler: Optimize GPU code? Auto-parallelization?● There's some work on OpenMP to CUDA.

– Debugging tool? Maybe a whole new “art of debugging” is needed.

– Software engineering is currently far behind the hardware development.

Page 26: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 26

Open problems

● Theory– Some traditional approach:

● PRAM: CRCW, EREW. Too general. ● SIMD: Too restricted.

– Big Oh analysis may not be good enough. ● Time complexity is relevant, but work complexity is more

important. ● Most GPU computing works only talk about actual running

time. – Performance Modeling for GPU, anyone?

Page 27: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 27

Where does this go?

● Intel/AMD already have 6 core 12 threads processors (maybe more).

● SeaMicro has a server with 512 Atom dual-core processors.● AMD Fusion: CPU + GPU.

● The GPU may not stay forever, but massively-multithreaded is definitely the future of computing.

Page 28: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 28

Where to start?

● Check your PC. – If it's not at the age of being able to go to a Primary school, there's

a high chance it has a GPU.

● Go to NVIDIA/ATI website, download some development toolkit, and you're ready to go.

Page 29: CSTalks  -  GPGPU - 19 Jan

19-Jan-2011 Computing Students talk 29

THANK YOU

● Any questions? Just ask. ● Any suggestion? What are you waiting for. ● Any problem or solution to discuss? Let's have a private talk

somewhere (j/k)


Recommended