History
• Ian Buck, Dir. of GPU Computing, received his PhD from Stanford for his research on GPPM in 2004
• Started working for Nvidia to commercialize GPU computing
• First start was in 2006, Nvidia released CUDA v 1.0 for G80
• In spring 2008, CUDA 2.0 was released together with GT200
About
• With CUDA, normal applications can be ported to GPU for higher performance
• No low level or 3D programming knowledge required, CUDA works with C
CPU vs GPU
• A CPU core can execute 4 32-bit instructions per clock, whilst a GPU can execute 3200 32-bit instructions per clock
• A CPU is designed primarily to be an executive and make decisions
• A GPU is different, it has a large number of ALU’s(Arithmetic/Logic Units), a lot more than a CPU.
Structure
• In CUDA, you are required to specify the number of blocks and threads in each block.
• One block can contain up to 512 threads.
• Each thread on each block is executed separately.
Structure
Syntax
• Key parts:
• Identifying a GPU function (__global__, __device__)
• Calling a GPU function, specifying number of blocks and threads per block function<<<block_nr, thread_nr>>>(param);
Syntax
• CPU Code:
• Calling function:
Syntax
• GPU Code:
• Calling function:
Bruteforce
• As a lot of information is processed at the same time, parallel programming has a big impact on bruteforce
• Number of tries increases drastically on a GPU than on a CPU
Examples
• Let’s say we have a password to break, and the only thing we know is it has length=3
• A simple bruteforce would be:
Examples
• A GPU bruteforce:
• Called like this:
Examples
• A more efficient GPU bruteforce:
• Called like this:
Real Life
• Let’s say we have an MD5 and a wordlist of 1.000.000 words
• A simple bruteforce would be:
Real Life
• A GPU bruteforce would be:
• Called like this:
• threadIdx.x+blockIdx.x*blockDim.x is the thread ID (ranging from 1 to 1.000.000)
• 2000*500=1.000.000 threads