Date post: | 15-May-2015 |
Category: |
Documents |
Upload: | indicthreads |
View: | 614 times |
Download: | 4 times |
Accelerating computation in html 5Ashish ShahSAS R&D INDIA
Outline
• Multicore Computing • Problem statement• Demo • Introduction to OpenCL and WebCL• Conclusion• References
Multicore Computing
Problem statement
Layout algorithm for node-linked graphs
AlgorithmLayout
DEMO
Demo 1 – Serial versionDemo 2 - Parallel version with multi-core
CPUDemo 3 - Parallel version with many-core
GPU
Performance analysisT
ime
in m
s
Number of particles
Introduction to OpenCL
• Open Compute Language, C- like language.
• Framework for writing parallel algorithms• Heterogeneous platforms• Developed by Apple• Is an open standard and controlled by
Khronos group
Example of adding two vectors
_kernel add(a,b,c)
{
int i =get_global_id(); //get thread id
c[i]=a[i]+b[i];
}
For(i=1 to n)
c[i]= a[i]+b[i];
Serial version
Using OpenCL
OpenCL Architecture
1. Platform model2. Execution model3. Memory model4. Programming model
OpenCL -Platform
• Device• Host
Host
Intel CPUGPU 2
Compute Device 1 (GPU 1)
Compute unite (Cores)
OpenCL-Execution Model
1. Kernel
2. Work-items
3. Work group
4. ND-range
5. Program
6. Memory objects
7. Command queues
_kernel add(a,b,c)
{
int i =get_global_id();//get thread/workitem id c[i]=a[i]+b[i];
}
Memory Model in OpenCL
Compute Device
Compute unit 0 Compute unit 1 Compute unit 2
Global Memory -DRAM
Global constant memory-DRAM
Local memory/cache Local memory/cache Local memory/cache
Private register Private register Private register
Programming model
1. Data parallel-single function on multiple data
2. Task parallel-Multiple functions on single data
OpenCL Runtime
OpenCL Framework
OpenCL Stack
OpenCL Device (GPU/CPU hardware)
Device driver
Compiler
Applications
kernals
OpenCL-Api
HTML,.java,.NET,c,c++
String data
Java,c,.net,WebCL
contextMemory Api’s
Command queues, buffer objects, kernel execution
Essential Development Tasks
Parallelize Code Kernel
Initialize OpenCL
environment
Initiate kernels and
data
Execute kernel
Read back data to host
C-code with restrictions
Essential Development Tasks
Parallelize Code Kernel
Initialize OpenCL
environment
Initiate kernels and
data
Execute kernel
Read back data to host
• Query compute device• Create context• Compile kernels
Essential Development Tasks
Parallelize Code Kernel
Initialize OpenCL
environment
Initiate kernels and
data
Execute kernel
Read back data to host
• Create memory objects• Map data structures to OpenCL
supported data structures.• Initialize kernel parameters
Essential Development Tasks
Parallelize Code Kernel
Initialize OpenCL
environment
Initiate kernels and
data
Execute kernel
Read back data to host
• Specify number of threads to execute task
• Trigger the execution of kernel-sync or async
Essential Development Tasks
Parallelize Code Kernel
Initialize OpenCL
environment
Initiate kernels and
data
Execute kernel
Read back data to host
• Map to application datastructure
Introduction to WebCL
• Java Script bindings for OpenCL• First announced in March 2011 by
Khronos• API definition underway• Prototype plugin is available only for
Firefox browser
Binding OpenCL to WebCL
CPU
Host application JavaScript
OpenCL Framework
WebCL
OpenCL
compliant
device
Coding with WebCLplatforms = WebCL.getPlatformIDs();
context = WebCL.createContextFromType([WebCL.CL_CONTEXT_PLATFORM,
platforms[0]], WebCL.CL_DEVICE_TYPE_CPU);
devices = context .getContextInfo(WebCL.CL_CONTEXT_DEVICES);
program = context .createProgramWithSource(kernelSrc);
kernelfunction1 = program.createKernel(“function1");
buffparam = context.createBuffer(WebCL.CL_MEM_READ_WRITE, bufSize);
cmdQueue = context.createCommandQueue(devices[0], 0);
cmdQueue.enqueueWriteBuffer(buffparam , true, 0, bufSize, parameter, []);
kernelfunction1.setKernelArg(0, buffparam , WebCL.types.float2);
cmdQueue.enqueueNDRangeKernel(kernelfunction1 , 1, [], totalWorkitems,
totalWorkgroups, []);
cmdQueue.finish ();
cmdQueue.enqueueReadBuffer(‘xyz’, true, 0, bufSize, ‘xyzParam’, []);
Applications of OpenCL• Database mining • Neural networks• Physics based simulation,mechanics• Image processing• Speech processing• Weather forecasting and climate research• Bioinformatics
Conclusion
• Significant performance gains in using OpenCL for computations in client-side environments like HTML5
• Algorithms need to be ‘parallelizable’
• Further optimizations can be achieved by exploiting memory model
Software/Hardware used in demo application
HardwareIntel(R) Core(TM)2 Quad core CPU Q8400 @
2.66GHz
Nvidia 160m Quadro 8 cores @ 580 MHz
Software OpenCL runtime for CPU
http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/
OpenCL runtime for GPUhttp://
www.nvidia.com/object/quadro_nvs_notebook.html
WebCL plugin for Firefoxhttp://webcl.nokiaresearch.com/
References
http://www.macresearch.org/openclhttp://en.wikipedia.org/wiki/GPGPUhttp://www.khronos.org/webcl/