Date post: | 07-May-2017 |
Category: |
Documents |
Upload: | eractus883023 |
View: | 232 times |
Download: | 1 times |
1
Bringing GPU capabilities in Scilab
June 16th, 2010
2
Sylvestre Ledru● In charge of R&D projects
● Responsible for GNU/Linux, Mac OS X and Unix version
● Developer
● Community management
● ...
Presentation
3
GPGPU in Scilab
4
● Use graphic cards for numerical computing purposes
● Much more powerful than CPU on some algorithms
● But hard to use...
What is GPGPU?
5
● CUDA by Nvidia for NvidiaFocused on GPU computing
● OpenCL by the Khronos group More general
Two main solutions
6
● To improve computation time
● Perfect to massively parallel algorithms
● To simplify its uses
Why GPU in Scilab?
7
● Take advantages of all Scilab features (high-level, easy to use...)
● Hide a part of the complexity
● Integration in a current software solution
Advantages compared to ad-hoc devs
8
Alpha version of Scilab GPU module
9
Description
● One goal: leverage GPU capabilities from Scilab
● Started in the context of the Google Summer of Code 2009
● Continued within the OpenGPU context (System@tic and Cap Digital)
10
● Handle both CUDA & OpenCL technologies
● Similar profiles and usages
● GPU functions (kernel) are compiled and transfered by the module
● Explicit transfert of variables
● Available on the Scilab forge :http://forge.scilab.org/index.php/p/sciCuda/
Features
11
● CUDA & OpenCL managed transparently with same profiles:
Features
● cudaAlloc vs openclAlloccudaToGpu vs openclToGpubuildCuda vs buildOpenCLcudaApplyFunction vs openclApplyFunction...
12
Code example - kernel
__global__ void someSimpleKernel(double* src,double* dst, int numberOfElement){ int idx=threadIdx.x+blockIdx.x*blockDim.x; if(idx<numberOfElement) { dst[idx]=src[idx]; }}
13
Code example – Scilab code
buildCuda(abs_path+"simple.cu");
A_host=rand(256,256);A=cudaToGpu(A_host);B=cudaAlloc(256,256);
kernel=cudaLoadFunction("simple.ptx","someSimpleKernel");lst=list(A,B,int32(256*256));cudaApplyFunction(kernel,lst,128,1,256*256/128,1);B_host=cudaFromGpu(B);
14
Example: Channel flow past a cylinderical obstacle
15
● A first tagged release
● More transparent uses
● Code generation to GPU
● Xcos integration
● Linear algebra features based on cublas
Future
16
Thanks for your attention
www.scilab.org
Introduction
Scilab on steroïds
Objective : Provide an easy way to go from a scripted prototype
in Scilab to a full-fledged application In fact an application taking advantage of GPU
acceleration
3
Integrating GPU-enabled libraries Introducing a pragma-based typing system Enabling the generation of an autonomous application on GPU Extending the typing system with automated type inferences
Several steps
HPC Project - Corporate Presentation
2
How to go from Scilab…
An interpreted Dynamical environment Ideal for prototyping
3
… to a full application
Typed variable To enable a compilation For performance To directly go to production
4
Producing application from Scilab scripts
If you write a less dynamic code Due to an internal representation of the code A code C can be generated from a Scilab script So you could get an autonomous application From there, you could get a GPU-accelerated application
Par4All source-to-source compiler
What is it ?
Some feedback on code quality
source to source Compiler
Benefit : only one code
Additional Tools
Back-end compiler
You can work on it:
6
Par4All – the compiling flow
Hundreds of code analysis phases and reorganizations
Scripting Metalanguage
C Fortran
C + Cuda code OpenMP MPI
Scilab to GPU
A natural extension High-level scripting languages :
Scilab
Generating type by inference
To ease development Provide an environment able to produce type
Type generation during script writing by inference Something that is provided in some strongly typed formal
langages Of course, it is currently possible to write a script and no
coherent typing can be generated This will result in some limitation on “acceptable” scripts
Appliances Wild Systems A packaged system for immediate use
High-end Configuration : dual-socket X5570 & GPU Tesla
Best HW Optimized App Best Math
In an Appliance
No Cost of Operation
Ethernet
Thank you Any question ?