High Level OpenCL Implementation

Post on 01-Jan-2016

27 views 0 download

Tags:

description

High Level OpenCL Implementation. By: Matthew Royle Supervisor: Prof. Shaun Bangay. Introduction. Multi-core CPUs Sequential algorithms to parallel algorithms GPUs used for more than just graphics Use of GPGPUs (General-Purpose Graphics Processing Unit). Introduction Cont…. - PowerPoint PPT Presentation

transcript

By: Matthew RoyleSupervisor: Prof. Shaun Bangay

Multi-core CPUs Sequential algorithms to parallel

algorithms GPUs used for more than just graphics Use of GPGPUs (General-Purpose

Graphics Processing Unit)

Parallel programming languages for

specific architectures, namely NVIDIA’s

CUDA Lack of a multi-platform open language The OpenCL (Open Computing Language)

standard Heterogenous Parallel Programming

Parallel nature of GPUs No Implementation Implement OpenCL using existing

technologies

High level translator Use Parallel Frameworks

GPU most likely form of implementation

NVIDIA and AMD plan to include OpenCL

Future Apple iPhones

Lack of implementation on CPU

architecture

Select a parallel processing framework

Create a high level translator Create valid tests Run created tests

_kernel int add_vect (); //create computation unit

cl_cmd_queue cmd_queue = CreateCommandQueue(); //create computation queue

clEnqueueTask(kernel,i); //enqueue task and execute

cl_cmd_queue CreateCommandQueue(){ return cmd_queue[]; }

void clEnqueueTask(kernel,i) { cmd_queue[i] = kernel; }

#pragma omp parallel for{for(int k = 0; k < cmd_queue.length; k++)

Execute(cmd_queue[k]);}

John Conway’s Game Of Life

Fractal Flame algorithm

OpenMP (Open Multi-Processing) framework

Parallel Processing Framework

Available with the GNU Compiler

Collection Free! OpenCL header files

/* scalar types */

typedef int8_t cl_char;

typedef uint8_t cl_uchar;

typedef int16_t cl_short __attribute__((aligned(2)));

typedef uint16_t cl_ushort __attribute__((aligned(2)));

typedef int32_t cl_int __attribute__((aligned(4)));

typedef uint32_t cl_uint __attribute__((aligned(4)));

typedef int64_t cl_long __attribute__((aligned(8)));

typedef uint64_t cl_ulong __attribute__((aligned(8)));

typedef uint16_t cl_half __attribute__((aligned(2)));

typedef float cl_float __attribute__((aligned(4)));

typedef double cl_double __attribute__((aligned(8)));

//hello.c

#include <omp.h>#include <stdio.h>int main() {#pragma omp parallel num_threads(10)printf("Hello from thread %d, nthreads %d\n",

omp_get_thread_num(), omp_get_num_threads());}

Improve performance

Evaluation of OpenCL on various

Architectures

Heterogenous execution

Lack of multi-platform open language

OpenCL standard

Most implementations for GPU

Implementation for CPU

High Level Translator

Use OpenMP framework