+ All Categories
Home > Documents > ANYDSL: OMPILER-F FOR DOMAIN-SPECIFIC LIBRARIES...

ANYDSL: OMPILER-F FOR DOMAIN-SPECIFIC LIBRARIES...

Date post: 15-Oct-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
1
Contact: [email protected] Website: http://anydsl.github.io ANYDSL: A COMPILER-FRAMEWORK FOR DOMAIN-SPECIFIC LIBRARIES (DSLS) Richard Membarth, Arsène Pérard-Gayot, Martin Weier, Philipp Slusallek Roland Leißa, Klaas Boesche, Sebastian Hack Motivation - Many-Core HW is everywhere - But cannot be programmed well Grafikprozessor (GMA HD4000) Zwischenspeicher (L3) Speichercontroller (Eingabe/Ausgabe) GPU 1. Kern 2. Kern 3. Kern 4. Kern System- überwa- chung, Speicher- und Display- controller Intel Haswell Architecture (1.4B Transistors) Nvidia Kepler (~7B Transistors) CPU GPU CPU CPU/GPU Intel KnightsFerry (~5B Transistors) CPU/GPU Intel Knights Landing GPU AMD Brazos Traditional Programs run only on a single core RaTrace - A DSL for ray traversal - 11% faster than Embree (on average, Core i7-4790) - 17% faster than Aila et al. (on average, GTX 970) - 1/10th of coding time (according to Halstead measures) AnyDSL Architecture Computer Vision DSL Physics DSL Ray Tracing DSL Parallel Runtime DSL ... Layered DSL Specifications AnyDSL Unified Program Representation AnyDSL Compiler Framework (Thorin) Various HW Back Ends Impala Thorin Vectorizer LLVM CUDA OpenCL SPIR Native Code NVVM Stincilla - A DSL for stencil codes - Example: Gaussian blur filter - Reference: OpenCV 3.0 - Intel CPU: 40% faster - Intel GPU: 25% faster - AMD GPU: 50% faster - NVIDIA GPU: 45% faster - Up to 10x shorter code Embedding of DSLs in Impala - Separation of concerns through code refinement - Higher-order functions - Partial evaluation - Triggered code generation Application Developer fn main() { let img = load(“dragon.png“); let blurred = gaussian_blur(img); } DSL Developer fn gaussian_blur(field: Field) -> Field { let stencil: Stencil = { /* ... */ }; let mut out: Field = { /* ... */ }; for x, y in @iterate(out) { out.data(x, y) = apply_stencil(x, y, field, stencil); } out } Machine Expert fn iterate(field: Field, body: fn(int, int) -> ()) -> () { let grid = (field.cols, field.rows, 1); let block = (128, 1, 1); with nvvm(grid, block) { let x = nvvm_tid_x() + nvvm_ntid_x() * nvvm_ctaid_x(); let y = nvvm_tid_y() + nvvm_ntid_y() * nvvm_ctaid_y(); body(x, y); } }
Transcript
Page 1: ANYDSL: OMPILER-F FOR DOMAIN-SPECIFIC LIBRARIES (DSLSllvm.org/devmtg/2017-03/assets/slides/anydsl_a_compiler-framework… · Intel Haswell Architecture (1.4B Transistors) Nvidia Kepler

Contact: [email protected]: http://anydsl.github.io

ANYDSL:A COMPILER-FRAMEWORK FOR DOMAIN-SPECIFIC LIBRARIES (DSLS)

Richard Membarth, Arsène Pérard-Gayot, Martin Weier, Philipp SlusallekRoland Leißa, Klaas Boesche, Sebastian Hack

Motivation

− Many-Core HW is everywhere − But cannot be programmed well

Gra�kprozessor (GMA HD4000)

Zwischenspeicher (L3)

Speichercontroller (Eingabe/Ausgabe)

GPU

1. Kern 2. Kern 3. Kern 4. Kern System-überwa-chung, Speicher- und Display-controller

Intel Haswell Architecture (1.4B Transistors)

Nvidia Kepler (~7B Transistors)

CPU

GPU

CPU

CPU/GPU

Intel KnightsFerry(~5B Transistors)

CPU/GPU

Intel Knights Landing

GPU

AMD Brazos

Traditional Programs runonly on a single core

RaTrace

− A DSL for ray traversal

− 11% faster than Embree (on average, Core i7-4790)

− 17% faster than Aila et al. (on average, GTX 970)

− 1/10th of coding time (according to Halstead measures)

AnyDSL Architecture

ComputerVision

DSL

PhysicsDSL

Ray Tracing

DSL

Parallel Runtime

DSL...

Layered DSL Speci�cations

AnyDSL Uni�ed Program Representation

AnyDSL Compiler Framework (Thorin)

Various HW Back Ends

Impala Thorin

Vectorizer

LLVM

CUDAOpenCL

SPIR

Native Code

NVVM

Stincilla

− A DSL for stencil codes− Example: Gaussian blur �lter − Reference: OpenCV 3.0 − Intel CPU: 40% faster − Intel GPU: 25% faster − AMD GPU: 50% faster − NVIDIA GPU: 45% faster − Up to 10x shorter code

Embedding of DSLs in Impala

− Separation of concerns through code re�nement − Higher-order functions − Partial evaluation − Triggered code generation

Application Developerfn main() { let img = load(“dragon.png“); let blurred = gaussian_blur(img);}

DSL Developerfn gaussian_blur(field: Field) -> Field { let stencil: Stencil = { /* ... */ }; let mut out: Field = { /* ... */ };

for x, y in @iterate(out) { out.data(x, y) = apply_stencil(x, y, field, stencil); } out}

Machine Expertfn iterate(field: Field, body: fn(int, int) -> ()) -> () { let grid = (field.cols, field.rows, 1); let block = (128, 1, 1);

with nvvm(grid, block) { let x = nvvm_tid_x() + nvvm_ntid_x() * nvvm_ctaid_x(); let y = nvvm_tid_y() + nvvm_ntid_y() * nvvm_ctaid_y(); body(x, y); }}

Recommended