PLOTCON NYC: Out with the CPU, In with the GPU: Why GPUs Will Replace CPUs in Data Visualization

NOVEMBER 15-18, 2016

#plotcon

Out with the CPU, In with the GPU Why GPUs will replace CPUs in data visualization

Presentation title | #

TODD MOSTAKFOUNDER + CEO_________@[email protected]

3

Source: IDC and EMC Digital Universe Report

The data explosion is just beginning

float

ing

poin

t ope

ratio

ns

/sec

4

mem

ory

band

widt

h GB

/sec

CPUs are not keeping pace

mem

ory

band

widt

h GB

/sec

float

ing

poin

t ope

ratio

ns

/sec

5

GPUs solve the data problem on multiple levels

• Traditional DBs can be highly inefficient• each operator in SQL treated as a separate function• incurs tremendous overhead and prevents vectorization

6

• MapD compiles queries w/LLVM to create one custom function• Queries run at speeds approaching hand-written functions• LLVM enables generic targeting of different architectures (GPUs,

X86, ARM, etc).• Code can be generated to run query on CPU and GPU

simultaneously 1011101010101001101011011

Query compilation with LLVM

• Traditional BI tools fail when:• Need to visualize large volumes of

grain-level data• Need to send GBs of data from server to

client• Web FEs need to render large volumes

of data• MapD uses a hybrid frontend/backend

approach• Basic charts are FE-rendered using D3

and other related toolkits• Data rendered on the BE reduces to

compressed PNG (~100 KB)• Geovisualizations are composited

over a FE-rendered basemap.

7

Rendering with GPUs

8

• Vega Spec (a visualization grammar)• A declarative JSON format for

creating visualization designs• Defines attributes of render

primitives which can be driven by data columns and mapped by scales

• Used to describe backend visualizations

• Shader Compilation Framework• Templatized: supports multiple

types (ints, floats, colors, etc), and multiple continuities (discrete, continuous)

Frontend

VegaPNG

BackendQuery-to-Render

Backend Rendering Details

9

• Render-to-data operation to get a row id• Use an auxiliary integer buffer to

store row ids per-pixel• Use PBOs for GPU-to-CPU transfer for

caching.• Apply a gaussian-weighted kernel to

resolve hits near boundaries

• Run a SQL query using row id as filter

0 0 0 0 0 0 0 00 0 1 0 0 2 0 00 1 1 1 2 2 2 0

2 00 00 0

Hit Testing

10

A Vega example

11

• CUDA/OpenGL Interoperability API• Query results are written to OpenGL

buffers• Requires no copy or reduction

operations

• Multi-GPU• A subset of query results are rendered

per GPU• Copy render framebuffers to a primary

GPU for compositing• EGL: EGLImage objects used as

composite buffers and shared with siblings on each GPU

• GLX: use NV_copy_image extension • 3 megapixel render on 8 GPUs in 25ms

GPU 0GPU 1

GPU N

Query to render pipeline

Confidential & Proprietary 12

vs.

We are witnessing an inflection point in compute driven by data growth outpacing the growth in processing power.

Fast hardware needs fast software. Our GPU-centric approach gives us technical leadership + performance advantages.

+

Speed at Scale. We are designed for high-value problems that combine size and a requirement for speed.

Integrated analytics deliver the best performance.

Database + Dataviz + Data Science.

Some closing thoughts

Date post:	08-Jan-2017
Category:	Data & Analytics
Upload:	plotly
View:	129 times
Download:	1 times

PLOTCON NYC: Out with the CPU, In with the GPU: Why GPUs Will Replace CPUs in Data Visualization

Data & Analytics