Date post: | 08-Jan-2017 |
Category: |
Data & Analytics |
Upload: | plotly |
View: | 129 times |
Download: | 1 times |
NOVEMBER 15-18, 2016
#plotcon
Out with the CPU, In with the GPU Why GPUs will replace CPUs in data visualization
Presentation title | #
TODD MOSTAKFOUNDER + CEO_________@[email protected]
3
Source: IDC and EMC Digital Universe Report
The data explosion is just beginning
float
ing
poin
t ope
ratio
ns
/sec
4
mem
ory
band
widt
h GB
/sec
CPUs are not keeping pace
mem
ory
band
widt
h GB
/sec
float
ing
poin
t ope
ratio
ns
/sec
5
GPUs solve the data problem on multiple levels
• Traditional DBs can be highly inefficient• each operator in SQL treated as a separate function• incurs tremendous overhead and prevents vectorization
6
• MapD compiles queries w/LLVM to create one custom function• Queries run at speeds approaching hand-written functions• LLVM enables generic targeting of different architectures (GPUs,
X86, ARM, etc).• Code can be generated to run query on CPU and GPU
simultaneously 1011101010101001101011011
Query compilation with LLVM
• Traditional BI tools fail when:• Need to visualize large volumes of
grain-level data• Need to send GBs of data from server to
client• Web FEs need to render large volumes
of data• MapD uses a hybrid frontend/backend
approach• Basic charts are FE-rendered using D3
and other related toolkits• Data rendered on the BE reduces to
compressed PNG (~100 KB)• Geovisualizations are composited
over a FE-rendered basemap.
7
Rendering with GPUs
8
• Vega Spec (a visualization grammar)• A declarative JSON format for
creating visualization designs• Defines attributes of render
primitives which can be driven by data columns and mapped by scales
• Used to describe backend visualizations
• Shader Compilation Framework• Templatized: supports multiple
types (ints, floats, colors, etc), and multiple continuities (discrete, continuous)
Frontend
VegaPNG
BackendQuery-to-Render
Backend Rendering Details
9
• Render-to-data operation to get a row id• Use an auxiliary integer buffer to
store row ids per-pixel• Use PBOs for GPU-to-CPU transfer for
caching.• Apply a gaussian-weighted kernel to
resolve hits near boundaries
• Run a SQL query using row id as filter
0 0 0 0 0 0 0 00 0 1 0 0 2 0 00 1 1 1 2 2 2 0
2 00 00 0
Hit Testing
10
A Vega example
11
• CUDA/OpenGL Interoperability API• Query results are written to OpenGL
buffers• Requires no copy or reduction
operations
• Multi-GPU• A subset of query results are rendered
per GPU• Copy render framebuffers to a primary
GPU for compositing• EGL: EGLImage objects used as
composite buffers and shared with siblings on each GPU
• GLX: use NV_copy_image extension • 3 megapixel render on 8 GPUs in 25ms
GPU 0GPU 1
GPU N
Query to render pipeline
Confidential & Proprietary 12
vs.
We are witnessing an inflection point in compute driven by data growth outpacing the growth in processing power.
Fast hardware needs fast software. Our GPU-centric approach gives us technical leadership + performance advantages.
+
Speed at Scale. We are designed for high-value problems that combine size and a requirement for speed.
Integrated analytics deliver the best performance.
Database + Dataviz + Data Science.
Some closing thoughts