© Copyright 2018 Xilinx
Presented By
Johan Janssen
Chief Video Architect
Video Acceleration in the Cloud Using FPGAs
Sean Gardner
Marketing Mgr., Cloud Video
© Copyright 2018 Xilinx
The market is recognizing us and our technology…
Dan Rayburn is key industry analyst & veteran
https://www.streamingmediablog.com/2018/07/fixing-the-ott-problem.html
© Copyright 2018 Xilinx
Live streaming seeing explosive growth
Page 4
0
5
10
15
20
25
30
35
40
2016 2021
Live Streaming Market Size ($B)
Ann
ualr
even
ue U
SD
Markets & Markets report
© Copyright 2018 XilinxPage 5
Live video demands a new approach
• Video will be 82% of all internet traffic by 2021 - Cisco VNI report
• Video will be 73% of all wireless traffic by 2023 - Ericsson Mobility report
0
20
40
60
80
100
120
SD (480) HD (720) FHD (1080) UHD (4k) 8k
Increase in pixels
© Copyright 2018 XilinxPage 6
New codecs help one hand…
0
20
40
60
80
100
120
140
MPEG2 H.264 HEVC / VP9 AV1
1080p60 (mbps) Codec Complexity
8 x less bandwidthbandwidth
Codec
Complexity
© Copyright 2018 Xilinx
CPUs are too slow for live video…
Compression efficiency
customers would like
What they live with for
real-time applications
© Copyright 2018 XilinxPage 8
Intel performance scaling not keeping up
Slide taken from Intel XDF presentation
Description Growth in Compute
Live video growth 2 x
Resolution growth 96 x
Growth in codec complexity 125 x
Total growth in compute 223 x
© Copyright 2018 XilinxPage 9
The “Pareto Principle” of live video distribution
20% of video streams generate 80% of streaming traffic
Less popular streams (80% of streams but 20% of bandwidth)
Most
Po
pula
r/V
iew
ed S
tream
s
OPEX
HEAVY
CAPEX
HEAVY
Lowest Bandwidth Highest Density
© Copyright 2018 XilinxPage 10
Together Xilinx & NGCodec can save on streaming costs
24.00
25.00
26.00
27.00
28.00
29.00
30.00
31.00
32.00
33.00
0 1 2 3 4 5 6Millions
x264 medium
N265
PS
NR
(dB
)
Bitrate (Mbps) 45%
© Copyright 2018 XilinxPage 12
Hardened solutions come at a cost
Encoder saves significant OPEX costs for bandwidth on egress traffic
22
23
24
25
26
27
28
29
30
0 2000000 4000000 6000000 8000000 10000000 12000000 14000000
PS
NR
NVenc HEVC NGcodec HEVC PSNR
35% lower bitrate for same quality
Measured June 2018 SDK
Competitor
Competitor
Bitrate (kbps)
© Copyright 2018 XilinxPage 15
FPGA HEVC encode vs. x265 encoding configurations
http://x265.ru/wp-content/uploads/2014/12/diff-presets.png
Commonly used
X265 preset for
encoder benchmarking
HEVC encode
comparison
1080p fps @
x265-slow preset
Device
Powerperformance/W
performance/W
ImprovementAvailable
Dual Socket
E5-2680 v3 2.5GHz10 fps 240 W 0.042 fps/W
Single socket VU9p 120 fps <40 W 3.0 fps/W 72x Dec 2018
Dual socket VU9p 240 fps <80 W 3.0 fps/W 72x Dec 2019
Fra
mes p
er
second
© Copyright 2018 XilinxPage 16
˃ Source: https://www.ffmpeg.org/about.html
˃ FFmpeg is the leading multimedia framework, able to decode, encode,
transcode, mux, demux, stream, filter and play pretty much anything that
humans and machines have createdIt supports the most obscure ancient formats up to the cutting edge
Highly portable: FFmpeg compiles, runs, across Linux, Mac OS X, Microsoft Windows, under a wide variety of build environments, machine architectures, and configurations
About FFmpeg
© Copyright 2018 XilinxPage 17
Integration into FFmpeg framework: building the ecosystem
FPGA
SDAccel Target Board
FFmpeg(video codecs, Scalars, Compositing etc.)
XMA(XRT, Partial Reconfiguration, Video IP)
x86 Server
Customer Application
FPGA HEVC
encode plugin
FPGA h.264
encode pluginXilinx Yolo plugin
Xilinx ABR Scalar
plugin
FPGA VP9 encode
plugin
Xilinx h.264
decode plugin
© Copyright 2018 XilinxPage 18
Video Transcoding ABR Example (Single VU9p)
FFmpeg
ABR
Scaler
2 x
HEVC or
VP9 1080p60
Encoder
RTSP/RTMP Live Stream RTSP
ClientDecode
ABR
ScalerEncoder
RSTP/RTMP Stream to Inference
Xilinx Alveo
PCIe Card
Mux
Aud/Vid
Demux
Aud/VidMux
Aud/Vid
RTSP/RTMP to Media Server
Encoder
(low frame
rate)
Decode
Optional
2 x
HEVC or
VP9 1080p60
Encoder
© Copyright 2018 XilinxPage 19
Video Transcoding ABR Example (Single VU9p)
FFmpeg
ABR
Scaler
HEVC or
VP9
Encoder
RTSP/RTMP Live Stream RTSP
ClientDecode
ABR
ScalerEncoder
RSTP/RTMP Stream to Inference
Xilinx Alveo
PCIe Card
Mux
Aud/Vid
Demux
Aud/VidMux
Aud/Vid
RTSP/RTMP to Media Server
Encoder
(low frame
rate)
Decode
Optional
HEVC or
VP9
Encoder
HEVC or
VP9
Encoder
HEVC or
VP9
Encoder
HEVC or
VP9
Encoder
10 x H.264
1080p60
Encoder
© Copyright 2018 XilinxPage 20
cloud
1080p30
Decode
“Edge-to-Cloud Video Analytics”
Video Inference
Streaming Server
h.264
h.264
Decode
Xilinx Alveo
PCIe card
Inference
xDNN
Post-
ProcessOverlay
Metadata
RTSP/RTMP to Media Server
Database
(mysql/postgresql)HTTP Server
Demux
Aud/VidRTSP/
RTMP
Client
RTSP/RTMP
Live Stream Encode
Optional
h.264 or
HEVC or
VP9
file
File
© Copyright 2018 XilinxPage 21
Integrating accelerators into FFmpeg framework
$ ffmpeg \-f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -r 30 -an -i/home/ffmpeg/VU9P/TestSequences/Kimono1_1920x1080_24.yuv \-frames 240 -b:v 4000k -g 30 -c:v xlnx_HEVC_enc -f h265 -y ./hw_outdir/out1_br4000k.h264
ffmpeg \-f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -r 30 -an -i/home/ffmpeg/VU9P/TestSequences/Kimono1_1920x1080_24.yuv \-frames 240 -c:v libx264 -preset medium -profile:v high -crf 23 -bf 4 -refs 3 -g 30 -b:v 4000k -maxrate 4000k -bufsize8000k -f h264 -r 30 -y ./sw_outdir/x264_medium_out0_br4000k.h264
https://trac.ffmpeg.org/wiki/EncodingForStreamingSites
$ ffmpeg \-f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -r 30 -an -i/home/ffmpeg/VU9P/TestSequences/Kimono1_1920x1080_24.yuv \-frames 240 -b:v 4000k -g 30 -c:v xlnx_h264_enc-hq -f h264 -y ./hw_outdir/out0_br4000k.h264
Change 20 characters to get acceleration
© Copyright 2018 XilinxPage 22
XMA Architecture
˃ Key features
Video domain specific interfaces with seamless integration with FFmpeg
Low-level plugin can be reused with any media framework
Supports multiple processes sharing different kernels on the same device
Supports multiple channels on a single kernel
Ensures a kernel resource is reserved for the lifetime of a video session
Xilinx Media AcceleratorXMA
OpenCL
Decoder ABR Scaler
EncoderSession-1
EncoderSession-2
EncoderSession-N
Custom Applications
Host
Accelerator
Media FrameworkFFmpeg, Gstreamer, Other
Xilinx Runtime(XRT)
© Copyright 2018 Xilinx
Video IP Offering (each IP has FFmpeg plugin)
All throughput numbers are based on VU9P
© Copyright 2018 XilinxPage 24
Xilinx Alveo ABR video transcoding solution
: https://www.xilinx.com/products/boards-and-kits/alveo/applications/.html