1
ARM Multimedia Technology Update ARM Mali®-T880 Delivering Stunning
Visual Mobile Experience
ARM Tech Forum, June 2015
Nizar Romdan
Director of Ecosystem
Media Processing Group
2
Anything, Anywhere, Anytime
"Arsenal vs Chelsea" by Wonker at Flickr. Licensed under CC BY 2.0 via Wikimedia Commons.
Multi-casting UHD and
4K video streams
Console-quality gaming
experience
Epic Games using Enlighten Technology from Geomerics, an ARM Company
3
69% Mobile video
CAGR
2013-20184
Computer Vision & Image Processing
Expanding Complexity of Content
Sources:
1: AppAnnie report
2: CISCO VNI report
3: Android Developer Dashboards
4: CISCO VNI report
90% of Google Play
revenue is from
mobile games1
Immersive Gaming
53% of mobile data
was video
in 20132
Media Capture and Sharing
73% Of mobile device
resolutions are
HD or more3
User Interfaces
4
Complexity of Device Requirements Balancing performance, power and price to address consumer needs
Higher
Performance
Handle more
complex
content and
higher
resolutions
Higher Energy Efficiency
Deliver extra
performance
in same
thermal
budget
Higher
Cost Efficiency
Bring high-
quality,
affordable
technology
to the
mainstream
Faster
Time to Market
Remain
competitive
in a fast-
paced
market
5
Battery life
Cost
Slimmer and sleeker
form factors Performance
Thermal design
Features
Capabilities
Delivery Requires Innovation Throughout the SoC and Device
6
ARM IP Suite Delivering 2016 Premium Mobile Experiences
Enabling innovation on the most
advanced process nodes
Highest performance
ARM Cortex CPU
Next generation ARM CoreLink
Cache Coherent Interconnect
Premium configurations of Mali video
and display processors Complements ARM big.LITTLE
enhancing energy efficiency
Highest performance and most
energy-efficient ARM Mali GPU
7
550M Mali –based GPUs
shipped in 2014 Total
114 licenses
ARM® Mali™ Success
73 licensees
0
100
200
300
400
500
600
2011 2012 2013 2014
Mill
ion u
nits
25 new Mali
licenses
in FY14
Mali is in
75% of DTVs…
…
>50% of Android
tablets…
…
>35% of
smartphones
YTD
On track for
600-700M
8
Mali GPU Roadmap
Energy Efficient GPU Roadmap
Mali™-T604 • First OpenGL® ES 3.1
and OpenCL Full Profile
mobile GPU
Mali-T624 • 50% performance uplift
• Full ASTC support
Mali-T628 • Extending scalability to 8
cores
Mali-T760 • Increased SoC energy
efficiency
• Scalability to 16 cores
Mali-T860 & T880 • Energy efficiency and
performance gains
• UI performance uplift
Cost Efficient GPU Roadmap
Mali-400 MP • First OpenGL ES 2 multi-
core GPU
• Leading area efficiency
Mali-450 MP • Double the performance
of Mali-400 MP
Mali-T622 • Enabling Full Profile
Compute and OpenGL
ES 3.1 in mid-range
Mali-T720 • Optimised area efficiency
and decreased cost and
time-to-market
Mali-T820 & T830 • Performance density
increases
• Bandwidth efficiency with
AFBC and other features
9
Mali-T880/T860 builds on top of the technical advances of Mali-T760
Significant improvements in energy efficiency – up to 40% improvement over Mali-T760
Incorporates proven bandwidth reduction technologies : ASTC, AFBC, Smart Composition, TE
Increased Performance
Improved performance across real 3D, causal 2D and UI use cases
Increased arithmetic performance for compute heavy applications
Support for high fidelity content for 4K and beyond
Native 10-bit YUV input and output support
Gamma correct filtering for maximum color fidelity
High performance full speed implementation – no shader conversion
Mali-T880 & Mali-T860 Overview
10
ARM® Mali™-T880: New Horizon for Mobile Visuals
1.8x more performance of 2014 Mali-T760 based
devices
Delivering advanced gaming and console-like experiences
Most energy-efficient Mali GPU
40% energy reduction for same workloads
Enabling superior system level benefits
Complemented by Mali-V550 video and Mali-DP550
display processors
TrustZone secure video path for premium 4K content
11
Improved Performance for High-End and Casual content
Quad prioritization 2D blitting +40%
2D-like content +12%;
Also benefits advanced content +3-5%
3x ALUs Increased arithmetic performance
Up to 33% Manhattan uplift from HW
Up to 50% Peak compute performance increase
Improved Early Z test throughput
Eliminating work earlier in the pipeline
Cache size tuning, improved thread management
Increase size of L1 and L0 instruction caches
10% 20% 30% 40%
Quaternion Rotation
CLBenchmark
User Interface
Casual Gaming
2D Blit
BasemarkX
GFXBench T-Rex
GFXBench Manhattan
Performance Improvement vs. Mali-760
12
Important Mali Design Values
Tile-Based Multi-Core Real Performance
11
Tile-based rendering gives
the best start for an energy
efficient GPU architecture.
But more than a technology
it’s a way of thinking to
keep working data close
and bandwidth to minimum
Building multi-core capability
into the core of the
architecture allows ultimate
scalability and partner
flexibility. A valuable and
unrivaled extra dimension
Paper specs are far less
important than real world
performance. Delivering
sustained performance in
real devices with real
constraints is what matters
13
Mali Scalability – True Partner Value
Mali is the only GPU IP with true scalability
Multi-core from ground up allows partner configurability
A Mali MP product covers a full roadmap of competing
monolithic products
Unrivaled ability to match to markets
Perfect for hitting the right product position
Fine tune balance of PPA and cost
Ranging from super-efficient performance champions…
… to cost optimized mass-market products
Core hardening can be reused in all configs
Utilize hardening resource efficiency
Reduce Time-To-Market
14
Real World Performance
Mali prioritizes real performance over specs
A key example is peak architectural fill-rate
I.e. pixel throughput per clock
Nominal fill-rate has not changed
Same for Mali-400 as for the latest, Mali-T880
One pixel per clock per core through start to end
Effective actual fill-rate is close to double
A true benefit for UI, browsing and casual uses
Achieved through
Innovation focus on real applications and use cases
Architectural tuning for real SoC environments
Mali-400 Mali-T880
15
Mali-T604 Mali-T760
Real World Performance
Another example is peak architectural GFLOPS
Comparing between Mali-T604 and Mali-T760
Nominal GFLOPS has not changed
FP32: 34 ops per clock per core for both Mali GPUs
Effective performance more than doubled
Achieved through
Innovation focus on real applications and use cases
Architectural tuning for real SoC environments
16
ARM acquired Geomerics in 2013
Enlighten – Physically based Dynamic Lighting
Enlighten reproduces console experiences on
all platforms
ARM and Enlighten – powerful ecosystems
Console Class Gaming Right in your Hands
Available in
Unity 5.0
17
Mali Video and Display Roadmap
Video Processor Roadmap
Mali-V500 • Scalable to 4K120 encode/decode
• ARM® Frame Buffer Compression (AFBC)
• Trustzone™ technology enabled
Mali-V550 • 10-bit HEVC decode
• HEVC encode
• Motion Elimination
Display Processor Roadmap
Mali-DP500 • Up to 4k2K display
• Secure display layer
• AFBC
Mali-DP550 • 7 layer composition
• Co-processor interface
• Motion elimination
18
First video IP with multi-standard codec including
HEVC ( H.265) for both encode and decode
in a single core
Optimal area solution enables HEVC
in the mainstream
Scalable performance and multi-stream video
1080p 60 on a single core up to 4K120 on 8 cores
Multiple simultaneous encode/decode streams
10-bit YUV support
Optimized for lowest power
More than 50% system bandwidth saving with AFBC
Up to 35% extra system bandwidth saving for WiFi
Display/Miracast with Motion Search Elimination
Robust to bus latency
Design lower cost memory system without dropping frames
HEVC H.264
MPEG4
VP8
VC-1
Real H.263
MPEG2
JPEG
Mali-V550 Overview
19
Low power and low memory bandwidth usage
Composition, rotation, scaling, post-processing and display output
in a single pass
Supports system-wide AFBC and Motion Search Elimination
Efficient composition of up to 7 layers
Scales to 4K resolution
Enable 3rd party display differentiation
Co-processor interface enables 3rd party / SoC vendor IP
in Mali display pipeline
Feature rich IP
Wide range of pre-processing and post-processing features
Wide range of YUV/RGB pixel formats including 10-bit YUV
Single or dual display output
Compatible with all major display standards
Mali-DP550 Overview Energy efficient processing all the way to the glass
20
ARM Mali and Future Trends
Increase value to customers
Move the limits of architecture
Drive innovation in standards
Resolution race continues, but slowing down
Peak around 4k for most handheld devices
Better fidelity must come from better pixels
Increased use of complex shading algorithms
Growing complexity in geometry
More innovation in middleware and apps
Requires open but robust APIs and GPUs
Real time graphics will continue to be tricks
Every “correct” effect can be faked, cheaper
The cheaper the effects, the more can be used
More effects looks better
22
Developer Outreach Activities
Presentations, demos and workshops at developer events
Expected to be present at around 20 developer events in 2015
Topics:
Best practises and optimization techniques for ARM Mali graphics
(target game developers)
ARM Tools: MGD, DS-5 Streamline, developer resources
GPU acceleration for UI and Browser (target broadcast industry and
HTML5 developers)
Computer Vision workshop at EVA summit (target ARM partners and
OpenCL / RS developers)
23
Mali Developer Center
A full range of resources on one, easy-
access portal (EN & CN):
Tools
SDKs
Drivers
Tutorials & Developer Guides
Sample Code
http://malideveloper.arm.com
http://malideveloper.arm.com/cn
24
• API Trace & Debug
• OpenGL® ES, OpenCL™
• Debug and improve performance at frame level
Mali Graphics Debugger
Mali GPU
- Timeline
- HW Counters
- OpenCL visualizer
ARM DS-5 Streamline
• Analyze shader performance
• Generates binary shaders
• Command line tool
Mali Offline Compiler
• Emulate OpenGL ES 2.0 and 3.0
• Windows and Linux
• Khronos Conformant
OpenGL ES Emulator
• Command line and GUI
• ETC, ETC2, ASTC
• 3D textures
Texture Compression Tool
Integration with partners’ tools
Third Party Tools
Mali GPU Software Tools Performance Analysis, Debug, and Software Development