+ All Categories
Home > Technology > LCE13: Android Graphics Upstreaming

LCE13: Android Graphics Upstreaming

Date post: 13-Jun-2015
Category:
Upload: linaro
View: 1,221 times
Download: 2 times
Share this document with a friend
Description:
Resource: LCE13 Name: Android Graphics Upstreaming Date: 09-07-2013 Speaker: Video: https://www.youtube.com/watch?v=DA3fmzUC-Jk
Popular Tags:
40
Android Graphics Upstreaming Linaro Connect Europe 2013
Transcript
Page 1: LCE13: Android Graphics Upstreaming

Android Graphics Upstreaming

Linaro Connect Europe 2013

Page 2: LCE13: Android Graphics Upstreaming

2

Overview

● Covering– ION– Sync– KMS/HWComposer

● Hoping for active discussion

Page 3: LCE13: Android Graphics Upstreaming

3

Disclosure

I'm not a DMA expert, nor am I all that familiar with details around graphics

I'm likely to be wrong in more then one place

Page 4: LCE13: Android Graphics Upstreaming

4

ION

Page 5: LCE13: Android Graphics Upstreaming

5

What is the issue ION solves?

● Provides a way to allocate buffers so that they can be shared between different hardware devices (via DMA) to avoid copying

● Different devices have different constraints– Physically contiguous memory– Smaller memory aperture (32bit device accessing LPAE/64bit

memory)– Different pagetable sizes

● Provides a method to select type of buffer that satisfies the constraints

● While mostly used for graphics, ION is not graphics specific

Page 6: LCE13: Android Graphics Upstreaming

6

????

Would contrived cartoon examples help?

Page 7: LCE13: Android Graphics Upstreaming

7

Page 8: LCE13: Android Graphics Upstreaming

8

CPU full virtual and physical addressing

Page 9: LCE13: Android Graphics Upstreaming

9

GPU supports full memory range + scatter/gather

Page 10: LCE13: Android Graphics Upstreaming

10

Camera is 32bit, and can only do DMAto physically contiguous memory

Page 11: LCE13: Android Graphics Upstreaming

11

Crypto engine only supports 32bits, but does support scatter/gather

Page 12: LCE13: Android Graphics Upstreaming

12

MMC supports full memory range, but only contiguous physical memory

Page 13: LCE13: Android Graphics Upstreaming

13

Virtual allocation

Page 14: LCE13: Android Graphics Upstreaming

14

Resulting physical allocation

Page 15: LCE13: Android Graphics Upstreaming

15

kmalloc for physically contiguous allocation

Page 16: LCE13: Android Graphics Upstreaming

16

CMA allows kernel to make space forcontiguously physical allocations

Page 17: LCE13: Android Graphics Upstreaming

17

Carveout memory is physically contiguousmemory reserved at boot

Page 18: LCE13: Android Graphics Upstreaming

18

ION interface● Provides way for userland to allocate buffers

from various “pools of memory” (aka: heaps)– SYSTEM: Virtually contiguous (vmalloc)

– SYSTEM_CONTIG: Small physically contiguous (kmalloc)

– CARVEOUT: Large reserved physically contiguous

– CHUNK: Carveout + large page tables

– CUSTOM: Whatever hardware vendors want (ick)

– CMA: Sometime in the future?

Page 19: LCE13: Android Graphics Upstreaming

19

ION Interface (cont)

● Allows freeing, mapping and passing of those buffers to other applications and drivers– Buffers shared as file descriptors

Page 20: LCE13: Android Graphics Upstreaming

20

Using our examples● CPU + GPU: SYSTEM● CPU + MMC: SYSTEM_CONTIG● CPU + CAMERA: CARVEOUT● CPU + GPU + CAMERA: CARVEOUT● CPU + GPU + MMC: SYSTEM_CONTIG

● Note: ION does not help calculate what the proper heap is for the given combination of hardware. It just provides userland an interface to specify a heap that userland knows satisfies the hardware constraints

Page 21: LCE13: Android Graphics Upstreaming

21

ION developer priorities● Android developers very focused on avoiding “jank”

- frame drops, jerky animations● Want very deterministic behavior

– They worry about CMA since it may spend a variable amount of time to move memory on a large allocation

– Delayed constraint-solving dma-buf allocation ideas are similarly not considered viable (by Android devs)

● Want to centralize as much logic as possible in ION core, so any optimizations can be made once in the core infrastructure– Avoid lots of per-driver tweaking

Page 22: LCE13: Android Graphics Upstreaming

22

Isn't this what dma-buf does?

● ION pre-dates dma-buf● dma-buf provides a subset of what ION does● dma-buf is more of a encapsulation structure for

buffers of different types– Allows buffers to be passed between different drivers and

userland– Basically a marshaling structure– Does not specify how the buffers are allocated

● ION also has its own buffer encapsulation structure– ION added support to export dmabufs (sort of)

Page 23: LCE13: Android Graphics Upstreaming

23

Isn't this what CMA does?● Again: Sort of.● CMA allows for large physically contiguous memory allocations by

migrating memory to make room for the large allocation● Pros:

– Avoids wasting memory with carveouts if they aren't in use.– CMA has pluggable allocators and options that can allow for allocations that

satisfy the constraints needed.● Cons:

– CMA is kernel-internal only for now, and doesn't have a interface to allow userland to allocate buffers or specify constraint options

– Migrating pages to make room can cause non-deterministic delays. Android developers want deterministic behavior.

● Patches to support CMA via ION have been submitted by Benjamin Gaignard (Android developer plan on accepting them).

Page 24: LCE13: Android Graphics Upstreaming

24

What about TTM, GEM and PRIME?

You are now in the acronym pit of despair!

DRM, DRI, DRI2, EXA, UXA, GEM, TTM, UMA, GTT

Page 25: LCE13: Android Graphics Upstreaming

25

What about TTM, GEM?● TTM: Graphics memory manager for discrete gpus that have

their own video-ram.

– Considered complicated / poorly documented

– Provides fence synchronization facility

● GEM: More minimal approach to TTM

– Developed by Intel, focused on their hardware

– Limited to UMA devices (ie: integrated graphics)

– No synchronization (fence) primitives● Those have to be implemented w/ driver-specific ioctls

– Allows for sharing of buffers between applications by named ids

● GEM-ified TTM: TTM backend w/ GEM API

Page 26: LCE13: Android Graphics Upstreaming

26

What about PRIME?

● PRIME: GEM extended to use file descriptors for passing object references/buffers between drivers and userland– Uses dmabuf for passing buffers around– Required for “hybrid graphics” where there

are multiple gpu (discrete and integrated) working together.

Page 27: LCE13: Android Graphics Upstreaming

27

Issues with ION● Doesn't build on non 32-bit ARM architectures● Quite a bit of DMA api misuse

– Lots of ARM specific assumptions about DMA rules that aren't generically portable

● Exports kernel pointers to userland (makes compat_ioctl support difficult)

● Larger portability issue that applications have to understand the hardware buffer constraints in order to select the right heap to use– On different hardware, different heaps may be available, as

well as different devices with different constraints

– Same userland wouldn't necessarily work on different hardware

Page 28: LCE13: Android Graphics Upstreaming

28

DMA-API Misuse

● CPUs and Devices both cache memory– To keep coherency, we need to flush caches

before initiating DMA– This requires a direction and a device

● ION pre sync's data, before knowing which device its going to. Leaves device value as NULL. Works for their uses– Broken for IOMMUs

Page 29: LCE13: Android Graphics Upstreaming

29

What is our plan with ION?● Working w/ Android and ARM developers to address 32bit

ARM assumptions● Working with Arnd to try to sort out if we can address the

dma-api misuse, or decide if new dma-apis are needed● Try to come up with a way for the interface to expose less

hardware specific detail– Query devices for an opaque heap-cookie they support, which

could be OR-ed with other cookies to determine which heap to use for cross device buffers

● All of this may break current interface compatibility :(● I suspect getting ION into staging is as good as it will get● Other ideas?

Page 30: LCE13: Android Graphics Upstreaming

30

Sync

Page 31: LCE13: Android Graphics Upstreaming

31

What is Sync?● Provides synchronization primitives that can be

shared across processes● Used mostly to synchronize both drivers and

applications drawing to the screen● Like a condition-wait variable, but can be backed by

hardware primitives– Some gpus support hardware mutexes

● Provides lots of debugging data for sorting out synchronization issues

● In staging directory as of 3.10

Page 32: LCE13: Android Graphics Upstreaming

32

Sync Interface

● Timelines and fences– Applications set fences at specific points on

timeline and waitstruct sw_sync_create_fence_data data;

data.value = fence_count

ioctl(timeline_fd, SW_SYNC_IOC_CREATE_FENCE, &data);

ioctl(data.fence, SYNC_IOC_WAIT, &timeout);

– Controlling thread increments timeline, waking any processes waiting.

ioctl(timeline_fd, SW_SYNC_IOC_INC, &count);

Page 33: LCE13: Android Graphics Upstreaming

33

What about Dmabuf-fences?● Developed by Maarten Lankhorst, Daniel Vetter and

Rob Clark● Creates similar synchronization fences that are tied to

specific dma-buf buffers● Provides implicit synchronization

– Android's Sync is explicit synchronization, requiring developers to add the logic

● Limited to dma-buf buffers– Android's Sync driver can be used in more varied contexts

Page 34: LCE13: Android Graphics Upstreaming

34

Daniel Vetter's take:“The fundamental difference between android syncpoints and the dma_buf fences is that syncpoints use explicit userspace synchronization objects which get passed around as fds. Whereas dma_buf fences are all implicitly attached to the respective dma_bufs, so userspace can just pass around the buffer object fds and the kernel ensures that magic happens and everything is synced up properly.

Imo the later approach has two big upsides:

- Implicit sync objects are a _much_ simpler programming model. Think synchronous file i/o vs. aio. And if the kernel doesn't suck, there's not really a performance disadvantage, at least for the shared buffer use-case. GL drivers might still need explicit syncing for their gpu state objects for the last ounce of performance, but that's not relevant.

- Having fences attached directly to dma_buf objects is the only way to make dynamic buffers (i.e. eviction from garts/memory) possible. Currently every graphics driver on android seems to just pin their buffers into main memory so there's no need for that. And ion also only cares about pinned buffers. But I expect that this will change.”

Page 35: LCE13: Android Graphics Upstreaming

35

What about wait/wound-style mutexes?

● Also developed by Maarten Lankhorst and Daniel Vetter

● Developed to handle the case where buffers are shared between devices. Since buffers may not be ordered in the same way on all devices, there may be the possiblility for ABBA deadlocks

● Wait/wound style mutexes provide a global ticket (or context) which orders acquisitions. If a deadlock occurs, the oldest ticket holder waits for the mutex, while the younger holders have to “back off” and drop the locks they hold.

● Kernel driver interface only, not something userspace can use.

● I suspect this to be a base for dmabuf-fences

● Queued to be merged for 3.11

Page 36: LCE13: Android Graphics Upstreaming

36

What is our plan with Sync?

● Try to stir discussion between community and Android developers on explicit vs implicit synchronization issues

● Follow along to see if any part of the implementations can be shared

● Other ideas?

Page 37: LCE13: Android Graphics Upstreaming

37

KMS HWComposer

Page 38: LCE13: Android Graphics Upstreaming

38

What is KMS?

● Kernel Mode Setting● Makes the kernel responsible for graphics

mode (resolution, refresh, orientation)– Avoids races with userland and hardware– Can switch modes on OOPs to display message

Page 39: LCE13: Android Graphics Upstreaming

39

What is HWComposer?

● Per-platform userspace code that manages composition acceleration

● Part of the HAL layer● Currently using fb● Would be nice to convert HWComposer to

KMS

Page 40: LCE13: Android Graphics Upstreaming

40

What is our plan with KMS/HWC/HAL?

● Android devs likely already working on KMS enabled HAL– Likely to be optimized specifically for next hardware release

– Not likely to be generic KMS HAL

● Areas that may need work:– Sync and vsync notifications with KMS

● Hopefully this resolves the pageflipping framebuffer issue?

– Gralloc allocates 2x y_res– Most fb drivers don't support this

● Other thoughts/ideas?


Recommended