David F. Bacon T.J. Watson Research Center

Post on 07-Feb-2016

22 views 0 download

description

Parallel and Concurrent Real-time Garbage Collection Part I: Overview and Memory Allocation Subsystem. David F. Bacon T.J. Watson Research Center. What It Does. (Demo). http://www.youtube.com/user/ibmrealtime. What it Is. A production garbage collector that is - PowerPoint PPT Presentation

transcript

1

Parallel and ConcurrentReal-time Garbage Collection

Part I: Overview and Memory Allocation Subsystem

David F. Bacon

T.J. Watson Research Center

QuickTime™ and a decompressor

are needed to see this picture.

2

What It Does

(Demo)

http://www.youtube.com/user/ibmrealtime

3

What it Is

• A production garbage collector that is

– Real-time (450us worst-case latencies)

– Multiprocesing (uses multiple CPUs)

– Concurrent (can run in background)

– Robust (within and across JVMs)

4

Why It’s Important

DDG-1000 Destroyer

Telco SIP Switch

Air Java(w/ Berkeley CE)

JAviator(w/ Salzburg)

Java-basedSynthesizer

Playstation/Xbox etc

Automotive Electronics

33%

7%

22%

Trade Execution

5

Who and When

Metronome(2001-2004)

Recycler(1999-2001)

WebSphere Realtime(2004-2007)

QuickTime™ and a decompressorare needed to see this picture.QuickTime™ and a decompressorare needed to see this picture.

Dick Attanasio David BaconV.T. RajanSteve Smith

Han Lee

David BaconPerry ChengV.T. Rajan

Martin Vechev

Josh AuerbachDavid BaconPerry ChengDave Grove

5 Developers10 Testers

5 Salespeople…

6

Digression: Keys to Success

• Intelligence

• Collaboration

• Problem Selection

7

Perspectives

• Concurrent garbage collection is

– A key language runtime component

– A challenging verification problem

– A multi-faceted concurrent algorithm

8

Goals• Learn how to bridge:

– from abstract design…– …to concrete implementation

• Learn how to combine different– algorithms…– …and implementations…– …into a complete system

• Gain deep understanding– highly complex, real-world system– apply lessons to your problems

9

Where it Fits InJVMJVM

JITInterpreter GC

Class LibrariesRTSJ

Arraylets, Barriers

Class LibrariesRTSJ

Arraylets, Barriers

AoTCompiler

RTSJScopes,Threads

Class(Un)Loader(realtime)

JVMPI Debug RAS

SystemManagement

Weird RefsWeak, Soft,Phantom, JNI

Heap FormatDump & Parse

DocumentationDocumentation

Test

24x7 (at least)

Test

24x7 (at least)

10

Fundamental Issues

• Functional correctness (duh)

• Liveness– Timeliness (real-time bounds)

• Fairness– Priorities

• Initiation and Termination

• Contention

• Non-determinism

11

Why is Concurrency Hard?

• Performance– Contention

– Load Balancing

– Overhead -> Granularity

• “Inherent” Simultaneity

• Timing and Determinism

12

Stack

GC: A Simple Problem (?)

• Transitive Graph Closure

rr

pp TTa

b

XXa

b UUa

b

ZZa

b

WWa

b

YYa

b Class Foo { Foo a; Foo b;}

13

Basic Approaches: Mark/Sweep

Stack

rr

pp TTa

b

XXa

b UUa

b

ZZa

b

WWa

b

YYa

b

WWa

b

ZZa

b

YYa

b

XXa

b

freefree

• O(live) mark phase but O(heapsize) sweep

• Usually requires no copying

• Mark stack is O(maxdepth)

14

Basics II: Semi-space Copying

• O(live)

• If single-threaded, no mark stack needed

• Wastes 50% of memory

Stack

rr

pp

TTa

b XXa

b

UUa

b ZZa

b

WWa

b

YYa

b

WWa

b

ZZa

b

YYa

b

XXa

b

15

Kinds of “Concurrent” Collection

• “Stop the World”

• Parallel

• Concurrent

• Incremental

APP GC APP

APP APP

APP APP

APP GC APP

APP APP

APP APP

GC

GC

APP

APP

APP APPGC

APP GC

APP

APP

APP

APP

APP

GC APP

APP

APP

GC APP

APP

APP

16

Our Subject: Metronome-2 System

• Parallel, Incremental, and Concurrent

• No increment exceeds 450us

• Real-time Scheduling

• Smooth adaptation from under- to over-load

• Implementation in production JVM

GC

APP GC

APP GC APP GC APP

APP

APP GC APP

APP APP APP

17

What Does “Real-time” Mean?

• Minimal, predictable interruption of application

• Collection finishes before heap is exhausted

• “Real space” - bounded, predictable memory

• Honor thread priorities

• Micro- or macro-level determinism (cf. CK)

18

The Cycle of Life

AllocateAllocate

FreeFree

MutateMutate

• Not really a “garbage collector”…

• … but a memory management subsystem

19

Metronome Memory Organization

• Page-based• Segregated free lists• Ratio bounds internal & page-internal fragmentation

20

Large Objects: Arraylets

• (Almost) eliminates external fragmentation• (Almost) eliminates need for compaction• Very large arrays still need contiguous pages• Extra indirection for array access

21

Page Data Structures

1616 6464 256256 freefree

22

Page Data Synchronization, Take 1

1616 6464 256256 freefree

23

Page Data, Take 2

1616 6464 256256 freefree

1616 6464 256256

Thread 1

1616 6464 256256

Thread 2

24

http://www.research.ibm.com/metronome

https://sourceforge.net/projects/tuningforkvp