+ All Categories
Home > Documents > program the 99%Java Objectives of this talk After almost a decade working on real-time Java...

program the 99%Java Objectives of this talk After almost a decade working on real-time Java...

Date post: 06-Nov-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
45
program the 99%
Transcript
Page 1: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

program the

99%

Page 2: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Java

Objectives of this talk

After almost a decade working on real-time Java

Self-contained overview of Real-time Garbage Collection

Highlight results from Filip Pizlo’s PhD thesis[PLDI’10, EUROSYS’10, RTSS’09, ECOOP’09, ISMM’08, PLDI’08, ISMM0’7, LCTES’07, CC’07, RTAS’06]

Page 3: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Expectations

A managed language should be <2x slower than C

Real-time support should cost <2x

Worst case performance matters

Page 4: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Reality After 10 years of work… FijiVM

Java Application Fiji VM compiler Native Code

Fiji Runtime

Fiji VM C1 GCC register allocation

everything else

Bytecode Parser Fiji IR Transform & Optimize

Fiji IR C Code Gen

Bytecode

Parser

Fiji IR

Make SSA

Fiji SSA

Const & Copy Propagation +

CFG Simplification

Intrinsics

Inlining

Global Value Numbering

Kill SSA

Unroll and Peel Loops

Make SSA

Const & Copy Propagation +

CFG Simplification

Fiji IR

Fiji SSA

Allocation, Lock, Barrier Inlining

Global Value Numbering

Whole-program Dead Code Elimination

Representational Lowering

Calling Convention Lowering

Kill Types

Const & Copy Propagation +

CFG Simplification

Kill SSA

Const & Copy Propagation +

CFG Simplification

Generate C Code

C code

Whole-program 0CFA

Fiji IR

Page 5: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Reality

Real-time benchmarkAircraft collision avoidance w. simulated radar framesCDc - idiomatic CCDj - idiomatic Java

Real-time platformRTEMS 4.9.1 (hard RTOS)40MHz LEON3, 64MB RAM (radiation-hardened SPARC)

Page 6: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

2000 2050 2100 2150 2200

100

150

200

250

300

Worst case JavaWorst case C

Frame Number vs. Execution Time (ms)

Page 7: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Java Iteration Execution Time

C Ite

ratio

n Ex

ecut

ion T

ime

Correlation C/Java

100K samples15 GC cycles

Page 8: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Memory management and programming models

The choice of memory management affects productivity

Object-oriented languages naturally hide allocation behind abstraction barriers

Taking care of de-allocation manually is more difficult in OO style

Concurrent algorithms usually emphasize allocation

because freshly allocated data is guaranteed to be thread local

“transactional” algorithms generate a lot of temporary objects

… but garbage collection is a global, costly, operation that introduces unpredictability

Page 9: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Alternative 1: No Allocation

If there is no allocation, GC does not run.

This approach is used in JavaCard

Page 10: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Alt 2: Allocation in Scoped Memory

RTSJ provides scratch pad memory regions which can be used for temporary allocation

Used in deployed systems, but tricky as they can cause exceptions

s = new SizeEstimator();s.reserve(Decrypt.class, 2);… shared = new LTMemory(s.getEstimate());shared.enter(new Run(){ public void run(){ ...d1 = new Decrypt() ...}});

Page 11: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

1

Page 12: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

GC is easy*

* good performance is hard

Page 13: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection: Mark & Sweep

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 14: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 15: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 16: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 17: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 18: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 19: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 20: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Garbage Collection

thread#2thread#1 heap

Phases

Mutation

Stop-the-world

Root scanning

Marking

Sweeping

Compaction

Page 21: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

2

Page 22: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

RTGC is easy*

* good performance is harder

Page 23: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Incrementalizing marking

Collector marks object

Application updates reference field

Compiler inserted write barrier marks object

Page 24: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Time-based GC Scheduling

GC thread

RT thread

Java thread

Page 25: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Slack-based GC Scheduling

GC threadRT thread

Java thread

Page 26: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

3

Page 27: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Compaction is easy*

* that’s a lie

Page 28: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

State of the art

Oracle HotSpot

fast & space boundedbut blocking

Oracle Java RTS

space bounds, concurrent, wait-freebut 60% slow-down

IBM Websphere SRT

30% slow-down, concurrent, wait-freebut susceptible to fragmentation

Page 29: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Minimizing fragmentation

Previous Work

Page 30: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

On-demand Defragmentation

Concurrent defragmentation has draw-backs

slow down during defrag more than 5x [Pizlo07,Pizlo08]

timeperfo

rman

ce

Defrag starts

Defrag ends

Page 31: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Replica

Replication-based GC

Allows concurrent defragmentation [NettlesOToole93, ChengBlelloch01]

Two spaces: one space for reads; writes “replicated” to both

… but writes not atomic

OriginalObject

Copying

Read Write

Page 32: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Fragmented allocation

All objects split into small fragments [Siebert’99]

Fragment size is fixed at 32 bytes

Fragments are linked, application follows links on reads

Plain ObjectArray

Most objects require only two fragments.

Access cost is known statically, does not vary.

Access cost is logarithmic.

Page 33: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Schism[PLDI’10]

Page 34: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Schism = CM&S + Replication + Fragments

Insight: replicated collectors are good immutable data fragmented allocation works well for fixed-size data

Combination:Concurrent mark-sweep for fixed-size fragmentsReplication for array spines

No external fragmentation, O(1) heap access, wait-free & coherent

Page 35: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Spine

Arrays

Data in fixed size fragments

Index in a variable sized spine… which is immutable

Page 36: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Concurrent Mark-Sweep Heap for Fragments

To-space for Spines From-space for Spines

Small Object

Large Array?

Concurrent Replication Heap for Spines

Page 37: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Proof ?

Page 38: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Tunable throughput/predictability trade-off

A deterministic

allocate fragmented

C throughput

allocate contiguously if possible

CW worst-case for level C

poison all fast-paths (array accesses, write barriers, allocations)

Page 39: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Summary of Results

Goal: fast

Goal: fragmentation tolerant

Goal: deterministic

Page 40: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

SPECjvm98 (50MB heap)

HotSpot 1.6 Server

IBM J9

Sun Java RTS 2.1

IBM Metronome SRT

Fiji VM CMR

Fiji VM Schism/cmr level C

Fiji VM Schism/cmr level A

Fiji VM Schism/cmr level CW

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Throughput relative to HotSpot 1.6 Server(More is Better)

HotSpotWebsphere

Java RTSMetronome

Fiji CMRSchism C

A CW

63% slow-down38%

35% 50%

57%

Non Real Time

Page 41: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Summary of Results

Goal: fast

Goal: fragmentation tolerant

Goal: deterministic

Page 42: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Torture tests

% free memory allocated under fragmentationHotSpot: 100%

Java RTS: ~80%Metronome: ~1%Schism: 100%

Page 43: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

Summary of Results

Goal: fast

Goal: fragmentation tolerant

Goal: deterministic

Page 44: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

40

60

80

100

120

Java vs C on CDx M

illis

C JavaCMR Schism C

Schism CWSchism A

70.5

98.5

< 40% slower than Cas deterministic

Page 45: program the 99%Java Objectives of this talk After almost a decade working on real-time Java Self-contained overview of Real-time Garbage Collection Highlight results from Filip Pizlo’s

References and acknowledgements

Team

F Pizlo, E Blanton, L Ziarek, T Kalibera, T Hosking, P Maj, T Cunei, M Prochazka, J Baker

Paper trail

Schism: Fragmentation-Tolerant Real-Time Garbage Collection. PLDI10High-level Programming of Embedded Hard Real-Time Devices. EUROSYS10Accurate Garbage Collection in Uncooperative Environments. CCP&E09A Study of Concurrent Real-time Garbage Collectors. PLDI08Memory Management for Real-time Java: State of the Art. ISORC08Hierarchical Real-time Garbage Collection. LCTES07


Recommended