+ All Categories
Home > Documents > CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some...

CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some...

Date post: 13-May-2020
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
38
CS 3214 Computer Systems Godmar Back Automatic Memory Management/GC
Transcript
Page 1: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

CS 3214

Computer Systems

Godmar Back

Automatic Memory Management/GC

Page 2: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

MEMORY MANAGEMENT

Part 2

CS 3214 Spring 2020

Some of the following slides are taken with permission from

Complete Powerpoint Lecture Notes for

Computer Systems: A Programmer's Perspective (CS:APP)

Randal E. Bryant and David R. O'Hallaron

http://csapp.cs.cmu.edu/public/lectures.html

Page 3: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Dynamic Memory Allocation

• Explicit vs. Implicit Memory Allocator

– Explicit: application allocates and frees space

• E.g., malloc and free in C

– Implicit: application allocates, but does not free space

• E.g. garbage collection in Java, ML or Lisp

• Allocation

– The memory allocator provides an abstraction of memory as a set of blocks or, in type-safe languages, as objects

– Doles out free memory blocks to application

• Will discuss automatic memory allocation today

Application

Dynamic Memory Allocator

Heap Memory

CS 3214 Spring 2020

Page 4: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Implicit Memory Management

• Motivation: manually (or explicitly) reclaiming memory is difficult:

– Too early: risk access-after-free errors

– Too late: memory leaks

• Requires principled design

– Programmer must reason about ownership of objects

• Difficult & error prone, especially in the presence of object sharing

• Complicates design of APIs

CS 3214 Spring 2020

Page 5: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Concept Map

CS 3214 Spring 2020

Implicit/Automatic

Memory Management

Motivation: Lack of

Robustness of explicit

schemes

Reference Counting

based approaches

manualsmart

pointers

Garbage Collection

Mechanisms

Reachability

Graph

Mark/Sweep

Programming Issues

Churn

Bloat

Leaks

Efficiency

considerationsPolicies & Tuning:

Generation sizing,

triggers,

heap expansion

policy, etc.Generational

Incremental

Concurrent

Evacuation/

Scavenging

Barriers

Memory Overhead relative

to live heap size

Allocation rate

Program Throughput

GC Throughput

Page 6: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Manual Reference Counting• Idea: keep track of how many references there are to each object in a

reference counter stored with each object– Copy a reference to an object globalvar = q

• increment count: “addref”

– Remove a reference p = NULL• decrement count: “release”

• Uses set of rules programmers must follow– E.g., must ‘release’ reference obtained from OUT parameter in function call

– Must ‘addref’ when storing into global

– May not have to use addref/release for references copied within one function

• Programmer must use addref/release correctly– Still somewhat error prone, but rules are such that correctness of the code

can be established locally without consulting the API documentation of any functions being called; parameter annotations (IN, INOUT, OUT, return value) imply reference counting rules

• Used in Microsoft COM & Netscape XPCOM

CS 3214 Spring 2020

Page 7: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Automatic Reference Counting

• Idea: force automatic reference count updates when pointers are assigned/copied

• Most common variant: – C++ “smart pointers” – C++ allows programmer to

interpose on assignments and copies via operator overloading/special purpose constructors.

• Disadvantage of all reference counting schemes is their inability to handle cycles– But great advantage is immediate reclamation: no

“drag” between last access & reclamation

CS 3214 Spring 2020

Page 8: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Garbage Collection

• Determine which objects may be accessed

in the future

– Don’t know which one’s will, but can

determine those who can’t be accessed

because there are no pointers to them

– Requires that all pointers are identifiable (e.g.,

no pointer/int conversion)

• Invented 1960 by McCarthy for LISP

CS 3214 Spring 2020

Page 9: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Reachability Graph

• Roots are commonly– Global variables (static in Java)

– Local variables that contain references (any object or array in Java). Local variables are stored in the currently active stack frames of each running thread. They change constantly as the thread calls new methods/returns from calls

– Internal roots pinned down by the JVM

• The following slides visualize this. Note that in the actual implementation, objects are not tagged with a thread id (this is just for visualization purposes)

CS 3214 Spring 2020

Page 10: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Reachability Graph

Thread A Thread B Thread C

Root set

C

C

C

C

C

B

B

B

B

B

C

B

A

A

A

A

A

A

CS 3214 Spring 2020

Page 11: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Reachability Graph

Thread A Thread B Thread C

Root set

C

C

C

C

C

B

B

B

B

B

C

B

A

A

A

A

A

A

CS 3214 Spring 2020

Page 12: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Reachability Graph

Thread A Thread B Thread C

Root set

C

C

C

C

C

B

B

B

B

B

C

A

A

A

A

A

CS 3214 Spring 2020

Page 13: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Reachability Graph

Thread A Thread B

Root set

C

C

C

C

C

B

B

B

B

B

C

A

A

A

A

A

CS 3214 Spring 2020

Page 14: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Reachability Graph

Thread A

Root set

C

C

C

C

B

B

B

B

B

C

A

A

A

A

A

CS 3214 Spring 2020

Page 15: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Reachability Graph

Thread A

Root set

A

A

A

A

CS 3214 Spring 2020

Page 16: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

GC Design Choices

• Determining which objects are reachable– “marking” live objects, or

– “evacuating”/”scavenging” objects –copying live objects into new area (if objects are movable)

• Deallocating unreachable objects– “sweeping” – essentially calling “free()”

on all unreachable objects

– more efficient if it’s possible to evacuate all life objects from an area

CS 3214 Spring 2020

cost generally

proportional to

amount of life

objects in area

considered

cost proportional

to amount of dead

objects (garbage)

in theory, constant cost;

in practice, dominated by

need to zero memory

Page 17: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

End time – teStart time – ts

Time

Allocated

Memory

Amax

live

garbage

Memory Allocation Time-Profile

CS 3214 Spring 2020

Page 18: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

End time – teStart time – ts

Time

Allocated

Memory

Amax

live

garbage

Modeling Memory Allocation

CS 3214 Spring 2020

Page 19: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Execution Time vs. Memory

time

memory

ts te

Max

Heap

CS 3214 Spring 2020

Page 20: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Execution Time vs. Memory

time

memory

ts te

Max Heap

CS 3214 Spring 2020

Page 21: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Execution Time vs. Memory

time

memory

ts te

Max Heap

CS 3214 Spring 2020

Page 22: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

time

memory

ts te

Execution Time vs. Memory

Max Heap

CS 3214 Spring 2020

Page 23: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Execution Time vs. Memory

time

memory

Max Heap

ts te

CS 3214 Spring 2020

Page 24: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Heap Size vs. GC Frequency

• All else being equal, smaller maximum heap sizes necessitate more frequent collections

– Old rule of thumb: need between 1.5x and 2.5x times the size of the live heap to limit collection overhead to 5-15% for applications with reasonable allocation rates

– [Hertz 2005] finds that GC outperforms explicit MM when given 5x memory, is 17% slower with 3x, and 70% slower with 2x

– Performance degradation occurs when live heap size approaches maximum heap size

CS 3214 Spring 2020

Page 25: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Kattis.com example

• In ICPC judging, Java programs are subjected to both a total memory limit (via –Xmx) and a total CPU consumption limit (which includes all JVM threads, including those devoted to GC!)

• Given that the JVM is unaware it’s being timed and tries to adhere to its default policies, what’s the fairest way to run the JVM under these conditions?– Hypothesis (1): set start heap size to max heap size to tell

JVM it’s ok to ask for this much memory from the OS. (Otherwise, it’ll try to GC before growing its heap).

– Hypothesis (2): even though the JVM “senses” free cores and enables concurrent GC, force it to use serial GC instead: GC happens in the context of mutator thread.

CS 3214 Spring 2020

Page 26: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Kattis.com data

• Sample of submissions to open.kattis.com

written in Java

• X-axis: Total CPU consumption (mutator +

gc threads)

• Y-axis: Log (CPU_standard/CPU_better)

– CPU_standard: -Xmx{memlimit}

• Tries to use parallel gc

– CPU_better: -Xms{memlimit} –Xmx{memlimit}

–XX:+UseSerialGCCS 3214 Spring 2020

Page 27: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

“Real-World” data point: Kattis.com

CS 3214 Spring 2020

Page 28: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Conclusion of kattis.com

• Works better most of the time (and so is

now the default in Kattis)

• Flipside: larger start heap size forces

larger Eden size and less frequent GC in

the nursery, resulting in bad locality.

– For one benchmark slowdown of 400% by

setting –Xms vs not.

CS 3214 Spring 2020

Page 29: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Infant Mortality

CS 3214 Spring 2020

Source: http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html

Page 30: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Generational Collection

• Observation: “most objects die young”

• Allocate objects in separate area (“nursery”, “Eden space”, collect area when run out of space– Will typically have to evacuate few survivors

– “minor garbage collection”

• But: must treat all pointers into Eden as roots– Typically, requires cooperation of the mutator

threads to record assignments: if ‘b’ is young, and ‘a’ is old, a.x = b must add a root for ‘b’.

• Aka “write barrier”

CS 3214 Spring 2020

Page 31: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

When to collect

• “Stop-the-world”

– All mutators stop while collection is ongoing

• Incremental

– Mutators perform small chunks of marking

during each allocation

• Concurrent/Parallel

– Garbage collection happens in concurrently

running thread – requires some kind of

synchronization between mutator & collectorCS 3214 Spring 2020

Page 32: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Example:

G1GC• See Oracle tutorial and InfoQ.

• Source: [Beckwith 2013]

CS 3214 Spring 2020

Page 33: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Trade-Offs

• For a good discussion of other trade-offs

related to GC, see this post related to

claims in Go GC

CS 3214 Spring 2020

Page 34: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Precise vs. Conservative Collectors

• Precise collectors keep only objects alive that are in fact part of reachability graph

• Conservative collectors may keep objects alive that aren’t– Reason typically that they do not know where pointers are

stored, must conservatively guess

• In-between forms: some systems assume precise knowledge of heap objects, but not stack frame layouts– Can be expensive to keep track of where references are

stored on the stack, particularly in fully preemptive environments

• Conservatism makes GC usable for languages such as C, but prevents moving/compacting of objects

CS 3214 Spring 2020

Page 35: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Application Programmer’s Perspective

• Dealing with Memory Leaks

– Avoiding bloat

– Avoiding churn

• Tuning garbage collection parameters

• Garbage collection in mixed language

environments

– C code must coordinate with the garbage

collection system in place

CS 3214 Spring 2020

Page 36: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Programmer’s Perspective

• Your program is running out of memory. What do you do?

• Possible reasons:

– Leak

– Bloat

• Your program is running slowly and unpredictably

– Churn

– “GC Thrashing”

CS 3214 Spring 2020

Page 37: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Memory Leaks

• Objects that remain reachable, but will not be accessed in the future

– Due to application semantics

• Will ultimately lead to out-of-memory condition

– But will degrade performance before that

• Common problem, particularly in multi-layer frameworks

– Containers are a frequent culprit

– Heap profilers can help

CS 3214 Spring 2020

Page 38: CS 3214 Computer Systemscs3214/spring2020/... · MEMORY MANAGEMENT Part 2 CS 3214 Spring 2020 Some of the following slides are taken with permission from Complete Powerpoint Lecture

Bloat and Churn

• Bloat: use of inefficient, pointer-intensive

data structures increases overall memory

consumption [Chis et al 2011]

– E.g. HashMaps for small objects

• Churn: frequent and avoidable allocation

of objects that turn to garbage quickly

• Caches:

– How to implement and size caches

CS 3214 Spring 2020


Recommended