CPSC 388 – Compiler Design and Construction
Heap Management
Areas of Memory Used by Program Program Code
Static Data
Heap
Stack
Heap Used for dynamically allocated memory Important operations include allocation and de-
allocation In C++, Pascal, and Java allocation is done via the
“new” operator In C allocation is done via the “malloc” function call De-allocation is done either automatically or
programmer must specify when to de-allocate memory: Pascal and C++ – dispose C – free Java – garbage collection
Managing the Heap Available memory is managed using a
free list: a list of available “chunks” Each chunk includes:
Size of chunk Address of the next item on the free list The chunk itself
Initial Heap Free List
FirstFree
100
0 4 …
\ …
103
size next
Request is made to allocate 20 bytesUses first portion of first chunk (after sizeField) and returns address of 4
Initial Heap Free List
FirstFree
20
0 4 … 23 24 28 …
\
103
size size next
76
Request is made to allocate 10 bytes
Initial Heap Free List
FirstFree
20
0 4 … 23 24 28 …37 38 42 …
\
103
size size next
6210
size
First chunk is freedAdds chunk to front of free list
Initial Heap Free List
FirstFree
20
0 4 … 23 24 28 …37 38 42 …
\
103
size size next
6210
size
Operations on Free List Request space
Find a satisfactory chunk Free Space
Return to Free List
Goals for Operations Only fail to satisfy request for n bytes if
there are not n bytes available on free list
Do both operations quickly
Questions to Consider Given a request for n bytes, which n
bytes to return?
Given a de-allocation of a chunk, how to coalesce it with neighboring free chunks?
Techniques for Allocation Best Fit: Find the chunk on the freelist
with the smallest size greater than or equal to allocation request May require search of entire freelist
(SLOW!) Leaves lots of little pieces of free storage
on the list
Techniques for Allocation First Fit: Use the first chunk with size
greater than or equal to n. Faster than best-fit. Produces little pieces of free storage at
the front of the list, which slows later searches
Techniques for Allocation Circular First Fit: Make the freelist
circular (i.e. have last item point back to the first item).
Satisfy requests using the first chunk with size greater than or equal to n.
Change the freelist pointer to point to next chunk after allocated one.
Techniques for de-allocation Use a doubly-linked list
Each Chunk has a previous and next pointer
One bit of size field reserved to indicated if chunk is “free” or “in-use”.
Check free bit of storage after chunk If following chunk is free then
coalesce Follow Example on Board
Techniques for De-allocation Can also coalesce with preceding
chunk if you keep the size of chunk at beginning and end of chunk
Follow example on board
Note that NO pointers need to be updated
Automatic or Explicit De-allocation In C++ and C de-allocation must be
done explicitly In Java de-allocation is done
automatically (by the garbage collector)
Making it Automatic reduces burden on the programmer (and eliminates some types of errors)
Errors of Explicit De-allocation Storage Leaks
Some storage is never freed even though it is inaccessible
Listnode *p = malloc( sizeof(Listnode) ); . . // no copy from p in this code . p = ...;
Errors of Explicit De-allocation Dangling pointers
A pointer that points to memory that has been freed
May read garbage May mess up free list May corrupt other variables
Example Dangling Pointers Listnode *p, *q; p = malloc( sizeof(Listnode) ); q = p; . . // no assignment to q in this code . free(p); . . // no assignment to q in this code . *q = ...
Detecting Dangling Pointers Add a new field to every allocated
chunk (like size field) (lock)
Add a new field to every pointer (in addition to storing the address) (key)
If lock does not match key then throw an error
Detecting Dangling Pointers Each free chunk’s lock is set to 0 When allocated both lock and key
assigned a new value (always increasing)
When storage is freed set lock back to zero
When pointer is dereferenced, compiler generates code to first match key to lock, otherwise cause error
Automatic De-allocation Determine if a chunk of storage is no
longer accessible to the program
Make de-allocation efficient, avoid long pauses in program’s execution during de-allocation
Two Approaches: Reference Counting Garbage Collection
Reference Counting Include invisible field in every chunk of storage: its
reference count field. Value of field is the number of pointers that point to the
chunk. Value is initialized to 1 when chunk is allocated and
updated: When a pointer is copied, a new reference is created, so the
reference count of chunk must be incremented When a non-null pointer’s value is over-written, a reference is
removed, so the reference count of the chunk (before the over-write) must be decremented.
When a reference count becomes zero, it means nothing points to it so the chunk can be de-allocated and added to free list. If the chunk contains pointers to other chunks, then their reference counts must be decrimented.
Problems with Reference Counting Slows Program Execution
Every write into a pointer must test to see if old value is null. Requires updates to reference counts
Cyclic Structures cannot be deallocated var p: Nodeptr; /* p is a pointer to a node */ new(p); /* p points to new storage, reference count
is 1 */ p^.next = p; /* next field of node points to node, so
now reference count is 2 */ p = nil; /* p's value is over-written, so node's reference count decremented(from 2 to 1) In fact, it is inaccessible (it points
to itself, no other pointer points to
it), but we can't tell that just from the reference count. */
Garbage Collection Wait until no stoarge left then
Find all accessible objects Free all other (inaccessible) objects
Several Approaches to Garbage Collection Mark and Sweep Stop and Copy
Mark and Sweep Two Phases
Mark phase finds and marks all accessible objects
Sweep phase sweeps through the heap, collecting all of the garbage and putting back on freelist
Another “invisible” value in each chunk called mark bit Initialized to 0 Set to 1 if the chunk is reached during mark
phase
Mark PhasePut all “active” pointers on a worklist
(“active” means pointer is on stack or static data area)
While worklist is not empty do:p=select_pointer(worklist)if p’s object’s mark-bit is zero:
change it to oneput all pointers in p’s object on
worklist
Sweep Phase Looks at every chunk of storage in
heap How?
If mark-bit for chunk is 0 add to freelist
If mark-bit for chunk is 1 change to 0 When adding to freelist coalesce
neighbor chunks See example on board
Stop and Copy Garbage Collection Heap is divided into two parts:
Old space used for allocation of new chunks New space used for garbage collection
First-free pointer points to first free space in old space
When allocation request is made for n bytes, if space is available in old space then make allocation, otherwise perform garbage collection
Stop and Copy Garbage Collection Find all accessible objects (following
same method as mark and sweep) Copy the object from old space to
new space (no mark bit) After making all copies, reverse role
of old and new space First-free pointer points to beginning
of the “new” old space
Stop and Copy Garbage Collection When chunk is copied from old to
new, ALL pointers to chunk must be updated
A forwarding pointer is left behind in old space and used to update other pointers to same object
Follow example on board
Advantages of Stop and Copy Allocation is Cheaper (no need for
searching free list, just advance first-free pointer)
No Freelist, just one chunk of free memory, no need to coalesce chunks
Cheaper than mark and sweep – no need to scan entire heap
Compacting objects means closer together (fewer cache misses, fewer page faults)
Identifying Pointers Automatic deallocation requires the
ability to find all pointers on the stack Every word has a one-bit tag (0 for not-
pointer, 1 for pointer) Maintain separate bit-map of tags Associate with each variable and each
object a type tag.
Summary Two methods of Storage De-allocation
Programmer controlled Automatic
Programmer controlled errors include: Storage leaks Corrupted memory via dangling pointers
Automatic De-allocation Reference counting
High space and time overhead Cannot free cyclic structures Cost is distributed over the execution of program
Garbage collection Mark and Sweep Stop and Copy