Operating Systems and Computer Networks
Memory Management I
Prof. Dr.-Ing. Axel HungerAlexander Maxeiner, M.Sc.
Institute of Computer EngineeringFaculty of Engineering
University Duisburg-Essen
Alexander Maxeiner, M.Sc. & Dr.-Ing. Pascal A. Klein
Alexander Maxeiner, M.Sc.University Duisburg-Essen
2Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Goals of Memory Management
Memory hierarchy
Cache performance
Cache organization
Memory organization
– Overview
– Algorithms
Agenda
Alexander Maxeiner, M.Sc.University Duisburg-Essen
3Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Goals of Memory Management
Convenient abstraction for programming
Allocation of scarce memory resources among competing processes
maximize performance with minimal overhead
Mechanisms
Physical and virtual addressing
Partitioning, Paging, Segmentation
Page replacement algorithms
Memory Management
Alexander Maxeiner, M.Sc.University Duisburg-Essen
4Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Memory Hierarchy (1)
Power on
Power onvery short term
Power offshort term
Power offmid term
Power offLong term
Power onImmediate term
small sizesmall capacity
small sizesmall capacity
medium sizemedium capacity
medium sizelarge capacity
Large sizevery large capacity
Large sizevery large capacity
processor registersvery fast, very expensive
processor cachevery fast, very expensive
Random access memoryfast, affordable
hard drivesslow, very cheap
flash / USB memoryslower, cheap
Tape backupvery slow, affordable
Alexander Maxeiner, M.Sc.University Duisburg-Essen
5Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Memory Hierarchy (2)
Power on
Power onvery short term
Power offmid term
Power onImmediate term
small sizesmall capacity
small sizesmall capacity
medium sizemedium capacity
Large sizevery large capacity
processor registersvery fast, very expensive
processor cachevery fast, very expensive
Random access memoryfast, affordable
hard drivesslow, very cheap
Alexander Maxeiner, M.Sc.University Duisburg-Essen
6Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Ahmdahl’s Law:
“The performance improvement of a system to be gained from using faster mode execution is limited by the slowest fraction of a system that can’t be parallelized.”
Problems of Ahmdahl’s Law:
Ahmdahl’s law didn’t include cache memory.
Processors can’t be parallelized indefinitely.
Hyperthreading difficult to include.
Access time improvements
Alexander Maxeiner, M.Sc.University Duisburg-Essen
7Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Improvement of a system in numbers:
𝑆𝑂𝑣𝑒𝑟𝑎𝑙𝑙 =𝑡𝑜𝑙𝑑𝑡𝑛𝑒𝑤
1
1 − 𝑃𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 +𝑃𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑
𝑆𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑
-> The overall speedup of a system is determined by the probability of the functionality of the component and the improvement factor of itself.
Performance improvement
Alexander Maxeiner, M.Sc.University Duisburg-Essen
8Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
System without cache is improved with a cache.
Cache is 10 times faster than main memory and chance of finding required data in cache is 90%
𝑆𝑃𝐶 =1
1 − 𝑃𝑐𝑎𝑐ℎ𝑒 +𝑃𝑐𝑎𝑐ℎ𝑒
𝑆𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑚𝑒𝑛𝑡
=1
1 − 0.9 +0.910
= 5.26
Cache example
Alexander Maxeiner, M.Sc.University Duisburg-Essen
9Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Example of access times in a modern computer system:
Table shows the big difference in access times of cache and main memory.
Register access depends on clock speed of CPU.
Speed examples
Name: CPU L1 Cache L2 Cache Main Memory I/O Device
Type: Register SRAM SRAM DDR3 RAM Drives
Size: 256 Byte 32 KiB 256 KiB 4 GiB >128 GiB
Access time:
0.28 ns ~1 ns ~3 ns ~40 ns ~5 ms
Alexander Maxeiner, M.Sc.University Duisburg-Essen
10Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
To calculate the access times of memory the following information needs to be available:
– Access time in case of a hit
– Probability of finding data in memory
– Penalty time in case of a data miss
With all this information access times can be calculated as:
𝑡𝑎𝑐𝑐 = 𝑃ℎ𝑖𝑡 ∗ 𝑡ℎ𝑖𝑡 + 1 − 𝑃ℎ𝑖𝑡 ∗ 𝑡𝑝𝑒𝑛𝑎𝑙𝑡𝑦
Cache performance
Alexander Maxeiner, M.Sc.University Duisburg-Essen
11Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Addresses in cache memory ordered by number.
Starting address of a memory block is 0.
Cache organized in lines. Current CPU use 64 Bytes in one cache line. Therefore 6 Bits for offset are needed in addressing.
L1 cache in AMD CPU’s is 32KiB for Instructions & (up to) 64 KiB for Data.
L1 in Intel Hasswel CPU’s is 32KiB for Instructions & 32 KiB for Data.
Both CPU’s are working with 16 Bit addresses.
Higher level caches increasing in size and access times.
Cache addresses
Alexander Maxeiner, M.Sc.University Duisburg-Essen
12Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Memory allocation part of the responsibilities of memory management unit.
Needs to be fast and precise.
Addresses of data origin need to be tracked to ensure correct storing of modified data sets.
Allocation strategies different dependent of the type of memory.
L1 Cache small therefor no space for complicated address storage information -> precision & hit chance high priority.
L2 Cache bigger -> Decrease swapping of data, maximum usage of space.
Memory allocation
Alexander Maxeiner, M.Sc.University Duisburg-Essen
13Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
L1 cache organized in one of 3 ways:
– Fully associative
– Direct mapped
– N-way set associative
All those allocation methods create the target address in the L1 cache out of the originated address.
Backtracking the address of the originated data therefor easy and fast.
If data is modified and swapped out of memory origin is determined and content is replaced.
Cache organization (L1)
Alexander Maxeiner, M.Sc.University Duisburg-Essen
14Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Fully associative cache allows the data to be placed anywhere in the cache. Address of origin needs to be stored.
Direct mapped allocation: Data is stored in cache block determined by calculation of:
(𝐵𝑙𝑜𝑐𝑘 𝐴𝑑𝑑𝑟𝑒𝑠𝑠) % (𝑁𝐵𝑙𝑜𝑐𝑘𝑠 𝑖𝑛 𝑐𝑎𝑐ℎ𝑒)
N-Way set associative allocation: Data is stored in cache block determined by calculation of:
(𝐵𝑙𝑜𝑐𝑘 𝐴𝑑𝑑𝑟𝑒𝑠𝑠) % (𝑁𝑆𝑒𝑡𝑠 𝑖𝑛 𝑐𝑎𝑐ℎ𝑒)
N = Number of Cache lines per Set.
Cache mapping
Alexander Maxeiner, M.Sc.University Duisburg-Essen
15Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
There is a direct connection between the cache mapping method and the hit chance of a data access.
Higher hit chances increase performance of overall system.
Mapping and hit chance
Alexander Maxeiner, M.Sc.University Duisburg-Essen
16Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
L2 Caches and higher level memory use different allocation methods.
Size of data can vary, therefore algorithms need to be in place that find free space fast.
Due to constant swapping of data external fragmentation common problem of cache and memory.
Deleting data only acceptable if unused or no free area available for process critical data.
The more data stored in a cache system, the faster tasks can be executed. Since cache is limited in size only most important data can be stored with requirement of maximum data efficiency.
Memory allocation
Alexander Maxeiner, M.Sc.University Duisburg-Essen
17Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Assuming a L2 Cache of 0.5 MiB.
Cache partially used by other tasks.
New request for data arrives at the memory allocation unit of CPU.
Behavior of allocation unit dependent on underlying algorithm.
Example of cache meory
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free Space:60 KiB
Free Space:40 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
18Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Assuming a new process E requests 30 KiB
Which area should it use?
Available algorithms
First fit
Next fit
Best fit
Worst fit
Algorithms to allocate memory
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free Space:60 KiB
Free Space:40 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
19Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
simplest algorithm
scanning along the list (from beginning) until it finds sufficient area
Breaking area in two pieces:
• One for process
• One for unused memory (new area)
Memory Allocation: First Fit
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free Space:60 KiB
Free Space:40 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
20Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
simplest algorithm
scanning along the list (from beginning) until it finds sufficient area
Breaking area in two pieces:
– One for process
– One for unused memory (new area)
Fitting into 60KiB hole
Very fast – searches as little as possible
Memory Allocation: First Fit
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free: 30 KiB
Free Space:40 KiB
Process E:30KiB
start
stop
Alexander Maxeiner, M.Sc.University Duisburg-Essen
21Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Variation of First Fit
Keeps track of where it found a area
Next time: start at position where it left off last time
Assumption that prospective fitting areas come after already found holes
However, slight worse performance than first fit
Memory Allocation: Next Fit
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free Space:60 KiB
Free Space:40 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
22Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Variation of First Fit
Keeps track of where it found a area
Next time: start at position where it left off last time
Assumption that prospective fitting areas come after already found holes
However, slight worse performance than first fit
Fitting into 60KiB hole
Memory Allocation: Next Fit
start
stop
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free: 30 KiB
Free Space:40 KiB
Process E:30KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
23Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Variation of First Fit
Keeps track of where it found a area
Next time: start at position where it left off last time
Assumption that prospective fitting areas come after already found holes
However, slight worse performance than first fit
Fitting into 60 KiB hole
Then, fitting into 40 KiB hole
Memory Allocation: Next Fit
Process B:100 KiB
Free: 10 KiB
start
stop
Process F:30 KiB
OS: 132 KiB
Free: 30 KiB
Process E:30KiB
Free Space:120 KiB
Process D:60 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
24Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Searches entire list (from beginning to end)
Takes smallest free space
Assumption of best memory spatial performance
Memory Allocation: Best Fit
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free Space:60 KiB
Free Space:40 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
25Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Searches entire list (from beginning to end)
Takes smallest free space
Assumption of best memory spatial performance
Fitting into 40K hole
Takes much CPU time (for searching)
Resulting in more memory wasting because of numerous tiny useless free areas.
Memory Allocation: Best Fit
Free: 10 KiBProcess E:30KiB
start
stop
OS: 132 KiB
Process B:100 KiB
Free Space:60 KiB
Free Space:120 KiB
Process D:60 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
26Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Variation of best fit
Always takes the largest hole
Assumption to get around splitting into tiny holes, to be big enough for other processes
Memory Allocation: Worst Fit
Free Space:120 KiB
OS: 132 KiB
Process B:100 KiB
Process D:60 KiB
Free Space:60 KiB
Free Space:40 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
27Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Variation of best fit
Always takes the largest hole
Assumption to get around splitting into tiny holes, to be big enough for other processes
Fitting into 120 KiB hole
Takes much CPU time (for searching)
Still memory wasting
Memory Allocation: Worst Fit
Free Space:90 KiB
Process E:30KiB
start
stop
OS: 132 KiB
Process B:100 KiB
Free Space:60 KiB
Free Space:40 KiB
Process D:60 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
28Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Maintaining separate lists for processes AND holes
• Algorithms devote energy to inspect holes NOT processes
• BUT higher effort for deallocating (changing both lists)
Sorting of lists regarding size
enhancing speed of Best Fit/Worst Fit
In practice:
FF is usually better
NF is pointless when sorted lists are used
Worst fit at its worst if allocated memory can‘t be reorganized easily
Memory Allocation:
Enhancing performance
Alexander Maxeiner, M.Sc.University Duisburg-Essen
29Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Cache organized with the Buddy-system usually fast and memory efficient.
Memory encapsulated in units of powers of 2
Process requests rounded up to fit into units
Possible hole sizes: ..., 4K, 8K, 16K, 32K, 64K, 128K, ...
Buddy System
64 KiB
32 KiB
16 KiB
8 KiB4 KiB4 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
30Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Example: Suppose a memory of 128 KiB One hole of 128 KiB
Process A requests 6 KiB rounded to 8 KiB (2 KiB wasted)
Buddy System - Splitting (1)
128 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
31Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
128K
Example: Suppose a memory of 128 KiB One hole of 128 KiB
Process A requests 6 KiB rounded to 8 KiB (2 KiB wasted)
Next process repeats the algorithm and is stored in first fitting memory unit.
If no memory unit is capable of storing the data, other data needs to be deleted.
Buddy System - Splitting (2)
64 KiB
32 KiB
16 KiB
8 KiB
A: 8 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
32Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Example: Suppose a memory of 128 KiB.
Memory filled as described.
New data request for 52 KiB of data arrives. No free memory block fits.
Data is now deleted according to underlying swapping algorithm.
Buddy System - Merging (1)
D: 32 KiB
B: 32 KiB
8 KiB
A: 8 KiB
8 KiB
C: 4 KiB4 KiB
32 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
33Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
If data is deleted adjacent blocks merge.
Blocks can only merge to ‚restore‘ original size from where they split.
Buddy System – Merging (2)
D: 32 KiB
B: 32 KiB
8 KiB
A: 8 KiB
8 KiB
C: 4 KiB4 KiB
32 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
34Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
If data is deleted adjacent blocks merge.
Blocks can only merge to ‚restore‘ original size from where they split.
Deleting Data A results in a merger of both adjacent memory units, since they got split up in the allocation process.
Buddy System – Merging (3)
D: 32 KiB
B: 32 KiB
8 KiB
C: 4 KiB4 KiB
32 KiB
A: 8 KiB
8 KiB16 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
35Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
If data is deleted adjacent blocks merge.
Blocks can only merge to ‚restore‘ original size from where they split.
Deleting Data B will not result in a merger, since the neighboring 32 KiB Block it needs to merge with is still occupied
Buddy System – Merging (4)
D: 32 KiB
B: 32K
8 KiB
A: 8 KiB
8 KiB
C: 4 KiB4 KiB
32 KiB
32 KiB
Alexander Maxeiner, M.Sc.University Duisburg-Essen
36Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Memory allocation method dependent on requirements of hardware.
In PC cache memory n-way set allocation used for optimal usage, hit chance and reduced swapping.
Embedded systems with small caches and less data to handle might use direct mapped cache.
First fit allocation method usually fast.
Worst fit / Best Fit more space efficient depending on data.
“Y”-Fit algorithms less problems with internal fragmentation.
Buddy system fast and efficient, but has problems with internal fragmentation
Conclusion
Alexander Maxeiner, M.Sc.University Duisburg-Essen
37Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Questions?
Alexander Maxeiner, M.Sc.University Duisburg-Essen
38Prof. Dr.-Ing. Axel HungerInstitute of Computer Engineering OSCN – Memory Management I
Tanenbaum, Andrew S., “Modern Operating Systems”, 3rd edition, Pearson Education Inc, Amsterdam, Netherlands, 2008.
Tanenbaum, Andrew S., “Moderne Bertriebssysteme”, 3rd edition, Pearson Education Inc, Amsterdam, Netherlands, 2009.
Lee, Insup, „CsE 380 Computer Operating Systems“, Lecture Notes, University of Pennsylvania, 2002.
Snoeren, Alex C., „Lecture 10: Memory Management“, LectureNotes, UC San Diego, 2010.
Resources