Date post: | 26-Dec-2014 |
Category: |
Education |
Upload: | naveed-qadri |
View: | 6,953 times |
Download: | 2 times |
Windows 7 Memory Management
Presented to:Mr. Ahmad Mohsin
Presented By:Muhmmad Naveed AhmadAdnan Faisal KhanMuhammad AminMuhammad Irshad
Windows 7 Memory Management
Presented to:Mr. Ahmad Mohsin
Presented By:Muhmmad Naveed AhmadAdnan Faisal KhanMuhammad AminMuhammad Irshad
Memory> In psychology, memory is the process by
which information is encoded, stored, retrieved and erased.
> In Science and Technology, The term memory identifies data storage that comes in the form of chips.
> The computer memory is a physical device used to store data or programs for use in a computer. `Some types of computer memory include Volatile (RAM, Cache, Register), Non-volatile(ROM).
Memory Management
> Memory management is the process of controlling and coordinating computer memory, assigning portions called blocks to various running programs to optimize overall system performance.
> Memory management resides in hardware, in the OS (operating system), and in programs and applications.
> MMU(Memory Management Unit) > Hardware device that maps virtual to physical address> the value in the relocation register is added to every address generated
by a user process at the time it is sent to memory
Binding of Instructions & Data to Memory> Compile time:
If memory location known in advance, absolute code can be generated; must recompile code if starting location changes
> Load time:Must generate re-locatable code if memory
location is not known at compile time
> Execution time: Binding delayed until run time if the
process can be moved during its execution from onememory segment to another. Need hardware support foraddress maps (registers ).
Logical vs. Physical Address Space
The concept of a logical address space that is bound to aseparate physical address space is central to proper
memorymanagement
Logical address – generated by the CPU; also referred to as virtual
addressPhysical address –
address seen by the memory unitLogical and physical addresses are the same in compile-timeand load-time address-binding schemes; logical (virtual) andphysical addresses differ in execution-time address-bindingscheme
Special CPU registers: base & limito Registers contain
• Base: start of the process’s memory partition
• Limit: length of the process’s memory partition
Address generationo Physical address: location in
actual memoryo Logical address: location
from the process’s point of view
o Physical address = base + logical address
o Logical address larger than limit => error
Processpartition
OS0
0xFFFF
Limit
Base
0x2000
0x9000
Logical address: 0x1204Physical address:0x1204+0x9000 = 0xa204
Swapping
> A process can be swapped temporarily out of memory to abacking store, and then brought back into memory for continuedexecution
> Backing store – fast disk large enough to accommodate copiesof all memory images for all users; must provide direct access tothese memory images
> Roll out, roll in – swapping variant used for priority-basedscheduling algorithms; lower-priority process is swapped out sohigher-priority process can be loaded and executed
> Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped
> Modified versions of swapping are found on many systems (i.e.,UNIX, Linux, and Windows)
Fragmentation
> External Fragmentation – total memory space exists to satisfy
a request, but it is not contiguous
> Internal Fragmentation – allocated memory may be slightly
larger than requested memory; this size difference is memory
internal to a partition, but not being used
> Reduce external fragmentation by compactiono Shuffle memory contents to place all free memory together in one large
blocko Compaction is possible only if relocation is dynamic, and is done at
execution time
Paging
> Logical address space of a process can be noncontiguous; process is allocated physical memory whenever the latter is Available
> Divide physical memory into fixed-sized blocks called frames (size is power of 2, between 512 bytes and 8192 bytes)
> Divide logical memory into blocks of same size called pages.
> Keep track of all free frames
> To run a program of size n pages, need to find n free frames and load program
> Set up a page table to translate logical to physical addresses
Implementation of Page Table
> Page table is kept in main memory
> Page-table base register (PTBR) points to the page table
> Page-table length register (PRLR) indicates size of the page table
> In this scheme every data/instruction access requires twomemory accesses. One for the page table and one for the data/instruction.
> The two memory access problem can be solved by the use of a special fast-lookup hardware cache called associative memory or translation look-aside buffers (TLBs)
Task Manager’s Main Screen
Task Manager Working Set Display
Integrated NVRAM Support
> NVRAM : - Built directly into motherboards - In solid state drives - In USB sticks - As a replacement for main memory
> Windows 7 delivers tight and efficient integration of NVRAM support directly into the core memory management system.
> The same disk page can be in memory, in a ReadyBoost cache and pinned in a ReadyDrive disk all at the same time with each component unaware of the others.
> Pagefile-backed pages can be consuming space in both ReadyBoost and ReadyDrive caches even though the application (and memory management) had deleted them long ago.
PFN Lock
> The PFN array is a virtually contiguous (but can be physically sparse) data structure where each PFN entry describes the state of a physical page of memory.
> Information includes :> - State (zero, free, standby, modified, modified no write, bad, active, etc)> - How many page table entries are mapping it> - How many I/Os are currently in progress> - The containing frame/PTE> - The PTE value to restore when the page leaves its last working set or is
repurposed> - NUMA node> - etc
Size is critical ... And how to best manage the information ?
Buffer Overflow Attacks> One serious security threat related to memory management remains
to be introduced:> buffer overflow , also known as a buffer overrun , which is
defined in the NIST (National Institute of Standards and Technology)> buffer overrun: A condition at an interface under which more input
can be> placed into a buffer or data-holding area than the capacity allocated,
overwriting other information.> Attackers exploit such a condition to crash a system or to insert specially crafted code that allows them to gain control of the
system.> When done deliberately as part of an attack on a system, the
transfer of control could be to code of the attacker’s choosing, resulting in the ability to
execute arbitrary code with the privileges of the attacked process. > Buffer overflow attacks are one of the most prevalent and
dangerous types of security attacks.
NUMA> Non-Uniform Memory Access (NUMA) is a computer
memory design used in multiprocessing, where the memory access time depends on the memory location relative to a processor. Under NUMA, a processor can access its own local memory faster than non-local memory, that is, memory local to another processor or memory shared between processors.
> In virtual memory, assumed that all main memory is created equal-or at least that it is accessed equally.
> In multiple CPUs a given CPU can access some sections of main memory fasterthan it can access others.
> Performance differences is due to CPUs and memory interconnection.
> Systems with memory & CPUs are located on the same motherboard VS separate.
NUMA
> NUMA architectures logically follow in scaling from symmetric multiprocessing (SMP) architectures.
> Their commercial development came in work by (Unisys, Hewlett-Packard, Group Bull, Silicon Graphics, IBM), EMC, Compaq, HP) during the 1990s.
> Techniques developed by these companies later featured in a variety of Unix-like operating systems, and somewhat in Windows NT.
> The first commercial implementation of a NUMA-based Unix system was the Symmetrical Multi Processing XPS-100 family of servers,
> These were designed by Dan Gielan Honeywell Information Systems Italy.
> The tremendous success of the architecture propelled HISI to the #1 spot of Unix vendors in Europe.
Numa Code Experiment
> Performed on an Intel Xeon X5570 @ 2.93GHz, 2 CPUs, 8 cores.> An array x of size 10,000,000 bytes on core 0's NUMA node with
numa_alloc_local. > iterate over array x 50 times and read and write each byte in the
array. > Measure the elapsed time to do the 50 iterations.> Then, on each of the other cores in my server, I pin a new thread
and again measure the elapsed time to do 50 iterations of reading and writing to every byte in array x.
> Array x is large to minimize cache effects. We want to measure the speed when the CPU has to go all the way to RAM to load and store, not when caches are helping.
The OutputC++ -o numatest -pthread -lboost_thread -lnuma -O0 numatest.cpp ./numatest
numa_available() 0 <-- NUMA is available on this system
numa node 0 10101010 12884901888 <-- cores 0,2,4,6 are on NUMA node 0, which is about 12 Gb numa
node 1 01010101 12874584064 <-- cores 1,3,5,7 are on NUMA node 1, which is slightly smaller than node 0
Elapsed read/write by same thread that allocated on core 0: 00:00:01.767428
Elapsed read/write by thread on core 0: 00:00:01.760554
Elapsed read/write by thread on core 1: 00:00:01.719686
Elapsed read/write by thread on core 2: 00:00:01.708830
Elapsed read/write by thread on core 3: 00:00:01.691560
Elapsed read/write by thread on core 4: 00:00:01.686912
Elapsed read/write by thread on core 5: 00:00:01.691917
Elapsed read/write by thread on core 6: 00:00:01.686509
Elapsed read/write by thread on core 7: 00:00:01.689928