Unit OS5: Memory Management Unit OS5: Memory Management
5.1. 5.1. Memory Management for Multiprogramming Memory Management for Multiprogramming
Copyright Notice Copyright Notice
© 2000-2005 David A. Solomon and Mark Russinovich
© 2000-2005 David A. Solomon and Mark Russinovich
These materials are part of the
These materials are part of the Windows Operating Windows Operating System Internals Curriculum Development Kit,
System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E.
developed by David A. Solomon and Mark E.
Russinovich with Andreas Polze Russinovich with Andreas Polze
Microsoft has licensed these materials from David Microsoft has licensed these materials from David Solomon Expert Seminars, Inc. for distribution to Solomon Expert Seminars, Inc. for distribution to academic organizations solely for use in academic academic organizations solely for use in academic environments (and not for commercial use)
environments (and not for commercial use)
Roadmap for Section 5.1.
Roadmap for Section 5.1.
Memory Management Principles Memory Management Principles
Logical vs Physical Address Space Logical vs Physical Address Space
Swapping vs Segmentation Swapping vs Segmentation
Paging
Paging
Memory Management Principles Memory Management Principles
Memory is central to the operation of a modern Memory is central to the operation of a modern
computer system computer system
Memory is a large array of words/bytes Memory is a large array of words/bytes CPU fetches instructions from memory CPU fetches instructions from memory
according to the value of the program counter according to the value of the program counter
Instructions may cause additional loading from Instructions may cause additional loading from
and storing to specific memory addresses
and storing to specific memory addresses
Address Binding Address Binding
Addresses in source Addresses in source programs are symbolic programs are symbolic
Compiler binds symbolic to Compiler binds symbolic to relocatable addresses
relocatable addresses
Linkage editor/loader binds Linkage editor/loader binds relocatable addresses to relocatable addresses to absolute addresses
absolute addresses
Binding can be done at any step:
Binding can be done at any step:
i.e., compiler may generate i.e., compiler may generate absolute code (as for MS- absolute code (as for MS- DOS .COM programs)
DOS .COM programs)
Source program Compiler or
assembler Object module other
object modules
Linkage editor Load module
loader System
libraries
In-memory binary memory
image dynamically
loaded system libraries
load time Compile time
execution time (run time)
Logical vs. Physical Logical vs. Physical
Address Space Address Space
Address generated by CPU is called a
Address generated by CPU is called a logical addresslogical address Memory unit deals with
Memory unit deals with physical addressesphysical addresses compile-time
compile-time and load-time address-binding and load-time address-binding::
Logical and physical addresses are identical Logical and physical addresses are identical
execution-time address-binding execution-time address-binding::
Logical addresses are different from physical addresses Logical addresses are different from physical addresses Logical addresses are also called
Logical addresses are also called virtual addressesvirtual addresses Run-time mapping
Run-time mapping from virtual to physical addresses is done by from virtual to physical addresses is done by Memory Memory Management Unit
Management Unit (MMU) – a hardware device (MMU) – a hardware device
The concept of a
The concept of a logical address spacelogical address space that is bound to a that is bound to a different
different physical address spacephysical address space is central to Memory is central to Memory Management!
Management!
Memory-Management Unit (
Memory-Management Unit ( MMU MMU ) )
Hardware device that maps virtual to physical address.
Hardware device that maps virtual to physical address.
The MMU is part of the processor The MMU is part of the processor
Re-programming the MMU is a privileged operation that can Re-programming the MMU is a privileged operation that can
only be performed in privileged (kernel) mode only be performed in privileged (kernel) mode
In MMU scheme, the value in the relocation register is In MMU scheme, the value in the relocation register is added to every address generated by a user process at added to every address generated by a user process at the time it is sent to memory.
the time it is sent to memory.
The user program deals with
The user program deals with logical addresses; it never sees logical addresses; it never sees the the realreal physical addresses. physical addresses.
MMU
Dynamic relocation using a Dynamic relocation using a
relocation register relocation register
memory
CPU
relocation register
logical + address
642
physical address 7642
7000
Dynamic Loading Dynamic Loading
A routine is not loaded until it is called A routine is not loaded until it is called
All routines are kept on disk in a relocatable load format All routines are kept on disk in a relocatable load format When a routine calls another routine:
When a routine calls another routine:
It checks, whether the other routine has been loaded It checks, whether the other routine has been loaded
If not, it calls the relocatable linking loader to load desired If not, it calls the relocatable linking loader to load desired routine
routine
Loader updates program‘s address tables to reflect change Loader updates program‘s address tables to reflect change Control is passed to newly loaded routine
Control is passed to newly loaded routine
Better memory-space utilization Better memory-space utilization
Unused routines are never loaded Unused routines are never loaded
No special OS support required
No special OS support required
Dynamic Linking Dynamic Linking
Similar to dynamic loading:
Similar to dynamic loading:
Rather than loading being postponed until run time, Rather than loading being postponed until run time,
linking is postponed linking is postponed
Dynamic libraries are not statically attached to Dynamic libraries are not statically attached to
a program‘s object modules (only a small stub is attached) a program‘s object modules (only a small stub is attached)
The stub indicates how to call (load) the appropriate library The stub indicates how to call (load) the appropriate library
routine routine
All programs may use the same copy of a library (code) All programs may use the same copy of a library (code) (shared libraries - .DLLs)
(shared libraries - .DLLs)
Dynamic linking requires operating system support Dynamic linking requires operating system support
OS is the only instance which may locate a library in another OS is the only instance which may locate a library in another
process‘s address space process‘s address space
Memory Allocation Schemes Memory Allocation Schemes
Main memory must accommodate OS
Main memory must accommodate OS ++ user processes user processes
OS needs to be protected from changes by user processes OS needs to be protected from changes by user processes User processes must be protected from each other
User processes must be protected from each other
Single partition allocation:
Single partition allocation:
User processes occupy a single memory partition User processes occupy a single memory partition
Protection can be implemented by limit and relocation register Protection can be implemented by limit and relocation register (OS in low memory, user processes in high memory, see below) (OS in low memory, user processes in high memory, see below)
CPU <
limit register
relocation register
+ memory
trap, addressing error logical
address
no
yes physical
address
OS
Memory Allocation Schemes
Memory Allocation Schemes (contd.) (contd.)
Multiple-Partition Allocation Multiple-Partition Allocation
Multiple processes should reside in memory simultaneously Multiple processes should reside in memory simultaneously
Memory can be divided in multiple partitions (fixed vs. variable size) Memory can be divided in multiple partitions (fixed vs. variable size) Problem:
Problem: What is the optimal partition size?What is the optimal partition size? Dynamic storage allocation problem
Dynamic storage allocation problem
Multiple partitions with holes in between Multiple partitions with holes in between Memory requests are satisfied from the
Memory requests are satisfied from the set of holesset of holes Which hole to select?
Which hole to select?
First-fit
First-fit: allocate the : allocate the firstfirst hole that is big enough hole that is big enough Best-fit
Best-fit: allocate the : allocate the smallest hole that is big enoughsmallest hole that is big enough Worst-fit
Worst-fit: allocate the : allocate the largest hole (produces largest leftover hole)largest hole (produces largest leftover hole) First-fit & best-fit are better than worst-fit (time & storage-wise) First-fit & best-fit are better than worst-fit (time & storage-wise) First-fit is generally faster than best-fit
First-fit is generally faster than best-fit
Overlays Overlays
Size of program and data Size of program and data may exceed size of memory may exceed size of memory Concept:
Concept:
Separate program in modules Separate program in modules Load modules alternatively Load modules alternatively Overlay driver locates
Overlay driver locates modules on disk
modules on disk
Overlay modules are kept as Overlay modules are kept as absolute memory images
absolute memory images Compiler support required Compiler support required
Symbol table Common
routines Overlay driver
Example:
multi-pass compiler
Pass 1
Pass 2
Swapping Swapping
Processes can temporarily be Processes can temporarily be swapped
swapped outout of memory to backing of memory to backing store in order to allow for execution of store in order to allow for execution of other processes
other processes
On the basis of physical addresses:
On the basis of physical addresses:
Then, processes will be
Then, processes will be swapped inswapped in into same memory space that they into same memory space that they occupied previously
occupied previously
On the basis of logical addresses:
On the basis of logical addresses:
What current OSes call swapping is What current OSes call swapping is rather paging out whole processes.
rather paging out whole processes.
Then, processes can be
Then, processes can be swapped in swapped in at arbitrary physical addresses.
at arbitrary physical addresses.
Operating system
User space
Process P1
Process P2 Swap
out
Swap in Main memory
Backing store
In a multiprogramming environment:
Segmentation Segmentation
What is the programmer‘s view of memory?
What is the programmer‘s view of memory?
Collection of variable-sized segments (text, data, stack, subroutines,..) Collection of variable-sized segments (text, data, stack, subroutines,..) No necessary ordering among segments
No necessary ordering among segments Logical address: <segment-number, offset>
Logical address: <segment-number, offset>
Hardware:
Hardware:
Segment table containing base address and limit for each segment Segment table containing base address and limit for each segment
CPU s d
limit base s
<
Trap, addressing error +
Physical memory Segment
table
yes no
Fragmentation Fragmentation
External Fragmentation
External Fragmentation – total memory space exists to satisfy a – total memory space exists to satisfy a request, but it is not contiguous.
request, but it is not contiguous.
Internal Fragmentation
Internal Fragmentation – allocated memory may be slightly larger – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a than requested memory; this size difference is memory internal to a partition, but not being used.
partition, but not being used.
Reduce external fragmentation by compaction Reduce external fragmentation by compaction
Shuffle memory contents to place all free memory together in one Shuffle memory contents to place all free memory together in one large block.
large block.
Compaction is possible
Compaction is possible only if relocation is dynamic, and is done at only if relocation is dynamic, and is done at execution time.
execution time.
I/O problem I/O problem
Latch job in memory while it is involved in I/O.
Latch job in memory while it is involved in I/O.
Do I/O only into OS buffers.
Do I/O only into OS buffers.
Paging Paging
Dynamic storage allocation algorithms for varying-sized Dynamic storage allocation algorithms for varying-sized chunks of memory may lead to fragmentation
chunks of memory may lead to fragmentation Solutions:
Solutions:
Compaction – dynamic relocation of processes Compaction – dynamic relocation of processes Noncontiguous allocation of process memory in Noncontiguous allocation of process memory in equally sized
equally sized pages pages (this avoids the memory fitting (this avoids the memory fitting problem)
problem) Paging
Paging breaks physical memory into fixed-sized blocks breaks physical memory into fixed-sized blocks (called
(called frames frames ) )
Logical memory is broken into
Logical memory is broken into pages pages (of the same size) (of the same size)
Paging: Basic Method Paging: Basic Method
When a process is executed, its pages are loaded into When a process is executed, its pages are loaded into any available frames from backing store (disk)
any available frames from backing store (disk)
Hardware support for paging consists of a page table Hardware support for paging consists of a page table Logical addresses consist of page number and offset Logical addresses consist of page number and offset
CPU p d f d
Physical memory Logical
address
Physical address p
Page number offset
Page frames are typically 2-4 kb
Paging Example Paging Example
Page 0 Page 1 Page 2 Page 3
logical memory
Page 0
Page 2
physical memory Page 1
Page 3 frame
number 0 1 2 3 4 5 6 7 4
1 6 3 page table 0
1 2 3
Free Frames Free Frames
Before allocation After allocation
Page 2
physical Page 0
Page 3 frame
number
10 11 12 13 14 15 16 17
Page 1 7
8 9 10
11 12 13 14 15 16 17 7 8 9
Free frame list 7, 8, 10, 11,13, 16
Free frame list 7, 10
11 8 16 13
New process page table 0
1 2 3
Process creation
Paging: Hardware Support Paging: Hardware Support
Every memory access requires access to page table Every memory access requires access to page table
Page table should be implemented in hardware Page table should be implemented in hardware Page tables exist on a per-user process basis Page tables exist on a per-user process basis
Small page tables can be just a set of registers Small page tables can be just a set of registers
Problem: size of physical memory, # of processes Problem: size of physical memory, # of processes
Page tables should be kept in memory Page tables should be kept in memory
Only base address of page table is kept in a special register Only base address of page table is kept in a special register Problem: speed of memory accesses
Problem: speed of memory accesses
Translation look-aside buffers (TLBs) Translation look-aside buffers (TLBs)
Associative registers store recently used page table entries Associative registers store recently used page table entries TLBs are fast, expensive, small: 8..2048 entries
TLBs are fast, expensive, small: 8..2048 entries TLB must be flushed on process context switches TLB must be flushed on process context switches
Associative Memory Associative Memory
Associative memory – parallel search Associative memory – parallel search
Address translation (A´, A´´) Address translation (A´, A´´)
If A´ is in associative register, get frame # out.
If A´ is in associative register, get frame # out.
Otherwise get frame # from page table in memory Otherwise get frame # from page table in memory
Page # Frame #
Paging Hardware With TLB Paging Hardware With TLB
CPU p d f d
Physical memory Logical
address
Physical address
p
Page table Page number
offset Page # Frame #
TLB
TLB miss TLB hit
Effective Access Time with TLB Effective Access Time with TLB
Associative Lookup in TLB =
Associative Lookup in TLB = time unit time unit
Assume memory cycle time is 1 microsecond Assume memory cycle time is 1 microsecond
Hit ratio – percentage of times that a page number is found Hit ratio – percentage of times that a page number is found in the associative registers;
in the associative registers;
ratio related to number of associative registers.
ratio related to number of associative registers.
Let us assume a hit ratio = Let us assume a hit ratio =
Effective Access Time (EAT) Effective Access Time (EAT) EAT = (1 +
EAT = (1 + ) ) + (2 + + (2 + )(1 – )(1 – ) )
= 2 +
= 2 + – –
Memory Protection Memory Protection
Memory protection implemented by associating control Memory protection implemented by associating control bits with each frame
bits with each frame
Isolation of processes in main memory Isolation of processes in main memory
Valid-invalid
Valid-invalid bit attached to each entry in the page table: bit attached to each entry in the page table:
“ “ valid” indicates that the associated page is in the valid” indicates that the associated page is in the process’ logical address space, and is thus a legal process’ logical address space, and is thus a legal page page
“ “ invalid” indicates that the page is not in the invalid” indicates that the page is not in the process’ logical address space
process’ logical address space
Valid (v) or Invalid (i) Bit in a Page Valid (v) or Invalid (i) Bit in a Page
Table Table
Page 0 Page 1 Page 2 Page 3
logical memory
Page 0
Page 2
physical memory Page 1
Page 3 frame
number 0 1 2 3 4 5 6 7 4
1 6 3
page table 0
1 2 3 4 5
v v v v i i
Invalid pages may be paged out
Invalid pages may be paged out
Page Table Structure Page Table Structure
Hierarchical Paging Hierarchical Paging
Hashed Page Tables
Hashed Page Tables
Inverted Page Tables
Inverted Page Tables
Hierarchical Page Tables Hierarchical Page Tables
Break up the logical address space into multiple Break up the logical address space into multiple
page tables page tables
A simple technique is a two-level page table A simple technique is a two-level page table
Used with 32-bit CPUs
Used with 32-bit CPUs
page number page offset
pi p2 d
10 10 12
Two-Level Paging Example Two-Level Paging Example
A logical address (on 32-bit machine with 4K page size) is A logical address (on 32-bit machine with 4K page size) is divided into:
divided into:
a page number consisting of 20 bits.
a page number consisting of 20 bits.
a page offset consisting of 12 bits.
a page offset consisting of 12 bits.
Since the page table is paged, the page number is further Since the page table is paged, the page number is further divided into:
divided into:
a 10-bit page number a 10-bit page number a 10-bit page offset a 10-bit page offset
Thus, a logical address is as follows:
Thus, a logical address is as follows:
where pi is an index into the outer page table, and p2 is the where pi is an index into the outer page table, and p2 is the displacement within the page of the outer page table
displacement within the page of the outer page table
Two-Level Page-Table Scheme Two-Level Page-Table Scheme
…
…
…
…
outer page table (page directory)
page tables
memory
Address-Translation Scheme Address-Translation Scheme
Address-translation scheme for a two-level 32- Address-translation scheme for a two-level 32-
bit paging architecture bit paging architecture
page number page offset
p1 p2 d
10 10 12
page directory
page table
Main memory
p1
p2
Hashed Page Tables Hashed Page Tables
Common in address spaces > 32 bits Common in address spaces > 32 bits
IA64 supports hashed page tables IA64 supports hashed page tables
The virtual page number is hashed into a page table.
The virtual page number is hashed into a page table.
This page table contains a chain of elements hashing to This page table contains a chain of elements hashing to the same location
the same location
Virtual page numbers are compared in this chain Virtual page numbers are compared in this chain searching for a match. If a match is found, the searching for a match. If a match is found, the corresponding physical frame is extracted
corresponding physical frame is extracted
Hashed Page Table Hashed Page Table
CPU p d f d
Physical memory Logical
address
Physical address
Page table Page number
offset
p f q r
hash function
Inverted Page Table Inverted Page Table
One entry for each real page of memory One entry for each real page of memory
Entry consists of the virtual address of the page Entry consists of the virtual address of the page
stored in that real memory location, with stored in that real memory location, with
information about the process that owns that information about the process that owns that page page
Decreases memory needed to store each page Decreases memory needed to store each page
table, but increases time needed to search the table, but increases time needed to search the
table when a page reference occurs table when a page reference occurs
Use hash table to limit the search to one — or at Use hash table to limit the search to one — or at
most a few — page-table entries
most a few — page-table entries
Inverted Page Table Architecture Inverted Page Table Architecture
CPU p d f d
Physical memory Logical
address
Physical address
Page table Page number
offset
pid p search
pid
Process ID
f
Shared Pages Shared Pages
Shared code Shared code
One copy of read-only (reentrant) code shared among One copy of read-only (reentrant) code shared among
processes (i.e., text editors, compilers, window systems) processes (i.e., text editors, compilers, window systems) Shared code must appear in same location in the logical Shared code must appear in same location in the logical
address space of all processes address space of all processes
Private code and data Private code and data
Each process keeps a separate copy of the code and data Each process keeps a separate copy of the code and data
The pages for the private code and data can appear anywhere The pages for the private code and data can appear anywhere
in the logical address space in the logical address space
Shared Pages Example Shared Pages Example
Process 1 virtual memory
Process 2 virtual memory
cpp
cc1
data1 data2
cc2 memory cpp
cc1 cc2 data1
cpp cc1 cc2 data2
1 4 11
7
1 4 11
8
frame number
0 1 2 3 4 5 6 7 8 9 10 11
Process 1 page table
Process 2 page table
Segmentation with Paging Segmentation with Paging
- paged segmentation on the GE 645 (Multics) - paged segmentation on the GE 645 (Multics)
The innovative MULTICS operating system introduced:
The innovative MULTICS operating system introduced:
Logical addresses: 18-bit segment no, 16-bit offset Logical addresses: 18-bit segment no, 16-bit offset
(relatively) small number of 64k segments (relatively) small number of 64k segments
To eliminate fragmentation,
To eliminate fragmentation, segments segments areare paged paged A separate page table exists for each segment A separate page table exists for each segment
segment length
page-table base
s d
>=
+ Trap
yes no
segment table base register
d
p d‘
+ f f d‘
physical memory
physical address logical address
segment table
Intel 30386 Intel 30386 Address Translation Address Translation
The Intel 386 The Intel 386 uses
uses
segmentation segmentation with paging with paging for memory for memory management management with a two- with a two- level paging level paging scheme.
scheme.
10 10 12
22
31 21 11 0
Intel Linear Address
12
4Mb PDE 4Kb PDE
Page directory 1024x4byte entries
(one per process) cr 3
PTE
Page table 1024 entries
Physical Address
operand 4 Kb page
operand 4 Mb page 22 bit
offset 4kb page frame
4MB page frame
Physical Memory
limit base s
+
descriptor table
selector offset
offset selector
Intel logical Address
Summary Summary
In a multiprogrammed OS, every memory address In a multiprogrammed OS, every memory address
generated by the CPU must be checked for legality and generated by the CPU must be checked for legality and
possibly mapped to a physical address possibly mapped to a physical address
Checking cannot be implemented (efficiently) in software Checking cannot be implemented (efficiently) in software
Hardware support is essential Hardware support is essential
A pair of registers is sufficient for single/multiple partition A pair of registers is sufficient for single/multiple partition
schemes schemes
Paging/segmentation need mapping tables to define address maps Paging/segmentation need mapping tables to define address maps
Paging and segmentation can be fast Paging and segmentation can be fast
Tables have to be implemented in fast registers (Problem: size) Tables have to be implemented in fast registers (Problem: size)
Set of associative registers (TLB) may reduce performance Set of associative registers (TLB) may reduce performance
degradation if tables are kept in memory degradation if tables are kept in memory