Unit OS5: Memory Management Unit OS5: Memory Management

(1)

Unit OS5: Memory Management Unit OS5: Memory Management

5.1. 5.1. Memory Management for Multiprogramming Memory Management for Multiprogramming

(2)

Copyright Notice Copyright Notice

These materials are part of the

These materials are part of the Windows Operating Windows Operating System Internals Curriculum Development Kit,

System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E.

developed by David A. Solomon and Mark E.

Russinovich with Andreas Polze Russinovich with Andreas Polze

Microsoft has licensed these materials from David Microsoft has licensed these materials from David Solomon Expert Seminars, Inc. for distribution to Solomon Expert Seminars, Inc. for distribution to academic organizations solely for use in academic academic organizations solely for use in academic environments (and not for commercial use)

environments (and not for commercial use)

(3)

Roadmap for Section 5.1.

Memory Management Principles Memory Management Principles

Logical vs Physical Address Space Logical vs Physical Address Space

Swapping vs Segmentation Swapping vs Segmentation

Paging

(4)

Memory Management Principles Memory Management Principles

Memory is central to the operation of a modern Memory is central to the operation of a modern

computer system computer system

Memory is a large array of words/bytes Memory is a large array of words/bytes CPU fetches instructions from memory CPU fetches instructions from memory

according to the value of the program counter according to the value of the program counter

Instructions may cause additional loading from Instructions may cause additional loading from

and storing to specific memory addresses

(5)

Address Binding Address Binding

Addresses in source Addresses in source programs are symbolic programs are symbolic

Compiler binds symbolic to Compiler binds symbolic to relocatable addresses

relocatable addresses

Linkage editor/loader binds Linkage editor/loader binds relocatable addresses to relocatable addresses to absolute addresses

absolute addresses

Binding can be done at any step:

i.e., compiler may generate i.e., compiler may generate absolute code (as for MS- absolute code (as for MS- DOS .COM programs)

DOS .COM programs)

Source program Compiler or

assembler Object module other

object modules

Linkage editor Load module

loader System

libraries

In-memory binary memory

image dynamically

loaded system libraries

load time Compile time

execution time (run time)

(6)

Logical vs. Physical Logical vs. Physical

Address Space Address Space

Address generated by CPU is called a

Address generated by CPU is called a logical addresslogical address Memory unit deals with

Memory unit deals with physical addressesphysical addresses compile-time

compile-time and load-time address-binding and load-time address-binding::

Logical and physical addresses are identical Logical and physical addresses are identical

execution-time address-binding execution-time address-binding::

Logical addresses are different from physical addresses Logical addresses are different from physical addresses Logical addresses are also called

Logical addresses are also called virtual addressesvirtual addresses Run-time mapping

Run-time mapping from virtual to physical addresses is done by from virtual to physical addresses is done by Memory Memory Management Unit

Management Unit (MMU) – a hardware device (MMU) – a hardware device

The concept of a

The concept of a logical address spacelogical address space that is bound to a that is bound to a different

different physical address spacephysical address space is central to Memory is central to Memory Management!

Management!

(7)

Memory-Management Unit (

Memory-Management Unit ( MMU MMU ) )

Hardware device that maps virtual to physical address.

The MMU is part of the processor The MMU is part of the processor

Re-programming the MMU is a privileged operation that can Re-programming the MMU is a privileged operation that can

only be performed in privileged (kernel) mode only be performed in privileged (kernel) mode

In MMU scheme, the value in the relocation register is In MMU scheme, the value in the relocation register is added to every address generated by a user process at added to every address generated by a user process at the time it is sent to memory.

the time it is sent to memory.

The user program deals with

The user program deals with logical addresses; it never sees logical addresses; it never sees the the realreal physical addresses. physical addresses.

(8)

MMU

Dynamic relocation using a Dynamic relocation using a

relocation register relocation register

memory

CPU

relocation register

logical + address

642

physical address 7642

7000

(9)

Dynamic Loading Dynamic Loading

A routine is not loaded until it is called A routine is not loaded until it is called

All routines are kept on disk in a relocatable load format All routines are kept on disk in a relocatable load format When a routine calls another routine:

When a routine calls another routine:

It checks, whether the other routine has been loaded It checks, whether the other routine has been loaded

If not, it calls the relocatable linking loader to load desired If not, it calls the relocatable linking loader to load desired routine

routine

Loader updates program‘s address tables to reflect change Loader updates program‘s address tables to reflect change Control is passed to newly loaded routine

Control is passed to newly loaded routine

Better memory-space utilization Better memory-space utilization

Unused routines are never loaded Unused routines are never loaded

No special OS support required

(10)

Dynamic Linking Dynamic Linking

Similar to dynamic loading:

Rather than loading being postponed until run time, Rather than loading being postponed until run time,

linking is postponed linking is postponed

Dynamic libraries are not statically attached to Dynamic libraries are not statically attached to

a program‘s object modules (only a small stub is attached) a program‘s object modules (only a small stub is attached)

The stub indicates how to call (load) the appropriate library The stub indicates how to call (load) the appropriate library

routine routine

All programs may use the same copy of a library (code) All programs may use the same copy of a library (code) (shared libraries - .DLLs)

(shared libraries - .DLLs)

Dynamic linking requires operating system support Dynamic linking requires operating system support

OS is the only instance which may locate a library in another OS is the only instance which may locate a library in another

process‘s address space process‘s address space

(11)

Memory Allocation Schemes Memory Allocation Schemes

Main memory must accommodate OS

Main memory must accommodate OS ++ user processes user processes

OS needs to be protected from changes by user processes OS needs to be protected from changes by user processes User processes must be protected from each other

User processes must be protected from each other

Single partition allocation:

User processes occupy a single memory partition User processes occupy a single memory partition

Protection can be implemented by limit and relocation register Protection can be implemented by limit and relocation register (OS in low memory, user processes in high memory, see below) (OS in low memory, user processes in high memory, see below)

CPU <

limit register

relocation register

+ memory

trap, addressing error logical

address

no

yes physical

address

OS

(12)

Memory Allocation Schemes

Memory Allocation Schemes ^(contd.) ^(contd.)

Multiple-Partition Allocation Multiple-Partition Allocation

Multiple processes should reside in memory simultaneously Multiple processes should reside in memory simultaneously

Memory can be divided in multiple partitions (fixed vs. variable size) Memory can be divided in multiple partitions (fixed vs. variable size) Problem:

Problem: What is the optimal partition size?What is the optimal partition size? Dynamic storage allocation problem

Dynamic storage allocation problem

Multiple partitions with holes in between Multiple partitions with holes in between Memory requests are satisfied from the

Memory requests are satisfied from the set of holesset of holes Which hole to select?

Which hole to select?

First-fit

First-fit: allocate the : allocate the firstfirst hole that is big enough hole that is big enough Best-fit

Best-fit: allocate the : allocate the smallest hole that is big enoughsmallest hole that is big enough Worst-fit

Worst-fit: allocate the : allocate the largest hole (produces largest leftover hole)largest hole (produces largest leftover hole) First-fit & best-fit are better than worst-fit (time & storage-wise) First-fit & best-fit are better than worst-fit (time & storage-wise) First-fit is generally faster than best-fit

First-fit is generally faster than best-fit

(13)

Overlays Overlays

Size of program and data Size of program and data may exceed size of memory may exceed size of memory Concept:

Concept:

Separate program in modules Separate program in modules Load modules alternatively Load modules alternatively Overlay driver locates

Overlay driver locates modules on disk

modules on disk

Overlay modules are kept as Overlay modules are kept as absolute memory images

absolute memory images Compiler support required Compiler support required

Symbol table Common

routines Overlay driver

Example:

multi-pass compiler

Pass 1

Pass 2

(14)

Swapping Swapping

Processes can temporarily be Processes can temporarily be swapped

swapped outout of memory to backing of memory to backing store in order to allow for execution of store in order to allow for execution of other processes

other processes

On the basis of physical addresses:

Then, processes will be

Then, processes will be swapped inswapped in into same memory space that they into same memory space that they occupied previously

occupied previously

On the basis of logical addresses:

What current OSes call swapping is What current OSes call swapping is rather paging out whole processes.

rather paging out whole processes.

Then, processes can be

Then, processes can be swapped in swapped in at arbitrary physical addresses.

at arbitrary physical addresses.

Operating system

User space

Process P₁

Process P₂ Swap

out

Swap in Main memory

Backing store

In a multiprogramming environment:

(15)

Segmentation Segmentation

What is the programmer‘s view of memory?

Collection of variable-sized segments (text, data, stack, subroutines,..) Collection of variable-sized segments (text, data, stack, subroutines,..) No necessary ordering among segments

No necessary ordering among segments Logical address: <segment-number, offset>

Logical address: <segment-number, offset>

Hardware:

Segment table containing base address and limit for each segment Segment table containing base address and limit for each segment

CPU s d

limit base s

<

Trap, addressing error +

Physical memory Segment

table

yes no

(16)

Fragmentation Fragmentation

External Fragmentation

External Fragmentation – total memory space exists to satisfy a – total memory space exists to satisfy a request, but it is not contiguous.

request, but it is not contiguous.

Internal Fragmentation

Internal Fragmentation – allocated memory may be slightly larger – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a than requested memory; this size difference is memory internal to a partition, but not being used.

partition, but not being used.

Reduce external fragmentation by compaction Reduce external fragmentation by compaction

Shuffle memory contents to place all free memory together in one Shuffle memory contents to place all free memory together in one large block.

large block.

Compaction is possible

Compaction is possible only if relocation is dynamic, and is done at only if relocation is dynamic, and is done at execution time.

execution time.

I/O problem I/O problem

Latch job in memory while it is involved in I/O.

Do I/O only into OS buffers.

(17)

Paging Paging

Dynamic storage allocation algorithms for varying-sized Dynamic storage allocation algorithms for varying-sized chunks of memory may lead to fragmentation

chunks of memory may lead to fragmentation Solutions:

Solutions:

Compaction – dynamic relocation of processes Compaction – dynamic relocation of processes Noncontiguous allocation of process memory in Noncontiguous allocation of process memory in equally sized

equally sized pages pages (this avoids the memory fitting (this avoids the memory fitting problem)

problem) Paging

Paging breaks physical memory into fixed-sized blocks breaks physical memory into fixed-sized blocks (called

(called frames frames ) )

Logical memory is broken into

Logical memory is broken into pages pages (of the same size) (of the same size)

(18)

Paging: Basic Method Paging: Basic Method

When a process is executed, its pages are loaded into When a process is executed, its pages are loaded into any available frames from backing store (disk)

any available frames from backing store (disk)

Hardware support for paging consists of a page table Hardware support for paging consists of a page table Logical addresses consist of page number and offset Logical addresses consist of page number and offset

CPU p d f d

Physical memory Logical

address

Physical address p

Page number offset

Page frames are typically 2-4 kb

(19)

Paging Example Paging Example

Page 0 Page 1 Page 2 Page 3

logical memory

Page 0

Page 2

physical memory Page 1

Page 3 frame

number 0 1 2 3 4 5 6 7 4

1 6 3 page table 0

1 2 3

(20)

Free Frames Free Frames

Before allocation After allocation

Page 2

physical Page 0

Page 3 frame

number

10 11 12 13 14 15 16 17

Page 1 7

8 9 10

11 12 13 14 15 16 17 7 8 9

Free frame list 7, 8, 10, 11,13, 16

Free frame list 7, 10

11 8 16 13

New process page table 0

1 2 3

Process creation

(21)

Paging: Hardware Support Paging: Hardware Support

Every memory access requires access to page table Every memory access requires access to page table

Page table should be implemented in hardware Page table should be implemented in hardware Page tables exist on a per-user process basis Page tables exist on a per-user process basis

Small page tables can be just a set of registers Small page tables can be just a set of registers

Problem: size of physical memory, # of processes Problem: size of physical memory, # of processes

Page tables should be kept in memory Page tables should be kept in memory

Only base address of page table is kept in a special register Only base address of page table is kept in a special register Problem: speed of memory accesses

Problem: speed of memory accesses

Translation look-aside buffers (TLBs) Translation look-aside buffers (TLBs)

Associative registers store recently used page table entries Associative registers store recently used page table entries TLBs are fast, expensive, small: 8..2048 entries

TLBs are fast, expensive, small: 8..2048 entries TLB must be flushed on process context switches TLB must be flushed on process context switches

(22)

Associative Memory Associative Memory

Associative memory – parallel search Associative memory – parallel search

Address translation (A´, A´´) Address translation (A´, A´´)

If A´ is in associative register, get frame # out.

Otherwise get frame # from page table in memory Otherwise get frame # from page table in memory

Page # Frame #

(23)

Paging Hardware With TLB Paging Hardware With TLB

CPU p d f d

address

Physical address

p

Page table Page number

offset _{Page #} _{Frame #}

TLB

TLB miss TLB hit

(24)

Effective Access Time with TLB Effective Access Time with TLB

Associative Lookup in TLB =

Associative Lookup in TLB =   time unit time unit

Assume memory cycle time is 1 microsecond Assume memory cycle time is 1 microsecond

Hit ratio – percentage of times that a page number is found Hit ratio – percentage of times that a page number is found in the associative registers;

in the associative registers;

ratio related to number of associative registers.

Let us assume a hit ratio = Let us assume a hit ratio = 

Effective Access Time (EAT) Effective Access Time (EAT) EAT = (1 +

EAT = (1 +   )  )  + (2 + + (2 +   )(1 – )(1 –   ) )

= 2 +

= 2 +   – –  

(25)

Memory Protection Memory Protection

Memory protection implemented by associating control Memory protection implemented by associating control bits with each frame

bits with each frame

Isolation of processes in main memory Isolation of processes in main memory

Valid-invalid

Valid-invalid bit attached to each entry in the page table: bit attached to each entry in the page table:

“ “ valid” indicates that the associated page is in the valid” indicates that the associated page is in the process’ logical address space, and is thus a legal process’ logical address space, and is thus a legal page page

“ “ invalid” indicates that the page is not in the invalid” indicates that the page is not in the process’ logical address space

process’ logical address space

(26)

Valid (v) or Invalid (i) Bit in a Page Valid (v) or Invalid (i) Bit in a Page

Table Table

Page 0 Page 1 Page 2 Page 3

logical memory

Page 0

Page 2

physical memory Page 1

Page 3 frame

number 0 1 2 3 4 5 6 7 4

1 6 3

page table 0

1 2 3 4 5

v v v v i i

Invalid pages may be paged out

(27)

Page Table Structure Page Table Structure

Hierarchical Paging Hierarchical Paging

Hashed Page Tables

Inverted Page Tables

(28)

Hierarchical Page Tables Hierarchical Page Tables

Break up the logical address space into multiple Break up the logical address space into multiple

page tables page tables

A simple technique is a two-level page table A simple technique is a two-level page table

Used with 32-bit CPUs

(29)

page number page offset

p_i p₂ d

10 10 12

Two-Level Paging Example Two-Level Paging Example

A logical address (on 32-bit machine with 4K page size) is A logical address (on 32-bit machine with 4K page size) is divided into:

divided into:

a page number consisting of 20 bits.

a page offset consisting of 12 bits.

Since the page table is paged, the page number is further Since the page table is paged, the page number is further divided into:

divided into:

a 10-bit page number a 10-bit page number a 10-bit page offset a 10-bit page offset

Thus, a logical address is as follows:

where pi is an index into the outer page table, and p2 is the where pi is an index into the outer page table, and p2 is the displacement within the page of the outer page table

displacement within the page of the outer page table

(30)

Two-Level Page-Table Scheme Two-Level Page-Table Scheme

…

outer page table (page directory)

page tables

memory

(31)

Address-Translation Scheme Address-Translation Scheme

Address-translation scheme for a two-level 32- Address-translation scheme for a two-level 32-

bit paging architecture bit paging architecture

page number page offset

p₁ p₂ d

10 10 12

page directory

page table

Main memory

p₁

p₂

(32)

Hashed Page Tables Hashed Page Tables

Common in address spaces > 32 bits Common in address spaces > 32 bits

IA64 supports hashed page tables IA64 supports hashed page tables

The virtual page number is hashed into a page table.

This page table contains a chain of elements hashing to This page table contains a chain of elements hashing to the same location

the same location

Virtual page numbers are compared in this chain Virtual page numbers are compared in this chain searching for a match. If a match is found, the searching for a match. If a match is found, the corresponding physical frame is extracted

corresponding physical frame is extracted

(33)

Hashed Page Table Hashed Page Table

CPU p d f d

address

Physical address

offset

p f q r

hash function

(34)

Inverted Page Table Inverted Page Table

One entry for each real page of memory One entry for each real page of memory

Entry consists of the virtual address of the page Entry consists of the virtual address of the page

stored in that real memory location, with stored in that real memory location, with

information about the process that owns that information about the process that owns that page page

Decreases memory needed to store each page Decreases memory needed to store each page

table, but increases time needed to search the table, but increases time needed to search the

table when a page reference occurs table when a page reference occurs

Use hash table to limit the search to one — or at Use hash table to limit the search to one — or at

most a few — page-table entries

(35)

Inverted Page Table Architecture Inverted Page Table Architecture

CPU p d f d

address

Physical address

offset

pid p search

pid

Process ID

f

(36)

Shared Pages Shared Pages

Shared code Shared code

One copy of read-only (reentrant) code shared among One copy of read-only (reentrant) code shared among

processes (i.e., text editors, compilers, window systems) processes (i.e., text editors, compilers, window systems) Shared code must appear in same location in the logical Shared code must appear in same location in the logical

address space of all processes address space of all processes

Private code and data Private code and data

Each process keeps a separate copy of the code and data Each process keeps a separate copy of the code and data

The pages for the private code and data can appear anywhere The pages for the private code and data can appear anywhere

in the logical address space in the logical address space

(37)

Shared Pages Example Shared Pages Example

Process 1 virtual memory

Process 2 virtual memory

cpp

cc1

data1 data2

cc2 memory cpp

cc1 cc2 data1

cpp cc1 cc2 data2

1 4 11

7

1 4 11

8

frame number

0 1 2 3 4 5 6 7 8 9 10 11

Process 1 page table

Process 2 page table

(38)

Segmentation with Paging Segmentation with Paging

- paged segmentation on the GE 645 (Multics) - paged segmentation on the GE 645 (Multics)

The innovative MULTICS operating system introduced:

Logical addresses: 18-bit segment no, 16-bit offset Logical addresses: 18-bit segment no, 16-bit offset

(relatively) small number of 64k segments (relatively) small number of 64k segments

To eliminate fragmentation,

To eliminate fragmentation, segments segments areare paged paged A separate page table exists for each segment A separate page table exists for each segment

segment length

page-table base

s d

>=

+ Trap

yes no

segment table base register

d

p d‘

+ f f d‘

physical memory

physical address logical address

segment table

(39)

Intel 30386 Intel 30386 Address Translation Address Translation

The Intel 386 The Intel 386 uses

uses

segmentation segmentation with paging with paging for memory for memory management management with a two- with a two- level paging level paging scheme.

scheme.

10 10 12

22

31 21 11 0

Intel Linear Address

12

4Mb PDE 4Kb PDE

Page directory 1024x4byte entries

(one per process) cr 3

PTE

Page table 1024 entries

Physical Address

operand 4 Kb page

operand 4 Mb page 22 bit

offset 4kb page frame

4MB page frame

Physical Memory

limit base s

+

descriptor table

selector offset

offset selector

Intel logical Address

(40)

Summary Summary

In a multiprogrammed OS, every memory address In a multiprogrammed OS, every memory address

generated by the CPU must be checked for legality and generated by the CPU must be checked for legality and

possibly mapped to a physical address possibly mapped to a physical address

Checking cannot be implemented (efficiently) in software Checking cannot be implemented (efficiently) in software

Hardware support is essential Hardware support is essential

A pair of registers is sufficient for single/multiple partition A pair of registers is sufficient for single/multiple partition

schemes schemes

Paging/segmentation need mapping tables to define address maps Paging/segmentation need mapping tables to define address maps

Paging and segmentation can be fast Paging and segmentation can be fast

Tables have to be implemented in fast registers (Problem: size) Tables have to be implemented in fast registers (Problem: size)

Set of associative registers (TLB) may reduce performance Set of associative registers (TLB) may reduce performance

degradation if tables are kept in memory degradation if tables are kept in memory

Most modern OS combine paging and segmentation

(41)

Unit OS5: Memory Management Unit OS5: Memory Management