• Aucun résultat trouvé

Unit OS B: Comparing the Linux Unit OS B: Comparing the Linux and Windows Kernels and Windows Kernels

N/A
N/A
Protected

Academic year: 2022

Partager "Unit OS B: Comparing the Linux Unit OS B: Comparing the Linux and Windows Kernels and Windows Kernels"

Copied!
50
0
0

Texte intégral

(1)

Unit OS B: Comparing the Linux Unit OS B: Comparing the Linux

and Windows Kernels

and Windows Kernels

(2)

Copyright Notice Copyright Notice

© 2000-2005 David A. Solomon and Mark Russinovich

© 2000-2005 David A. Solomon and Mark Russinovich

These materials are part of the

These materials are part of the Windows Operating Windows Operating System Internals Curriculum Development Kit,

System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E.

developed by David A. Solomon and Mark E.

Russinovich with Andreas Polze Russinovich with Andreas Polze

Microsoft has licensed these materials from David Microsoft has licensed these materials from David Solomon Expert Seminars, Inc. for distribution to Solomon Expert Seminars, Inc. for distribution to academic organizations solely for use in academic academic organizations solely for use in academic environments (and not for commercial use)

environments (and not for commercial use)

(3)

Roadmap for Section B Roadmap for Section B

A Brief History of Windows and Linux A Brief History of Windows and Linux

Comparing the Windows and Linux kernel Comparing the Windows and Linux kernel

architectures architectures

Linux: becoming more like Windows Linux: becoming more like Windows

Benchmarks and other lies Benchmarks and other lies What does the future hold?

What does the future hold?

(4)

Scope Scope

We’re going to look at the technology of the We’re going to look at the technology of the kernels

kernels

We’re not going to look at:

We’re not going to look at:

Cost Cost

Support Support

Applications Applications

Management Management

Use as a desktop system

Use as a desktop system

(5)

The History of Linux The History of Linux

The real history of Linux starts in 1969, when Ken The real history of Linux starts in 1969, when Ken

Thompson developed the first version of UNIX at Bell Thompson developed the first version of UNIX at Bell Labs

Labs

After Dennis Ritchie, designer of the C programming language, After Dennis Ritchie, designer of the C programming language, joined the project it debuted to the research community in an joined the project it debuted to the research community in an academic paper in 1974

academic paper in 1974

Bell Labs released the first commercial version in 1976 as UNIX Bell Labs released the first commercial version in 1976 as UNIX Version 6 (V6)

Version 6 (V6)

UNIX spread throughout universities and in 1978 Bell UNIX spread throughout universities and in 1978 Bell

Labs released UNIX Time-Sharing System, a version with Labs released UNIX Time-Sharing System, a version with portability in mind

portability in mind

(6)

Linux History Continued Linux History Continued

Because Bell Labs distributed UNIX with source code, the Because Bell Labs distributed UNIX with source code, the early 1980’s saw three major branches grow on the UNIX early 1980’s saw three major branches grow on the UNIX tree:

tree:

UNIX System III from Bell Lab’s UNIX Support Group (USG) UNIX System III from Bell Lab’s UNIX Support Group (USG)

UNIX Berkeley Source Distribution (BSD) from the University of UNIX Berkeley Source Distribution (BSD) from the University of California at Berkeley

California at Berkeley Microsoft’s XENIX Microsoft’s XENIX

The UNIX market fragmented further in the 1980’s, The UNIX market fragmented further in the 1980’s, despite the IEEE’s POSIX standard and the X/Open despite the IEEE’s POSIX standard and the X/Open Group’s Portability Guide

Group’s Portability Guide

(7)

Linus and Linux Linus and Linux

In 1991 Linus Torvalds took a college computer science In 1991 Linus Torvalds took a college computer science course that used the Minix operating system

course that used the Minix operating system

Minix is a “toy” UNIX-like OS written by Andrew Tanenbaum as a Minix is a “toy” UNIX-like OS written by Andrew Tanenbaum as a learning workbench

learning workbench

Linus wanted to make MINIX more usable, but Tanenbaum Linus wanted to make MINIX more usable, but Tanenbaum wanted to keep it ultra-simple

wanted to keep it ultra-simple

Linus went in his own direction and began working on Linus went in his own direction and began working on Linux

Linux

In October 1991 he announced Linux v0.02 In October 1991 he announced Linux v0.02 In March 1994 he released Linux v1.0

In March 1994 he released Linux v1.0

(8)

The History of Windows (NT) The History of Windows (NT)

The history of Windows really begins in the mid-1970s, The history of Windows really begins in the mid-1970s,

when Dick Hustvedt, Peter Lipman and David Cutler when Dick Hustvedt, Peter Lipman and David Cutler

designed the VMS operating system for Digital’s 32-bit designed the VMS operating system for Digital’s 32-bit

VAX processor VAX processor

Digital shipped VMS v1.0 in 1978 Digital shipped VMS v1.0 in 1978

Cutler moved to Seattle to open DECWest and worked on Cutler moved to Seattle to open DECWest and worked on

the Digital Mica OS for a new CPU codenamed Prism the Digital Mica OS for a new CPU codenamed Prism

12 engineers went with him and the facility grew to 200 12 engineers went with him and the facility grew to 200 In 1988 Digital cancelled the project

In 1988 Digital cancelled the project

(9)

The History of Windows Continued The History of Windows Continued

Bill Gates wanted a UNIX rival Bill Gates wanted a UNIX rival

He hired Cutler and 20 Digital engineers in 1989 He hired Cutler and 20 Digital engineers in 1989

The new project was called NT OS/2 because it focused on OS/2 The new project was called NT OS/2 because it focused on OS/2 backward compatibility

backward compatibility

With the success of Windows 3.0’s 1990 release Gates With the success of Windows 3.0’s 1990 release Gates refocused the project on Windows compatibility

refocused the project on Windows compatibility

The project renamed to Windows NT The project renamed to Windows NT

Microsoft released Windows NT 3.1 in August 1993 Microsoft released Windows NT 3.1 in August 1993

(10)

Windows and Linux Windows and Linux

Both Linux and Windows are based on Both Linux and Windows are based on

foundations developed in the mid-1970s foundations developed in the mid-1970s

1970 1980 1990 2000

UNIX born

UNIX public

UNIX V6

Linux v1.0 v2.0

v2.1

v2.2 v2.3

v2.4 v2.6

1970 1980 1990 2000

VMS v1.0

Windows NT 3.1

NT 4.0

Windows 2000 Windows XP

Server 2003

(11)

Comparing the Architectures Comparing the Architectures

Both Linux and Windows are monolithic Both Linux and Windows are monolithic

All core operating system services run in a shared address space All core operating system services run in a shared address space in kernel-mode

in kernel-mode

All core operating system services are part of a single module All core operating system services are part of a single module

Linux: vmlinuz Linux: vmlinuz

Windows: ntoskrnl.exe Windows: ntoskrnl.exe

Windowing is handled differently:

Windowing is handled differently:

Windows has a kernel-mode Windowing subsystem Windows has a kernel-mode Windowing subsystem Linux has a user-mode X-Windowing system

Linux has a user-mode X-Windowing system

(12)

Kernel Architectures Kernel Architectures

Device Drivers Process Management,

Memory Management, I/O Management, etc.

X-Windows Application

System Services User Mode

Kernel Mode

Hardware Dependent Code

Linux

Device Drivers

Process Management, Memory Management, I/O Management, etc.

Win32 Windowing

Application

System Services

User Mode Kernel Mode

Hardware Dependent Code

Windows

(13)

Linux Kernel Linux Kernel

Linux is a monolithic but modular system Linux is a monolithic but modular system

All kernel subsystems form a single piece of code with no All kernel subsystems form a single piece of code with no protection between them

protection between them

Modularity is supported in two ways:

Modularity is supported in two ways:

Compile-time options Compile-time options

Most kernel components can be built as a dynamically loadable Most kernel components can be built as a dynamically loadable kernel module (DLKM)

kernel module (DLKM)

DLKMs DLKMs

Built separately from the main kernel Built separately from the main kernel

Loaded into the kernel at runtime and on demand (infrequently Loaded into the kernel at runtime and on demand (infrequently used components take up kernel memory only when needed) used components take up kernel memory only when needed) Kernel modules can be upgraded incrementally

Kernel modules can be upgraded incrementally

Support for minimal kernels that automatically adapt to the Support for minimal kernels that automatically adapt to the machine and load only those kernel components that are used machine and load only those kernel components that are used

(14)

Windows Kernel Windows Kernel

Windows is a monolithic but modular system Windows is a monolithic but modular system

No protection among pieces of kernel code and drivers No protection among pieces of kernel code and drivers

Support for Modularity is somewhat weak:

Support for Modularity is somewhat weak:

Windows Drivers allow for dynamic extension of kernel Windows Drivers allow for dynamic extension of kernel

functionality functionality

Windows XP Embedded has special tools / packaging rules that Windows XP Embedded has special tools / packaging rules that

allow coarse-grained configuration of the OS allow coarse-grained configuration of the OS

Windows Drivers are dynamically loadable kernel modules Windows Drivers are dynamically loadable kernel modules

Significant amount of code run as drivers (including network Significant amount of code run as drivers (including network

stacks such as TCP/IP and many services) stacks such as TCP/IP and many services)

Built independently from the kernel Built independently from the kernel

Can be loaded on-demand Can be loaded on-demand

Dependencies among drivers can be specified Dependencies among drivers can be specified

(15)

Comparing Portability Comparing Portability

Both Linux and Windows kernels are portable Both Linux and Windows kernels are portable

Mainly written in C Mainly written in C

Have been ported to a range of processor architectures Have been ported to a range of processor architectures

Windows Windows

i486, MIPS, PowerPC, Alpha, IA-64, x86-64 i486, MIPS, PowerPC, Alpha, IA-64, x86-64 Only x86-64 and IA-64 currently supported Only x86-64 and IA-64 currently supported

> 64MB memory required

> 64MB memory required

Linux Linux

Alpha, ARM, ARM26, CRIS, H8300, i386, IA-64, M68000, MIPS, Alpha, ARM, ARM26, CRIS, H8300, i386, IA-64, M68000, MIPS, PA-RISC, PowerPC, S/390, SuperH, SPARC, VAX, v850, x86-64 PA-RISC, PowerPC, S/390, SuperH, SPARC, VAX, v850, x86-64 DLKMs allow for minimal kernels for microcontrollers

DLKMs allow for minimal kernels for microcontrollers

> 4MB memory required

> 4MB memory required

(16)

Comparing Layering, APIs, Complexity Comparing Layering, APIs, Complexity

Windows Windows

Kernel exports about 250 system calls (accessed via ntdll.dll) Kernel exports about 250 system calls (accessed via ntdll.dll)

Layered Windows/POSIX subsystems Layered Windows/POSIX subsystems

Rich Windows API (17 500 functions on top of native APIs) Rich Windows API (17 500 functions on top of native APIs)

Linux Linux

Kernel supports about 200 different system calls Kernel supports about 200 different system calls

Layered BSD, Unix Sys V, POSIX shared system libraries Layered BSD, Unix Sys V, POSIX shared system libraries

Compact APIs (1742 functions in Single Unix Specification Compact APIs (1742 functions in Single Unix Specification

Version 3; not including X Window APIs) Version 3; not including X Window APIs)

(17)

Comparing Architectures Comparing Architectures

Processes and scheduling Processes and scheduling

SMP support SMP support

Memory management Memory management I/O I/O

File Caching File Caching

Security

Security

(18)

Process Management Process Management

Windows Windows

Process Process

Address space, handle Address space, handle

table, statistics and at least table, statistics and at least one thread

one thread

No inherent parent/child No inherent parent/child relationship

relationship

Threads Threads

Basic scheduling unit Basic scheduling unit

Fibers - cooperative user- Fibers - cooperative user- mode threads

mode threads

Linux Linux

Process is called a Task Process is called a Task

Basic Address space, Basic Address space, handle table, statistics handle table, statistics

Parent/child relationship Parent/child relationship

Basic scheduling unit Basic scheduling unit

Threads Threads

No threads per-se No threads per-se

Tasks can act like Windows Tasks can act like Windows

threads by sharing handle threads by sharing handle

table, PID and address table, PID and address

space space

PThreads – cooperative PThreads – cooperative

user-mode threads user-mode threads

(19)

Scheduling Priorities Scheduling Priorities

Windows Windows

Two scheduling classes Two scheduling classes

““Real time” (fixed) - Real time” (fixed) - priority 16-31

priority 16-31

Dynamic - priority 1-15 Dynamic - priority 1-15

Higher priorities are Higher priorities are favored

favored

Priorities of dynamic Priorities of dynamic

threads get boosted on threads get boosted on

wakeups wakeups

Thread priorities are Thread priorities are

never lowered never lowered

31

15 16

0 Fixed

Dynamic I/O

Windows

(20)

Scheduling Priorities Scheduling Priorities

Windows Windows

Two scheduling classes Two scheduling classes

““Real time” (fixed) - Real time” (fixed) - priority 16-31

priority 16-31

Dynamic - priority 1-15 Dynamic - priority 1-15

Higher priorities are Higher priorities are favored

favored

Priorities of dynamic Priorities of dynamic

threads get boosted on threads get boosted on

wakeups wakeups

Thread priorities are Thread priorities are

never lowered never lowered

Linux Linux

Has 3 scheduling classes:

Has 3 scheduling classes:

Normal – priority 100-139 Normal – priority 100-139

Fixed Round Robin – priority Fixed Round Robin – priority 0-990-99

Fixed FIFO – priority 0-99 Fixed FIFO – priority 0-99

Lower priorities are favored Lower priorities are favored

Priorities of normal threads Priorities of normal threads go up (decay) as they use go up (decay) as they use CPUCPU

Priorities of interactive Priorities of interactive

threads go down (boost) threads go down (boost)

(21)

Scheduling Priorities (cont) Scheduling Priorities (cont)

31

15 16

0 Fixed

Dynamic I/O

Windows

140 100 99 0

Fixed FIFO Fixed Round-Robin

Normal

CPU I/O

Linux

(22)

Linux Scheduling Details Linux Scheduling Details

Most threads use a dynamic priority policy Most threads use a dynamic priority policy

Normal class - similar to the classic UNIX scheduler Normal class - similar to the classic UNIX scheduler A newly created thread starts with a base priority A newly created thread starts with a base priority

Threads that block frequently (I/O bound) will have their priority Threads that block frequently (I/O bound) will have their priority gradually increased

gradually increased

Threads that always exhaust their time slice (CPU bound) will Threads that always exhaust their time slice (CPU bound) will have their priority gradually decreased

have their priority gradually decreased

“ “ Nice value” sets a thread’s base priority Nice value” sets a thread’s base priority

Larger values = less priority, lower values = higher priority Larger values = less priority, lower values = higher priority Valid nice values are in the range of -20 to +20

Valid nice values are in the range of -20 to +20

Nonprivileged users can only specify positive nice value Nonprivileged users can only specify positive nice value

Dynamic priority policy threads have static priority zero Dynamic priority policy threads have static priority zero

Execute only when there are no runnable real-time threads Execute only when there are no runnable real-time threads

(23)

Real-Time Scheduling on Linux Real-Time Scheduling on Linux

Linux supports two static priority scheduling policies:

Linux supports two static priority scheduling policies:

Round-robin and FIFO (first in, first out) Round-robin and FIFO (first in, first out)

Selected with the sched-setscheduler( ) system call Selected with the sched-setscheduler( ) system call Use static priority values in the range of 1 to 99 Use static priority values in the range of 1 to 99

Executed strictly in order of decreasing static priority Executed strictly in order of decreasing static priority FIFO policy lets a thread run to completion

FIFO policy lets a thread run to completion

Thread needs to indicate completion by calling the sched-yield( ) Thread needs to indicate completion by calling the sched-yield( ) Round-robin lets threads run for up to one time slice

Round-robin lets threads run for up to one time slice

Then switches to the next thread with the same static priority Then switches to the next thread with the same static priority

RT threads can easily starve lower-prio threads from executing RT threads can easily starve lower-prio threads from executing

Root privileges or the CAP-SYS-NICE capability are required for the Root privileges or the CAP-SYS-NICE capability are required for the selection of a real-time scheduling policy

selection of a real-time scheduling policy

Long running system calls can cause priority-inversion Long running system calls can cause priority-inversion

Same as in Windows; but cmp. rtLinux Same as in Windows; but cmp. rtLinux

(24)

Windows Scheduling Details Windows Scheduling Details

Most threads run in variable priority levels Most threads run in variable priority levels

Priorities 1-15;

Priorities 1-15;

A newly created thread starts with a base priority A newly created thread starts with a base priority

Threads that complete I/O operations experience priority Threads that complete I/O operations experience priority

boosts (but never higher than 15) boosts (but never higher than 15)

A thread’s priority will never be below base priority A thread’s priority will never be below base priority

The Windows API function SetThreadPriority() sets the The Windows API function SetThreadPriority() sets the priority value for a specified thread

priority value for a specified thread

This value, together with the priority class of the thread's This value, together with the priority class of the thread's

process, determines the thread's base priority level process, determines the thread's base priority level

Windows will dynamically adjust priorities for non-realtime Windows will dynamically adjust priorities for non-realtime

threads threads

(25)

Real-Time Scheduling on Windows Real-Time Scheduling on Windows

Windows supports static round-robin scheduling policy for Windows supports static round-robin scheduling policy for threads with priorities in real-time range (16-31)

threads with priorities in real-time range (16-31)

Threads run for up to one quantum Threads run for up to one quantum

Quantum is reset to full turn on preemption Quantum is reset to full turn on preemption Priorities never get boosted

Priorities never get boosted

RT threads can starve important system services RT threads can starve important system services

Such as CSRSS.EXE Such as CSRSS.EXE

SeIncreaseBasePriorityPrivilege required to elevate a thread’s SeIncreaseBasePriorityPrivilege required to elevate a thread’s priority into real-time range (this privilege is assigned to

priority into real-time range (this privilege is assigned to members of Administrators group)

members of Administrators group)

System calls and DPC/APC handling can cause priority System calls and DPC/APC handling can cause priority inversion

inversion

(26)

Scheduling Timeslices Scheduling Timeslices

Windows Windows

The thread timeslice The thread timeslice

(quantum) is 10ms-120ms (quantum) is 10ms-120ms

When quanta can vary, When quanta can vary,

has one of 2 values has one of 2 values

Reentrant and Reentrant and preemptible preemptible

Fixed: 120ms

20ms

Foreground: 60ms Background

Linux Linux

The thread quantum is The thread quantum is

10ms-200ms 10ms-200ms

Default is 100ms Default is 100ms

Varies across entire Varies across entire

range based on priority, range based on priority,

which is based on which is based on

interactivity level interactivity level

Reentrant and Reentrant and

preemptible preemptible

100ms

200ms 10ms

(27)

Multiprocessor Support Multiprocessor Support

Windows Windows

Supports symmetric multiprocessing Supports symmetric multiprocessing (SMP)

(SMP)

Up to 32 processors on 32-bit Up to 32 processors on 32-bit Windows

Windows

Up to 64 processors on 64-bit Up to 64 processors on 64-bit Windows

Windows

All CPUs can take interrupts All CPUs can take interrupts

Supports Non-Uniform Memory Access Supports Non-Uniform Memory Access systems

systems

Scheduler favors the node a thread Scheduler favors the node a thread prefers to run on

prefers to run on

Memory manager tries to allocate Memory manager tries to allocate memory on the node a thread memory on the node a thread prefers to run on

prefers to run on

Supports Hyperthreading Supports Hyperthreading

Scheduler favors idle physical Scheduler favors idle physical processors when it has a choice processors when it has a choice Doesn’t count logical CPUs against Doesn’t count logical CPUs against licensing limits

licensing limits

Physical

CPU 0 Physical

CPU 1

0 1 3 4

Ready Thread

(28)

Multiprocessor Support Multiprocessor Support

Windows Windows

Supports symmetric multiprocessing Supports symmetric multiprocessing (SMP)

(SMP)

Up to 32 processors on 32-bit Up to 32 processors on 32-bit Windows

Windows

Up to 64 processors on 64-bit Up to 64 processors on 64-bit Windows

Windows

All CPUs can take interrupts All CPUs can take interrupts

Supports Non-Uniform Memory Access Supports Non-Uniform Memory Access systems

systems

Scheduler favors the node a thread Scheduler favors the node a thread prefers to run on

prefers to run on

Memory manager tries to allocate Memory manager tries to allocate memory on the node a thread memory on the node a thread prefers to run on

prefers to run on

Supports Hyperthreading Supports Hyperthreading

Scheduler favors idle physical Scheduler favors idle physical processors when it has a choice processors when it has a choice Doesn’t count logical CPUs against Doesn’t count logical CPUs against licensing limits

licensing limits

Linux Linux

Supports SMP Supports SMP

No upper CPU limit: set as No upper CPU limit: set as

kernel build constant kernel build constant

All CPUs can take interrupts All CPUs can take interrupts

Supports Non-Uniform Memory Supports Non-Uniform Memory Access systems

Access systems

Scheduler favors the node a Scheduler favors the node a

thread last ran on thread last ran on

Memory manager tries to Memory manager tries to

allocate memory on the node a allocate memory on the node a

thread is running on thread is running on

Supports Hyperthreading Supports Hyperthreading

Scheduler favors idle Scheduler favors idle

physical processors when it physical processors when it

has a choice has a choice

(29)

Virtual Memory Management Virtual Memory Management

Windows Windows

32-bit versions split user- 32-bit versions split user-

mode/kernel-mode from 2GB/2GB mode/kernel-mode from 2GB/2GB to 3GB/1GB

to 3GB/1GB

Demand-paged virtual memory Demand-paged virtual memory

32 or 64-bits 32 or 64-bits Copy-on-write Copy-on-write Shared memory Shared memory

Memory mapped files Memory mapped files

User

System 0

2GB

4GB

Linux Linux

Splits user-mode/kernel-mode Splits user-mode/kernel-mode from 1GB/3GB to 3GB/1GB from 1GB/3GB to 3GB/1GB

2.6 has “4/4 split” option where 2.6 has “4/4 split” option where

kernel has its own address kernel has its own address

space space

Demand-paged virtual memory Demand-paged virtual memory

32-bits and/or 64-bits 32-bits and/or 64-bits

Copy-on-write Copy-on-write

Shared memory Shared memory

Memory mapped files Memory mapped files

User

System 0

3GB

4GB

(30)

Physical Memory Management Physical Memory Management

Windows Windows

Per-process working sets Per-process working sets

Working set tuner adjust Working set tuner adjust sets according to memory sets according to memory needs using the “clock”

needs using the “clock”

algorithm algorithm

No “swapper”

No “swapper”

Process LRU

Reused Page

Linux Linux

Global working set Global working set management

management

uses “clock” algorithm uses “clock” algorithm

No “swapper” (the working No “swapper” (the working set trimmer code is called set trimmer code is called the swap daemon, however) the swap daemon, however)

LRU

Reused Page Other Process

LRU

(31)

I/O Management I/O Management

Windows Windows

Centered around the file object Centered around the file object Layered driver architecture Layered driver architecture throughout driver types

throughout driver types

Most I/O supports asynchronous Most I/O supports asynchronous operation

operation

Internal interrupt request level Internal interrupt request level (IRQL) controls interruptability (IRQL) controls interruptability Interrupts are split between an Interrupts are split between an Interrupt Service Routine (ISR) Interrupt Service Routine (ISR) and a Deferred Procedure Call and a Deferred Procedure Call (DPC)

(DPC)

Supports plug-and-play Supports plug-and-play

Linux Linux

Centered around the vnode Centered around the vnode No layered I/O model

No layered I/O model Most I/O is synchronous Most I/O is synchronous

Only sockets and direct disk Only sockets and direct disk

I/O support asynchronous I/O support asynchronous I/OI/O

Internal interrupt request level Internal interrupt request level (IRQL) controls interruptability (IRQL) controls interruptability Interrupts are split between an Interrupts are split between an ISR and soft IRQ or tasklet

ISR and soft IRQ or tasklet Supports plug-and-play Supports plug-and-play

IRQL

Masked

(32)

File Caching File Caching

Windows Windows

Single global common cache Single global common cache Virtual file cache

Virtual file cache

Caching is at file vs. disk block Caching is at file vs. disk block level

level

Files are memory mapped into Files are memory mapped into kernel memory

kernel memory

Cache allows for zero-copy file Cache allows for zero-copy file serving

serving

File Cache

File System Driver

Disk Driver

Linux Linux

Single global common cache Single global common cache Virtual file cache

Virtual file cache

Caching is at file vs. disk block Caching is at file vs. disk block

level level

Files are memory mapped into Files are memory mapped into

kernel memory kernel memory

Cache allows for zero-copy file Cache allows for zero-copy file serving

serving

File Cache

File System Driver

Disk Driver

(33)

Security Security

Windows Windows

Very flexible security model based on Very flexible security model based on Access Control Lists

Access Control Lists Users are defined with Users are defined with

Privileges Privileges

Member groups Member groups

Security can be applied to any Object Security can be applied to any Object Manager object

Manager object

Files, processes, synchronization Files, processes, synchronization objects, …

objects, …

Supports auditing Supports auditing

Linux Linux

Two models:

Two models:

Standard UNIX model Standard UNIX model

Access Control Lists (SELinux) Access Control Lists (SELinux)

Users are defined with:

Users are defined with:

Capabilities (privileges) Capabilities (privileges)

Member groups Member groups

Security is implemented on an Security is implemented on an object-by-object basis

object-by-object basis

Has no built-in auditing support Has no built-in auditing support

Version 2.6 includes Linux Security Version 2.6 includes Linux Security Module framework for add-on

Module framework for add-on security models

security models

(34)

Monitoring - Linux procfs Monitoring - Linux procfs

Linux supports a number of special filesystems Linux supports a number of special filesystems

Like special files, they are of a more dynamic nature and tend to have side Like special files, they are of a more dynamic nature and tend to have side effects when accessed

effects when accessed

Prime example is procfs

Prime example is procfs (mounted at /proc)(mounted at /proc)

provides access to and control over various aspects of Linux (I.e.; scheduling provides access to and control over various aspects of Linux (I.e.; scheduling and memory management)

and memory management)

/proc/meminfo contains detailed statistics on the current memory usage of Linux /proc/meminfo contains detailed statistics on the current memory usage of Linux Content changes as memory usage changes over time

Content changes as memory usage changes over time

Services for Unix implements procfs on Windows Services for Unix implements procfs on Windows

(35)

Windows’ Evolution Towards Linux Windows’ Evolution Towards Linux

Services for Unix 3.5 - really targeted at POSIX, not Linux Services for Unix 3.5 - really targeted at POSIX, not Linux

POSIX threads, full POSIX subsystem (Interix) POSIX threads, full POSIX subsystem (Interix)

X Window clients+server (X-Win32 LX) X Window clients+server (X-Win32 LX)

nfs, NIS, pam nfs, NIS, pam

proc-file system for Windows proc-file system for Windows

Configurability / Module Management Configurability / Module Management

Windows XP Embedded Windows XP Embedded

Target Designer/Component Designer/

Target Designer/Component Designer/

Component Management Database Component Management Database

Editions targeting new Application Domains Editions targeting new Application Domains

Windows Compute Cluster Server 2003 Windows Compute Cluster Server 2003

POSIX compatibility in Windows actually predates Linux and was one of the original

design goals

(36)

Linux’s Evolution Towards Windows Linux’s Evolution Towards Windows

I/O processing I/O processing Kernel reentrancy Kernel reentrancy Kernel preemptibility Kernel preemptibility

Per-processor memory allocation Per-processor memory allocation

O(1) scheduler and per-CPU ready queues O(1) scheduler and per-CPU ready queues Zero-Copy SendFile

Zero-Copy SendFile

Wake-One socket semantics Wake-One socket semantics Asynchronous I/O

Asynchronous I/O

Light-weight synchronization Light-weight synchronization

(37)

I/O Processing I/O Processing

Linux 2.2 had the notion of bottom halves (BH) for low- Linux 2.2 had the notion of bottom halves (BH) for low-

priority interrupt processing priority interrupt processing

Fixed number of BHs Fixed number of BHs

Only one BH of a given type could be active on a SMP Only one BH of a given type could be active on a SMP

Linux 2.4 introduced

Linux 2.4 introduced tasklets tasklets , which are non-preemptible , which are non-preemptible procedures called with interrupts enabled

procedures called with interrupts enabled

Tasklets are the equivalent of Windows Deferred Tasklets are the equivalent of Windows Deferred

Procedure Calls (DPCs)

Procedure Calls (DPCs)

(38)

Kernel Reentrancy Kernel Reentrancy

Mark Russinovich’s April 1999 Windows NT Magazine article, “Linux Mark Russinovich’s April 1999 Windows NT Magazine article, “Linux and the Enterprise”, pointed out that much of the Linux 2.2 was not and the Enterprise”, pointed out that much of the Linux 2.2 was not reentrant

reentrant

Ingo Molnar stated in rebuttal:

Ingo Molnar stated in rebuttal:

““his example is a clear red herring.”his example is a clear red herring.”

A month later he made all major paths reentrant A month later he made all major paths reentrant

cpu 1 cpu 2 cpu 1 cpu 2 Non-reentrant

Reentrant

Time Saved

(39)

Kernel Preemptibility Kernel Preemptibility

A preemptible kernel is more responsive to high-priority A preemptible kernel is more responsive to high-priority tasks

tasks

Through the base release of v2.4 Linux was only Through the base release of v2.4 Linux was only cooperatively

cooperatively preemptible preemptible

There are well-defined safe places where a thread running in the There are well-defined safe places where a thread running in the kernel can be preempted

kernel can be preempted

The kernel is preemptible in v2.4 patches and v2.6 The kernel is preemptible in v2.4 patches and v2.6 Windows NT has always been preemptible

Windows NT has always been preemptible

(40)

Per-CPU Memory Allocation Per-CPU Memory Allocation

Keeping accesses to memory localized to a CPU Keeping accesses to memory localized to a CPU minimizes CPU cache thrashing

minimizes CPU cache thrashing

Hurts performance on enterprise SMP workloads Hurts performance on enterprise SMP workloads

Linux 2.4 introduced per-CPU kernel memory buffers Linux 2.4 introduced per-CPU kernel memory buffers

Windows introduced per-CPU buffers in an NT 4 Service Windows introduced per-CPU buffers in an NT 4 Service Pack in 1997

Pack in 1997

0 1

Buffer Cache 0 Buffer Cache 1

CPUs

(41)

Scheduling Scheduling

The Linux 2.4 scheduler is O(n) The Linux 2.4 scheduler is O(n)

If there are 10 active tasks, it scans 10 of them in a list in order to If there are 10 active tasks, it scans 10 of them in a list in order to

decide which should execute next decide which should execute next

This means long scans and long durations under the scheduler lock This means long scans and long durations under the scheduler lock

103 112 112 101

Ready List

Highest Priority

Task

(42)

Scheduling Scheduling

Linux 2.6 has a revamped scheduler that’s O(1) from Ingo Molnar Linux 2.6 has a revamped scheduler that’s O(1) from Ingo Molnar

that:

that:

Calculates a task’s priority at the time it makes scheduling decision Calculates a task’s priority at the time it makes scheduling decision

Has per-CPU ready queues where the tasks are pre-sorted by priority Has per-CPU ready queues where the tasks are pre-sorted by priority

112 112 101

103 Highest-priority

Non-empty Queue

(43)

Scheduling Scheduling

Windows NT has always had an O(1) scheduler based Windows NT has always had an O(1) scheduler based

on pre-sorted thread priority queues on pre-sorted thread priority queues

Server 2003 introduced per-CPU ready queues Server 2003 introduced per-CPU ready queues

Linux load balances queues Linux load balances queues

Windows does not Windows does not

Not seen as an issue in performance testing by Microsoft Not seen as an issue in performance testing by Microsoft

Applications where it might be an issue are expected to use affinity Applications where it might be an issue are expected to use affinity

(44)

Zero-Copy Sendfile Zero-Copy Sendfile

Linux 2.2 introduced Sendfile to efficiently send file data over a socket Linux 2.2 introduced Sendfile to efficiently send file data over a socket

I pointed out that the initial implementation incurred a copy operation, I pointed out that the initial implementation incurred a copy operation, even if the file data was cached

even if the file data was cached

Linux 2.4 introduced zero-copy Sendfile Linux 2.4 introduced zero-copy Sendfile

Windows NT pioneered zero-copy file sending with TransmitFile, the Windows NT pioneered zero-copy file sending with TransmitFile, the Sendfile equivalent, in Windows NT 4

Sendfile equivalent, in Windows NT 4

File Data Buffer

Network Adapter

Buffer

Network

File Data Buffer

Network Driver

Network Network

Driver

1-Copy 0-Copy

(45)

Wake-one Socket Semantics Wake-one Socket Semantics

Linux 2.2 kernel had the

Linux 2.2 kernel had the thundering herd thundering herd or or overscheduling

overscheduling problem problem

In a network server application there are typically several In a network server application there are typically several

threads waiting for a new connection threads waiting for a new connection

In v2.2 when a new connection came in all the waiters would In v2.2 when a new connection came in all the waiters would

race to get it race to get it

Ingo Molnar’s response:

Ingo Molnar’s response:

5/2/99: “here he again forgets to _prove_ that overscheduling 5/2/99: “here he again forgets to _prove_ that overscheduling

happens in Linux.”

happens in Linux.”

5/7/99: “as of 2.3.1 my wake-one implementation and 5/7/99: “as of 2.3.1 my wake-one implementation and

waitqueues rewrite went in”

waitqueues rewrite went in”

In Linux 2.4 only one thread wakes up to claim the new In Linux 2.4 only one thread wakes up to claim the new connection

connection

Windows NT has always had wake-1 semantics

Windows NT has always had wake-1 semantics

(46)

Asynchronous I/O Asynchronous I/O

Linux 2.2 only supported asynchronous I/O on socket Linux 2.2 only supported asynchronous I/O on socket

connect operations and tty’s connect operations and tty’s

Linux 2.6 adds asynchronous I/O for direct-disk access Linux 2.6 adds asynchronous I/O for direct-disk access

AIO model includes efficient management of asynchronous I/O AIO model includes efficient management of asynchronous I/O

Also added alternate epoll model Also added alternate epoll model

Useful for database servers managing their database on a Useful for database servers managing their database on a

dedicated raw partition dedicated raw partition

Database servers that manage a file-based database suffer from Database servers that manage a file-based database suffer from

synchronous I/O synchronous I/O

Windows I/O is inherently asynchronous Windows I/O is inherently asynchronous

Windows has had completion ports since NT 3.5 Windows has had completion ports since NT 3.5

More advanced form of AIO More advanced form of AIO

(47)

Light-Weight Synchronization Light-Weight Synchronization

Linux 2.6 introduces Futexes Linux 2.6 introduces Futexes

There’s only a transition to kernel-mode when there’s There’s only a transition to kernel-mode when there’s

contention contention

Windows has always had CriticalSections Windows has always had CriticalSections

Same behavior Same behavior

Futexes go further:

Futexes go further:

Allow for prioritization of waits Allow for prioritization of waits

Works interprocess as well

Works interprocess as well

(48)

A Look at the Future A Look at the Future

The kernel architectures are fundamentally similar The kernel architectures are fundamentally similar

There are differences in the details There are differences in the details

Linux implementation is adopting more of the good ideas used in Linux implementation is adopting more of the good ideas used in Windows

Windows

For the next 2-4 years Windows has and will maintain an edge For the next 2-4 years Windows has and will maintain an edge

Linux is still behind on the cutting edge of performance tricks Linux is still behind on the cutting edge of performance tricks

Large performance team and lab at Microsoft has direct ties into the Large performance team and lab at Microsoft has direct ties into the kernel developers

kernel developers

As time goes on the technological gap will narrow As time goes on the technological gap will narrow

Open Source Development Labs (OSDL) will feed performance test Open Source Development Labs (OSDL) will feed performance test results to the kernel team

results to the kernel team

IBM and other vendors have Linux technology centers IBM and other vendors have Linux technology centers

Squeezing performance out of the OS gets much harder as the OS Squeezing performance out of the OS gets much harder as the OS gets more tuned

gets more tuned

(49)

Linux Technology Unknowns Linux Technology Unknowns

Linux kernel forking Linux kernel forking

RedHat has already done it: Red Hat Enterprise Server v3.0 is RedHat has already done it: Red Hat Enterprise Server v3.0 is

Linux 2.4 with some Linux 2.6 features Linux 2.4 with some Linux 2.6 features

Backward compatibility philosophy Backward compatibility philosophy

Linus Torvalds makes decisions on kernel APIs and Linus Torvalds makes decisions on kernel APIs and

architecture based on technical reasons, not business reasons architecture based on technical reasons, not business reasons

(50)

Further Reading Further Reading

Transaction Processing Council: www.tpc.org Transaction Processing Council: www.tpc.org SPEC: www.spec.org

SPEC: www.spec.org

NT vs Linux benchmarks: www.kegel.com/nt-linux-benchmarks.html NT vs Linux benchmarks: www.kegel.com/nt-linux-benchmarks.html The C10K problem: http://www.kegel.com/c10k.html

The C10K problem: http://www.kegel.com/c10k.html Linus Torvald’s home: http://www.osdl.org/

Linus Torvald’s home: http://www.osdl.org/

Linux Kernel Archives: http://www.kernel.org/

Linux Kernel Archives: http://www.kernel.org/

Linux history: http://www.firstmonday.dk/issues/issue5_11/moon/

Linux history: http://www.firstmonday.dk/issues/issue5_11/moon/

Veritest Netbench result:

Veritest Netbench result:

http://www.veritest.com/clients/reports/microsoft/ms_netbench.pdf http://www.veritest.com/clients/reports/microsoft/ms_netbench.pdf Mark Russinovich’s 1999 article, “Linux and the Enterprise”:

Mark Russinovich’s 1999 article, “Linux and the Enterprise”:

http://www.winntmag.com/Articles/Index.cfm?ArticleID=5048 http://www.winntmag.com/Articles/Index.cfm?ArticleID=5048 The Open Group's Single UNIX Specification:

The Open Group's Single UNIX Specification:

http://www.unix.org/version3/

http://www.unix.org/version3/

Références

Documents relatifs

Il faut alors ajouter une règle au pare-feu de Windows pour autoriser le protocole ICMP dans les flux entrants de l'ordinateur Windows vers lequel on souhaite faire un ping..

Windows Server comes with a tool named Network Monitor that lets you capture packets that flow through one or more NDIS miniport drivers on your system by. installing an

depuis /usr/svntp/workflow-svn Clic droit sur le dossier " workflow-svn " / TortoiseSVN / Navigateur!. de référentiel (Repo-browser)

Vous avez fait une modification dans XF86Config et le serveur X refuse de redémarrer.. 1er cas : vous aviez fait une copie de sauvegarde

Cette vérification est importante, parce que si vous commencez une plongée dans un mode erroné (par exemple Apnée FREE au lieu de OC-SCUBA), vous n’avez pas des informations

Pour cela, on branche une clé USB d'au moins 128 Mo (La taille de la clé USB dépend de celle de l'image ISO de GParted. En effet, si les versions futures de GParted auront une image

Tout cela grâce à coLinux (et XMing), que vous allez découvrir dans ce mini-tuto qui, je l'espère, vous sera utile en vous permettant de faire fonctionner Linux comme vous le

Le technicien supérieur systèmes et réseaux travaille dans une entreprise de services du numérique (ESN) ou à la direction des systèmes d'information (DSI) d'une entreprise,