chapter 1 introductionosnet.cs.nchu.edu.tw/powpoint/distributed_os/os/linux ch1.pdf · introduction...
TRANSCRIPT
Chapter 1 Introduction
Chapter 1 Chapter 1 IntroductionIntroduction
HsungHsung--Pin ChangPin ChangDepartment of Computer ScienceDepartment of Computer ScienceNational Chung National Chung HsingHsing UniversityUniversity
Preference• On the basis of 2.4.18 of the
Linux kernel– www.kernel.org
• Linux source code is contained in more than 8,000 C and assembly language files– 4 million lines of code– 144 megabytes of disk space
Resources for Tracing Linux
• Source code browser– LXR (Source code navigator)– Global
• Books– Understanding the Linux Kernel, D. P. Bovet and M.
Cesati, O'Reilly & Associates, 2000.– Linux Core Kernel – Commentary, In-Depth Code
Annotation, S. Maxwell, Coriolis Open Press, 1999.– The Linux Kernel, Version 0.8-3, D. A Rusling, 1998.– Linux Kernel Internals, 2nd edition, M. Beck et al.,
Addison-Wesley, 1998. – Linux Kernel, R. Card et al., John Wiley & Sons, 1998.
Introduction• Linux was initially developed by Linus
Torvalds in 1991 based on Intel 80386– Open source under GNU Public License
• Commercial distributions collect vast additional software with the Linux to provide a fully operating system.
• The Linux source is usually installed in the /usr/src/linux directory
Kernel Source Code Organization
Linux Advantages over Commercial Competitors
• Linux is free• Linux is fully customizable in all its
components– You are allowed to freely read and
modify the source code of the kernel by the GPL (General Public License)
• Linux runs on low-end, cheap hardware platforms
Linux Advantages over Commercial Competitors (Cont.)• Linux is powerful
– Fully exploit the hardware features– And its main goal is efficiency
• Linux has a high standard for source code quality
• The Linux kernel can be very small and compact– You may fit both a kernel image and full root
file system on just one 1.4 MB floppy disk
Linux Advantages over Commercial Competitors (Cont.)• Linux is highly compatible with many
common operating systems– You may mount file systems for all
versions of operating system– Linux support many network system,
such as Ethernet, FDDI…– Linux can directly run programs written
for other operating system by suitable libraries
Linux Advantages over Commercial Competitors (Cont.)• Linux is well supported
– Very easy to get patches– Numerous Device Driver support– Numerous newsgroup and mailing list
Hardware Dependency• Linux make a distinction between
hardware-dependent and hardware-independent source code
• Both arch and include directory include subdirectories that correspond to the hardware platform– Alpha, arm, i386, ia64, m68k, mips…
Linux Versions• Linux distinguishes stable kernels from
development kernels• Linux version: X.Y.Z
– X.Y: Version number• If the second number is even: stable kernel• If the second number is odd: development kernel
– Z: release number• Fix bugs reported by Users
Basic Operating System Concepts
• The operating system fulfill two main objectives– Interact with the hardware components– Provide an execution environment to
applications• Thus, when user wants a hardware
resource– Issue a request to the operating system
Basic Operating System Concepts (Cont.)
• Therefore, O.S. must rely on the availability of specific hardware features to forbid user applications to directly interact with hardware or memory– CPU provides:
• Nonprivileged mode for user programs• Privileged mode for the kernel
– Unix calls User Mode and Kernel Moderespectively
Multiuser Systems• Multiuser O.S. includes several features
– An authentication mechanism for verifying the user’s identify
– A protection mechanism against buggy user programs that could block other applications
– A protection mechanism against malicious user programs
– An accounting mechanism that limits the amount of resource units assigned to each user
Users and Groups• All users are identified by a unique number
called the User ID or UID• If user want to use the computer
– Provide the Login name and password• To selectively share material with other
users– Each user is a member of one or more groups
• Each group is identified by a unique number called a Group ID, or GID
Users and Groups• Root is a special user that handle
user accounts, perform maintenance task etc.– Also called superuser, supervisor
Process• Process
– An instance of a program in execution– Execution context
• A process executes sequence of instructions in an address space– The address space is the set of memory
addresses that the process is allowed to reference
– Multiple sequence of instructions would be executed in the same address space, i.e., threading
Process (Cont.)• Process/kernel model
– When a process makes a system call (i.e., a request to the kernel)• The hardware change the privilege mode from User
Mode to Kernel Mode• Thus, the O.S. acts with the execution context of
the process – When system call complete, kernel forces the
hardware to return to User Mode and the process continues its execution.
Kernel Architecture• Module
– An object file whose code can be linked to the kernel at run time
• Advantages– A modularized design approach– Platform independence
• A disk driver for SCSI works at both IBM PC or HP Alpha
– Frugal main memory usage– No performance penalty
• Once linked, module performs equivalent to the statically linked kernel
Unix FilesystemOverview
• Read it by yourself
An overview of Unix Kernels
• The Process/Kernel Model• Process Implementation• Reentrant Kernels• Process Address Space• Synchronization and Critical Regions• Signals and Interprocess Communication• Process Management• Memory Mangement• Device Drivers
The Process/Kernel Model
• Each CPU can run in either User Mode or Kernel Mode– 80x86 has four different execution
states• Linux use only User Mode and Kernel Mode
– Each CPU also provides special instructions to switch between these modes
The Process/Kernel Model (Cont.)
• The process/kernel model assumes that processes that require a kernel service use specific programming construct called system calls– The kernel itself is not a process but a
process manager that implements many system calls
The Process/Kernel Model (Cont.)
• However, Unix systems also include a few privileged process called kernel threads– They run in Kernel Mode and in kernel address
space– They do not interact with users– They are usually created during system startup
and remain alive until the system is shut down.
The Process/Kernel Model (Cont.)
• kernel routine can be activated in several ways– A process invokes a system call– The CPU executing the process signal an
exception.– A peripheral device issues an interrupt to the
CPU• Invoke interrupt handler
– A kernel thread is executed
Process Implementation• Each process is represented by a process
descriptor that– PCB (Process Control Block)
• The kernel saves the contents of processor registers in the descriptor when performing context switch– Program Counter (PC) and Stack Pointer (SP)– General purpose registers– Floating point registers– Process control registers (Processor Status Word)
containing information about the CPU state– The memory management registers
Reentrant Kernels• Unix kernels are reentrant
– Several processes may be executing in Kernel Mode at the same time
– On uniprocessor systems, only one process can progress, but many may be blocked in Kernel Mode waiting for some events
Reentrant Kernels (Cont.)• Implementation approach
– Reentrant Functions: functions that only modify local variables but do not alter global data structures
– Nonreentrant functions and use locking mechanism to ensure only one process can execute a nonreentrant function at a time
Kernel Control Path• A kernel control path denotes the
sequence of instructions executed by the kernel to handle– System call– Exception– Interrupt
Process Address Space• Each process runs in its private
address space– Running in User Mode refers to private
stack, data, and code areas– When running in Kernel Mode, the
process addresses the kernel data and code area and uses a private kernel stack• Since several kernel control paths exists,
thus each kernel control path uses its own private kernel stack
Process Address Space (Cont.)
• Memory sharing– Several users uses the same program
• The code segment is shared by all users– Processes may use shared memory as a
kind of interprocess communication scheme
• mmap()– Map file into a part of a process
address space
Synchronization and Critical Regions
• Implement a reentrant kernel requires synchronization– Different kernel control paths may
access the same kernel data structure– Called Race condition– This section of code is called a critical
region
Synchronization and Critical Regions (Cont.)
• Possible solution– Atomic operation
• Read and decrement a variable with a single, noninterruptible operation
– However, many kernel data structures, e.g. linked list, cannot be accessed with a single operation
Synchronization and Critical Regions (Cont.)
• Solutions– Nonpreemptive Kernels– Interrupt disabling– Spin locks
Nonpreemptive Kernels• When a process executes in Kernel
Mode, it cannot be arbitrarily suspended and substituted with another process
• Ineffective in multiprocessor system– Two kernel control path running on
different CPUs may concurrently access the same data structure
Interrupt disabling• Approach
– Disable interrupts – Enter critical section– Reenable interrupts
• Ineffective if the critical region is large
• Cannot work in a multiprocessor system
Semaphores• Semaphore
– A counter – Has two atomic operations
• Up• Down
• Each semaphore is associated with a list of waiting process– Link processes that are blocked
Spin Locks• In Multiprocessor, semaphores may
be inefficient if the time required to update the data structures is short– Insert a process into a semaphore list is
relative expensive• Spin lock
– Executing a tight instruction loop until the lock becomes open
Signals• Signal: a mechanism to notify process
of system events– Asynchronous notification– Synchronous errors or exceptions
• A process may react to a signal by– Ignore the signal– Asynchronously execute a specified
procedure (the signal handler)
Signals (Cont.)• A set of defined signals
– 1)SIGHUP 2) SIGINT 3) SIGQUIT – 4) SIGILL 5) SIGTRAP 6) SIGIOT – 7) SIGBUS 8) SIGFPE 9) SIGKILL– 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 – 13) SIGPIPE 14) SIGALR 15)SIGTERM– 17) SIGCHLD 18) SIGCONT 19) SIGSTOP – 20) SIGTSTP 21) SIGTTIN 22) SIGTTOU– 23) SIGURG 24) SIGXCPU 25) SIGXFSZ – 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH – 29) SIGIO 30) SIGPWR
Signals (Cont.)• If process does not specify its action,
the kernel perform a default action– Terminate the process– Core dump and terminate the process– Ignore the signal– Suspend the process– Resume the process’s execution, if it
was stopped
InterprocessCommunication
• System V IPC– Semaphores– Message queues– Shared memory
Process Management• Create a process
– Fork(): create a new process– Exec(): load a new program
• The process invokes a fork() is the parent, while the new one is its child
Process Management (Cont.)
• The original implementation of for() duplicates both the parent’s data and code and assign to the child– Currently, Copy-On-Write approach
• _exit(): terminate a process– Kernel release resources owned by the
process– Send a SIGCHLD signal to its parent
process
Zombie Process• The wait() system call allow a process to
wait until one of its children terminates– It returns the process ID (PID) of the
terminated child• Zombie process
– Terminated but before its parent executes wait() system call
– Still hold the task_struct data structure
Zombie Process (Cont.)• The related data structure is
released until wait() call• But, how about a parent terminates
without issue a wait() call ?– Start a child in background and then
parent exits
Zombie Process (Cont.)• The solutions rely on init process
– Created during system initialization• When child terminated with no
parent– Change its parent to init
• Init then routinely issues wait() system call
Process Group• Unix introduces the notion of
process groups to represent a job– $ ls | sort | more– A progress group consists of three
processes: ls, sort, more
• Login session
Memory Management• Virtual memory
Random Access Memory Usage
• RAM– A few megabytes are dedicated to
storing kernel image (kernel code and kernel static data structures)
Random Access Memory Usage (Cont.)
– The remaining• Satisfy kernel requests for buffers,
descriptors, and other dynamic kernel data structures
• Satisfy process requests for generic memory area and for memory mapping of files
• Cache for hard disk
Kernel Memory Allocator• Satisfy the requests for memory area
from all parts of the system– Both the kernel and user applications
• Features– Fast– Minimize the amount of wasted memory– Reduce memory fragmentation problem– Cooperate with other memory management
subsystems
Process Virtual Address Space Handling
• A process’s address space– Contain all the virtual memory address
that the process is allowed to reference– Stored as a list of memory area
descriptors
mm
Process’s Virtual Memory
countpgd
mmapmmap_avlmmap_sem
mm_structtask_struct
vm_endvm_startvm_flagsvm_inodevm_ops
vm_next
vm_endvm_startvm_flagsvm_inodevm_ops
vm_next
vm_area_struct
code
data
vm_area_struct
Process Virtual Address Space Handling (Cont.)
• A process’s virtual address space contains– The executable code the program– The initialized data of the program– The uninitialized data of the program– The initial program stack (i.e., the User Mode
stack)– The executable code and data of needed
shared libraries– The heap (for memory dynamically requested
by program)
Process Virtual Address Space Handling (Cont.)
• Demand paging– Loading virtual pages into memory as
they are accessed• Decide the page is in swap file or somewhere
in disk
Swapping and Caching• Swap area on disk
• Use physical memory as a cache– Defer writing to disk– Sync() system call force disk synchronization
by writing all dirty pages– All O.S. also periodically write dirty pages to
disk
Device Drivers