Section 3.1. Kernel Organization

   


3.1. Kernel Organization

The FreeBSD kernel can be viewed as a service provider to user processes. Processes usually access these services through system calls. Some services, such as process scheduling and memory management, are implemented as processes that execute in kernel mode or as routines that execute periodically within the kernel. In this chapter, we describe how kernel services are provided to user processes, and some of the ancillary processing done by the kernel. Then we describe the basic kernel services provided by FreeBSD and provide details of their implementation.

System Processes

All FreeBSD user-level processes originate from a single process that is crafted by the kernel at startup. Table 3.1 (on page 50) lists the processes that are created immediately and exist always. They are kernel processes, and they function wholly within the kernel. (Kernel processes execute code that is compiled into the kernel's load image and operate with the kernel's privileged execution mode.) The kernel also starts a kernel process for each device to handle interrupts for that device.

Table 3.1. Permanent kernel processes.

Name

Description

idle

runs when there is nothing else to do

swapper

schedules the loading of processes in main memory from secondary storage when system resources become available

vmdaemon

schedules the transfer of whole processes from main memory to secondary storage when system resources are low

pagedaemon

writes parts of the address space of a process to secondary storage to support the paging facilities of the virtual-memory system

pagezero

maintains a supply of zero'ed pages

bufdaemon

maintains a supply of clean buffers by writing out dirty buffers when the supply of clean buffers gets too low

syncer

ensures that dirty file data is written after 30 seconds

ktrace

writes system call tracing records to their output file

vnlru

maintains a supply of free vnodes by cleaning up the least recently used ones

random

collects entropy data to supply seeding for kernel random numbers and the /dev/random device

g_event

handle configuration tasks including discovery of new devices and cleanup of devices when they disappear

g_up

handle data coming from device drivers and being delivered to processes

g_down

handle data coming from processes and being delivered to device drivers


After creating the kernel processes, the kernel creates the first process to execute a program in user mode; it serves as the parent process for all subsequent processes. The first user-mode process is the init process historically, process 1. This process does administrative tasks, such as spawning getty processes for each terminal on a machine, collecting exit status from orphaned processes, and handling the orderly shutdown of a system from multiuser to single-user operation. The init process is a user-mode process, running outside the kernel (see Section 14.6).

System Entry

Entrances into the kernel can be categorized according to the event or action that initiates it.

  • Hardware interrupt

  • Hardware trap

  • Software-initiated trap

Hardware interrupts arise from external events, such as an I/O device needing attention or a clock reporting the passage of time. (For example, the kernel depends on the presence of a real-time clock or interval timer to maintain the current time of day, to drive process scheduling, and to initiate the execution of system timeout functions.) Hardware interrupts occur asynchronously and may not relate to the context of the currently executing process.

Hardware traps may be either synchronous or asynchronous but are related to the current executing process. Examples of hardware traps are those generated as a result of an illegal arithmetic operation, such as divide by zero.

Software-initiated traps are used by the system to force the scheduling of an event, such as process rescheduling or network processing, as soon as possible. Software-initiated traps are implemented by setting a flag that is checked whenever a process is preparing to exit from the kernel. If the flag is set, the software-interrupt code is run instead of exiting from the kernel.

System calls are a special case of a software-initiated trap the machine instruction used to initiate a system call typically causes a hardware trap that is handled specially by the kernel.

Run-Time Organization

The kernel can be logically divided into a top half and a bottom half, as shown in Figure 3.1. The top half of the kernel provides services to processes in response to system calls or traps. This software can be thought of as a library of routines shared by all processes. The top half of the kernel executes in a privileged execution mode, in which it has access both to kernel data structures and to the context of user-level processes. The context of each process is contained in two areas of memory reserved for process-specific information. The first of these areas is the process structure, which has historically contained the information that is necessary even if the process has been swapped out. In FreeBSD, this information includes the identifiers associated with the process, the process's rights and privileges, its descriptors, its memory map, pending external events and associated actions, maximum and current resource utilization, and many other things. The second is the user structure, which has historically contained the information that is not necessary when the process is swapped out. In FreeBSD, the user-structure information of each process includes the hardware thread control block (TCB), process accounting and statistics, and minor additional information for debugging and creating a core dump. Deciding what was to be stored in the process structure and the user structure was far more important in previous systems than it was in FreeBSD. As memory became a less limited resource, most of the user structure was merged into the process structure for convenience; see Section 4.2.

Figure 3.1. Run-time structure of the kernel.


The bottom half of the kernel comprises routines that are invoked to handle hardware interrupts. Activities in the bottom half of the kernel are synchronous with respect to the interrupt source but are asynchronous, with respect to the top half, and the software cannot depend on having a specific (or any) process running when an interrupt occurs. Thus, the state information for the process that initiated the activity is not available. The top and bottom halves of the kernel communicate through data structures, generally organized around work queues.

The FreeBSD kernel is never preempted to run another user process while executing in the top half of the kernel for example, while executing a system call although it will explicitly give up the processor if it must wait for an event or for a shared resource. Its execution may be interrupted, however, by interrupts for the bottom half of the kernel. When an interrupt is received, the kernel process that handles that device is scheduled to run. Normally these device-interrupt processes have a higher priority than user processes or processes running in the top half of the kernel. Thus, when an interrupt causes a device-interrupt process to be made runnable, it will usually preempt the currently running process. When a process running in the top half of the kernel wants to add an entry to the work list for a device, it needs to ensure that it will not be preempted by that device part way through linking the new element onto the work list. In FreeBSD 5.2, the work list is protected by a mutex. Any process (top or bottom half) seeking to modify the work list must first obtain the mutex. Once held, any other process seeking to obtain the mutex will wait until the process holding it has finished modifying the list and released the mutex.

Processes cooperate in the sharing of system resources, such as the disks and memory. The top and bottom halves of the kernel also work together in implementing certain system operations, such as I/O. Typically, the top half will start an I/O operation, and then relinquish the processor; then the requesting process will sleep, awaiting notification from the bottom half that the I/O request has completed.

Entry to the Kernel

When a process enters the kernel through a trap or an interrupt, the kernel must save the current machine state before it begins to service the event. For the PC, the machine state that must be saved includes the program counter, the user stack pointer, the general-purpose registers, and the processor status longword. The PC trap instruction saves the program counter and the processor status longword as part of the exception stack frame; the user stack pointer and registers must be saved by the software trap handler. If the machine state were not fully saved, the kernel could change values in the currently executing program in improper ways. Since interrupts may occur between any two user-level instructions (and on some architectures between parts of a single instruction), and because they may be completely unrelated to the currently executing process, an incompletely saved state could cause correct programs to fail in mysterious and not easily reproducible ways.

The exact sequence of events required to save the process state is completely machine dependent, although the PC provides a good example of the general procedure. A trap or system call will trigger the following events:

  • The hardware switches into kernel (supervisor) mode, so that memory-access checks are made with kernel privileges, references to the stack use the per-process kernel stack, and privileged instructions can be executed.

  • The hardware pushes onto the per-process kernel stack the program counter, processor status longword, and information describing the type of trap. (On architectures other than the PC, this information can include the system-call number and general-purpose registers as well.)

  • An assembly-language routine saves all state information not saved by the hardware. On the PC, this information includes the general-purpose registers and the user stack pointer, also saved onto the per-process kernel stack.

After this preliminary state saving, the kernel calls a C routine that can freely use the general-purpose registers as any other C routine would, without concern about changing the unsuspecting process's state.

There are three major kinds of handlers, corresponding to particular kernel entries:

  1. Syscall() for a system call

  2. Trap() for hardware traps and for software-initiated traps other than system calls

  3. The appropriate device-driver interrupt handler for a hardware interrupt

Each type of handler takes its own specific set of parameters. For a system call, they are the system-call number and an exception frame. For a trap, they are the type of trap, the relevant floating-point and virtual-address information related to the trap, and an exception frame. (The exception-frame arguments for the trap and system call are not the same. The PC hardware saves different information based on different types of traps.) For a hardware interrupt, the only parameter is a unit (or board) number.

Return from the Kernel

When the handling of the system entry is completed, the user-process state is restored, and control returns to the user process. Returning to the user process reverses the process of entering the kernel.

  • An assembly-language routine restores the general-purpose registers and user-stack pointer previously pushed onto the stack.

  • The hardware restores the program counter and program status longword, and switches to user mode, so that future references to the stack pointer use the user's stack pointer, privileged instructions cannot be executed, and memory-access checks are done with user-level privileges.

Execution then resumes at the next instruction in the user's process.


   
 


The Design and Implementation of the FreeBSD Operating System
The Design and Implementation of the FreeBSD Operating System
ISBN: 0201702452
EAN: 2147483647
Year: 2003
Pages: 183

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net