The Boot Process

Team-Fly    

Solaris™ Operating Environment Boot Camp
By David Rhodes, Dominic Butler
Table of Contents
Chapter 2.  Booting and Halting the System


At the OK> prompt we type "boot" to cause the system to begin the process of loading Solaris, which ends with the display of the console login prompt. Here we will look at what actually happens during this process.

Once you have typed your boot command, the boot PROM performs a self-diagnostic test that verifies that the system hardware and memory are functioning correctly. The PROM then loads and runs the primary boot program called bootblk. The job of bootblk is to locate the secondary boot program (ufsboot) on the default (or specified) device, load this into memory, and then execute it. The boot process will then load the Solaris kernel. The kernel initializes itself and then begins to load the kernel modules. It will initially use ufsboot to read these modules, but once the modules needed to mount the root filesystem are loaded the kernel will unmap ufsboot and continue the boot process using its own resources. The kernel will now create the first Solaris process.

Once Solaris is up and running, new processes are created by an already running process being copied in memory and then the copy being overlaid with the new process. This cannot happen until at least one process is running, so the first process has to be created directly in memory (it is said to be "hand crafted"). This first process is the scheduler process. Since it is the first process, it is given a process ID of 0 (zero). The next process to be created is init and, not surprisingly, it gets a process ID of 1. Init is the first Solaris process to be created in the standard way; it can be thought of as the parent of all the other processes, apart from the scheduler. The scheduler process remains special and cannot be treated like other processes. One of its jobs is to swap processes between memory and disk (see Chapter 7 "Swap Space"). If it were a normal process, there would be nothing to prevent it from copying itself out to disk and then there would be nothing left in memory to swap it back in again.

Now that init is running, its first job is to read the configuration file (/etc/inittab) and start the processes defined in it. The boot process completes with init executing the run commands (rc) scripts that perform such tasks as mounting the remainder of the filesystems and starting applications. At this point Solaris is up and running and has full control of the computer. Table 2.1 summarizes the boot process.

Table 2.1. Boot Process Summary

The boot PROM phase:

The boot PROM runs self-diagnostic tests.

The boot PROM loads the bootblk program.

The boot programs phase:

The bootblk program loads the ufsboot program.

The ufsboot program loads the kernel.

The kernel initialization phase:

The kernel initializes itself and loads the modules needed to mount the root (/) filesystem.

The kernel starts the init process.

The init phase:

The init process reads the /etc/inittab file and runs the system's rc scripts.

Once the boot process has completed, the server is ready for use. At this point it will normally be in multi-user mode. This means that many people are able to use the computer at the same time, usually connecting across a network. The fact that Solaris has a multi-user mode implies that it also has a single-user mode, and this is the case. The mode that the server is in is usually referred to as the run level, and one of the jobs of the init process is to control the system run level. You can find out the current run level using the who command:

 hydrogen# who -r .       run-level 3  Aug 28 09:22     3      0  S hydrogen# 

In this case, we are in run-level 3, which means the system is in multi-user mode and all network services are available. The second field from the right tells us how many times we have been in this run level since the last reboot (in this case, zero times), and the field on the right tells us the last run level the system was in (in this case, single user).

Run Levels

A run level can be thought of as a predefined known state. Each run level has a set of processes that will be started or stopped when the run level is entered or left. A number of run levels have been predefined by Solaris (and are shown in Table 2.2) but others can be defined and used if required, although I have only worked on one site that has ever used a custom run level. You can change the run level at any time by typing init followed by the level to which you wish to change (e.g., init 2). Table 2.2 describes the predefined Solaris run levels.

Table 2.2. Run Levels

Run Level

Description

0

When the system is in run-level 0 it is shut down and in a state where it is safe to switch the power off.

1 (s or S)

Run-level 1 is single-user mode. Only the console port is available for logging in. Sometimes called "run-level s" (or "S").

2

This is multi-user mode. All network services are running apart from the Network File System (NFS).

3

Run-level 3 is multi-user mode with NFS activated.

4

This run level is defined in the Solaris manual as "not currently supported." It is, however, quite possible to use it to construct your own run level.

5

This run level is similar to 0 in that the operating system has shut down, but if the hardware supports it, the power will also be switched off.

6

This run level causes the server to reboot by first shutting down to run-level 0 then booting up to the default run level.

The init process controls the movement between run levels and will run processes (or kill processes) each time a run level change is made. It knows what to do when a run level changes by looking at instructions in the system file /etc/inittab.

Here we have a copy of the /etc/inittab file from one of our servers:

 ap::sysinit:/sbin/autopush -f /etc/iu.ap ap::sysinit:/sbin/soconfig -f /etc/sock2path fs::sysinit:/sbin/rcS sysinit  >/dev/msglog 2<>/dev/msglog </dev/console is:3:initdefault: p3:s1234:powerfail:/usr/sbin/shutdown -y -i5 -g0   >/dev/msglog 2<>/dev/    msglog sS:s:wait:/sbin/rcS            >/dev/msglog 2<>/dev/msglog </dev/console s0:0:wait:/sbin/rc0            >/dev/msglog 2<>/dev/msglog </dev/console s1:1:respawn:/sbin/rc1         >/dev/msglog 2<>/dev/msglog </dev/console s2:23:wait:/sbin/rc2           >/dev/msglog 2<>/dev/msglog </dev/console s3:3:wait:/sbin/rc3            >/dev/msglog 2<>/dev/msglog </dev/console s5:5:wait:/sbin/rc5            >/dev/msglog 2<>/dev/msglog </dev/console s6:6:wait:/sbin/rc6            >/dev/msglog 2<>/dev/msglog </dev/console fw:0:wait:/sbin/uadmin 2 0     >/dev/msglog 2<>/dev/msglog </dev/console of:5:wait:/sbin/uadmin 2 6     >/dev/msglog 2<>/dev/msglog </dev/console rb:6:wait:/sbin/uadmin 2 1     >/dev/msglog 2<>/dev/msglog </dev/console sc:234:respawn:/usr/lib/saf/sac -t 300 co:234:respawn:/usr/lib/saf/ttymon -g -h -p "`uname -n` console login:    " -T sun -d /dev/console -l console -m ldterm,ttcompat 

The file contains one record per line and each record contains colon-separated fields. The field contents are described in Table 2.3.

Table 2.3. /etc/inittab Fields

Field

Description

tag

This field contains a name for each entry (it does not need to be unique).

rstate

This specifies the run level in which the process should be executed. You can specify more than one run level if required.

action

This describes how init should run the specified process when the run level is entered.

once: Start this process (if it is not already running) and move straight on to the next inittab entry. If it terminates do not restart it.

respawn: Start the process (if it isn't already running); if it ever terminates while in this run level, restart it.

wait: Start the process and wait until it terminates before moving on to the next inittab entry. (If this process hangs, you can be in big trouble since init will not do anything else until it has completed.)

boot: These entries will only be dealt with during init's initial boot time read of the inittab file. The process will be started then init will move on to the next entry without waiting. These entries are useful if you need to initialize something (e.g., a hardware device) each time the system reboots.

bootwait: Any "bootwait" entry will be processed by init when the system first moves from single-user mode to multi-user mode. Each process listed will be started and init will wait for it to terminate before carrying on.

sysinit: These entries will only be run during bootup and before the console login has appeared. Init will wait for these entries to complete before moving on.

powerfail: Execute this process when the init command receives the power fail signal (SIGPWR).

powerwait: Execute this process when the init command receives the power fail signal (SIGPWR), but wait until it has completed before continuing to process the inittab file.

off: If the process is already running, init will kill it. If it is not running, init will do nothing.

ondemand: Any on-demand entries in inittab are treated in exactly the same way as "respawn" entries, but they are not tied to a particular run level. See below for more information on these entries.

initdefault: This entry is only read the first time init looks through the file. It tells init which run level we want it to take the system to.

process

This is the process to run based on the rules for that record.

There is one entry in /etc/inittab that does not conform to the above specification and that is the "initdefault" record. This simply tells the init process what run level to take the system to at system boot-time. It is not a good idea to set the default run level to 0 or 6. The former would cause Solaris to shut itself down every time it booted up, while the latter would cause the system to reboot itself continuously (which would make a very interesting problem to try and troubleshoot).

When we want to change to a new run level we simply type init followed by the new run level and init will read the inittab file to see what processes it needs to run when it moves to that level. When it has completed this task the run level is set to the one specified. This saves us, as administrators, from needing to know what processes should be running for specific situations. We know that every time we go to a specific run level the process defined to run at that run level, in inittab, will be run by init. We can also control processes from inittab, but without needing to change the run level to run them. These are the on-demand processes. If an entry in the inittab file has its action field set to "on-demand," it will be treated in a similar way to one that has "respawn" in that field. The difference is that an on-demand job will not have a run level in its rstate field. Instead it will have one of the letters a, b, or c. If you type "init" followed by the letter a, b, or c, then init will run any on-demand jobs with the same letter in their rstate field as though they were respawn jobs. Since a, b, and c are not valid run levels, the run level will not actually change, so no other entries in inittab will be acted upon.

If you make a change to the inittab file, init will not act upon the change until the next time it reads the file. It will normally only read the file at system startup or when the current run level is changed, but you can tell it to read the file with the command init q.

RC Scripts

You may have noticed that the example inittab file contains a number of entries to run programs with a name that matches /sbin/rcX, where the "X" represents a run level. These programs are actually shell scripts, and you will see that they are each defined as running only at that run level (apart from /sbin/rc2, which we will look at in a moment). The purpose of these scripts is to start and stop all the processes needed when a new run level is entered. Rather than doing this directly, which would lead to them needing updating whenever something new needed starting, they call a series of separate scripts. These scripts are located in the subdirectories of /etc named rcX.d, where the "X," again, refers to a run level. This means that if we wanted to add a new process that needed starting when a specific run level was entered, we do not need to change the rcX script. We can simply create a new script in the appropriate /etc/rc.X directory. We will look at how to add a new rc script later. The scripts located within these directories have names that follow the format Snnscriptname or Knnscriptname. The S scripts are for starting processes or applications and are executed when a run level is entered. The K scripts are for killing (or stopping) things and run when you leave a run level. The "nn" refers to a number, and this number controls the order that the scripts will run. The exact order that the scripts will run can be seen by running ls | more in the relevant rc directory. During a change of run level init will call the /sbin/rcX script, which will in turn run all the K scripts and S scripts in the equivalent /etc/rc.X directory. The scripts are always run in ascending order according to the number after the S or K. The K scripts are run first followed by the S scripts.

You may notice that certain script names exist in more that one rc directory and you will also notice that these files are actually linked. In fact, every script in a rc directory should be linked to a script in the directory /etc/init.d. This directory is the master directory for scripts called when init changes run levels. The fact that they are linked means that if you need to change any script it only needs doing once. If you look even more closely at the links, you will see that the S scripts are also linked to the K scripts. When the appropriate /bin/rcX script calls an S script it will supply the single parameter "start" and when it calls a K script it uses "stop" as the parameter. This again enables a single file to be edited regardless of whether changes need to be made to the way something is started, stopped, or both.

You may find that the K script is often located in a different directory to the S script. The directory containing the S script tells you which run level you need to enter to start the applications or subsystem, and the directory containing the K script tells you which run level you need to enter in order for the application or subsystem to be shut down.

We saw that in the inittab file the entry for the /sbin/rc2 script is set to run in both run-level 2 and run-level 3. This is because run-levels 2 and 3 are similar. Run-level 2 is referred to as multi-user mode without NFS running, and run level 3 is multi-user mode, with NFS running. The directory /etc/rc2.d contains the scripts that will be run when the system goes to run-level 2, but the directory /etc/rc3.d only contains the scripts to get NFS (and possibly one or two other subsystems) running. Therefore, if we enter run-level 2 we want all the S scripts in /etc/rc2.d to run (as would be expected), but if we enter run-level 3 we want to run the run-level 2 scripts (if they haven't already been run) followed by the run-level 3 scripts. If you look in the /sbin/rcX scripts, you will see that as well as knowing the run level being moved to, they also know the run level being moved from (they are stored in shell variables). So if you move from run-level 2 to run-level 3, only the scripts in /etc/rc3.d will run, but if you move from single-user mode straight to run-level 3, then the /etc/rc2.d scripts will run (from /sbin/rc2) followed by the /etc/rc3.d scripts (from /sbin/rc3).

Adding a New RC Script

If you install a piece of software on one of your Solaris systems that needs to be started automatically at boot-time, the usual method of starting it will be by its own rc script. Some software packages will set up the rc script themselves, but if it doesn't then it is a task that you will need to do yourself. This can be particularly important with some applications, for example databases, since if they are not stopped correctly when the system is shut down, they may need to perform some kind of recovery action when they are next started up.

Before you can start you need to know what the command to start the new piece of software is, what user the command should be run as, and whether any environment variables need to be defined before the command is run. You will also need to know in which run level(s) the software should run and shut down. Once you have all this information you can create the script and add it to the correct directory.

When writing the script you need to be aware that when it runs it will be called with one of two possible parameters: either "start" when it should start the new software, or "stop" when it should stop it.

For the purpose of this example, the command to start the new piece of software is:

 /opt/app_dir/bin/start_app 

And the command to stop it is:

 /opt/app_dir/bin/stop_app 

We will also assume that the command should be run as the user app_owner and an environment variable called APP_DATA needs to be set before the command can be run.

The rc script we write will look something like this:

 #!/bin/sh # script to start and stop the app software # version 1.0 # August 12th 2001 APP_DATA=/opt/app_dir/data_area export APP_DATA case $1 in   start)     su - app_owner -c "/opt/app_dir/bin/start_app"     echo "Started app"     exit 0     ;;   stop)     su - app_owner -c "/opt/app_dir/bin/stop_app"     echo "Stopped app"     exit 0     ;;   *)     echo "usage: $0 start|stop"     exit 1     ;; esac 

We now have our script so we need to give it a name (app_control will do), and we should place it in the directory /etc/init.d.

This directory is the master directory for rc scripts that will be called, indirectly, from init. None of the scripts are ever run from this directory, but, as mentioned, they will be linked to files in the actual rc directory where they will be run from.

Before we can create the links, we need to know what run level the software should start in and when it should be stopped. This is a multi-user application, so it will need to run in multi-user mode, but even though we normally run our systems in run-level 3, it is better to put the script in /etc/rc2.d so it can still be used if we need to put the system in that run level for any reason. If the software needs to make use of NFS, then it would make sense to put the script in /etc/rc3.d since we would not want it to run if NFS was not running.

We should start the software as we enter run-level 2, so we need to link the file from /etc/init.d to /etc/rc2.d. However, we need to know what to call the file in /etc.rc2.d. Since the script will be starting the software, we need to start the name with "S," but we also need to assign it a number to determine at what stage it runs.

The current contents of /etc/rc2.d are as follows:

 hydrogen# ls -g total 132 -rwxr--r--   6 sys     861 Sep  1  1998 K07dmi -rwxr--r--   6 sys     404 Sep  1  1998 K07snmpdx -rwxr--r--   6 sys    2307 Sep  1  1998 K28nfs.server -rw-r--r--   1 sys    1369 Sep  1  1998 README -rwxr--r--   3 sys    1886 Sep  9  1999 S01MOUNTFSYS -rwxr--r--   2 sys    2004 Sep  1  1998 S05RMTMPFILES -rwxr--r--   2 sys     624 Sep  1  1998 S20sysetup -rwxr--r--   2 sys     989 Sep  1  1998 S21perf -rwxr-xr-x   2 other  1644 Sep 11  1998 S30sysid.net -rwxr--r--   5 sys     359 Jun 29  1999 S40llc2 -rwxr--r--   5 sys    7317 Sep  1  1998 S69inet -rwxr--r--   5 sys    2750 Oct 14  1999 S71rpc -rwxr-xr-x   2 other  1498 Sep 11  1998 S71sysid.sys -rwxr-xr-x   2 other  1558 Sep 11  1998 S72autoinstall -rwxr--r--   5 sys    7430 Sep  1  1998 S72inetsvc -rwxr--r--   2 sys    1113 Sep  1  1998 S73cachefs.daemon -rwxr--r--   3 sys    1223 Oct 14  1999 S73nfs.client -rwxr--r--   5 sys     364 Oct 14  1999 S74autofs -rwxr--r--   5 sys     867 Oct 14  1999 S74syslog -rwxr--r--   5 sys     942 Sep  1  1998 S74xntpd -rwxr--r--   5 sys     504 Sep  1  1998 S75cron -rwxr--r--   2 sys    2519 Sep  1  1998 S75savecore -rwxr--r--   5 sys     563 Sep  1  1998 S76nscd -rwxr--r--   5 sys     460 Sep  1  1998 S80lp -rwxr--r--   2 sys     256 Sep  1  1998 S80PRESERVE -rwxr--r--   5 sys     610 Sep  1  1998 S80spc -rwxr--r--   5 sys    1959 Sep  1  1998 S85power -rwxr--r--   5 sys     868 Sep  1  1998 S88sendmail -rwxr--r--   5 sys     597 Sep  1  1998 S88utmpd -rwxr--r--   5 sys     391 Oct 14  1999 S92volmgt -rwxr--r--   2 sys     364 Sep  1  1998 S93cacheos.finish -rwxr--r--   5 sys     447 Sep  1  1998 S99audit -rwxr--r--   5 sys    2804 Sep 12  1998 S99dtlogin -rwxr--r--   2 sys     449 Oct 14  1999 S99tsquantum hydrogen# 

You will see that although most scripts in rc2.d are S scripts, there are three K scripts. These are the scripts that are linked to the S scripts in rc3.d, and they are needed because if we move from run-level 3 to run-level 2 we want these services to stop running. The scripts will also run if we move from single-user mode to run-level 2, but since the run-level 3 services would not have been running in that run level the scripts will have no effect.

We would ideally like our new software package to run after everything else has started. However, since there are already a few rc scripts with S99 as their prefix we will choose S95, as it doesn't matter if our script runs before the existing S99 rc scripts. To create the link we would use the following command:

 hydrogen# ln /etc/init.d/app_control /etc/rc2.d/S95app_control hydrogen# cd /etc/rc2.d hydrogen# ls -l S95app_control -rw-r--r--   2 root   other    373 Sep  5 13:33 S95app_control hydrogen# 

We can see that the script now has two links, but this only affects starting the software. We will also need to create a link to a name beginning with "K" so the software can be correctly stopped when we shut the server down. However, since we also want it to stop if the system is put into single-user mode, we will need to create several links. These will be in the directories /etc/rc0.d, /etc/rc1.d, and /etc/rcS.d.

 hydrogen# cd /etc/init.d hydrogen# ln app_control /etc/rc0.d/K95app_control hydrogen# ln app_control /etc/rc1.d/K95app_control hydrogen# ln app_control /etc/rcS.d/K95app_control hydrogen# 

This means that our new software will start automatically each time the system goes into multi-user mode (either run-level 2 or run-level 3). It will also be shut down when the system either goes into single-user mode or is shut down itself. Because all the files are linked, if we ever need to make a change we only need to edit the script once. We can double-check that all the links are set up correctly with the following find command:

 hydrogen# find /etc -name "*app_control" -exec ls -ild {} \; 6415 -rw-r--r--  5 root  other  373 Sep  5 13:33 /etc/init.d/app_control 6415 -rw-r--r--  5 root  other  373 Sep  5 13:33 /etc/rc0.d/K95app_control 6415 -rw-r--r--  5 root  other  373 Sep  5 13:33 /etc/rc1.d/K95app_control 6415 -rw-r--r--  5 root  other  373 Sep  5 13:33 /etc/rc2.d/S95app_control 6415 -rw-r--r--  5 root  other  373 Sep  5 13:33 /etc/rcS.d/K95app_control hydrogen# 

Because we included the "-i" option with the ls command, we can confirm that they all have the same inode number (6415). Of course, we can also see that they have the correct number of links set.


    Team-Fly    
    Top
     



    Solaris Operating Environment Boot Camp
    Solaris Operating Environment Boot Camp
    ISBN: 0130342874
    EAN: 2147483647
    Year: 2002
    Pages: 301

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net