Certification Objective 8.01Managing System Processes | Sun Certified System Administrator for Solaris 10 Study Guide Exams 310-XXX & 310-XXX

Certification Objective 8.01—Managing System Processes

Exam Objective 5.2: Control system processes by viewing the processes, clearing frozen processes, and scheduling automatic one-time and recurring execution of commands using the command line.

Managing system processes is one of the tasks you will be performing often as a system administrator. It includes listing the processes, getting detailed information about a process, deleting a hung process, and scheduling a process. Some common commands for managing processes are listed in Table 8-1.

Table 8-1: Some commands to manage processes
Command	Description
ps	List the active processes on a system and obtain information about them.
pgrep	Display information about selective processes.
prstat	Display information about selective processes that must be refreshed periodically.
pstop	Stop processes.
prun	Start processes.
kill,pkill	Terminate processes.

Before we can control a process, we need to find some information, such as the process ID. Now, let's find out how Solaris identifies a process and how to view it.

Viewing Processes

In order to manage (or control) processes on your system, you need to know what processes exist on the system, and you need to have some information about those processes such as process IDs. Therefore, process management starts with the commands that let you view the processes. The ps command lets you check the status of active processes on the system and obtain some technical information about them.

Viewing Processes with the ps Command

You can view the active processes on the system by using the ps command, which has the following syntax:

    ps [<options>]

If no <options> are specified, the output of the ps command includes only those processes that have the same effective user ID and terminal as the user who issued the command. Some common, options for this command are described here:

-a. Display information about the most frequently requested processes. Processes not associated with a terminal will not be included.
-A. Display information about every process currently running.
-e. Identical with the -A option.
-f. Full listing. Display additional information about each process.
-l. Generate a long listing.
-P <procList>. Display information only about those processes whose process IDs are specified by <procList>.
-u <uidList>. Display information only about those processes whose effective user IDs or login names are specified by <uidList>, which could be a single argument or a space or comma-separated list.
-U <uidList>. Display information only about those processes whose real user IDs or login names are specified by <uidList>, which could be a single argument or a space or comma separated list.

Remember chat the ps command takes a snapshot of the processes running at the moment the command is issued, and therefore the values of some of the fields in the output may not be good even right after the command is executed.

Exam Watch

The state of a running process is represented by the value O of the state field, not by the value R of the state field. A process with the value of the state field equal to R is not running, but is ready to run and is in the queue for running. This state is called runnable.

The output samples of the ps command are shown in Exercise 8-1. The fields displayed in the output of the ps command depend on the command options. A large set of these fields is described in Table 8-2.

Exercise 8-1: Using the ps Command to View Processes

Use the process command without any options:

    $ ps

The output will look like the following:

    PID TTY TIME COMD    1664 ptS/4 0:06 csh    2081 pts/4 0:00 ps

Now, use the process command with the -e and -f options:

    $ ps -ef

The output will look like the following:

    UID PID PPID C STIME TTY TIME CMD    root 0 0 0 Dec 20 ? 0:17 sched    root 1 0 0 Dec 20 ? 0:00 /etc/init -    root 2 0 0 Dec 20 ? 0:00 pageout    root 3 0 0 Dec 20 ? 4:20 fsflush    root 374 367 0 Dec 20 ? 0:00 /usr/lib/saf/ttymon

Try other options and understand the output.

Table 8-2: Summary of fields in the output of the ps command
Field	Required Option	Description
ADDR	-l	The memory address of the process
C	-f or -l	The processor utilization for scheduling (not displayed if -c option is used)
CLS	-c	The scheduling class to which the process belongs (e.g., system, time sharing)
CMD	None	The command that generated the process
NI	-l	The nice number for the process that contributes to its scheduling priority (Nicer means lower priority.)
PID	None	The process ID, a unique identifier for each process
PPID	-f or -l	The parent process ID (i.e., the unique identifier of the process that spawned this process)
PRI	-l	Scheduling priority for this process (Higher number means higher priority.)
S	-l	State of the process (For example, R indicates that the process is running, and S indicates that the process is sleeping.)
STIME	-f	The starting time of the process in hours, minutes, and seconds.
SZ	-l	Size (total number of pages the process has in the virtual memory)
Time	None	The total CPU time the process has used since it began
TTY	None	The terminal from which the process of its parent process was started

If the output of the ps command is several pages in size, it will quickly scroll down to the end: hence it may not be convenient to view a particular process. In this case, you can pipe the output to the more (or less) command as shown here:

    ps -f | more

Now, you can display the output page by page. If you know what process or processes you are looking for and you can figure out a string of characters, say xyz, in their entries in the output, then you can pipe the output into a grep command as shown here:

    ps -f | grep xyz

The most important two fields in the output of the ps command are the PID, which represents the process ID (the unique identifier for the process), and S, which represents the current state of the process. The possible values for the field S are described in Table 8-3.

Table 8-3: Summary of process states
Value of Field S	Process State	Description
O	Running	The process is running.
R	Runnable	The process is ready and is in the queue for running.
S	Sleeping	The process is waiting for some event to complete.
T	Traced	The process has been stopped, either by a job control signal or because it's being traced.
Z	Zombie state	The process has been, terminated, and the parent process is not waiting; it is an uncleaned dead process.

Note that it is the O state, not the R state, that indicates that the process is running. The R state indicates that the process is Runnable, and is in the queue for running. The S state means that the process is sleeping—for example, waiting for resources or some other event to happen before it is put into the Runnable state. The T state indicates that the process has been terminated—for example, by a stop command or a CTRL-Z pressed by the user when the process was running. The Z state identifies a zombie process, which is a dead process whose parent did not clean up after it and it is still occupying space in the process table.

On the Job

A zombie process, recognized by the state Z, does not use CPU resources, but it still uses space in the process table. There is no parent process to clean up after it.

Even in the ps command, you can be somewhat selective in viewing the processes—for example, by specifying the list of PIDs or UIDs. However, the pgrep command lets you be even more selective in viewing the processes or in searching for the processes that you want to view.

Viewing Processes with the pgrep Command

Previously in this chapter we showed how by using the grep command along with ps you can restrict the displayed output to certain processes. Solaris 10 offers the pgrep command to handle such situations. The pgrep command lets you specify the criteria and then displays the information only about the processes that match the criteria. The syntax for the pgrep command is shown here:

    pgrep [<options>] [<pattern>]

For example, the following command will select all the processes whose real group name is poli or tics:

    pgrep -G poli,tics

You can also specify multiple criteria, and a logical AND will be assumed between the criteria. For example, consider the following command:

    pgrep -G poli,tics -U gbush,jkerry

This command will select the processes that match the following criteria:

    (group name is poli OR tics) AND (user name is gbush OR jkerry)

Now that you have a handle on how the pgrep command works, let's look at some of its options:

-d <delim>. Specify delimiter string to be used to separate process IDs in the output. The newline character is the default.
-f. The regular expression pattern should be matched against the full process argument string, which can be obtained from the pr_psargs field of the /proc/nnnnn/psinfo file.
-g <pgrpList>. Select only those processes whose effective process group ID is in the list specified by <pgrpList>. If group 0 is included in the list, this is interpreted as the process group ID of the pgrep or pkill process.
-G <gidList>. Select only those processes whose real group ID is in the list specified by <gidlist>.
-l. Use the long output format.
-n. Select only the newest (i.e., the most recently created) process that meets all other specified matching criteria. Can't be used with the -o option.
-o. Matches only the oldest (i.e., the earliest created) process that meets all other specified matching criteria. Cannot be used with the -n option.
-P <ppidList>. Select only those processes whose parent process ID is in the list specified by <ppidList>.
-s <sidList>. Select only those processes whose process session ID is in the list specified by <sidList>. If ID 0 is included in the list, this is interpreted as the session ID of the pgrep or pkill process,
-t <termList>. Select only those processes that are associated with a terminal in the list specified by <termList>.
-T <taskidList>. Select only those processes whose task ID is in the list specified by <taskidList>. If ID 0 is included in the list, this is interpreted as the task ID of the pgrep or pkill process.
-u <euiList>. Select only those processes whose effective user ID is in the list specified by <euiList>.
-U <uidlist>. Select only those processes whose real user ID is in the list specified by <uidList>.
-v. Reverse the matching logic—that is, select all processes except those which meet the specified matching criteria.
-x. Select only those processes whose argument string or executable file name exactly matches the specified pattern—that is, all characters in the process argument string or executable file name must match the pattern.

Remember that in these commands, as anywhere else, a user ID may be specified either by the user name or by the numeric ID. This is also true for group IDs.

Now that you know how to use the ps and pgrep commands, here are some practical scenarios and their solutions.

SCENARIO & SOLUTION
How would you issue the pgrep command to list all the processes except the one with user name jmccain or bboxer?	pgrep -v -U jmccain, bboxer The -v option reverses the matching logic.
If you want to select only the oldest process on the system, what command would you issue?	pgrep -o
Which command would a user issue to get the list of all processes with the same UID and the terminal as that user?	ps The ps command without any option and argument will accomplish this task.

SCENARIO & SOLUTION

How would you issue the pgrep command to list all the processes except the one with user name jmccain or bboxer?

pgrep -v -U jmccain, bboxer

The -v option reverses the matching logic.

If you want to select only the oldest process on the system, what command would you issue?

pgrep -o

Which command would a user issue to get the list of all processes with the same UID and the terminal as that user?

The ps command without any option and argument will accomplish this task.

Both the ps and pgrep commands display a snapshot of the processes. The situation might have changed immediately after the output of these commands was displayed. In order to find the current information you will need to re-issue the command. If you want to monitor the processes continuously without having to reissue the command, use the prstat command, which we explore next.

Viewing Processes with the prstat Command

The prstat command displays information about the processes similar to that displayed by the ps and pgrep commands. However, a unique feature of the prstat command is that it refreshes (updates) the output in a periodic fashion. You can determine the frequency of updates.

The prstat command has the following syntax:

    prstat [<options>] [<interval> [<count>] ]

The <interval> argument specifies the time lapse between two consecutive display updates, and the default is five seconds. The <count> argument specifies how many times the display will be updated in total, and the default is infinity—that is, until the command process is terminated. Some values of <options> for this command are described here:

-a. Display information about processes and users.
-c. Display new reports below the previous displays instead of overwriting them.
-n. <number>. Display information about only the first x number of selected processes where the value of x is specified by <number>.
-p <pidList>. Display information about only those processes whose process ID is in the list specified by <pidList>.
-s <key>. Sort output lines by the field specified by <key> in descending order. Only one key can be used as an argument. The key has five possible values:
- cpu. Sort by CPU usage by the process. This is the default.
- pri. Sort by the process priority.
- rss. Sort by resident set size.
- size. Sort by size of process image.
- time. Sort by the process execution time.
-S <key>. Sort output lines by the field specified by <key> in descending order.
-u <euiList>. Select only those processes whose effective user ID is in the list specified by <euiList>.
-U <uidList>. Select only those processes whose real user ID is in the list specified by <uidList>.

You may be wondering at this point how you are going to remember all these options for these commands. Well, note that some options are related to the properties of a process and are repeated for more than one command. Those options along with the process properties they are related to are described in Table 8-4.

Table 8-4: A list of process properties that appear as options in various process management command
Option	Process property	Description
-p	Process ID	The unique identifier for a process
-u	Effective user ID	The user ID whose permissions are being used by the process
-U	Real user ID	The user ID for the user that started the process
-g	Effective group ID	The group ID for the group whose group permissions are being used by the process
-G	Real group ID	The group ID of the group that owns the process

To understand the difference between real and effective, suppose a user hillary starts a process passwd that is owned by the user root. The executable passwd has its setuid and setgid bits set—that is, its permission mode is 6555. Therefore, although hillary started the process passwd, it's running with privileges associated with the root. In this case, hillary is called the real user of this process, and root is called the effective user. Accordingly hillary's user ID is the real user ID for the process, and root's user ID is the effective user ID for the process.

By using the ps, the pgrep, or the pstat command, you obtain some information about the processes. That information may tell you which process needs a control action, and that same information (such as PID) also gives you the handle that you can use to control the process.

Controlling Processes

Processes use the resources on your system such as CPU, memory, and disk space. If they remain unmonitored, they may fill your disk space or bring your system to a halt. Therefore, you need to control processes—for example, by clearing a hung process, terminating a process that has fallen into an infinite loop, stopping a process, or restarting a process.

Controlling a Process

You control a process by taking these three steps:

Obtain the process ID of the process that you want to control—for example, by issuing the following command:
```
    D#pgrep <processName> 
```
Issue the appropriate command to control the process. For example, issue the following command to stop the process:
```
    #[[CD]]pstop <pid> 
```
<pid> is the process ID that you discovered in step 1.
Verify the process status to make sure you have accomplished what you wanted to; for example, issue the following command:
```
    # ps -ef | grep <pid> 
```
Repeat steps 2 and 3 if you need to. If you want to restart the stopped process, issue the following command:
```
    # prun <pid> 
```
Verify that it is actually running.

There will be some hung processes that you will need to clear.

Clearing a Hung Process

Solaris supports the concept of communicating with a process by sending it a signal. Sometimes, you might need to kill (stop or terminate) a process. The process, for example, might be in an endless loop, it might be hung, or you might have started a large job that you want to stop before it has completed. You can send a signal to a process by using the kill command which has the following syntax:

    kill [<signal>] <pid>

The <pid> is the process ID, and <signal> is an integer whose default value is -15 (SIGTERM). If you use -9 (SIGKILL) for the <signal>, the process terminates promptly. However, do not use -9 signal to kill certain processes, such as a database process, or an LDAP server process, because you might lose or corrupt data contained in the database. A good policy is first always use the kill command without specifying any signal, and wait for a few minutes to see whether the process terminates before you issue the kill command with -9 signal.

On the Job

As a superuser, you can kill any process. However, killing any of the processes with Process ID 0, 1, 2, 3, and 4 will most likely crash the system. Now, why would you do that?

You will mostly be using the SIGHUP, SIGSTP, and SIGKILL signals. Table 8-5 describes these and some other commonly used signals for controlling the processes.

Table 8-5: Most common signals used for controlling processes
Signal	Number	Description
SIGHUP	1	Hangup, Usually means that the controlling terminal has been disconnected.
SIGINT	2	Interrupt. Pressing CTRL-D or DELETE will generate this signal.
SIGQUIT	3	Quit. This signal causes the process to quit and generates a core dump. You can generate it by pressing CTRL-.
SIGABRT	6	Abort.
SIGKILL	9	Kill the process promptly. Process is not allowed to clean up after itself, so you can lose or corrupt data with this command.
SIGTERM	15	Terminate. Terminate the process and give the process a chance to clean up after itself. This is the default signal sent by kill and pkill.
SIGSTOP	23	Stop. Pauses a process.
SIGCONT	25	Continue. Starts a stopped process.

You can issue the following command to get a list of all the supported signals:

    kill -l

If you are going to use the pgrep command to find some processes matching some criteria and then use the kill command on them, you would do better to use the pkill command, which has the functionality of both the pgrep and kill commands. For example, consider the following command:

    pkill -9 -U gbush, jKerry

This command will find all the processes owned by gbush and jkerry and kill them. You can also use (pkill instead of kill) to send a signal to a known process:

    pkill [<signal>] <processName>

Note that in the kill command, you use the process ID and in the pkill command you use the process name to refer to a process.

A hung up process can also freeze the system.

Dealing with a Hung System

You will find at times that the system has hung because of some software process that has become stuck. To recover from a hung system, try the following actions:

If the system is running a window environment, perform the following steps:
- Make sure the pointer is in the window in which you are typing the commands.
- Press CTRL-Q if the screen is frozen because the user accidentally pressed CTRL-S.
- Log in remotely from another system on the network, and use the pgrep command to look for the hung process. Identify the process and kill it.
Press CTRL- to force a "quit" on the running process.
Press CTRL-C to interrupt the program that might be running.
Log in remotely, identify the process that is hanging the system, and kill it.
Log in remotely, become superuser, and reboot the system.
If the system still does not respond, force a crash dump and reboot.
If the system still does not respond, turn the power off, wait a minute or two, then turn the power back on.
If you can't get the system to respond at all, contact your local service provider for help.

You can always start a process instantly by issuing a command. However, Solaris allows you to schedule processes that will start executing at a later time.

Scheduling Processes

The motivation for scheduling processes is three pronged: to start executing a job at a time when you will not be physically present at the system to manually start a job, to distribute the job load over time, and to execute a job repeatedly in a periodic fashion without having to start it each time manually.

Like everything else in Solaris, you do process scheduling through files. The management in this area includes writing and maintaining these files and determining who can write them.

Scheduling Processes with the cron Utility

The automatic scheduling of processes (also called jobs) is handled by the cron utility, named after the Greek god of time Chronos. The job schedule is set up in the /var/spool/cron/atjobs directory files for jobs that will be executed only once, and in the /var/spool/cron/crontab directory files for jobs that will be executed repeatedly. The cron daemon manages the automatic scheduling of the processes (commands) listed in these files by performing the following tasks:

Check for new crontab (and atjob) files.
Read the commands and their scheduled times inside these files.
Submit the commands for execution at the scheduled times.
Listen for the notifications from the crontab commands regarding updated crontab and atjobs files.

Each entry in a crontab file contains the command name and the time at which it should be executed. The structure of an entry in a crontab file is shown in Figure 8-1.

image from book
Figure 8-1: An example of an entry in the crontab file that specifies that the script diskchecker will be executed at 9—15 A.M. on each Sunday and Wednesday every week, every month

There is one entry in each line in crontab file. The beginning of each entry contains date and time information that tells the cron daemon when to execute the command, which is listed as the last field in the entry. The fields are described in Table 8-6.

Table 8-6: Acceptable range of values for the crontab time fields
Field Position (from left)	Field	Range of Values
1	Minute	0-59 A * means every minute.
2	Hour	0-23 A * means every hour.
3	Day of month	1-31 A * means every day of the month.
4	Month	1-12 A * means every month.
5	Day of week	0-6 A * means every day of the week.
6	Command	Command to be executed.

While writing an entry in a crontab file, follow these rules:

Use a space to separate any two consecutive fields.
Use a comma to separate multiple values for a field.
Use a hyphen (-) to specify a range of values for a field.
Use an asterisk (*) as a wildcard to indicate all legal values of a field.
Use a pound sign (#) at the beginning of a line to indicate comment or a blank line.

On the Job

Each entry in a crontab file must consist of only one line, even if that line is very long, because the crontab file does not recognize extra carriage returns. This also means that there should be no blank lines (without a # sign) in between any two entry lines.

For example, you get a crontab file named root during SunOS software installation. Consider the following two entries in this file:

    10 3 * * * /usr/sbin/logadm    15 3 * * 0 /usr/lib/fs/nfs/nfsfind

The first entry schedules the logadm command to be run at 3:10 A.M. every day, and the second entry schedules the nfsfind script to be executed at 3:15 A.M. every Sunday.

The jobs that need to be run repeatedly are scheduled by using the crontab files. A user with the appropriate privileges can create a crontab file, whereas the system administrator can create a crontab file for any user.

Managing the crontab Files

As a system administrator, you will need to manage the crontab files. The crontab files are created and edited by using the crontab command with the following syntax:

    crontab -e [<userName>]

The <username> is the login name of the user for whom you want to create the crontab file, and it defaults to the login name of the user who issued the command. You must be a superuser to create (or edit) the crontab file for other users, but you don't need to be a superuser to create the crontab file for your own account.

You can verify that the crontab file exists from the output of the following command:

    ls -l /var/spool/cron/crontabs

You can display the content of a crontab file by using the crontab command with the following syntax:

    crontab -l [<userName>]

The <username> specifies the login name of the user whose crontab file you want to display, and it defaults to the login name of the user who issues the command. You will need to be a superuser to display the crontab file of another user.

You can remove a crontab file by using the crontab command with the following syntax:

    crontab -r [<userName>]

The <userName> specifies the login name of the user whose crontab file you want to remove and defaults to the login name of the user who issued the command. You need to be a superuser to remove a crontab file of another user.

Processes running on a system consume system resources, and they can also damage the system depending on what they are launched to do. A regular user has a right, by default, to create a crontab file and thereby to schedule processes. However, as a system administrator, you can determine which users can have the privilege to create crontab files.

Controlling Access to crontab Files

You can control access to the crontab command by using the following two files:

    /etc/cron.d/cron.deny    /etc/cron.d/cron.allow

These files allow you to specify users who can (or cannot) use the crontab command for performing tasks such as creating, editing, displaying, or removing their own crontab files.

The cron.deny and cron.allow files consist of a list of user names, one user name per line. The permission to use the crontab command is determined by the interaction of both files as described here:

If the cron.allow file exists, only the users listed in this file can create, edit, display, or remove crontab files.
If the cron.allow file does not exist, all users except those listed in the cron.deny file can submit the crontab files.
If neither cron.allow nor cron.deny exists, only a superuser can execute the crontab command.

To be more specific, there are only four possible combinations of the existence or absence of the cron.allow and cron.deny files. The crontab access corresponding to each of these combinations is listed in Table 8-7.

Table 8-7: Access to the crontab command managed by the cron.allow and cron.deny files
Does cron.allow exist?	Does cron.deny exist?	Who has access to crontab?
Yes	Yes	Users listed in the cron.allow file, and superuser
Yes	No	Users listed in the cron.allow file, and superuser
No	Yes	All users except those listed in the cron.deny file
No	No	Only the superuser

Furthermore, only a superuser can create or edit the cron.deny and cron.allow files. When you install SunOS software, a default version of the cron.deny file is created, but no cron.allow file is created. You can display the content of the cron.deny file with the following command:

Exam Watch

When the cron.allow file exists, the existence or absence of the cron.deny file does not matter—that is, the cron.deny file is not even checked.

    $ cat /etc/cron.d/cron.deny

The output of this command would look like the following:

    Daemon    bin    smtp    nuucp    listen    nobody    noaccess

None of the user names listed in the cron.deny file can access the crontab command, but all other users can. Of course, you can edit this file to add other user names that will be denied access to the crontab command, and you can create the cron.allow file as well, in which case the cron.deny file will be ignored.

Scheduling a Process for One Time Execution

The processes for one time execution at a later time are scheduled by using the at command. You can schedule a job for time execution by performing the following steps:

Issue the at command with the following syntax:
```
    $ at [-m] <time> [<date>] 
```
The -m option will send you an email after the job is completed. The <time> specifies the hour at which you want to schedule the job. Add am or pm if you do not specify the hours according to the 24-hour clock. Acceptable keywords are midnight, noon, and now. Minutes are optional. For example, 1930 means 7:30 P.M. The <date> specifies the first three or more letters of a month, a day of the week, or the keywords: today or tomorrow.
At the at prompt, type the commands or scripts that you want to execute: one per line. You can type more than one command by pressing RETURN at the end of each line.
Press CTRL-D to Exit the at utility and save the at job.

Your at job is assigned a queue number, which is also the job's filename in the /var/spool/cron/atjobs directory. You can control access to the at command by using the following file:

    /etc/cron.d/at.deny

This file is created when you install SunOS software and has the following list of users, one user name per line:

    Daemon    bin    smtp    nuucp    listen    nobody    noaccess

The users who are listed in this file cannot access the at command, but all other users can. As a superuser, you can edit this file and add more user names to it.

Exercise 8-2: Scheduling a cron Job by Creating and Editing a crontab File

Issue the following command to edit your crontab file:
```
    crontab -e 
```
If you did not have a crontab file, this command will create an empty crontab file and let you write into it.
Write the following entry into your crontab file:
```
    * * * ls -l /etc » cron_test.log 
```
This command will execute the ls -l command on the directory /etc every minute and dump the output into the file cron_test.log. Save the file. Provide the full path for the file cron_test.log, so that you know for sure where to find it.
After each minute, verify that the output from the ls -l command is being appended to the cron_test.log file.
After a few minutes, edit the crontab file and remove the entry that you made. Verify that no more output is being appended to the cron_test.log file.

Remember that you can also use the process tool of the Solaris Management Console (SMC) GUI to view and manage processes.

Processes use system resources such as CPU, memory, and disk space. There is another important resource that users often use on a system—the printing service that the Solaris system offers. You will need to manage the printing service on the Solaris system to allow the users to share the printers on the network.