Processes

	Network Programming with Perl By Lincoln D. Stein Slots : 1
	Table of Contents

	Chapter 2. Processes, Pipes, and Signals

Content

UNIX, VMS, Windows NT/2000, and most other modern operating systems are multitasking. They can run multiple programs simultaneously, each one running in a separate thread of execution known as a process. On machines with multiple CPUs, the processes running on different CPUs really are executing simultaneously. Processes that are running on single-CPU machines only appear to be running simultaneously because the operating system switches rapidly between them, giving each a small slice of time in which to execute.

Network applications often need to do two or more things at once. For example, a server may need to process requests from several clients at once, or handle a request at the same time that it is watching for new requests. Multitasking simplifies writing such programs tremendously, because it allows you to launch new processes to deal with each of the things that the application needs to do. Each process is more or less independent, allowing one process to get on with its work without worrying about what the others are up to.

Perl supports two types of multitasking. One type, based on the traditional UNIX multiprocessing model, allows the current process to clone itself by making a call to the fork() function. After fork() executes, there are two processes, each identical to the other in almost every respect. One goes off to do one task, and the other does another task.

Another type, based on the more modern concept of a lightweight "thread," keeps all the tasks within a single process. However, a single program can have multiple threads of execution running through it, each of which runs independently of the others.

In this section, we introduce fork() and the variables and functions that are relevant to processes. We discuss multithreading in Chapter 11.

The `fork()` Function

The fork() function is available on all UNIX versions of Perl, as well as the VMS and OS/2 ports. Version 5.6 of Perl supports fork() on Microsoft Windows platforms, but not, unfortunately , on the Macintosh.

The Perl fork() function takes no arguments and returns a numeric result code. When fork() is called, it spawns an exact duplicate of the current process. The duplicate, called the child, shares the current values of all variables, filehandles (including data in the standard I/O buffers), and other data structures. In fact, the duplicate process even has the memory of calling fork() . It is like a man walking into the cloning booth of a science fiction movie. The copy wakes up in the other booth having all the memories of the original up to and including walking into the cloning booth, but thinking wait, didn't I start out over there? And who is the handsome gentleman in that other booth?

To ensure peaceful coexistence, it is vital that the parent and child processes know which one is which. Each process on the system is associated with a unique positive integer, known as its process ID, or PID.

After a call to fork() , the parent and child examine the function's return value. In the parent process, fork() returns the PID of the child. In the child process, fork() returns numeric 0. The code will go off and do one thing if it discovers it's the parent, and do another if it's the child.

$pid = fork()

Forks a new process. Returns the child's PID in the parent process, and 0 in the child process. In case of error (such as insufficient memory to fork), returns undef , and sets $! to the appropriate error message.

If the parent and child wish to communicate with each other following the fork, they can do so with a pipe (discussed later in this chapter in the Pipes section), or via shared memory (discussed in Chapter 14 in the An Adaptive Preforking Server Using Shared Memory section). For simple messages, parent and child can send signals to each others' PIDs using the kill() function. The parent gets the child's PID from fork() 's result code, and the child can get the parent's PID by calling getppid() . A process can get its own PID by examining the $$ special variable.

$pid = getppi()

Returns the PID of the parent process. Every Perl script has a parent, even those launched directly from the command line (their parent is the shell process).

The $$ variable holds the current PID for the process. It can be read, but not changed.

We discuss the kill() function later in this chapter, in the Signals section.

If it wishes, a child process can itself fork() , creating a grandchild. The original parent can also fork() again, as can its children and grandchildren. In this way, Perl scripts can create a whole tribe of (friendly, we hope) processes. Unless specific action is taken, each member of this tribe belongs to the same process group .

Each process group has a unique ID, which is usually the same as the process ID of the shared ancestor . This value can be obtained by calling getpgrp() :

$processid = getpgrp([$pid])

For the process specified by $pid , the getpgrp() function returns its process group ID. If no PID is specified, then the process group of the current process is returned.

Each member of a process group shares whatever filehandles were open at the time its parent forked. In particular, they share the same STDIN , STDOUT , and STDERR . This can be modified by any of the children by closing a filehandle, or reopening it on some other source. However, the system keeps track of which children have filehandles open, and will not close the file until the last child has closed its copy of the filehandle.

Figure 2.1 illustrates the typical idiom for forking. Before forking, we print out the PID stored in $$. We then call fork() and store the result in a variable named $child . If the result is undefined, then fork() has failed and we die with an error message.

Figure 2.1. This script forks a single child

graphics/02fig01.gif

We now examine $child to see whether we are running in the parent or the child. If $child is nonzero, we are in the parent process. We print out our PID and the contents of $child , which contains the child process's PID.

If $child is zero, then we are running in the child process. We recover the parent's PID by calling ppid() and print it and our own PID.

Here's what happens when you run fork.pl:

 % fork.pl PID=372 Parent process: PID=372, child=373 Child process:  PID=373, parent=372

The `system()` and `exec ()` Functions

Another way for Perl to launch a subprocess is with system() . system() runs another program as a subprocess, waits for it to complete, and then returns. If successful, system() returns a result code of 0 (notice that this is different from the usual Perl convention!). Otherwise it returns -1 if the program couldn't be started, or the program's exit status if it exited with some error. See the perlvar POD entry for the $? variable for details on how to interpret the exit status fully.

$status = system ('command and arguments')

$status = system ('command', 'and', 'arguments')

The system() function executes a command as a subprocess and waits for it to exit. The command and its arguments can be specified as a single string, or as a list containing the command and its arguments as separate elements. In the former case, the string will be passed intact to the shell for interpretation. This allows you to execute commands that contain shell metacharacters (such as input/output re-directs), but opens up the possibility of executing shell commands you didn't anticipate. The latter form allows you to execute commands with arguments that contain whitespace, shell metacharacters, and other special characters , but it doesn't interpret metacharacters at all.

The exec() function is like system() , but replaces the current process with the indicated command. If successful, it never returns because the process is gone. The new process will have the same PID as the old one, and will share the same STDIN , STDOUT , and STDERR filehandles. However, other opened filehandles will be closed automatically. ^[1]

^[1] You can arrange for some filehandles to remain open across exec() by changing the value of the $~F special variable. See the perlvar POD document for details.

$status = exec ('command and arguments')

$status = exec ('command', 'and', 'arguments')

exec() executes a command, replacing the current process. It will return a status code only on failure. Otherwise it does not return. The single-value and list forms have the same significance as system() .

exec() is often used in combination with fork() to run a command as a subprocess after doing some special setup. For example, after this code fragment forks, the child reopens STDOUT onto a file and then calls exec() to run the ls -l command. On UNIX systems, this command generates a long directory listing. The effect is to run ls -l in the background, and to write its output to the indicated file.

 my $child = fork(); die "Can't fork: $!" unless defined $child; if ($child == 0) { # we are in the child now   open (STDOUT,">log.txt") or die "open() error: $!";   exec ('ls','-l');   die "exec error(): $!"; # shouldn't get here }

We use exec() in this way in Chapter 10, in the section titled The Inetd Super Daemon.

Top

The fork() Function

Figure 2.1. This script forks a single child

The system() and exec () Functions

The `fork()` Function

The `system()` and `exec ()` Functions