Unnamed Pipes

Table of contents:

An unnamed pipe is constructed with the pipe system call (see Table 5.5).

Table 5.5. Summary of the pipe System Call.

Include File(s)			Manual Section	2
Summary	`int pipe(int filedes[2]);`
Return	Success	Failure	Sets `errno`
		1	Yes

If successful, the pipe system call returns a pair of integer file descriptors, filedes[0] and filedes[1] . The file descriptors reference two data streams. Historically, pipes were unidirectional, and data flowed in one direction only. If two-way communication was needed, two pipes were opened: one for reading and another for writing. This is still true in Linux today. However, in some versions of UNIX (such as Solaris) the file descriptors returned by pipe are full duplex (bidirectional) and are both opened for reading/writing.

In a full duplex setting, if the process writes to filedes[0] , then filedes[1] is used for reading; otherwise , the process writes to filedes[1] , and filedes[0] is used for reading. In a half duplex setting (such as in Linux) filedes[1] is always used for writing, and filedes[0] is always used for readingan attempt to write to fildes[0] or read from filedes[1] will produce an error (i.e., bad file descriptor).

If the pipe system call fails, it returns a 1 and sets errno (Table 5.6).

Table 5.6. pipe Error Messages.

#	Constant	`perror` Message	Explanation
23	ENFILE	File table overflow	System file table is full.
24	EMFILE	Too many open files	Process has exceeded the limit for number of open files.
14	EFAULT	Bad address	`filedes` is invalid.

As previously noted, data in a pipe is read on a FIFO basis. Program 5.1 shows a pair of processes (parent/child) that use a pipe to send the first argument passed on the command line to the parent as a message to the child. Notice that the pipe is established prior to forking the child process.

Program 5.1 Parent/child processes communicating via a pipe.

File : p5.1.cxx
 /* Using a pipe to send data from a parent to a child process
 */
 #include 
 #include 
 + #include 
 #include 
 using namespace std;
 int
 main(int argc, char *argv[ ]) {
 10 int f_des[2];
 static char message[BUFSIZ];
 if (argc != 2) {
 cerr << "Usage: " << *argv << " message
";
 return 1;
 + }
 if (pipe(f_des) == -1) { // generate the pipe
 perror("Pipe"); return 2;
 }
 switch (fork( )) {
 20 case -1:
 perror("Fork"); return 3;
 case 0: // In the child
 close(f_des[1]);
 if (read(f_des[0], message, BUFSIZ) != -1) {
 + cout << "Message received by child: [" << message
 << "]" << endl;
 cout.flush();
 } else {
 perror("Read"); return 4;
 30 }
 break;
 default: // In the Parent
 close(f_des[0]);
 if (write(f_des[1], argv[1], strlen(argv[1])) != -1) {
 + cout << "Message sent by parent : [" <<
 argv[1] << "]" << endl;
 cout.flush();
 } else {
 perror("Write"); return 5;
 40 }
 }
 return 0;
 }

In the parent process the "read" pipe file descriptor f_des[0] is closed, and the message (the string referenced by argv[1] ) is written to the pipe file descriptor f_des[1] . In the child process the "write" pipe file descriptor f_des[1] is closed, and pipe file descriptor f_des[0] is read to obtain the message. While the closing of the unused pipe file descriptors is not required, it is a good practice. Remember that for read to be successful, the number of bytes of data requested must be present in the pipe or all the write file descriptors for the pipe must be closed so that an end-of-file can be returned. The pipe file descriptors f_des[0] in the child and f_des[1] in the parent will be closed when each process exits. The output of Program 5.1 is shown in Figure 5.2.

Figure 5.2 Output of Program 5.1.

linux$ p5.1 Once_upon_a_starry_night
Message sent by parent : [Once_upon_a_starry_night]
Message received by child: [Once_upon_a_starry_night]

EXERCISE

Modify Program 5.1 so the child, upon receipt of the message, changes its case and returns the message (via a pipe) to the parent, where it is then displayed. On a system that does not support duplex pipes, you will need to generate two pipes prior to forking the child process.

At a command-line level, a pipe is specified by the symbol. As shown in Figure 5.3, pipes are used to tie the standard output of one command to the standard input of another to create a command pipeline.

Figure 5.3. Using pipes on the command line.

graphics/05fig03.gif

For example, the command line sequence

linux$ ps ef grep $USER cat -n

will execute the ps -ef command (which displays, in full form, the process status of all users) and pipe its output to the grep $USER command. The grep command prints those lines that contain the contents of the variable $USER that is, the user's login. A second pipe passes the output of the grep command to the cat , which (with option n ) displays its output as a numbered list. The redirection of the output of the ps command to be the input to the grep command and the output of the grep command to be the input of the cat command is accomplished with the inclusion of the command-line specification of a pipe. To achieve a similar arrangement with our parent/child pair, we need a way to associate standard input and standard output with the pipe we have created. This can be done either by using the dup or the dup2 system call (Tables 5.7 and 5.8).

The dup2 call supersedes the dup system call, but both bear discussion. The dup system call duplicates an original open file descriptor. The new descriptor references the system file table entry for the next available nonnegative file de scriptor. The new descriptor will share the same file pointer (offset), have the same access mode as the original, and share locks. Both will remain open across an exec call, but they do not, however, share the close-on-exec flag. An important point to consider is that when called, dup will always return the next lowest available file descriptor.

Table 5.7. Summary of the dup System Call.

Include File(s)			Manual Section	2
Summary	`int dup( int oldfd );`
Return	Success	Failure	Sets `errno`
	Next available nonnegative file descriptor	1	Yes

Table 5.8. Summary of the dup2 System Call.

Include File(s)			Manual Section	2
Summary	`int dup2( int oldfd, int newfd );`
Return	Success	Failure	Sets `errno`
	`newfd` as a file descriptor for `oldfd`	1	Yes

A code sequence of

int f_des[2];
pipe(f_des);
close( fileno(stdout) ); // close standard output
dup(f_des[1]); // duplicate 1st free descriptor
 as write end of pipe
.
.
.

declares and generates a pipe. The file descriptor for standard output (say, file descriptor 1) is closed. The following dup system call returns the next lowest available file descriptor, which in this case should be the previously closed standard output file descriptor (i.e., 1). Thus, any data written to standard output in following statements would now be written to the pipe. Notice that there are two steps in this sequence: closing the descriptor and then dup -ing it. There is an outside chance that the sequence will be interrupted and the descriptor returned by dup will not be the one that was just closed. This could happen if a signal was caught and the signal-catching routine closed a file.

Enter the dup2 system call. The dup2 system call closes and duplicates the file descriptor as a single atomic action. When calling dup2 , there is no time at which newfd is closed and oldfd has not yet been duplicated . If the file referenced by newfd is already open, it will be closed before the duplication is performed. For those more stout of heart, both the dup and dup2 calls can be implemented with the fcntl system call (when passed the proper flag values).

A short program that mimics the last sort command-line sequence is shown in Program 5.2. The files/pipes for the two processes, once Program 5.2 successfully executes the fork system call in line 17, are shown in Figure 5.4.

Figure 5.4 Initial entries for files/pipes.


parent


child

0 stdin stdin 0
1 stdout stdout 1
2 stderr stderr 2
3 f_des[o] f_des[o] 3
4 f_des[1] f_des[1] 4
5

... ...

5
6 6

Assuming a fairly standard setting (i.e., stdin = 0, stdout = 1, stderr = 2) with both stdout and stderr mapped to the same device (most likely the terminal), initially both the parent and child processes reference the same entries in the system file table. After the child process is generated, we use the dup2 call to close standard output and duplicate it. The system returns the previous reference for standard output, which is now associated with the file table entry for f_des[1] . Once this association has been made, the file descriptors f_des[0] and f_des[1] are closed, as they are not needed by the child process.

Program 5.2 A last sort pipeline.

File : p5.2.cxx
 /* A home grown last sort cmd pipeline
 */
 #define_GNU_SOURCE
 #include 
 + #include 
 #include 
 using namespace std;
 enum { READ, WRITE };
 
 10 int
 main( ) {
 int f_des[2];
 if (pipe(f_des) == -1) {
 perror("Pipe");
 + return 1;
 }
 switch (fork( )) {
 case -1:
 perror("Fork");
 20 return 2;
 case 0: // In the child
 dup2( f_des[WRITE], fileno(stdout));
 close(f_des[READ] );
 close(f_des[WRITE]);
 + execl("/usr/bin/last", "last", (char *) 0);
 return 3;
 default: // In the parent
 dup2( f_des[READ], fileno(stdin));
 close(f_des[READ] );
 30 close(f_des[WRITE]);
 execl("/bin/sort", "sort", (char *) 0);
 return 4;
 }
 return 0;
 + }

In the parent process the dup2 call closes standard input and duplicates it as the reference f_des[0]. The entries for the files/pipes would now look like those shown in Figure 5.5. In the parent process, stdout and stderr have not been modified. However, stdin is now the read end of the pipe shared with the child. In the child process, stdout and stderr are their default values. However, stdout has been associated with the write end of pipe shared with the parent.

Figure 5.5. End entries for files/pipes.

graphics/05fig05.gif

When running Program 5.2, the two processes (parent and child) are running concurrently (at the same time). The sequence in which these processes will be executed is not guaranteed . For the processes involved, this is not a concern, since the pipe allows both processes to write/read at the same time.

We can summarize the steps involved for communication via unnamed pipes:

Create the pipe(s) needed.
Generate the child process(es).
Close/duplicate file descriptors to properly associate the ends of the pipe.
Close the unneeded ends of the pipe.
Perform the communication activities.
Close any remaining open file descriptors.
If appropriate, wait for child processes to terminate.

If either dup or dup2 fail, they return a -1 and set errno . The error codes for dup and dup2 are shown in Table 5.9.

Table 5.9. dup / dup2 Error Messages.

#	Constant	`perror` Message	Explanation
4	EINTR	Interrupted system call	Signal was caught during the system call.
9	EBADF	Bad file descriptor	The file descriptor is invalid.
24	EMFILE	Too many open files	Process has exceeded the limit for number of open files.
67	ENOLINK	The link has been severed	The file descriptor value references a remote system that is no longer active.

EXERCISE

Most UNIX-based systems include a utility program called tee that copies standard input to standard output and to the file descriptor passed on the command line. Thus, the command sequence

linux$ cat x.c tee /dev/tty wc

would cat the contents of the file x.c and pipe the standard output to tee . The tee program would copy its standard input (from the cat command) to the file /dev/tty and to its standard out put, where it would be piped to the wc (word count) program. Using unnamed pipes, write your own version of tee called my_tee . Hint: If you do not know your terminal device, on most systems the command stty will display the device. If stty does not work, try the who command. When passing the name of the terminal device to your my_tee program, be sure to include the full path for the device.

EXERCISE

Modify Program 5.2 so a variable number of commands can be passed to the program. Each command passed to the program should be connected to the next command via a pipe. When using this new program, a three-command sequence such as

linux$ last sort more

would be indicated as

linux$ my_p5.2 last sort more

EXERCISE

Rework the program written for Exercise 4.3 (the producer/consumer problem in Chapter 4) so the producer and consumer now use a pipe to communicate with one another.

Since the sequence of generating a pipe, forking a child process, duplicating file descriptors, and passing command execution information from one process to another via the pipe is relatively common, a set of standard library functions is available to simplify this task: popen and pclose . See Tables 5.10 and 5.11.

Table 5.10. Summary of the popen Library Function.

Include File(s)			Manual Section	3
Summary	FILE popen( const char command, const, char *type )
Return	Success	Failure	Sets `errno`
	Pointer to a FILE	NULL pointer	Sometimes

Table 5.11. Summary of the pclose Library Function.

Include File(s)			Manual Section	3
Summary	`int pclose( FILE *stream );`
Return	Success	Failure	Sets `errno`
	Exit status of command	-1	Sometimesdn9

When successful, the popen call returns a pointer to a file stream (not an integer file descriptor). The arguments for popen are a pointer to the shell command [2] that will be executed and an I/O mode type . The I/O mode type (read or write) determines how the process will handle the file pointer returned by the popen call.

[2] This can be any valid Bourne shell command, including those with I/O redirection. Most often, the command is placed in a doubly quoted string.

When invoked, the popen call automatically generates a child process. The child process exec s a Bourne shell ( /bin/sh ), which will execute the passed shell command. Input to and output from the child process is accomplished via a pipe. If the I/O mode type for popen is specified as w the parent process can write to the standard input of the shell command. In other terms, writing to the file pointer reference generated by the popen in the parent process will enable the child process running the shell command to read the data as its standard input. Conversely, if the I/O type is r , using the popen file pointer, the parent process can read from the standard output of the shell command (run by the child process). By default, the I/O stream generated by popen is fully buffered.

If popen fails due to an inability to allocate memory, errno will not be set. However, if the mode type is specified incorrectly, popen sets errno to EINVAL.

The pclose call is used to close a data stream opened with a popen call. If the data stream being closed is associated with a popen , pclose returns the exit status of the shell command referenced by the popen . If the data stream is not associated with a popen call, the pclose call returns a value of 1. If pclose is unable to obtain the status of the child process, errno is set to ECHILD.

Program 5.3 shows one way the popen and pclose calls can be used to pipe the output of one shell command to the input of another.

Program 5.3 Using popen and pclose .

File : p5.3.cxx
 /* Using the popen and pclose I/O commands
 */
 #define_GNU_SOURCE
 #include 
 + #include 
 #include 
 #include 
 using namespace std;
 int
 10 main(int argc, char *argv[ ]) {
 FILE *fin, *fout;
 char buffer[PIPE_BUF];
 int n;
 if (argc < 3) {
 + cerr << "Usage " << argv << "cmd1 cmd2" << endl;
 return 1;
 }
 fin = popen(argv[1], "r");
 fout = popen(argv[2], "w");
 20 fflush(fout);
 while ((n = read(fileno(fin), buffer, PIPE_BUF)) > 0)
 write(fileno(fout), buffer, n);
 pclose(fin);
 pclose(fout);
 + return 0;
 }

As written, Program 5.3 requires two command-line arguments: two shell commands whose standard output/input is redirected via pipes generated when using the popen call. The first popen call, with the I/O option of r , directs the system to fork a child process that will execute the shell command referenced by argv[1] . The output of the command will be redirected so it can be read by the parent process when using the file pointer reference fin . In a similar manner, the second popen , with the I/O option of w directs the system to fork a second child process. As this child process executes its shell command (referenced by argv[2] ), its standard input will be the data written to the pipe by the parent process, and its output will go the standard output. The parent process writes data to the second pipe using the file pointer reference fout and reads data from the first pipe using the file pointer reference fin . The while loop in the program is used to copy the data from the output end of one pipe to the input end of the other. The call to fflush in line 20 of the program is used to clear buffered output so that it will not be interleaved with data in the pipe.

Figure 5.6 depicts the arrangement when the shell command last and more are passed on the command line to Program 5.3.

Figure 5.6. Program 5.3 relationships when invoked as p5.3 last more .

graphics/05fig06.gif

EXERCISE

Using just the popen call to generate pipes, can we create a pipeline consisting of three separate shell commands (e.g., a program that when passed three shell commands on the command line, would pipe the commands together in the manner cmd1 cmd2 cmd3 )? If yes, write a program that shows how this can be done. If no, give the reason(s) why.

Programs and Processes

Processing Environment

Using Processes

Primitive Communications

Pipes

Message Queues

Semaphores

Shared Memory

Remote Procedure Calls

Sockets

Threads

Appendix A. Using Linux Manual Pages

Appendix B. UNIX Error Messages

Appendix B. UNIX Error Messages

Appendix C. RPC Syntax Diagrams

Appendix D. Profiling Programs