Introduction | Sockets

One of the nice things about UNIX and its variants is that they use a common interface for the access of files and devices that reside on a single host. By using a file descriptor generated by an open system call, the user can easily read data from and write data to the file descriptor. This can be done without being overly concerned with the underlying mechanics of the process and without knowing to which device the descriptor has been mapped (e.g., the screen, a file on disk, etc.). When we discussed the use of pipes , we saw a similar approach. With pipes, we could have two-way (duplex) communications using read and write system calls as long as the processes involved were related . Again, the processes communicated by using read and write system calls as if they were dealing with files. When we discussed System V-based message queues, semaphores, and shared memory as interprocess communication techniques, we began to stray from the read/write paradigm. We also found that while we could use some of these techniques for interprocess communication, even with unrelated processes, each technique had its own special method for sending and receiving information. Unfortunately, while these techniques are powerful, and certainly have their place, their arcane syntax is somewhat restrictive . In the last chapter, we examined remote procedure calls. We noted that RPC mechanisms are used for interprocess communications. The RPC API (application program interface) was developed to ease the burden of writing applications that required communication between unrelated processes residing in a distributed environment. In attempting to make things easier, the developers of RPCs have, in some cases, made things more complex and restrictive. RPC applications, by nature, have a large number of ancillary files whose contents and relationships may at times obscure their functionality. In an RPC-based application, it is easy to lose touch with and control of the mechanics of the communication process. It would seem that what is needed is an extension of the basic read/write paradigm with the inclusion of sufficient networking semantics to permit unrelated processes on different hosts to communicate as if they were reading and writing to a local file. This type of intermediate level of interprocess communications would lie somewhere in between pipes, message queues, shared memory techniques, and RPC applications. Fortunately, in UNIX there are several application interfaces that support this type of communication and are in fact the underlying basis for the RPC interface.

The most common APIs that provide remote interprocess communications are the Berkeley socket interface, introduced in the early 1980s, and Transport Level Interface (TLI) programming implemented by AT&T in the mid-1980s. There is much discussion as to which of these offers the better solution for remote interprocess communication. As the Berkeley socket interface preceded TLI, a majority of existing remote interprocess communication coding is done with sockets. However, Berkeley sockets are not transport-independent and must be used with caution in a multithreaded processing environment. On the other hand, TLI is designed to be transport-independent (i.e., applications can access transport specifics in a protocol-independent manner). Unfortunately, to date, not all transport protocols support every TLI service. Unlike sockets, TLI is STREAMS-based and requires that the application push a special module on the stream before performing reads and writes . The concept of privileged ports (a Berkeley concept) is not supported with TLI. In addition, broadcasting (the ability to send the same message to a group of hosts) is not transport-independent. Recently, TLI has begun to be replaced by the X/Open Transport Interface (XTI).

In this chapter we explore the Berkeley socket interface. Conceptually, a socket is an abstract data structure that is used to create a channel (connection point) to send and receive information between unrelated processes. Once a channel is established, the connected processes can use generalized file-system type access routines for communication. For the most part, when using a socket-based connection, the server process creates a socket, maps the socket to a local address, and waits (listens) for requests from clients . The client process creates its own socket and determines the location specifics (such as the host name and port number) of the server. Depending upon the type of transport/connection specified, the client process will begin to send and receive data either with or without receiving a formal acknowledgment (acceptance) from the server process.

Programs and Processes

Processing Environment

Using Processes

Primitive Communications

Pipes

Message Queues

Semaphores

Shared Memory

Remote Procedure Calls

Sockets

Threads

Appendix A. Using Linux Manual Pages

Appendix B. UNIX Error Messages

Appendix B. UNIX Error Messages

Appendix C. RPC Syntax Diagrams

Appendix D. Profiling Programs