Sockets: The Connection-Oriented Paradigm

Table of contents:

Sockets The Connection Oriented Paradigm

When using sockets for interprocess communications, we can specify the socket type as either connection-oriented (type SOCK_STREAM) or connectionless (type SOCK_DGRAM). The sequence of events that must occur for connection-oriented communications is shown in Figure 10.6. In this setting, the process initiating the connection is the client process and the process receiving the connection is the server.

Figure 10.6. A connection-oriented clientserver communication sequence.

graphics/10fig06.gif

As shown, both the client and server processes use the socket call to create a new instance of a socket. The socket will act as a queuing point for data exchange. The summary for the socket system call is shown in Table 10.4

The socket system call takes three arguments. The arguments parallel those for the socketpair call without the fourth integer array/socket pair reference. In short, the socket call takes an integer value (one of the defined constants in the file) that indicates the protocol family as its first argument. At present, the protocol families shown in Table 10.5 are supported.

The second argument, type , denotes the socket type (such as SOCK_STREAM or SOCK_DGRAM). The third argument, protocol , is the specific protocol to be used within the indicated address/protocol family. As with the socketpair call, we will most often set this value to 0 to let the system choose the protocol based on the protocol family. If the socket call is successful, it will return an integer value that can be used to reference the socket descriptor. If the call fails, it returns a -1 and sets errno . The value for errno , and an interpretation of the error message, is shown in Table 10.6.

Table 10.4. Summary of the socket System Call.

Include File(s)			Manual Section	2
Summary	`int socket( int domain, int type, int protocol );`
Return	Success	Failure	Sets `errno`
	0 and an open socket descriptor.	-1	Yes

Table 10.5. Supported Protocol Families.

Constant	Protocol Family
PF_APPLETALK	Appletalk
PF_ATMPVC	Access to raw ATM PVCs
PF_AX25	Amateur radio AX.25 protocol
PF_INET	IPv4 Internet protocols
PF_INET6	IPv6 Internet protocols
PF_IPX	IPX - Novell protocols
PF_NETLINK	Kernel user interface device
PF_PACKET	Low-level packet interface
PF_UNIX, PF_LOCAL	Local communication
PF_X25	ITU-T X.25 / ISO-8208 protocol x25(7)

When a socket call is made in a program in some, but not all, development settings, the socket library must be specifically linked at compile time using the option -lsocket .

Table 10.6. socket Error Messages.

#	Constant	`perror` Message	Explanation
12	ENOMEM	Cannot allocate memory	When creating a socket, insufficient memory available.
13	EACCES	Permission denied	Cannot create a socket of the specified type/protocol.
22	EINVAL	Invalid argument	Unknown protocol or protocol family is not available.
23	ENFILE	Too many open files in system	Insufficient kernel (system) memory for socket allocation.
24	EMFILE	Too many open files	This process has reached the limit for open file descriptors.
93	EPROTONOSUPPORT	Protocol not supported	Requested `protocol` not supported on this system or within this domain.
105	ENOBUFS	No buffer space available	Socket cannot be created until resources are freed.

EXERCISE

Read the manual page on netstat . Issue the following command to determine all listening sockets and their associated port numbers on your system:

[View full width]

linux$ netstat -l -n # alt_cmd$ netstat -a -n grep graphics/ccc.gif

LISTEN

How many sockets are listening for connections? If the system is listening on port 80, issue the command sequence (note the back tics):

linux$ telnet 'hostname' 80

When the system responds, type GET / . What happens? What information is returned to the screen? What does the / stand for?

Initially, when the socket is created, it is unbound (i.e., there is no name or address/port number pair associated with the socket). If the process creating the socket is to act as a server, the socket must be bound. This is similar in concept to the assignment of a phone number to an installed phone or a street name and number to a mailing address. The bind system call is used to associate a name or address/port pair with a socket. If the socket is to be used in the UNIX domain, a file name must be provided, and we say the address resides in local namespace . In the Internet domain, an address/port pair must be assignedthe address resides in Internet namespace . Table 10.7 provides a summary of the bind system call.

Table 10.7. Summary of the bind System Call

Include File(s)			Manual Section	2
Summary	int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen);
Return	Success	Failure	Sets `errno`
		-1	Yes

The first argument for bind is an integer value that has been returned from a previous successful socket call. The second argument, a reference to a sockaddr structure, is a real gem. The short explanation is the my_addr argument references a generic address structure of the type:

/* Structure describing a generic socket address. */
struct sockaddr {
 __SOCKADDR_COMMON (sa_); /* Common data: adr family and length. */
 char sa_data[14]; /* Address data. */
};

We will use this structure definition as a starting point for our address references. In a few paragraphs we will come back to the use of this structure. For those who enjoy a syntactical challenge, a more detailed explanation of the definition of the sockaddr structure, as well as the definition of the SOCKADDR_COMMON macro, can be found in the first 100 or so lines of the include file .

Again, bind can be used for both UNIX and Internet domain sockets. For UNIX domain sockets, a reference to a file must be bound to the socket. A UNIX socket domain address is defined in the header file [8] as

[8] If we are working with UNIX domain sockets, this file must be in the include list of the program. If you do some spelunking, you will find the full definition of the UNIX sock structure is actually found in one of the files included by the file.

#define UNIX_PATH_MAX 108
struct sockaddr_un {
 sa_family_t sun_family; /* AF_UNIX */
 char sun_path[UNIX_PATH_MAX]; /* pathname */
};

When this structure is used, the sockaddr_un.sun_family member is usually assigned the defined constant AF_UNIX to indicate UNIX addressing is being used. The second member, sun_path , is the path (absolute or relative) to the file name to be bound to the socket. In the UNIX domain, bind creates a file entry for the socket. If the file is already present, an error occurs. When listing a directory in long format with the ls command, a file that is bound to the socket will have the letter p or s as its file type, indicating it is a pipe or socket. The number of bytes in the file will be listed as 0. The maximum length for the sun_path member, including the NULL terminator, is 108 characters .

If the socket is to be used in the Internet domain, the addressing structure found in the file is used. As with UNIX domain sockets, if we are working with Internet domain sockets, this file must be in the include list of the program. This address structure is defined as

struct sockaddr_in {
 sa_family_t sin_family; /* address family: AF_INET */
 u_int16_t sin_port; /* 16 bit port in network byte order */
 struct in_addr sin_addr; /* internet address structure */
};

struct in_addr {
 u_int32_t s_addr; /* 32 bit address in network byte order */
};

Keep in mind that in Internet namespace, we must map the socket to an Internet address/port number pair. To accomplish this, we use the sockaddr_in structure shown above. The first member of the structure, like the sockaddr_un structure, is an integer value that indicates the address family. In this scenario, this member is assigned the value AF_INET. The second member, sin_port , indicates the port number for the service. The port number is a 16-bit value that acts as an offset [9] to the indicated Internet address and references the actual endpoint for the communication. A list of assigned port numbers can be obtained by viewing the contents of the /etc/services file. A partial excerpt from a local /etc/services file is shown in Figure 10.7.

[9] Sticking with our phone system analogy for connection-oriented protocol, this would be similar to an extension for a given phone number.

Figure 10.7 Partial contents of a local /etc/services file.

linux$ cat /etc/services

# /etc/services:
# $Id: services,v 1.17 2001/02/28 20:11:31 notting Exp $
#
# Network services, Internet style
...

# Each line describes one service, and is of the form:
#
# service-name port/protocol [aliases ...] [# comment]
tcpmux 1/tcp # TCP port service multiplexer
tcpmux 1/udp # TCP port service multiplexer
rje 5/tcp # Remote Job Entry
rje 5/udp # Remote Job Entry
echo 7/tcp
echo 7/udp
discard 9/tcp sink null
discard 9/udp sink null
systat 11/tcp users
systat 11/udp users
...

As can be seen, each service has a name, such as echo , is associated with a specific port (e.g., 7), and uses a specific protocol (e.g., tcp ). Ports with values less than 1024 are reserved (can only be used by processes whose effective ID is root). Many of these low-numbered ports are considered to be well-known ; that is, they consistently have the same value and are always associated with the same type of service using the same protocol. A port can be associated with more than one protocol.

The third member of the sockaddr_in structure, sin_addr , is a reference to an in_addr structure. This structure, with just one member, holds the actual 32-bit host Internet address value (with adjustments for byte-orderings; i.e., little endian versus big endian). We present the details of how to fill in the sockaddr_in structure in the example section.

We are now ready to return to the generic sockaddr structure, which is the second argument of the bind call. There are two members in the generic sockaddr structure. The first member, the macro __SOCKADDR_COMMON (sa_) , is used to indicate the address family information. The second member, sa_data , is a reference to the actual address. To make bind work, we first populate the appropriate addressing structure ( sockaddr_un or sockaddr_in ) with correct values. Then, when passing the structure as an argument to bind , cast the reference as (struct sockaddr *) to convince bind that we are passing a reference to the proper structure type. The third argument to bind , which provides the size of the address structure, helps to resolve things such as being able to pass a UNIX domain address with a 108-byte file/path reference. Again, the details of how to calculate the size of the address structure are presented in the example section.

If bind is successful, it returns a 0; otherwise , it returns a -1 and sets the value of errno . Table 10.8 summarizes the errors associated with a failure of bind for both local (UNIX) and Internet namespace.

Table 10.8. bind Error Messages.

#	Constant	`perror` Message	Explanation
2	ENOENT	No such file or directory	Component of the path for the file `name` entry does not exist.
9	EBADF	Bad file descriptor	`sockfd` reference is invalid.
12	ENOMEM	Cannot allocate memory	Insufficient memory.
13	EACCES	Permission denied	Cannot create a socket of the specified type/protocol. Search access denied for part of the path specified by `name` .
14	EFAULT	Bad address	`my_addr` references address outside user's space.
20	ENOTDIR	Not a directory	Part of the path of `my_addr` is not a directory.
22	EINVAL	Invalid argument	`addrlen` is invalid. `sockfd` already bound to an address.
30	EROFS	Read-only file system	File would reside on a read-only file system.
36	ENAMETOOLONG	File name too long	`my_addr` name is too long.
40	ELOOP	Too many levels of symbolic links	Too many symbolic links in `my_addr` .
63	ENOSR	Out of streams resources	Insufficient STREAMS resources for specified operation.
88	ENOTSOCK	Socket operation on non-socket	`sockfd` is a file descriptor, not a socket descriptor.
98	EADDRINUSE	Address already in use	Specified address already in use.
99	EADDRNOTAVAIL	Can't assign request address	The specified address is not available on the local system.

While our primary concern is with Internet domain protocols, a UNIX domain socket may also be bound. In the UNIX domain, an actual file entry is generated that should be removed ( unlinked ) when the user is done with the socket.

Continuing with the server process in the connection-oriented setting, the next system call issued is to listen . This call, which only applies to sockets of type SOCK_STREAM or SOCK_SEQPACKET, creates a queue for incoming connection requests . If the queue is full and the protocol does not support retransmission, the client process generating the request will receive the error ECONNREFUSED from the server. If the protocol does support retrans mission, the request is ignored, so a subsequent retry can succeed. The summary for listen is given in Table 10.9.

The first argument of the listen system call is a valid integer socket descriptor. The second argument, backlog , denotes the maximum size of the queue. Originally, BSD-based documentation indicated that there was no limit to the value for backlog . However, in many versions of BSD-derived UNIX, the limit was set to five for any backlog value greater than five. As of Linux 2.2 the backlog value is for completely established sockets waiting for acceptance versus incomplete connection requests. If needed, the maximum queue size for incomplete socket requests can be set with the /sbin/sysctl command using tcp_max_syn_backlog variable.

Table 10.9. Summary of the listen System Call.

Include File(s)			Manual Section	2
Summary	`int listen(int s, int backlog);`
Return	Success	Failure	Sets `errno`
		-1	Yes

Should the listen call fail, it sets errno and returns one of the values shown in Table 10.10.

Table 10.10. listen Error Messages.

#	Constant	`perror` Message	Explanation
9	EBADF	Bad file descriptor	`s` reference is invalid.
88	ENOTSOCK	Socket operation on non-socket	`s` is a file descriptor, not a socket descriptor.
95	EOPNOTSUPP	Operation not supported	Socket type (such as SOCK_DGRAM) does not support `listen` operation.

At this point, the server process is ready to accept a connection from a client process (which has already established a connection-based socket). By default, the accept call will block, if there are no pending requests for connections. The summary for the accept system call is given in Table 10.11.

The first argument is a socket descriptor that has been previously bound to an address with the bind system call and is currently listen ing for a connection. If one or more client connections are pending, the first connection in the queue is returned by the accept call. The second argument for accept , *addr , is a pointer to a generic sockaddr structure. This structure is returned to the server once the connection with the client has been established. Its actual format, as in the bind system call, is dependent upon the domain in which the communication will occur. The structure the addr pointer references contains the client's address information. The third argument, *addrlen , initially contains a reference to the length, in bytes, of the previous sockaddr structure. When the call returns, this argument references the size (in bytes) of the returned address. If accept is successful, it returns a new connected socket descriptor with properties similar to the socket specified by the first argument to the accept system call. This new socket can be used for reading and writing. The original socket remains as it was and can, in some settings, still continue to accept additional connections. If the accept call fails, it returns a value of -1 and sets the value of errno to one of the values shown in Table 10.12.

Table 10.11. Summary of the accept System Call.

Include File(s)			Manual Section	2
Summary	int accept( int s, struct sockaddr addr, socklen_t addrlen );
Return	Success	Failure	Sets `errno`
	Positive integer new socket descriptor value	-1	Yes

Table 10.12. accept Error Messages.

#	Constant	`perror` Message	Explanation
1	EPERM	Operation not permitted	Firewall software prohibits connection.
4	EINTR	Interrupted system call	A signal was received during the `accept` process.
9	EBADF	Bad file descriptor	The socket reference is invalid.
11	EWOULDBLOCK,EAGAIN	Resource temporarily unavailable	The socket is set to non-blocking , and no connections are pending.
12	ENOMEM	Cannot allocate memory	Insufficient memory to perform operation.
14	EFAULT	Bad address	Reference for `addr` is not writeable .
19	ENODEV	No such device	Specified protocol family/type not found in the `netconfig` file.
22	EINVAL	Invalid argument	Invalid argument passed to `accept` call.
24	EMFILE	Too many files open	Process has exceeded the maximum number of files open.
63	ENOSR	Out of streams resources	Insufficient STREAMS resources for specified operation.
71	EPROTO	Protocol error	An error in protocol has occurred.
85	ERESTART	Interrupted system call should be restarted	`accept` call must be restarted.
88	ENOTSOCK	Socket operation on non-socket	The socket is a file descriptor, not a socket descriptor.
93	EPROTONOSUPPORT	Protocol not supported	Invalid protocol specified.
94	ESOCKTNOSUPPORT	Socket type not supported	Invalid socket type specified.
95	EOPNOTSUPP	Operation not supported	`s` is not of type SOCK_STREAM.
103	ECONNABORTED	Software caused connection abort	Connection aborted.
105	ENOBUFS	No buffer space available	Insufficient memory to perform operation.
110	ETIMEDOUT	Connection timed out	Unable to establish connection within specified time limit.

It is interesting to note that the Linux manual pages indicate that when called, accept will also pass on pending network errors as if they were from accept . This behavior is different from straight BSD socket implementations that do not have this quirk.

In the connection-oriented setting, the client process initiates the connection with the server process with the connect system call. The summary of the connect call is shown in Table 10.13.

The first argument is a valid integer socket descriptor. The second argument, *serv_addr , is handled differently depending upon whether the referenced socket is connection-oriented (type SOCK_STREAM) or con nectionless (type SOCK_DGRAM). In the connection-oriented setting, *serv_addr references the address of the socket with which the client wants to communicate (i.e., the serving process's address). For a connectionless socket, *serv_addr references the address to which the datagrams are to be sent. Normally, a stream socket is connect ed only once, while a datagram socket can be connect ed several times. Further, if the protocol domain is UNIX, *serv_addr will reference a path/file name, while in the Internet domain (i.e., AF_INET) *serv_addr will reference an Internet address/ port number pair. In either case, the reference should be cast to a generic sockaddr structure reference. Clear as mud, right? Hopefully, the section with the client server examples will help to clarify the details of the connect system call. The third argument, addrlen , conveys the size of the *serv_addr reference.

Table 10.13. Summary of the connect System Call

Include File(s)			Manual Section	2
Summary	int connect(int sockfd, const struct sockaddr *serv_addr, socklen_t addrlen);
Return	Success	Failure	Sets `errno`
		-1	Yes

As there are a number of ways in which the connect call can fail, the list of errors that connect can generate is quite extensive . A list of connect errors is found in Table 10.14.

Table 10.14. connect Error Messages.

#	Constant	`Perror` Message	Explanation
1	EPERM	Operation not permitted	Attempt to broadcast without having broadcast flag set. Request failed due to firewall.
4	EINTR	Interrupted system call	A signal was received during `connect` process.
9	EBADF	Bad file descriptor	`sockfd` reference is invalid.
11	EAGAIN	Resource temporarily unavailable	No more free local ports.
13	EACCES	Permission denied	Search permission denied for part of path referenced by `*serv_addr` .
14	EFAULT	Bad address	Address referenced by `*serv_addr` is outside the user's address space.
22	EINVAL	Invalid argument	`namelength` is not correct for address referenced by `*serv_addr` .
63	ENOSR	Out of streams resources	Insufficient STREAMS resources for specified operation.
88	ENOTSOCK	Socket operation on non-socket	`sockfd` is a file descriptor, not a socket descriptor.
91	EPROTOTYPE	Protocol wrong type for socket	Conflicting protocols, `socketfd` versus the `*serv_addr` reference.
97	EAFNOSUPPORT	Address family not supported by protocol family	Address referenced by `*serv_addr` cannot be used with this socket.
98	EADDRINUSE	Address already in use	Local address referenced by `*serv_addr` already in use.
99	EADDRNOTAVAIL	Cannot assign requested address	Address referenced by `*serv_addr` not available on remote system.
101	ENETUNREACH	Network is unreachable	Cannot reach specified system.
106	EISCONN	Transport endpoint is already connected	`sockfd` already connected.
110	ETIMEDOUT	Connection timed out	Could not establish a connection within time limits.
111	ECONNREFUSED	Connection refused	Connect attempt rejected; socket already connected.
114	EALREADY	Operation already in progress	Socket is non-blocking, and no previous connection completed.
115	EINPROGRESS	Operation now in progress	Socket set as non-blocking, and connection cannot be established immediately.

Once the connection between the client and server has been established, they can communicate using standard I/O calls, such as read and write , or one of a number of specialized send/receive type calls covered in Section 10.5. When the processes are finished with the socket descriptor, they issue a standard close , which by default will attempt to send remaining queued data should the protocol for the connection (such as TCP) specify reliable delivery.

10.4.1 A UNIX Domain Stream Socket Example

In the following example, programs 10.2 and 10.3, we create a server process and a client process that each use a UNIX domain, connection-oriented (SOCK_STREAM) socket for communication. The server will create the socket, bind it to an address, generate a wait queue, accept a connection, and when data is available, read from the socket and display the results to the screen. The client process will create a socket, connect to the server, and obtain from the user 10 expressions, each of which it writes to the socket. The server reads the data passed (the expression) and processes the expression by passing it, via a pipe, to the bc utility for evaluation. The output of the bc utility is read by the server and sent back, using the socket, to the client process where it is displayed.

Program 10.2 UNIX domain connection-oriented server .

File : p10.2.cxx
 /*
 Server - UNIX domain, connection-oriented
 */
 #define _GNU_SOURCE
 + #include 
 #include 
 #include 
 #include 
 #include 
 10 #include  // UNIX protocol
 using namespace std;
 
 const char *NAME = "./my_sock";
 const int MAX = 1024;
 + void clean_up( int, const char *); // Close socket and remove
 int
 main( ) {
 socklen_t clnt_len; // Length of client address
 int orig_sock, // Original socket descriptor
 20 new_sock; // New socket descriptor from connect
 static struct sockaddr_un // UNIX addresses to be used
 clnt_adr, // Client address
 serv_adr; // Server address
 static char clnt_buf[MAX], // Message from client
 + pipe_buf[MAX]; // output from bc command
 FILE *fin; // File for pipe I/O
 // Generate socket
 if ((orig_sock = socket(PF_UNIX, SOCK_STREAM, 0)) < 0) {
 perror("generate error");
 30 return 1;
 } // Assign address information
 serv_adr.sun_family = AF_UNIX;
 strcpy(serv_adr.sun_path, NAME);
 unlink(NAME); // Remove old copy if present
 + // BIND the address
 if (bind( orig_sock, (struct sockaddr *) &serv_adr,
 sizeof(serv_adr.sun_family)+strlen(serv_adr.sun_path)) < 0) {
 perror("bind error");
 clean_up(orig_sock, NAME);
 40 return 2;
 }
 listen(orig_sock, 1); // LISTEN for connections
 clnt_len = sizeof(clnt_adr); // ACCEPT connection
 if ((new_sock = accept( orig_sock, (struct sockaddr *) &clnt_adr,
 + &clnt_len)) < 0) {

<-- 1

perror("accept error");
 clean_up(orig_sock, NAME);
 return 3;
 }
 50 // Process 10 requests
 for (int i = 0; i < 10; i++) {
 memset(clnt_buf, 0x0, MAX); // Clear client buffer
 read(new_sock, clnt_buf, sizeof(clnt_buf));
 // build command for bc
 + memset(pipe_buf, 0x0, MAX);
 sprintf(pipe_buf, "echo '%s' bc
", clnt_buf);
 fin = popen( pipe_buf, "r" );
 memset(pipe_buf, 0x0, MAX);
 read(fileno(fin), pipe_buf, MAX);
 60 cout << clnt_buf << " = " << pipe_buf << endl;
 }
 close(new_sock);
 clean_up(orig_sock, NAME);
 return 0;
 + }
 void
 clean_up( int sd, const char *the_file ){
 close( sd ); // Close socket
 unlink( the_file ); // Remove it
 70 }

(1) When a connection is accepted, a new socket is generatedsimilar in form to the original socket.

Notice the call to bind in the server program (program p10.2.cxx , lines 36 and 37). As written, the third argument, which is the length of the address structure, is an expression. The expression calculates the total size by adding the size of the sun_family member of the address structure to the string length of the sun_path member. If we just applied the sizeof operator to the whole address structure, on most platforms the value returned would be 110 (say, 2 bytes for the sun_family member plus the 108 bytes for the sun_path member).

The client program is shown in Program 10.3.

Program 10.3 UNIX domain connection-oriented client.

File : p10.3.cxx
 /*
 Client - UNIX domain, connection-oriented
 */
 
 + #define _GNU_SOURCE
 #include 
 #include 
 #include 
 #include 
 10 #include 
 #include  // UNIX protocol
 using namespace std;
 
 const char *NAME = "./my_sock";
 + const int MAX = 1024;
 int
 main( ) {
 int orig_sock; // Original socket descriptor
 static struct sockaddr_un
 20 serv_adr; // UNIX address of the server process
 static char buf[MAX]; // Buffer for messages
 // Generate the SOCKET
 if ((orig_sock = socket(PF_UNIX, SOCK_STREAM, 0)) < 0) {
 perror("generate error");
 + return 1;
 }
 serv_adr.sun_family = AF_UNIX;
 strcpy(serv_adr.sun_path, NAME);
 // CONNECT
 30 if (connect( orig_sock, (struct sockaddr *) &serv_adr,
 sizeof(serv_adr.sun_family)+strlen(serv_adr.sun_path)) < 0) {
 perror("connect error");
 return 2;
 }
 + // Prompt for expressions
 cout << "Enter an expression and press enter to process." << endl;
 for (int i = 0; i < 10; i++) {
 memset(buf, 0x0, MAX);
 cin.getline(buf, MAX-1, '
');
 40 write(orig_sock, buf, sizeof(buf));
 }
 close(orig_sock);
 return 0;
 }

We run the clientserver pair by placing the server process in the background. We then run the client process in the foreground. The compilation sequence and some sample output generated by the clientserver programs are shown in Figure 10.8.

Figure 10.8 UNIX domain clientserver program compilation and run.

linux$ g++ p10.2.cxx -o server

<-- 1

linux$ g++ p10.3.cxx -o client

linux$ ./server &

<-- 2

[1] 4739

<-- 3

linux$ ls -l my_sock 
srwxr-xr-x 1 gray faculty 0 May 9 15:35 my_sock

<-- 4

linux$ ./client 
Enter an expression and press enter to process.
78 * 92
78 * 92 = 7176

89 % 6 + 34 - 2 * -9
89 % 6 + 34 - 2 * -9 = 57

1 && 0 1
1 && 0 1 = 1

!( 1 && 1 0 )
!( 1 && 1 0 ) = 0

. . .

(1) Compile each program into an executable.

(2) Place server in background.

(3) Check for presence of the socket.

(4) Run client in the foreground.

On the command line, the presence of the socket can also be confirmed by using the netstat command. This command, which has numerous options, can be used to display information about socket-based communications. Figure 10.9 shows part of the output of netstat on a local system after the UNIX domain server program has been placed in the background.

Figure 10.9 Sample output from the netstat command.

linux$ netstat -x -a
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node Path
unix 13 [ ] DGRAM 1384 /dev/log
unix 2 [ ACC ] STREAM LISTENING 1681 /var/lib/mysql/mysql.sock
unix 2 [ ACC ] STREAM LISTENING 2202 /tmp/.font-unix/fs7100
unix 2 [ ACC ] STREAM LISTENING 5914 /opt/ARCserve/data/ds_callback
unix 2 [ ACC ] STREAM LISTENING 65439 ./

my_sock

. . .

EXERCISE

If we place the server process in the background and open, say, three windows on the same host and attempt to run multiple client processes, we find that one client will work correctly but the other clients will not. Rewrite the server program ( p10.2.cxx ) so that it will accept and process multiple client connections (each in their own window) correctly. Hint : Should the server fork a child process to handle each connection?

10.4.2 An Internet Domain Stream Socket Example

In the Internet domain, processes must have address and port information to communicate. An application may know the name of a host (such as linux , kahuna , or morpheus ) with which it wants to communicate but lack specifics about the host's fully qualified name, Internet address, services offered (on which ports), and other information. There are a number of network information calls that can be used to return this information.

The gethostbyname call will return information about a specific host when passed its name. Table 10.15 presents a summary of the gethostbyname call.

The gethostbyname call takes a single character string reference that contains the name of the host. The call queries the local network database [10] to obtain information about the indicated host. If the host name is found, the call returns a reference to a hostent structure. The hostent structure is defined in the include file as

[10] Information may come from any of the sources for services specified in the /etc/nsswitch.conf file (see nsswitch.conf in Section 5 of the manual pages for details).

Table 10.15. Summary of the gethostbyname Library Function.

Include File(s)	#include #include extern int h_errno;		Manual Section	3
Summary	`struct hostent gethostbyname(const char name);`
Return	Success	Failure	Sets `errno`
	Reference to a `hostent` structure	NULL	NO, sets `h_errno`

struct hostent {
 char *h_name; /* Official name of host. */
 char **h_aliases; /* Alias list. */
 int h_addrtype; /* Host address type. */
 int h_length; /* Length of address. */
 char **h_addr_list; /* List of addresses from name server. */
 #define h_addr h_addr_list[0] /* Address, for backward compatibility. */
};

If the host name is not found, the gethostbyname call returns a NULL. Should the call encounter an error situation, it sets a global variable called h_error (not errno ) to indicate the error. The values h_error can take and the associated defined constants (found in the include file ) are shown in Table 10.16. An obsolete error messaging function called h_error (similar in spirit to perror ) can be called to generate an error message.

Table 10.16. gethostbyname Error Messages.

#	Constant	Explanation
	NETDB_SUCCESS	No problem.
1	HOST_NOT_FOUND	Authoritative answer not found/no such host.
2	TRY_AGAIN	Nonauthoritative host not found or SERVERFAIL.
3	NO_RECOVERY	Nonrecoverable error.
4	NO_DATA	Valid name but no data record of requested type.

In some development environments the object code for the gethostbyname network call resides in the libnsl.a archive. In these settings, when using this call, the switch -lnsl must be added to the compilation line. Program 10.4 uses the gethostbyname call to obtain information about a host.

Program 10.4 Obtaining host information with gethostbyname .

File : p10.4.cxx
 /*
 Checking host entries
 */
 #include 
 + #include 
 #include 
 #include 
 #include  // for inet_ntoa
 #include 
 10 #include  // for memcpy
 extern int h_errno;
 using namespace std;
 int
 main( ) {
 + struct hostent *host;
 static char who[60];
 cout << "Enter host name to look up: ";
 cin >> who;
 host = gethostbyname( who );
 20 if ( host != (struct hostent *) NULL ) {
 cout << "Here is what I found about " << who << endl;
 cout << "Official name : " << host->h_name << endl;
 cout << "Aliases : ";
 while ( *host->h_aliases ) {
 + cout << *host->h_aliases << " ";
 ++host->h_aliases;
 }
 cout << endl;
 cout << "Address type : " << host->h_addrtype << endl;
 30 cout << "Address length: " << host->h_length << endl;
 cout << "Address list : ";
 struct in_addr in;
 while ( *host->h_addr_list ) {
 memcpy( &in.s_addr, *host->h_addr_list, sizeof (in.s_addr));
 + cout << "[" << *host->h_addr_list << "] = "
 << inet_ntoa(in) << " ";
 ++host->h_addr_list;
 }
 cout << endl;
 40 } else
 herror(who);
 return 0;
 }

In Program 10.4, the gethostbyname call is used to obtain network database information about a host. When the program is run, the user is prompted for the name of a host (as written, the name can be at most 59 characters). If the gethostbyname call is successful, the official database entry name of the host is displayed. This is followed by a list of aliases (alternate names ). The address type and length is displayed next. In an Internet domain setting, we can expect these values to be 2 (the value of AF_INET) and 4 (the number of bytes needed to store an integer value). The last part of the program displays the Internet address of the host. It uses an additional Internet address manipulation call, inet _ ntoa , to translate the character-encoded network address referenced by the h_addr_list member into the more standard dotted notation. The manual page on inet _ ntoa provides a good explanation of how the character string argument to the call is translated. A run of Program 10.4 is shown in Figure 10.10.

Figure 10.10 A run of Program 10.4.

linux$ p10.4
Enter host name to look up: www-cs
Here is what I found about www-cs
Official name : zeus.hartford.edu
Aliases : www-cs.hartford.edu
Address type : 2
Address length: 4
Address list : [14] = 137.49.52.2

<-- 1

(1) The address list with and without inet_ntoa translation.

EXERCISE

There is a call similar to gethostbyname that returns host entry information when passed the dotted Internet address of the host. Write a program based on Program 10.4 that requests the Internet address of a host. Then use the gethostbyaddr call to display the host's information.

In addition to knowing the server's 32-bit Internet address, the client must also be able to make reference to a particular service at a given port on the server. As noted previously, there are some TCP- and UDP-based well-known ports that have standard services, such as echo , associated with them. The ports with numbers less than 1024 are reserved for processes with an effective ID of root. Ports 1024 and above are considered ephemeral , and may be used by any system user. Some further subdivide this upper range of ports into registered (102449151) and dynamic (49152 and greater) ports. An application can issue the getservbyname call (see Table 10.17) to obtain information about a particular service or port.

Table 10.17. Summary of the getservbyname Library Function

Include File(s)			Manual Section	3
Summary	struct servent getservbyname(const char name, const char *proto);
Return	Success	Failure	Sets `errno`
	Reference to a `servent` structure	NULL

The getservbyname call is passed the name of the host and protocol (e.g., tcp ). If successful, it returns a reference to a servent structure. The servent structure is defined in as:

struct servent {
 char *s_name; /* official service name */
 char **s_aliases; /* alias list */
 int s_port; /* port number */
 char *s_proto; /* protocol to use */
}

If the call fails, it returns a NULL value. Program 10.5 uses the getservbyname library function to return information about a selected service type for a given protocol.

Program 10.5 Obtaining service information on a host using getservbyname .

File : p10.5.cxx
 /*
 Checking service -- port entries for a host
 */
 #include 
 + #include 
 #include 
 using namespace std;
 int
 main( ) {
 10 struct servent *serv;
 static char protocol[10], service[10];

<-- 1

cout << "Enter service to look up : ";
 cin >> service;
 cout << "Enter protocol to look up: ";
 + cin >> protocol;
 serv = getservbyname( service, protocol );
 if ( serv != (struct servent *)NULL ) {
 cout << "Here is what I found " << endl;
 cout << "Official name : " << serv->s_name << endl;
 20 cout << "Aliases : ";
 while ( *serv->s_aliases ) {
 cout << *serv->s_aliases << " ";
 ++serv->s_aliases;
 }
 + cout << endl;
 cout << "Port number : " << ntohs(serv->s_port) << endl;
 cout << "Protocol Family: " << serv->s_proto << endl;
 } else
 cout << "Service " << service << " for protocol "
 30 << protocol << " not found." << endl;
 return 0;
 }

(1) Arbitrary buffer sizes.

Before the port number is displayed, it is passed to the ntohs function. This is one of a group of functions used to insure byte ordering is maintained when converting 16- and 32-bit integer values that represent host and network addresses. The summary for ntohs is shown in Table 10.18.

Table 10.18. Summary of the ntohs Library Function.

Include File(s)			Manual Section	3
Summary	unsigned short int ntohs(unsigned short int netshort);
`Return`	Success	Failure	Sets `errno`
	The argument in proper byte order for the network.

The inverse of the ntohs call is ntohs (notice the switch of the letters h and n ). The letter s indicates the argument is a short (16-bit) integer, as is the returned value. There are two similar routines, ntohl and htonl , that accept and return long (32-bit) integers. If byte ordering is not necessary for a given platform, these calls act as a no-op.

A sample run of Program 10.5 and a copy of the corresponding /etc/ services entry are shown in Figure 10.11.

Figure 10.11 A run of Program 10.5.

linux$ p10.5
Enter service to look up : discard
Enter protocol to look up: tcp
Here is what I found
Official name : discard
Aliases : sink null
Port number : 9
Protocol Family: tcp

linux$ grep discard /etc/services

<-- 1

discard 9/tcp sink null
discard 9/udp sink null

(1) Verify the information.

EXERCISE

The manual page for getservbyname includes a description of a network function called getservent . The getservent call can be used to enumerate all the services on a host. Write a program that requests the protocol type and uses the getservent network call to display all the services on the host that use the indicated protocol. Be sure to call setservent prior to issuing the getservent call.

We now have most of the basic tools to write a clientserver application that uses Internet protocol with a connection-oriented socket. In this next example, the server process receives messages from the client process. As each message is received, the server changes the case of the message and returns it to the client. Communication terminates when the client sends a string that has a dot (.) in column one. For each connection initiated by a client, the server process will fork a child process that runs concurrently and carries on communications.

All of the remaining clientserver type socket examples share a common header file called local_sock.h . The content of local_sock.h file is shown in Figure 10.12.

Figure 10.12 The local_sock.h include file for all socket example programs.

File : local_sock.h
 /*
 Local include file for socket programs
 */
 #ifndef LOCAL_SOCK_H
 + #define LOCAL_SOCK_H
 #define _GNU_SOURCE
 #include 
 #include 
 #include 
 10 #include 
 #include 
 #include 
 #include 
 #include 
 + #include 
 #include 
 #include 
 #include 
 #include 
 20 const int PORT=2002; // Arbitrary port programmer chooses
 static char buf[BUFSIZ]; // Buffer for messages
 const char *SERVER_FILE="server_socket";
 #endif
 using namespace std;

The local_sock.h file contains references to the include files needed by both the server and client programs. The defined constant PORT is an arbitrary integer port number that we will use with this application. The value for the port should be one that is currently not in use and is greater than or equal to 1024. An alternate approach is to add an entry for the port in the /etc/services file. If the port is in the /etc/services file, the port information could then be obtained dynamically with the getservbyname network call. However, most users do not have the required root access to add an entry. The character array buf is used as a temporary storage location for characters.

The server program, Program 10.6, is presented first.

Program 10.6 The Internet domain, connection-oriented server .

File : p10.6.cxx
 /*
 Internet domain, connection-oriented SERVER
 */
 #include "local_sock.h"
 + void signal_catcher(int);
 int
 main( ) {
 int orig_sock, // Original socket in server
 new_sock; // New socket from connect
 10 socklen_t clnt_len; // Length of client address
 struct sockaddr_in // Internet addr client & server
 clnt_adr, serv_adr;
 int len, i; // Misc counters, etc.
 // Catch when child terminates
 + if (signal(SIGCHLD , signal_catcher) == SIG_ERR) {
 perror("SIGCHLD");
 return 1;
 }
 if ((orig_sock = socket(PF_INET, SOCK_STREAM, 0)) < 0) {
 20 perror("generate error");
 return 2;
 }
 memset( &serv_adr, 0, sizeof(serv_adr) ); // Clear structure
 serv_adr.sin_family = AF_INET; // Set address type
 + serv_adr.sin_addr.s_addr = htonl(INADDR_ANY); // Any interface
 serv_adr.sin_port = htons(PORT); // Use our fake port
 // BIND
 if (bind( orig_sock, (struct sockaddr *) &serv_adr,
 sizeof(serv_adr)) < 0){
 30 perror("bind error");
 close(orig_sock);
 return 3;
 }
 if (listen(orig_sock, 5) < 0 ) { // LISTEN
 + perror("listen error");
 close (orig_sock);
 return 4;
 }
 do {
 40 clnt_len = sizeof(clnt_adr); // ACCEPT a connect
 if ((new_sock = accept( orig_sock, (struct sockaddr *) &clnt_adr,
 &clnt_len)) < 0) {
 perror("accept error");
 close(orig_sock);
 + return 5;
 }
 if ( fork( ) == 0 ) { // Generate a CHILD
 while ( (len=read(new_sock, buf, BUFSIZ)) > 0 ){
 for (i=0; i < len; ++i) // Change the case
 50 buf[i] = toupper(buf[i]);
 write(new_sock, buf, len); // Write back to socket
 if ( buf[0] == '.' ) break; // Are we done yet?
 }
 close(new_sock); // In CHILD process
 + return 0;
 } else
 close(new_sock); // In PARENT process
 } while( true ); // FOREVER
 return 0;
 60 }
 void
 signal_catcher(int the_sig){
 signal(the_sig, signal_catcher); // reset
 wait(0); // keep the zombies at bay
 + }

The server program contains a few new bells and whistles that were not in our previous examples. The server process fork s a child process to handle each connection. When the child process ends, the operating system will want to return the exiting status of the child to its parent. Normally, the parent process waits for the child. As multiple connections (producing multiple child processes) are possible, we do not want the parent (the server process) to block, which is the default for wait , for a given child process. To resolve this, we associate the receipt of a SIGCHLD signal (child process has terminated ) with a signal-catching routine (see lines 15 to 18). When invoked, the signal-catching routine performs the wait . This arrangement prevents a child process from becoming a zombie while it waits for the parent process to retrieve its returned status information. Also, notice the use of the memset library function in line 23 to clear the address structure before its contents are assigned. [11] When assigning the address member of the server structure, the address is first passed to htonl . In this example, the server passes the defined constant INADDR_ANY, found in the header file , to htonl . This constant, which is mapped to the value 0, indicates to the server that any address of socket type (SOCK_STREAM) will be acceptable. The client program is shown in Program 10.7.

[11] An alternate approach is to use bzero , a strictly BSD string function, to fill the location with NULL bytes. However, bzero is a deprecated function and should not be used if portability is a concern. If bzero is used, the file should be included.

Program 10.7 The Internet domain, connection-oriented client .

File : p10.7.cxx
 /*
 Internet domain, connection-oriented CLIENT
 */
 #include "local_sock.h"
 + int
 main( int argc, char *argv[] ) {
 int orig_sock, // Original socket in client
 len; // Misc. counter
 struct sockaddr_in
 10 serv_adr; // Internet addr of server
 struct hostent *host; // The host (server) info
 if ( argc != 2 ) { // Check cmd line for host name
 cerr << "usage: " << argv[0] << " server" << endl;
 return 1;
 + }
 host = gethostbyname(argv[1]); // Obtain host (server) info
 if (host == (struct hostent *) NULL ) {
 perror("gethostbyname ");
 return 2;
 20 }
 memset(&serv_adr, 0, sizeof( serv_adr)); // Clear structure
 serv_adr.sin_family = AF_INET; // Set address type
 memcpy(&serv_adr.sin_addr, host->h_addr, host->h_length);
 serv_adr.sin_port = htons( PORT ); // Use our fake port
 + // SOCKET
 if ((orig_sock = socket(PF_INET, SOCK_STREAM, 0)) < 0) {
 perror("generate error");
 return 3;
 } // CONNECT
 30 if (connect( orig_sock,(struct sockaddr *)&serv_adr,
 sizeof(serv_adr)) < 0) {
 perror("connect error");
 return 4;
 }
 + do { // Process
 write(fileno(stdout),"> ", 3);
 if ((len=read(fileno(stdin), buf, BUFSIZ)) > 0) {
 write(orig_sock, buf, len);
 if ((len=read(orig_sock, buf, len)) > 0 )
 40 write(fileno(stdout), buf, len);
 }
 } while( buf[0] != '.' ); // until end of input
 close(orig_sock);
 return 0;
 + }

The client program expects the name of a server (host) to be passed on the command line. The gethostbyname network call is used to obtain specific host-addressing information. The returned information, stored in the hostent structure, is referenced by *host . This information is used in part to fill the server Internet address information stored in the serv_adr structure. Before its members are assigned, the serv_adr structure is cleared using the memset library function. The address family is set to AF_INET. The memcpy library function is used to copy the obtained host address to the server address member. The memcpy function is used, as it will copy a specified number of bytes even if the referenced locations contain nonstandard strings (i.e., contain NULLs or do not end in a NULL). The assignment of the port number is similar to what was done in the server.

Next, a socket is created and a connection to the server process established. The client process then enters an endless loop. In the loop it requests user input with a > prompt. The user's input is read from the device mapped to standard input (most likely the keyboard). This input is then written to the socket where the server will read and process it (i.e., capitalize the string). The client process obtains the processed string by reading from the socket descriptor. The contents of this string (stored in the buf array) are written to the device mapped to standard output (usually the screen). The client continues to loop until a string that begins with a "." is entered.

A sample run of the Internet domain, connection-oriented clientserver application is shown in Figure 10.13.

Figure 10.13 A run of the Internet domain, connection-oriented clientserver application.

linux$ ps
 PID TTY TIME CMD

<-- 1

21604 pts/0 00:00:00 csh
23026 pts/0 00:00:00 3

linux ./server &

<-- 2

[1] 23028

linux$ ./server &

<-- 3

bind error: Address already in use
[2] 23029
[2] - Exit 2 ./server

linux$ telnet medusa

<-- 4

. . .
medusa$ ./client linux

<-- 5

> this is a test of the system
THIS IS A TEST OF THE SYSTEM
> .
.
medusa$ ps
 PID TTY TIME CMD
23095 pts/0 00:00:00 csh
23387 pts/0 00:00:00 ps

(1) We check the system for the server processnone is present.

(2) The server is placed in the background.

(3) The server has already bound the address.

(4) Run a terminal session on another local host.

(5) Run the client; pass the name of the host ( linux ) that is running the server process.

In this sequence, the user has logged onto the host linux and issued the ps command. The output of the command verifies that no server process is present. The server process is then invoked and explicitly placed in the background with the & . When the server is invoked a second time, the error message bind error: Address already in use is displayed. This message is generated by the call to bind because the previous invocation of the server program has already bound the port. The user then runs telnet to log onto another host on the network ( medusa ) and changes to the directory where the clientserver application resides. The client program is invoked and passed the name of the host running the server program (in this example, linux ). When the prompt appears, a line of text is entered. The client process passes the text to the server. The server, running on the host linux , processes the line of text and returns it to the client on host medusa . The client process displays the line (the initial line, which now is in all capitals). The application terminates with the entry of a line starting with a single period. A follow-up call to ps indicates the client process is gone.

EXERCISE

Modify Programs 10.6 and 10.7 to play a remote game of tic-tac-toe. The client (the user) will play against the server (the computer). The client (who goes first) requests the user enter a valid location. The location is stored in the representation of board. The board is then passed to the server. The server then generates a valid move [ tries first to win; if it cannot win, then block; if no block is required, it moves randomly ]. The server's move is stored in the board, the board is returned to the client, and so on. The client is responsible for validating a user's requested move, displaying the board, and determining a win, loss, or tie. The server should have separate routines to generate a winning, blocking, or random move. The server should always generate a valid move. The server should create a separate process for each connected client. Note : As a starting point, a PC platform executable version ( authored by a former student: Mark Cormier) of this exercise can be found with the files for this chapter. The files are called toe_server and toe_client .

EXERCISE

Further modify the tic-tac-toe game to allow two users to play against one another by connecting to a separate tic-tac-toe arbitrator process (server). Offload some of the common functions, such as checking for a win, loss, or tie, who goes first, and whose turn is next, to the arbitrator process.

Programs and Processes

Processing Environment

Using Processes

Primitive Communications

Pipes

Message Queues

Semaphores

Shared Memory

Remote Procedure Calls

Sockets

Threads

Appendix A. Using Linux Manual Pages

Appendix B. UNIX Error Messages

Appendix B. UNIX Error Messages

Appendix C. RPC Syntax Diagrams

Appendix D. Profiling Programs