2.3 Limitations of the Socket API
The native Socket API has several limitations: it's
error-prone
, overly complex, and nonportable/nonuniform. Although the following discussion focuses on the Socket API, the critique also applies to other native OS IPC APIs.
2.3.1 Error-Prone APIs
As outlined in Section 2.2, the Socket API uses handles to identify socket endpoints. In general, operating systems use handles to identify other I/O devices, such as files, pipes, and terminals. These handles are implemented as
weakly
typed
integer or pointer types, which allows subtle errors to occur at run time. To
illustrate
these and other problems that can occur, consider the following
echo_server()
function:
0 // This example contains bugs! Do not copy this example!
1 #include <sys/types.h>
2 #include <sys/socket.h>
3
4 const int PORT_NUM = 10000;
5
6 int echo server()
7 {
8 struct sockaddr_in addr;
9 int addr_len;
10 char buf[BUFSIZ];
11 int n_handle;
12 // Create the local endpoint.
13 int s_handle = socket (PF_UNIX, SOCK_DGRAM, 0);
14 if (s_handle == -1) return -1;
15
16 // Set up the address information where the server listens.
17 addr.sin_family = AF_INET;
18 addr.sin_port = PORT_NUM;
19 addr.sin_addr.addr = INADDR_ANY;
20
21 if (bind (s_handle, (struct sockaddr *) &addr,
22 sizeof addr) == -1)
23 return -1;
24
25 // Create a new communication endpoint.
26 if (n_handle = accept (s_handle, (struct sockaddr *) &addr,
27 &addr_len) != -1) {
28 int n;
29 while ((n = read (s_handle, buf, sizeof buf)) > 0)
30 write (n_handle, buf, n);
31
32 close (n handle);
33 }
34 return 0;
35 }
This function contains at least 10 subtle and all-too-common bugs that occur when using the Socket API. See if you can locate all 10 bugs while reading the code above, then read our dissection of these flaws below. The numbers in parentheses are the line
numbers
where errors occur in the
echo_server()
function.
-
Forgot
to initialize an important variable. (8 “9)
The
addr_len
variable must be set to
sizeof (addr)
. Forgetting to initialize this variable will cause the
accept()
call on line 26 to fail at run-time.
-
Use of nonportable handle datatype. (11 “14)
Although these lines look harmless enough, they are also fraught with peril. This code isn't portable to Windows Sockets (WinSock) platforms, where socket handles are type SOCKET, not type
int
. Moreover, WinSock failures are indicated via a non-standard macro called INVALID_SOCKET_HANDLE rather than by returning
“1
. The other bugs in the code fragment above aren't obvious until we examine the rest of the function.
The
next
three network addressing errors are subtle, and show up only at run time.
-
Unused struct
members
not cleared. (17 “19)
The entire
addr
structure should have been
initialized
to 0 before setting each address member. The Socket API uses one basic addressing structure (
sockaddr
) with different overlays, depending on the address family, for example,
sockaddr_in
for IPv4. Without initializing the entire structure to 0,
parts
of the fields may have indeterminate contents, which can yield random run-time failures.
-
Address/protocol family mismatch. (17)
The
addr.sin_family
field was set to AF_NET, which designates the Internet addressing family. It will be used with a socket (
s_handle
) that was created with the UNIX protocol family, which is inconsistent. Instead, the protocol family type passed to the
socket()
function should have been PF_INET.
-
Wrong byte order. (18)
The value assigned to
addr.sin_port
is not in network byte order; that is, the programmer forgot to use
htons()
to convert the
port number
from host byte order into network byte order. When run on a computer with little-endian byte order, this code will execute without error; however,
clients
will be unable to connect to it at the expected port number.
If these network addressing mistakes were corrected, lines 21 “23 would actually work!
There's an interrelated set of errors on lines 25 “27. These exemplify how hard it is to locate errors before run-time when programming directly to the C Socket API.
-
Missing an important API call. (25)
The
listen()
function was omitted
accidentally
. This function must be called before
accept()
to set the socket handle into so-called "passive mode."
-
Wrong socket type for API call. (26)
The
accept()
function was called for
s_handle
, which is exactly what should be done. The
s_handle
was created as a SOCK_DGRAM-type socket, however, which is an illegal socket type to use with
accept()
. The original
socket()
call on line 13 should therefore have been passed the SOCK_STREAM flag.
-
Operator precedence error. (26 “27)
There's one further error
related
to the
accept()
call. As written,
n_handle
will be set to 1 if
accept()
succeeds and 0 if it fails. If this program runs on UNIX (and the other bugs are fixed), data will be written to either
stdout
or
stdin
, respectively, instead of the connected socket. Most of the errors in this example can be avoided by using the ACE wrapper facade classes, but this bug is simply an error in operator precedence, which can't be avoided by using ACE, or any other library. It's a common pitfall [Koe88] with C and C++ programs, remedied only by knowing your operator precedence. To fix this, add a set of parentheses around the assignment expression, as
follows
:
if ((n_handle = accept (s_handle, (struct sockaddr *) &addr,
&addr_len)) != -1) {
Better yet, follow the convention we use in ACE and put the assignment to
n_handle
on a separate line:
n_handle = accept (s_handle,
(struct sockaddr *) &addr, &addr_len);
if (n_handle != -1) ( {
This way, you don't have to worry about remembering the arcane C++ operator precedence rules!
We're almost finished with our
echo_server()
function, but there are still several remaining errors.
-
Wrong handle used in API call. (29)
The
read()
function was called to receive up to
sizeof buf
bytes from
s_handle
, which is the passive-mode listening socket. We should have called
read()
on
n_handle
instead. This problem would have manifested itself as an obscure run-time error and could not have been
detected
at compile time since socket handles are weakly typed.
-
Possible data loss. (30)
The return value from
write()
was not checked to see if all
n
bytes (or any bytes at all) were written, which is a possible source of data loss. Due to socket buffering and
flow control
, a
write()
to a bytestream mode socket may only send part of the
requested
number of bytes, in which case the rest must be sent later.
After being burned enough times,
experienced
network programmers will be alert for most of the problems outlined above. A more fundamental design problem, however, is the lack of adequate type safety and data abstraction in the Socket API. The source code above will compile cleanly on some OS platforms, but not all. It will not run correctly on any platform, however!
Over the
years
, programmers and libraries have implemented
numerous
workarounds to alleviate these problems. A common solution is to use
typedefs
to clarify to programmers what types are involved. Although these solutions can alleviate some portability concerns, they don't cover the other problems outlined above. In particular, the diversity of IPC addressing schemes
presents
more variation than a mere
typedef
can hide.
2.3.2 Overly Complex APIs
The Socket API provides a single interface that supports multiple:
-
Protocol families,
such as TCP/IP, IPX/SPX, X.25, ISO OSI, ATM, and UNIX-domain sockets
-
Communication/connection
roles, such as
active connection establishment
versus
passive connection establishment
versus data transfer
-
Communication optimizations,
such as the gather-write function,
writev
()
, that sends multiple buffers in a single system function and
-
Options
for less common functionality, such as broadcasting, multicasting, asynchronous I/O, and urgent data delivery.
The Socket API combines all this functionality into a single API, listed in the tables in Section 2.2. The result is complex and hard to master. If you apply a careful analysis to the Socket API, however, you'll see that its interface can be decomposed into the following dimensions:
-
Type of communication service,
such as streams versus datagrams versus connected datagrams
-
Communication/connection role;
for example, clients often initiate connections actively, whereas servers often accept them
passively
-
Communication domain,
such as local host only versus local or remote host
Figure 2.1 clusters the related Socket functions according to these three dimensions. This natural clustering is obscured in the Socket API, however, because all this functionality is crammed into a single set of functions. Moreover, the Socket API can't enforce the correct use of its functions at compile time for different communication and connection roles, such as active versus passive connection establishment or datagram versus stream communication.
Figure 2.1. Taxonomy of Socket Dimensions
2.3.3 Nonportable and Nonuniform APIs
Despite its ubiquity, the Socket API is not portable. The following are some areas of
divergence
across platforms:
-
Function
names
”
The
read()
,
write()
, and
close()
functions in the
echo_server()
function on page 37 are not portable to all OS platforms. For example, Windows defines a different set of functions (
ReadFile()
,
writeFile()
, and
closesocket
()
) that provide these behaviors.
-
Function semantics ”
Certain functions behave differently on different platforms. For example, on UNIX and Win32 the
accept()
function can be passed NULL pointers for the client address and client length field. On certain real-time platforms, such as VxWorks, passing NULL pointers to
accept()
can crash the machine.
-
Socket handle types ”
Different platforms use different representations for socket handles. For example, on UNIX platforms socket handles are integers, whereas on Win32 they are actually implemented as pointers.
-
Header files ”
Different OS/compiler platforms use different names for header files containing Socket API function
prototypes
.
Another problem with the Socket API is that its several
dozen
functions lack a uniform naming convention, which makes it hard to determine the scope of the API. For example, it isn't immediately obvious that
socket()
,
bind()
,
accept()
, and
connect()
functions belong to the same API. Other networking APIs address this problem by prepending a common prefix before each function. For example, a
t_
is prepended before each function in the TLI API [Rag93].
|