Section 16.5. Data Transfer

team bbl


16.5. Data Transfer

Since a socket endpoint is represented as a file descriptor, we can use read and write to communicate with a socket, as long as it is connected. Recall that a datagram socket can be "connected" if we set the default peer address using the connect function. Using read and write with socket descriptors is significant, because it means that we can pass socket descriptors to functions that were originally designed to work with local files. We can also arrange to pass the socket descriptors to child processes that execute programs that know nothing about sockets.

Although we can exchange data using read and write, that is about all we can do with these two functions. If we want to specify options, receive packets from multiple clients, or send out-of-band data, we need to use one of the six socket functions designed for data transfer.

Three functions are available for sending data, and three are available for receiving data. First, we'll look at the ones used to send data.

The simplest one is send. It is similar to write, but allows us to specify flags to change how the data we want to transmit is treated.

[View full width]

 #include <sys/socket.h> ssize_t send(int sockfd, const void *buf, size_t  nbytes, int flags);

Returns: number of bytes sent if OK, 1 on error


Like write, the socket has to be connected to use send. The buf and nbytes arguments have the same meaning as they do with write.

Unlike write, however, send supports a fourth flags argument. Two flags are defined by the Single UNIX Specification, but it is common for implementations to support additional ones. They are summarized in Figure 16.11.

Figure 16.11. Flags used with send socket calls

Flag

Description

POSIX.1

FreeBSD 5.2.1

Linux 2.4.22

Mac OS X 10.3

Solaris 9

MSG_DONTROUTE

Don't route packet outside of local network.

 

MSG_DONTWAIT

Enable nonblocking operation (equivalent to using O_NONBLOCK).

 

 

MSG_EOR

This is the end of record if supported by protocol.

 

MSG_OOB

Send out-of-band data if supported by protocol (see Section 16.7).


If send returns success, it doesn't necessarily mean that the process at the other end of the connection receives the data. All we are guaranteed is that when send succeeds, the data has been delivered to the network drivers without error.

With a protocol that supports message boundaries, if we try to send a single message larger than the maximum supported by the protocol, send will fail with errno set to EMSGSIZE. With a byte-stream protocol, send will block until the entire amount of data has been transmitted.

The sendto function is similar to send. The difference is that sendto allows us to specify a destination address to be used with connectionless sockets.

[View full width]

 #include <sys/socket.h> ssize_t sendto(int sockfd, const void *buf, size_t  nbytes, int flags,                const struct sockaddr *destaddr,  socklen_t destlen); 

Returns: number of bytes sent if OK, 1 on error


With a connection-oriented socket, the destination address is ignored, as the destination is implied by the connection. With a connectionless socket, we can't use send unless the destination address is first set by calling connect, so sendto gives us an alternate way to send a message.

We have one more choice when transmitting data over a socket. We can call sendmsg with a msghdr structure to specify multiple buffers from which to transmit data, similar to the writev function (Section 14.7).

[View full width]

 #include <sys/socket.h> ssize_t sendmsg(int sockfd, const struct msghdr  *msg, int flags);

Returns: number of bytes sent if OK, 1 on error


POSIX.1 defines the msghdr structure to have at least the following members:

    struct msghdr {      void          *msg_name;         /* optional address */      socklen_t      msg_namelen;      /* address size in bytes */      struct iovec  *msg_iov;          /* array of I/O buffers */      int            msg_iovlen;       /* number of elements in array */      void          *msg_control;      /* ancillary data */      socklen_t      msg_controllen;   /* number of ancillary bytes */      int            msg_flags;        /* flags for received message */      .      .      .    }; 

We saw the iovec structure in Section 14.7. We'll see the use of ancillary data in Section 17.4.2.

The recv function is similar to read, but allows us to specify some options to control how we receive the data.

[View full width]

 #include <sys/socket.h> ssize_t recv(int sockfd, void *buf, size_t nbytes,  int flags);

Returns: length of message in bytes, 0 if no messages are available and peer has done an orderly shutdown, or 1 on error


The flags that can be passed to recv are summarized in Figure 16.12. Only three are defined by the Single UNIX Specification.

Figure 16.12. Flags used with recv socket calls

Flag

Description

POSIX.1

FreeBSD 5.2.1

Linux 5.2.1

Mac OS X 10.3

Solaris 9

MSG_OOB

Retrieve out-of-band data if supported by protocol (see Section 16.7).

MSG_PEEK

Return packet contents without consuming packet.

MSG_TRUNC

Request that the real length of the packet be returned, even if it was truncated.

  

  

MSG_WAITALL

Wait until all data is available (SOCK_STREAM only).


When we specify the MSG_PEEK flag, we can peek at the next data to be read without actually consuming it. The next call to read or one of the recv functions will return the same data we peeked at.

With SOCK_STREAM sockets, we can receive less data than we requested. The MSG_WAITALL flag inhibits this behavior, preventing recv from returning until all the data we requested has been received. With SOCK_DGRAM and SOCK_SEQPACKET sockets, the MSG_WAITALL flag provides no change in behavior, because these message-based socket types already return an entire message in a single read.

If the sender has called shutdown (Section 16.2) to end transmission, or if the network protocol supports orderly shutdown by default and the sender has closed the socket, then recv will return 0 when we have received all the data.

If we are interested in the identity of the sender, we can use recvfrom to obtain the source address from which the data was sent.

[View full width]

 #include <sys/socket.h> ssize_t recvfrom(int sockfd, void *restrict buf,  size_t len, int flags,                  struct sockaddr *restrict addr,                  socklen_t *restrict addrlen); 

Returns: length of message in bytes, 0 if no messages are available and peer has done an orderly shutdown, or 1 on error


If addr is non-null, it will contain the address of the socket endpoint from which the data was sent. When calling recvfrom, we need to set the addrlen parameter to point to an integer containing the size in bytes of the socket buffer to which addr points. On return, the integer is set to the actual size of the address in bytes.

Because it allows us to retrieve the address of the sender, recvfrom is usually used with connectionless sockets. Otherwise, recvfrom behaves identically to recv.

To receive data into multiple buffers, similar to readv (Section 14.7), or if we want to receive ancillary data (Section 17.4.2), we can use recvmsg.

[View full width]

 #include <sys/socket.h> ssize_t recvmsg(int sockfd, struct msghdr *msg,  int flags);

Returns: length of message in bytes, 0 if no messages are available and peer has done an orderly shutdown, or 1 on error


The msghdr structure (which we saw used with sendmsg) is used by recvmsg to specify the input buffers to be used to receive the data. We can set the flags argument to change the default behavior of recvmsg. On return, the msg_flags field of the msghdr structure is set to indicate various characteristics of the data received. (The msg_flags field is ignored on entry to recvmsg). The possible values on return from recvmsg are summarized in Figure 16.13. We'll see an example that uses recvmsg in Chapter 17.

Figure 16.13. Flags returned in msg_flags by recvmsg

Flag

Description

POSIX.1

FreeBSD 5.2.1

Linux 2.4.22

Mac OS X 10.3

Solaris 9

MSG_CTRUNC

Control data was truncated.

MSG_DONTWAIT

recvmsg was called in nonblocking mode.

  

 

MSG_EOR

End of record was received.

MSG_OOB

Out-of-band data was received.

MSG_TRUNC

Normal data was truncated.


ExampleConnection-Oriented Client

Figure 16.14 shows a client command that communicates with a server to obtain the output from a system's uptime command. We call this service "remote uptime" (or "ruptime" for short).

This program connects to a server, reads the string sent by the server, and prints the string on the standard output. Since we're using a SOCK_STREAM socket, we can't be guaranteed that we will read the entire string in one call to recv, so we need to repeat the call until it returns 0.

The getaddrinfo function might return more than one candidate address for us to use if the server supports multiple network interfaces or multiple network protocols. We try each one in turn, giving up when we find one that allows us to connect to the service. We use the connect_retry function from Figure 16.9 to establish a connection with the server.

Figure 16.14. Client command to get uptime from server
 #include "apue.h" #include <netdb.h> #include <errno.h> #include <sys/socket.h> #define MAXADDRLEN  256 #define BUFLEN      128 extern int connect_retry(int, const struct sockaddr *, socklen_t); void print_uptime(int sockfd) {     int     n;     char    buf[BUFLEN];     while ((n = recv(sockfd, buf, BUFLEN, 0)) > 0)         write(STDOUT_FILENO, buf, n);     if (n < 0)         err_sys("recv error"); } int main(int argc, char *argv[]) {     struct addrinfo *ailist, *aip;     struct addrinfo hint;     int             sockfd, err;     if (argc != 2)         err_quit("usage: ruptime hostname");     hint.ai_flags = 0;     hint.ai_family = 0;     hint.ai_socktype = SOCK_STREAM;     hint.ai_protocol = 0;     hint.ai_addrlen = 0;     hint.ai_canonname = NULL;     hint.ai_addr = NULL;     hint.ai_next = NULL;     if ((err = getaddrinfo(argv[1], "ruptime", &hint, &ailist)) != 0)         err_quit("getaddrinfo error: %s", gai_strerror(err));     for (aip = ailist; aip != NULL; aip = aip->ai_next) {         if ((sockfd = socket(aip->ai_family, SOCK_STREAM, 0)) < 0)             err = errno;         if (connect_retry(sockfd, aip->ai_addr, aip->ai_addrlen) < 0) {             err = errno;         } else {             print_uptime(sockfd);             exit(0);         }     }     fprintf(stderr, "can't connect to %s: %s\n", argv[1],       strerror(err));     exit(1); } 

ExampleConnection-Oriented Server

Figure 16.15 shows the server that provides the uptime command's output to the client program from Figure 16.14.

To find out its address, the server needs to get the name of the host on which it is running. Some systems don't define the _SC_HOST_NAME_MAX constant, so we use HOST_NAME_MAX in this case. If the system doesn't define HOST_NAME_MAX, we define it ourselves. POSIX.1 states that the minimum value for the host name is 255 bytes, not including the terminating null, so we define HOST_NAME_MAX to be 256 to include the terminating null.

The server gets the host name by calling gethostname and looks up the address for the remote uptime service. Multiple addresses can be returned, but we simply choose the first one for which we can establish a passive socket endpoint. Handling multiple addresses is left as an exercise.

We use the initserver function from Figure 16.10 to initialize the socket endpoint on which we will wait for connect requests to arrive. (Actually, we use the version from Figure 16.20; we'll see why when we discuss socket options in Section 16.6.)

Figure 16.15. Server program to provide system uptime
 #include "apue.h" #include <netdb.h> #include <errno.h> #include <syslog.h> #include <sys/socket.h> #define BUFLEN  128 #define QLEN 10 #ifndef HOST_NAME_MAX #define HOST_NAME_MAX 256 #endif extern int initserver(int, struct sockaddr *, socklen_t, int); void serve(int sockfd) {     int     clfd;     FILE    *fp;     char    buf[BUFLEN];     for (;;) {         clfd = accept(sockfd, NULL, NULL);         if (clfd < 0) {             syslog(LOG_ERR, "ruptimed: accept error: %s",               strerror(errno));             exit(1);         }         if ((fp = popen("/usr/bin/uptime", "r")) == NULL) {             sprintf(buf, "error: %s\n", strerror(errno));             send(clfd, buf, strlen(buf), 0);         } else {             while (fgets(buf, BUFLEN, fp) != NULL)                 send(clfd, buf, strlen(buf), 0);             pclose(fp);         }         close(clfd);     } } int main(int argc, char *argv[]) {     struct addrinfo *ailist, *aip;     struct addrinfo hint;     int             sockfd, err, n;     char            *host;     if (argc != 1)         err_quit("usage: ruptimed"); #ifdef _SC_HOST_NAME_MAX     n = sysconf(_SC_HOST_NAME_MAX);     if (n < 0)  /* best guess */ #endif         n = HOST_NAME_MAX;     host = malloc(n);     if (host == NULL)         err_sys("malloc error");     if (gethostname(host, n) < 0)         err_sys("gethostname error");     daemonize("ruptimed");     hint.ai_flags = AI_CANONNAME;     hint.ai_family = 0;     hint.ai_socktype = SOCK_STREAM;     hint.ai_protocol = 0;     hint.ai_addrlen = 0;     hint.ai_canonname = NULL;     hint.ai_addr = NULL;     hint.ai_next = NULL;     if ((err = getaddrinfo(host, "ruptime", &hint, &ailist)) != 0) {         syslog(LOG_ERR, "ruptimed: getaddrinfo error: %s",           gai_strerror(err));         exit(1);     }     for (aip = ailist; aip != NULL; aip = aip->ai_next) {         if ((sockfd = initserver(SOCK_STREAM, aip->ai_addr,           aip->ai_addrlen, QLEN)) >= 0) {             serve(sockfd);             exit(0);         }     }     exit(1); } 

ExampleAlternate Connection-Oriented Server

Previously, we stated that using file descriptors to access sockets was significant, because it allowed programs that knew nothing about networking to be used in a networked environment. The version of the server shown in Figure 16.16 illustrates this point. Instead of reading the output of the uptime command and sending it to the client, the server arranges to have the standard output and standard error of the uptime command be the socket endpoint connected to the client.

Instead of using popen to run the uptime command and reading the output from the pipe connected to the command's standard output, we use fork to create a child process and then use dup2 to arrange that the child's copy of STDIN_FILENO is open to /dev/null and that both STDOUT_FILENO and STDERR_FILENO are open to the socket endpoint. When we execute uptime, the command writes the results to its standard output, which is connected to the socket, and the data is sent back to the ruptime client command.

The parent can safely close the file descriptor connected to the client, because the child still has it open. The parent waits for the child to complete before proceeding, so that the child doesn't become a zombie. Since it shouldn't take too long to run the uptime command, the parent can afford to wait for the child to exit before accepting the next connect request. This strategy might not be appropriate if the child takes a long time, however.

Figure 16.16. Server program illustrating command writing directly to socket
 #include "apue.h" #include <netdb.h> #include <errno.h> #include <syslog.h> #include <fcntl.h> #include <sys/socket.h> #include <sys/wait.h> #define QLEN 10 #ifndef HOST_NAME_MAX #define HOST_NAME_MAX 256 #endif extern int initserver(int, struct sockaddr *, socklen_t, int); void serve(int sockfd) {     int     clfd, status;     pid_t   pid;     for (;;) {         clfd = accept(sockfd, NULL, NULL);         if (clfd < 0) {             syslog(LOG_ERR, "ruptimed: accept error: %s",               strerror(errno));             exit(1);         }         if ((pid = fork()) < 0) {             syslog(LOG_ERR, "ruptimed: fork error: %s",               strerror(errno));             exit(1);         } else if (pid == 0) {  /* child */             /*              * The parent called daemonize (Figure 13.1), so              * STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO              * are already open to /dev/null. Thus, the call to              * close doesn't need to be protected by checks that              * clfd isn't already equal to one of these values.              */             if (dup2(clfd, STDOUT_FILENO) != STDOUT_FILENO ||               dup2(clfd, STDERR_FILENO) != STDERR_FILENO) {                 syslog(LOG_ERR, "ruptimed: unexpected error");                 exit(1);             }             close(clfd);             execl("/usr/bin/uptime", "uptime", (char *)0);             syslog(LOG_ERR, "ruptimed: unexpected return from exec: %s",               strerror(errno));         } else {        /* parent */             close(clfd);             waitpid(pid, &status, 0);         }     } } int main(int argc, char *argv[]) {     struct addrinfo *ailist, *aip;     struct addrinfo hint;     int             sockfd, err, n;     char            *host;     if (argc != 1)         err_quit("usage: ruptimed"); #ifdef _SC_HOST_NAME_MAX     n = sysconf(_SC_HOST_NAME_MAX);     if (n < 0)  /* best guess */ #endif         n = HOST_NAME_MAX;     host = malloc(n);     if (host == NULL)         err_sys("malloc error");     if (gethostname(host, n) < 0)         err_sys("gethostname error");     daemonize("ruptimed");     hint.ai_flags = AI_CANONNAME;     hint.ai_family = 0;     hint.ai_socktype = SOCK_STREAM;     hint.ai_protocol = 0;     hint.ai_addrlen = 0;     hint.ai_canonname = NULL;     hint.ai_addr = NULL;     hint.ai_next = NULL;     if ((err = getaddrinfo(host, "ruptime", &hint, &ailist)) != 0) {         syslog(LOG_ERR, "ruptimed: getaddrinfo error: %s",           gai_strerror(err));         exit(1);     }     for (aip = ailist; aip != NULL; aip = aip->ai_next) {         if ((sockfd = initserver(SOCK_STREAM, aip->ai_addr,           aip->ai_addrlen, QLEN)) >= 0) {             serve(sockfd);             exit(0);         }     }     exit(1); } 

The previous examples have used connection-oriented sockets. But how do we choose the appropriate type? When do we use a connection-oriented socket, and when do we use a connectionless socket? The answer depends on how much work we want to do and what kind of tolerance we have for errors.

With a connectionless socket, packets can arrive out of order, so if we can't fit all our data in one packet, we will have to worry about ordering in our application. The maximum packet size is a characteristic of the communication protocol. Also, with a connectionless socket, the packets can be lost. If our application can't tolerate this loss, we should use connection-oriented sockets.

Tolerating packet loss means that we have two choices. If we intend to have reliable communication with our peer, we have to number our packets and request retransmission from the peer application when we detect a missing packet. We will also have to identify duplicate packets and discard them, since a packet might be delayed and appear to be lost, but show up after we have requested retransmission.

The other choice we have is to deal with the error by letting the user retry the command. For simple applications, this might be adequate, but for complex applications, this usually isn't a viable alternative, so it is generally better to use connection-oriented sockets in this case.

The drawbacks to connection-oriented sockets are that more work and time are needed to establish a connection, and each connection consumes more resources from the operating system.

ExampleConnectionless Client

The program in Figure 16.17 is a version of the uptime client command that uses the datagram socket interface.

The main function for the datagram-based client is similar to the one for the connection-oriented client, with the addition of installing a signal handler for SIGALRM. We use the alarm function to avoid blocking indefinitely in the call to recvfrom.

With the connection-oriented protocol, we needed to connect to the server before exchanging data. The arrival of the connect request was enough for the server to determine that it needed to provide service to a client. But with the datagram-based protocol, we need a way to notify the server that we want it to perform its service on our behalf. In this example, we simply send the server a 1-byte message. The server will receive it, get our address from the packet, and use this address to transmit its response. If the server offered multiple services, we could use this request message to indicate the service we want, but since the server does only one thing, the content of the 1-byte message doesn't matter.

If the server isn't running, the client will block indefinitely in the call to recvfrom. With the connection-oriented example, the connect call will fail if the server isn't running. To avoid blocking indefinitely, we set an alarm clock before calling recvfrom.

Figure 16.17. Client command using datagram service
 #include "apue.h" #include <netdb.h> #include <errno.h> #include <sys/socket.h> #define BUFLEN      128 #define TIMEOUT     20 void sigalrm(int signo) { } void print_uptime(int sockfd, struct addrinfo *aip) {     int     n;     char    buf[BUFLEN];     buf[0] = 0;     if (sendto(sockfd, buf, 1, 0, aip->ai_addr, aip->ai_addrlen) < 0)         err_sys("sendto error");     alarm(TIMEOUT);     if ((n = recvfrom(sockfd, buf, BUFLEN, 0, NULL, NULL)) < 0) {         if (errno != EINTR)             alarm(0);         err_sys("recv error");     }     alarm(0);     write(STDOUT_FILENO, buf, n); } int main(int argc, char *argv[]) {     struct addrinfo     *ailist, *aip;     struct addrinfo      hint;     int                  sockfd, err;     struct sigaction     sa;     if (argc != 2)         err_quit("usage: ruptime hostname");     sa.sa_handler = sigalrm;     sa.sa_flags = 0;     sigemptyset(&sa.sa_mask);     if (sigaction(SIGALRM, &sa, NULL) < 0)         err_sys("sigaction error");     hint.ai_flags = 0;     hint.ai_family = 0;     hint.ai_socktype = SOCK_DGRAM;     hint.ai_protocol = 0;     hint.ai_addrlen = 0;     hint.ai_canonname = NULL;     hint.ai_addr = NULL;     hint.ai_next = NULL;     if ((err = getaddrinfo(argv[1], "ruptime", &hint, &ailist)) != 0)         err_quit("getaddrinfo error: %s", gai_strerror(err));     for (aip = ailist; aip != NULL; aip = aip->ai_next) {         if ((sockfd = socket(aip->ai_family, SOCK_DGRAM, 0)) < 0) {             err = errno;         } else {             print_uptime(sockfd, aip);             exit(0);         }      }      fprintf(stderr, "can't contact %s: %s\n", argv[1], strerror(err));      exit(1); } 

ExampleConnectionless Server

The program in Figure 16.18 is the datagram version of the uptime server.

The server blocks in recvfrom for a request for service. When a request arrives, we save the requester's address and use popen to run the uptime command. We send the output back to the client using the sendto function, with the destination address set to the requester's address.

Figure 16.18. Server providing system uptime over datagrams
 #include "apue.h" #include <netdb.h> #include <errno.h> #include <syslog.h> #include <sys/socket.h> #define BUFLEN      128 #define MAXADDRLEN  256 #ifndef HOST_NAME_MAX #define HOST_NAME_MAX 256 #endif extern int initserver(int, struct sockaddr *, socklen_t, int); void serve(int sockfd) {     int         n;     socklen_t   alen;     FILE        *fp;     char        buf[BUFLEN];     char        abuf[MAXADDRLEN];     for (;;) {         alen = MAXADDRLEN;         if ((n = recvfrom(sockfd, buf, BUFLEN, 0,           (struct sockaddr *)abuf, &alen)) < 0) {             syslog(LOG_ERR, "ruptimed: recvfrom error: %s",               strerror(errno));             exit(1);         }         if ((fp = popen("/usr/bin/uptime", "r")) == NULL) {             sprintf(buf, "error: %s\n", strerror(errno));             sendto(sockfd, buf, strlen(buf), 0,               (struct sockaddr *)abuf, alen);         } else {             if (fgets(buf, BUFLEN, fp) != NULL)                 sendto(sockfd, buf, strlen(buf), 0,                   (struct sockaddr *)abuf, alen);             pclose(fp);         }     } } int main(int argc, char *argv[]) {     struct addrinfo *ailist, *aip;     struct addrinfo hint;     int             sockfd, err, n;     char            *host;     if (argc != 1)         err_quit("usage: ruptimed"); #ifdef _SC_HOST_NAME_MAX     n = sysconf(_SC_HOST_NAME_MAX);     if (n < 0)  /* best guess */ #endif         n = HOST_NAME_MAX;     host = malloc(n);     if (host == NULL)         err_sys("malloc error");     if (gethostname(host, n) < 0)         err_sys("gethostname error");     daemonize("ruptimed");     hint.ai_flags = AI_CANONNAME;     hint.ai_family = 0;     hint.ai_socktype = SOCK_DGRAM;     hint.ai_protocol = 0;     hint.ai_addrlen = 0;     hint.ai_canonname = NULL;     hint.ai_addr = NULL;     hint.ai_next = NULL;     if ((err = getaddrinfo(host, "ruptime", &hint, &ailist)) != 0) {         syslog(LOG_ERR, "ruptimed: getaddrinfo error: %s",           gai_strerror(err));         exit(1);     }     for (aip = ailist; aip != NULL; aip = aip->ai_next) {         if ((sockfd = initserver(SOCK_DGRAM, aip->ai_addr,           aip->ai_addrlen, 0)) >= 0) {             serve(sockfd);             exit(0);         }     }     exit(1); } 

    team bbl



    Advanced Programming in the UNIX Environment
    Advanced Programming in the UNIX Environment, Second Edition (Addison-Wesley Professional Computing Series)
    ISBN: 0321525949
    EAN: 2147483647
    Year: 2005
    Pages: 370

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net