Section 17.3. Basic Socket Operations

17.3. Basic Socket Operations

Like most other Linux resources, sockets are implemented through the file abstraction. They are created through the socket() system call, which returns a file descriptor. Once the socket has been properly initialized, that file descriptor may be used for read() and write() requests, like any other file descriptor. When a process is finished with a socket, it should be close() ed to free the resources associated with it.

This section presents the basic system calls for creating and initializing sockets for any protocol. It is a bit abstract due to this protocol independence, and does not contain any examples for the same reason. The next two sections of this chapter describe how to use sockets with two different protocols, Unix Domain and TCP/IP, and those sections include full examples of how to use most of the system calls introduced here.

17.3.1. Creating a Socket

New sockets are created by the socket() system call, which returns a file descriptor for the uninitialized socket. The socket is tied to a particular protocol when it is created, but it is not connected to anything. As it is not connected, it cannot yet be read from or written to.

 #include <sys/socket.h> int socket(int domain, int type, int protocol);

Like open(), socket() returns a value less than 0 on error and a file descriptor, which is greater than or equal to 0, on success. The three parameters specify the protocol to use.

The first parameter specifies the protocol family that should be used and is usually one of the values specified in Table 17.1.

The next parameter, type, is SOCK_STREAM, SOCK_DGRAM, or SOCK_RAW.^[6] SOCK_STREAM specifies a protocol from the specified family that provides a stream connection, whereas SOCK_DGRAM specifies a datagram protocol from the same family. SOCK_RAW provides the ability to send packets directly to a network device driver, which enables user space applications to provide networking protocols that are not understood by the kernel.

^[6] A couple of other values are available, but they are not usually used by application code.

The final parameter specifies which protocol is to be used, subject to the constraints specified by the first two parameters. Usually this parameter is 0, letting the kernel use the default protocol of the specified type and family. For the PF_INET protocol family, Table 17.2 lists some of protocols allowed, with IPPROTO_TCP being the default stream protocol and IPPROTO_UDP the default datagram protocol.

Table 17.2. IP Protocols
Protocol	Description
`IPPROTO_ICMP`	Internet Control Message Protocol for IPv4
`IPPROTO_ICMPV6`	Internet Control Message Protocol for IPv6
`IPPROTO_IPIP`	IPIP tunnels
`IPPROTO_IPV6`	IPv6 headers
`IPPROTO_RAW`	Raw IP packets
`IPPROTO_TCP`	Transmission Control Protocol (TCP)
`IPPROTO_UDP`	User Datagram Protocol (UDP)

17.3.2. Establishing Connections

After you create a stream socket, it needs to be connected to something before it is of much use. Establishing socket connections is an inherently asymmetric task; each side of the connection does it differently. One side gets its socket ready to be connected to something and then waits for someone to connect to it. This is usually done by server applications that are started and continuously run, waiting for other processes to connect to them.

Client processes instead create a socket, tell the system which address they want to connect it to, and then try to establish the connection. Once the server (which has been waiting for a client) accepts the connection attempt, the connection is established between the two sockets. After this happens, the socket may be used for bidirectional communication.

17.3.3. Binding an Address to a Socket

Both server and client processes need to tell the system which address to use for the socket. Attaching an address to the local side of a socket is called binding the socket and is done through the bind() system call.

 #include <sys/socket.h> int bind(int sock, struct sockaddr * my_addr, socklen_t addrlen);

The first parameter is the socket being bound, and the other parameters specify the address to use for the local endpoint.

17.3.4. Waiting for Connections

After creating a socket, server processes bind() the socket to the address they are listening to. After the socket is bound to an address, the process tells the system it is willing to let other processes establish connections to that socket (at the specified address) by calling listen(). Once a socket is bound to an address, the kernel is able to handle processes' attempts to connect to that address. However, the connection is not immediately established. The listen() ing process must first accept the connection attempt through the accept() system call. New connection attempts that have been made to addresses that have been listen() ed to are called pending connections until the connections has been accept() ed.

Normally, accept() blocks until a client process tries to connect to it. If the socket has been marked as nonblocking through fcntl(), accept() instead returns EAGAIN if no client process is available.^[7] The select(), poll(), and epoll system calls may also be used to determine whether a connection to a socket is pending (those calls mark the socket as ready to be read from).^[8]

^[7] The connect() system call can also be nonblocking, which allows clients to open multiple TCP connections much more quickly (it lets the program continue to run while TCP's three-way handshake is performed). Details on how to do this can be found in [Stevens, 2004].

^[8] The various forms of select() mark a socket as ready for reading when an accept() would not block even if the socket is not marked as nonblocking. For maximum portability, select() should be used only for accepting connections with nonblocking sockets, although under Linux it is not actually necessary. [Stevens, 2004] talks about the reasons for this in detail.

Here are the prototypes of listen() and accept().

 #include <sys/socket.h> int listen(int sock, int backlog); int accept(int sock, struct sockaddr * addr, socklen_t * addrlen);

Both of these functions expect the socket's file descriptor as the first parameter. listen()'s other parameter, backlog, specifies how many connections may be pending on the socket before further connection attempts are refused. Network connections are not established until the server has accept() ed the connection; until the accept(), the incoming connection is considered pending. By providing a small queue of pending connections, the kernel relaxes the need for server processes to be constantly prepared to accept() connections. Applications have historically set the maximum backlog to five, although a larger value may sometimes be necessary. listen() returns zero on success and nonzero on failure.

The accept() call changes a pending connection to an established connection. The established connection is given a new file descriptor, which accept() returns. The new file descriptor inherits its attributes from the socket that was listen() ed to. One unusual feature of accept() is that it returns networking errors that are pending as errors from accept().^[9] Servers should not abort when accept() returns an error if errno is one of ECONNABORTED, ENETDOWN, EPROTO, ENOPROTOOPT, EHOSTDOWN, ENONET, EHOSTUNREACH, EOPNOTSUPP, or ENETUNREACH. All of these should be ignored, with the server just calling accept() once more.

^[9] BSD variants do not have this behavior; on those systems the errors go unreported.

The addr and addrlen parameters point to data that the kernel fills in with the address of the remote (client) end of the connection. Initially, addrlen should point to an integer containing the size of the buffer addr points to. accept() returns a file descriptor, or less than zero if an error occurs, just like open().

17.3.5. Connecting to a Server

Like servers, clients may bind() the local address to the socket immediately after creating it. Usually, the client does not care what the local address is and skips this step, allowing the kernel to assign it any convenient local address.

After the bind() step (which may be omitted), the client connect()s to a server.

 #include <sys/socket.h> int connect(int sock, struct sockaddr * servaddr, socklen_t addrlen);

The process passes to connect() the socket that is being connected, followed by the address to which the socket should be connected.

Figure 17.1 shows the system calls usually used to establish socket connections, and the order in which they occur.

Figure 17.1. Establishing Socket Connections

17.3.6. Finding Connection Addresses

After a connection has been established, applications can find the addresses for both the local and remote end of a socket by using getpeername() and getsockname().

 #include <sys/socket.h> int getpeername(int s, struct sockaddr * addr, socklen_t * addrlen); int getsockname(int s, struct sockaddr * addr, socklen_t * addrlen);

Both functions fill in the structures pointed to by their addr parameters with addresses for the connection used by socket s. The address for the remote side is returned by getpeername(), while getsockname() returns the address for the local part of the connection. For both functions, the integer pointed to by addrlen should be initialized to the amount of space pointed to by addr, and that integer is changed to the number of bytes in the address returned.