Sockets Programming Paradigm | GNU/Linux Application Programming (Programming Series)

The Sockets paradigm involves a number of different elements that must be understood to use it properly. Let s look at the Sockets paradigm in a hierarchical fashion.

At the top of the hierarchy is the host. This is a source or destination node on a network to or from which packets are sent or received. (Technically, we would refer to interfaces as the source or destination, as a host may provide multiple interfaces, but we re going to keep it simple here.) The host implements a set of protocols. These protocols define the manner in which communication occurs. Within each protocol is a set of ports. Each port defines an endpoint (the final source of destination). See Table 12.1 for a list of these elements (and Figure 12.2 for a graphical view of these relationships).

Table 12.1: Sockets Programming Element Hierarchy
Element	Description
Host (Interface)	Network address (a reachable network node)
Protocol	Specific protocol (such as TCP or UDP)
Port	Client or server process endpoint

Figure 12.2: Graphical view of host/protocol/port relationship.

Hosts

Hosts are identified by addresses, and for IP (Internet Protocols), these are called IP addresses. An IPv4 address (of the version 4 class) is defined as a 32-bit address. This address is represented by four 8-bit values. A sample address can be illustrated as:

 192.168.1.1    or    0xC0A80101

The first value shows the more popular form of IPv4 addresses, which is easily readable. The second notation is simply the first address in hexadecimal format (32 bits wide).

Protocol

The protocol specifies the details of communication over the socket. The two most common protocols used are the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP). TCP is a stream-based reliable protocol, and UDP is a datagram (message)-based protocol that can be unreliable. We ll provide additional details of these protocols in this chapter.

Port

The port is the endpoint for a given process (interface) for a protocol. This is the application s interface to the Socket interface. Ports are unique on a host (not interface) for a given protocol. Ports are commonly called bound when they are attached to a given socket.

Ports are numbers that are split basically into two ranges. Port numbers below 1024 are reserved for well-known services (called well-known addresses). Port numbers above 1024 are typically used by applications.

Note

The original intent of service port numbers (such as FTP, HTTP, and DNS) was that they fall below port number 1024. Of course, the number of services exceeded that number long ago. Now, many system services occupy the port number space greater than 1024 (for example NFS at port number 2049 and X-11 at port number 6000).

Addressing

From this discussion, we see that a tuple uniquely identifies an endpoint from all other endpoints on a network. Consider the following tuple:

 { tcp, 192.168.1.1, 4097 }

This defines the endpoint on the host identified by the address 192.168.1.1 with the port 4097 using the TCP protocol.

The Socket

Simply put, a Socket can be defined as an endpoint of a communications channel between two applications. An example of this is defined as two tuples:

 { tcp, 192.168.1.1, 4097 }     { tcp, 10.0.0.1, 5820 }

Figure 12.3: Visualization of a Socket between two hosts.

The first item to note is that a socket is defined as an association of two endpoints that share the same protocol. The IP addresses are different here, but they don t have to be. We could communicate via sockets in the same host. The port numbers are different here, but they could be the same unless they exist on the same host. Port numbers assigned by the TCP/IP stack are called ephemeral ports . This relationship is shown visually in Figure 12.3.

Client/Server Model

In most Sockets applications, there exists a server (responds to requests and provides responses) and a client (makes requests to the server). The Sockets API (which we ll explore in the next section) provides commands that are specific to clients and to servers. Figure 12.4 illustrates two simple applications that implement a client and a server.

Figure 12.4: Client/server symmetry in Sockets applications.

The first step in a Sockets application is the creation of a socket. The socket is the communication endpoint that is created by the socket call. Note that in the sample flow (in Figure 12.4) both the server and client perform this step.

The server requires a bit more setup as part of registering a service to the host. The bind call binds an address and port to the server so that it s known. Letting the system choose the port can result in a service that can be difficult to find. If we choose the port, we know what it is. Once we ve bound our port, we call the listen function for the server. This makes the server accessible (puts it in the listen mode).

We establish our socket next, using connect at the client and accept at the server. The connect call starts what s known as the three-way handshake, with the purpose of setting up a connection between the client and server. At the server, the accept call creates a new server-side client socket. Once accept finishes, a new socket connection exists between the client and server, and data flow can occur.

In the data transfer phase, we have an established socket for which communication can occur. Both the client and server can send and recv data asynchronously.

Finally, we can sever the connection between the client and server using the close call. This can occur asynchronously, but upon one endpoint closing the socket, the other side will automatically receive an indication of the closure.