The Programmer's Perspective: Sockets
As you know, the Internet is a very complex system, consisting of millions of servers speaking quite elaborate languages. Luckily, most of the Internet's complexity is hidden from the programmer by some cleverly designed APIs. You don't have to deal with TCP, UDP, or IP directly, nor are they separate tools. You don't even have to break the information into pieces manually. From a programmer's viewpoint, accessing the Internet is pretty much like accessing a file: You can "open" a site, read from it, write to it, and so on. All this elegance is achieved through an abstraction layer that makes network programming a breeze. This abstraction layer is called the socket interface and was introduced in the 1980s at the University of California, Berkeley.
A socket is simply an input/output device (such as a file, a modem, and so on) that opens a communication pipeline between two sites. To transfer information, both sites must keep a socket open (and aimed at the other). We will soon discover that establishing the communication channel can sometimes be a tricky process that requires some specific programming. But once the socket is working, sending data back and forth is as simple as writing it to the socket. Every byte written at one of the socket's endpoints will automatically appear at the other end. Sockets can either operate in TCP or UDP mode. In both modes, they automatically route information between both sites. For TCP, the typical partition/reassembly routines are performed as well as packet numbering and reordering. So, data is recovered at the destination endpoint in the same sequence as it was sent, much like in a FIFO queue. The socket internally reassembles the data so we don't have to bother with transmission issues. The UDP socket interface is more lightweight, but offers greater speed and flexibility.
Data transfer using the Internet is not very different from traditional file input/output (I/O). Establishing the communication is, on the other hand, a bit more complicated. To understand this complexity, we need to classify networked applications into two broad groups called clients and servers.
A client application (such as a web browser) is typically an endpoint of the communications network. It always works in connection to a server and consumes data transferred from the server to the client. Sometimes clients send data back to the server (commands, and so on), but the main raison d'être of a client is reading data provided by the server. Generally speaking, a client is connected to one (and only one) server, which acts as the data provider.
Servers, on the other hand, can be connected to many different clients simultaneously. Think of a web server such as Yahoo! and the multitude of connections it is simultaneously serving. In a more game-related scenario, think of the game server in a massively multiplayer game. Individual players use clients to retrieve data about the game world and to update the game server with their current position and status.
Clearly, communications on the client's side are very simple. One socket connects us to our game server, so data transfer takes place using a well-established route. Establishing such a connection is also pretty simple. Game servers, on the other hand, have a harder time keeping up with all the incoming connections. A new player might join the game at any time, or another player might abandon it, either because he quits the game, or because there was a network problem. In these server-side scenarios, communications get more complicated, and some careful code planning is required.
Clients and servers can both be connection-oriented (thus coded with TCP) or connectionless (coded with UDP). In the following section, we will take a look at both TCP and UDP clients, and then move on to the server side.