TCP/IP ArchitectureThere are four levels to understanding how Java applications communicate over the network using TCP. Figure 23.1 shows these levels in their simplest form. Figure 23.1. The TCP/IP network model can be broken down into four layers . When viewed as a layered model, TCP/IP is usually seen as being composed of four layers, each playing a specific role:
Data within these layers is usually encapsulated with a common mechanism: protocols have a header that identifies meta-information ”such as the source, destination, and other attributes ”and a data portion that contains the actual information. The protocols from the upper layers are encapsulated within the data portion of the lower ones. When traveling back up the protocol stack, the information is reconstructed as it is delivered to each layer. Figure 23.2 shows this concept of encapsulation. Figure 23.2. As data moves through the TCP/IP layers, it is encapsulated. TCP/IP ProtocolsThree protocols are most commonly used within the TCP/IP scheme, and a closer investigation of their properties is warranted. Understanding how these three protocols (IP, TCP, and UDP) interact is critical to developing network applications. Internet Protocol (IP)IP is the keystone of the TCP/IP suite. All data on the Internet flows through IP packets, the basic unit of IP transmissions. IP is a connectionless, unreliable protocol. As a connectionless protocol, IP does not exchange control information before transmitting data to a remote system. Data packets are merely sent to the destination with the expectation that they will be treated properly. IP is unreliable because it does not retransmit lost packets or detect corrupted data. These tasks must be implemented by higher-level protocols, such as TCP. IP defines a universal addressing scheme called IP addresses. An IP address is a 32-bit number, and each standard address is unique on the Internet. Given an IP packet, the information can be routed to the destination based upon the IP address defined in the packet header. IP addresses are generally written as four numbers , between 0 and 255, separated by a period (for example, 124.148.157.6). Although a 32-bit number is an appropriate way to address systems for computers, humans understandably have difficulty remembering them. Thus, a system called the Domain Name System (DNS) was developed to map IP addresses to more intuitive identifiers and vice versa. You can use http://www.netspace.org instead of 128.148.157.6. It is important to realize that these domain names are not used or understood by IP. When an application wants to transmit data to another machine on the Internet, it must first translate the domain name to an IP address using the DNS. A receiving application can perform a reverse translation, using the DNS to return a domain name given an IP address. There is not a one-to-one correspondence between IP addresses and domain names: a domain name can map to multiple IP addresses, and multiple IP addresses can map to the same domain name. Caution Even more important to note is that the entire body of DNS data cannot be trusted. Varied systems through the world are responsible for maintaining DNS records. DNS servers can be tricked, and servers can be set up that are populated with false information. Transmission Control Protocol (TCP)Most Internet applications use TCP to implement the transport layer. TCP provides a reliable, connection-oriented, continuous-stream protocol. These characteristics are described here:
Because of these characteristics, it is easy to see why TCP would be used by most Internet applications. TCP makes it easy to create a network application, freeing you from worrying how the data is broken up or about coding error-correction routines. However, TCP requires a significant amount of overhead and perhaps you might want to code routines that more efficiently provide reliable transmissions, given the parameters of your application. Furthermore, retransmission of lost data might be inappropriate for your application, because such information's usefulness might have expired . In these instances, UDP serves as an alternative. A later section of this chapter, "User Datagram Protocol (UDP)," describes this protocol. An important addressing scheme that TCP defines is the port. Ports separate various TCP communications streams that are running concurrently on the same system. For server applications, which wait for TCP clients to initiate contact, a specific port can be established from where communications will originate. These concepts come together in a programming abstraction known as sockets. This is the discussion of the next section. |