TCPIP


TCP/IP

The Internet runs over TCP/IP, the Transmission Control Protocol/Internet Protocol. TCP/IP is actually a suite of protocols, each performing a particular role to let computers speak the same language. TCP/IP is universally available and is almost certainly running on the computers you use at work and at home. This is true regardless of LAN protocols, because LAN vendors have implemented TCP/IP compatibility in their products. For example, the latest Novell NetWare product can speak TCP/IP.

TCP/IP was designed by the Defense Advanced Research Projects Agency (DARPA) in the 1970s-the design goal being to let dissimilar computers freely communicate, regardless of location. Most early TCP/IP work was done on UNIX computers, which contributed to the protocol's popularity as vendors got into the practice of shipping TCP/IP software inside every UNIX computer. As a technology, TCP/IP maps to the OSI reference model, as shown in Figure 2-10.

image from book
Figure 2-10: The TCP/IP stack is compliant with the seven-layer reference model

Looking at Figure 2-10, you can see that TCP/IP focuses on layers 3 and 4 of the OSI reference model. The theory is to leave network technologies to the LAN vendors. TCP/IP's goal is to move messages through virtually any LAN product to set up a connection running virtually any network application.

TCP/IP works because it closely maps to the OSI model at the lowest two levels-the data-link and physical layers. This lets TCP/IP talk to virtually any networking technology and, indirectly, any type of computer platform. TCP/IP's four abstract layers include:

  • Network interface This allows TCP/IP to interact with all modern network technologies by complying with the OSI model.

  • Internet This defines how IP directs messages through routers over internetworks such as the Internet.

  • Transport This defines the mechanics of how messages are exchanged between computers.

  • Application This defines network applications to perform tasks such as file transfer, e-mail, and other useful functions.

TCP/IP is the de facto standard that unifies the Internet. A computer that implements an OSI-compliant layer network technology, such as Ethernet or Token Ring, has overcome incompatibilities that would otherwise exist between platforms such as Windows, UNIX, MAC, IBM mainframes, and others. We've already covered layers 1 and 2 in our discussion of LAN technologies that connect groups of computers together in a location. Now we'll cover how computers internetwork over the Internet or private internetworks.

TCP/IP Messaging

All data that goes over a network must have a format so that devices know how to handle it. TCP/IP's Internet layer-which maps to the OSI model's network layer-is based on a fixed message format called the IP datagram-the bucket that holds the information making up the message. For example, when you download a Web page, the stuff you see on the screen was delivered inside datagrams.

Closely related to the datagram is the packet. Whereas a datagram is a unit of data, a packet is a physical entity consisting of a message unit that passes through the internetwork. People often use the terms interchangeably; the distinction is only important in certain narrow contexts. The key point is that most messages are sent in pieces and reassembled at the receiving end.

image from book

For example, when you send an e-mail to someone, it goes over the wire as a stream of packets. A small e-mail might be only ten packets; a big one may be split into thousands. At the opposite extreme, a request-for-service message might take only a single packet.

One advantage of this approach is that if a packet is corrupted during transmission, only that packet need be resent, not the entire message. Another advantage is that no single host is forced to wait an inordinate length of time for another's transmission to complete before being able to transmit its own message.

TCP vs. UDP as Transport Protocols

An IP message travels using either of two transport protocols: TCP or UDP. TCP stands for Transmission Control Protocol, the first half of the TCP/IP acronym. UDP stands for User Datagram Protocol, used in place of TCP for less critical messages. Either protocol provides the transport services necessary to shepherd messages through TCP/IP internetworks. TCP is called a reliable protocol because it checks with the receiver to make sure the packet was received, sending an ACK (acknowledgement) message when a transmission is complete. UDP is called unreliable or connection-less because no effort is made to confirm delivery. Both transport technologies operate at layer 4 of the OSI stack.

image from book

Don't let the name TCP/IP throw you. TCP has no involvement in a UDP message. And while we're at it, don't let the name User Datagram Protocol throw you either. An IP message sent through TCP contains an IP datagram just like a UDP message does.

A key point to know is that only one transport protocol can be used to manage a message. For example, when you download a Web page, the packets are handled by TCP with no involvement from UDP. Conversely, a Trivial File Transfer Protocol (TFTP) upload or download is handled entirely by the UDP protocol.

Which transport protocol is used depends on the network application-e-mail, HTTP Web page downloads, network management, and so on. As we'll discuss, network software designers will use UDP where possible because it generates less overhead traffic. TCP goes to greater lengths to assure delivery and sends many more packets than UDP to manage connections. Figure 2-11 shows a sampling of network applications to illustrate the division between the TCP and UDP transports.

image from book
Figure 2-11: TCP and UDP handle different network applications (port numbers)

The examples in Figure 2-11 highlight a few good points. First, FTP and TFTP do essentially the same thing-handle the transferring of data files. The major difference is that TFTP is mainly used to download and back up network device software, and it uses UDP because failure of such a message is tolerable (TFTP payloads aren't for end users, but for network administrators, who are lower priority). The Domain Name System (DNS), the service that translates from URLs to IP addresses, uses UDP for client-to-server name lookups and TCP for server-to-server lookups. However, it may use only one of the two for a particular DNS lookup connection.

The IP Datagram Format

The datagram is the basic unit of data inside IP packets. The datagram's format provides fields both for message handling and for the payload data. The datagram layout is depicted in Figure 2-12. Don't be misled by the proportions of the fields in the figure; the data field is by far the largest field in most packets.

image from book
Figure 2-12: The IP datagram format is variable in length

Computers connect. The packets making the connection contain your IP address in addition to the destination addresses. They also contain additional information such as an instruction to download a Web page. The other 12 packet fields are for handling purposes.

A key fact about IP packets is that they are variable in length. For example, in Ethernet LANs, one packet might be 200 bytes long, another 1,400 bytes. IP packets can grow as large as 4,000 bytes in Token Ring packets.

Note 

We're talking bytes instead of bits in this context because datagrams contain data, and computers prefer dealing with bytes. On the other hand, when discussing traffic streaming, say, over a cable, the unit of measure is bits-the preferred measure of networking.

A packet has two basic parts: header information and data. The data portion of the packet holds the cargo-the payload that's being sent across the network. The header contains housekeeping information needed by routers and computers to handle the packet and keep it in order with the other packets making up the whole message.

Keep in mind that the CPU handling the packet needs to know where each field starts down to the exact bit position; otherwise, the entire thing is just a bunch of meaningless 0's and 1's. Notice that the three fields that can vary in length are placed toward the right side of the format. If variable-length fields were to the left in the format, it would be impossible for computers to know where the subsequent fields begin. The IP datagram fields are as follows:

  • VER The version of IP being used by the station that originated the message. The current version is IP version 4. This field lets different versions coexist in an internetwork.

  • HLEN For header length, this tells the receiver how long the header will be so that the CPU knows where the data field begins.

  • Service Type A code to tell the router how the packet should be handled in terms of level of service (reliability, precedence, delay, and so on).

  • Length The total number of bytes in the entire packet, including all header fields and the data field.

  • ID, Flags, and Frags Offset These fields identify to the router how to packet fragmentation and reassembly, and how to offset for different frame sizes that might be encountered as the packet travels through different LAN segments using different networking technologies (Ethernet, FDDI, and so on).

  • TTL Stands for Time to Live, a number that is decremented by one each time the packet is forwarded. When the counter reaches zero, the packet is dropped. TTL prevents router loops and lost packets from endlessly wandering internetworks.

  • Protocol The transport protocol that should be used to handle the packet. This field almost always identifies TCP as the transport protocol to use, but certain other transports can be used to handle IP packets.

  • Header Checksum A checksum is a numerical value used to help assure message integrity. If the checksums in all the message's packets don't add up to the right value, the station knows that the message was garbled.

  • Source IP Address The 32-bit address of the host that originated the message (usually a PC or a server).

  • Destination IP Address The 32-bit address of the host to which the message is being sent (usually a PC or a server).

  • IP Options Used for network testing and other specialized purposes.

  • Padding Fills in any unused bit positions so that the CPU can correctly identify the first bit position of the data field.

  • Data The payload being sent. For example, a packet's data field might contain some of the text making up an e-mail.

Port Numbers

A port number identifies the network application to the upper layers of the stack. For example, each packet in an e-mail transmission contains the port number 25 in its header to indicate the Simple Mail Transfer Protocol (SMTP). There are hundreds of assigned port numbers. The Internet Assigned Numbers Authority (IANA) coordinates port number assignments according to the following system:

  • Numbers 1023 and below "Well known" ports assigned to public applications (such as SMTP) and to companies to identify network application products

  • Numbers 1024 to 49151 Reserved for and registered and assigned for use by specific companies

  • Numbers 49152 to 65535 Assigned dynamically by the end-user application using the network application

Port numbers help the stations keep track of various connections being processed simultaneously. For example, for security reasons, most firewalls are configured to read port numbers in every packet header.

Many beginners are confused as to exactly how port numbers are used. For example, if you're trying to connect to a Web server from your PC, you might think that both end-stations would use port 80 (HTTP) to conduct a Web page download. In fact, the requesting client uses a random port number in the request packet's source port field and uses assigned HTTP port number 80 only in the destination port field. Figure 2-13 demonstrates how port numbers are used during a transmission.

image from book
Figure 2-13: Port numbers identify the network application the message is using

The client uses a random port number to help keep track of conversations during a connection. A conversation is a discrete port-to-port transaction between end-stations. There can be any number of conversations within a single connection.

Looking at Figure 2-13, the page downloaded in step 2 may have included one of those annoying embedded HTML commands that automatically creates a new browser window without your asking for it (a pop-up). The pop-up window requests that a new page be downloaded, thereby creating a whole new stream of HTML code, text, GIFs, and JPEGs to handle-a second conversation, in other words.

At the server end, however, a widely recognized port number, such as 80 for HTTP, must be used-otherwise, the thousands of hosts hitting the Web server would have no idea what application to ask for.

The Transport Layer

The way packets are handled differs according to the type of traffic. There are two techniques for sending packets over a TCP/IP internetwork: connection-oriented and connectionless. In the strict sense, of course, a connection is made whenever a packet reaches its destination. Connection-oriented and connectionless refer to the level of effort and control that is applied to handling a message.

Every packet that goes over an internetwork consumes bandwidth, including overhead traffic. Connection assurance mechanisms are not used for certain types of TCP/IP traffic in order to minimize overhead packets where tolerable. Discrimination in packet handling is achieved by the choice of transport protocol:

  • TCP The connection-oriented mechanism to transport IP packets through an internetwork

  • UDP The connectionless mechanism for transporting packets

The primary difference between the two is that TCP requires an ACK message from the receiver that acknowledges the successful completion of each step of a transmission, while UDP does not. That's why UDP is often called the connectionless transport. Because UDP is connectionless, it's faster and more efficient than TCP. UDP is used for network applications, where it is considered tolerable to retransmit should the message fail.

Both TCP and UDP operate at layer 4 of the OSI stack, just above the IP network layer. Internetworks run TCP and UDP traffic simultaneously, but an individual message may be sent using only one of the two. The difference between the two is manifested in the format of the IP datagram's transport wrapper, called a segment. When a stream of packets is sent over an IP network, the packets are wrapped in either a TCP segment or a UDP segment and handled according to the rules of that particular transport protocol. These segments hold the data used to transport the packet through the internetwork. Keep in mind that this is not payload data, but information used to manage transportation of the packets.

A packet sent through a TCP connection has a much longer header than one traveling through UDP. The extra fields in the TCP header contain information used for establishing connections and handling errors. TCP is the subsystem responsible for establishing and managing IP connections, and it uses a sophisticated handshake procedure to make sure the two end-stations are properly set up for the transmission. For example, when you click a hyperlink to jump to a new Web page, TCP springs into action to "shake hands" with that Web server so that the page is downloaded properly. TCP also has procedures for monitoring transmissions and error recovery.

The TCP Segment Format

IP datagrams are placed inside TCP segments when transport is managed by the TCP protocol. The TCP segment format, depicted in Figure 2-14, holds certain pieces of data for establishing TCP connections and managing packet transport.

image from book
Figure 2-14: The TCP packet segment holds data used to closely manage packet transport

The data fields in the TCP segment reflect the protocol's focus on establishing and managing network connections. Each is used to perform a specific function that contributes to assuring that a connection runs smoothly:

  • Source port The application port used by the sending host.

  • Destination port The application port used by the receiving host.

  • Sequence number Positions the packet's data to fit in the overall packet stream.

  • Acknowledgment number Contains the sequence number of the next expected TCP packet, thereby implicitly acknowledging receipt of the prior message.

  • HLEN For header length, tells the receiver how long the header will be so that the CPU knows where the data field begins.

  • Reserved Bits reserved for future use by the Internet Engineering Task Force (IETF).

  • Code bits Contains SYN (synchronize) bits to set up a connection or FIN (finish) bits to terminate one.

  • Window Contains the number of bytes the receiving station can buffer or the number of bytes to be sent. This field sets a "capacity window" to ensure that the sender does not overwhelm the receiver with too many packets all at once.

Establishing a TCP Connection

The TCP connection process is often referred to as the "three-way handshake" because the second step involves the receiving station sending two TCP segments at once. The steps in Figure 2-15 show a couple of the TCP segment fields in action. The first TCP segment's sequence number serves as the initial sequence number-the base number used to keep subsequent packets in proper sequence. The Sequence field is used for reassembling out-of-sequence packets into a cogent message at the receiving end.

image from book
Figure 2-15: The three-way TCP handshake process passes SYN and ACK

The example in Figure 2-15 shows a PC connecting to a Web server. But any type of end-stations could be talking-a server connecting to another server to perform an e-commerce transaction, two PCs connecting for an IRC (Inter-Regional Connectivity) chat session, or any connection between two end-stations over an IP network.

TCP Windowing

It's not enough just to establish the connection; the session must be dynamically managed to make sure things run smoothly. The major task here is to ensure that one station doesn't overwhelm the other by transmitting too much data at once.

This is done using a technique called windowing, in which the receiving station updates the other as to how many bytes it's willing to accept. Put another way, the station is saying how much memory buffer it has available to handle received packets. The TCP windowing process is depicted in Figure 2-16, with a too-small window size shown on the left and a proper window size on the right.

image from book
Figure 2-16: Windowing ensures that the receiving host has the capacity to process incoming packets

Window size is communicated through the ACK messages. Looking at Figure 2-16, obviously, a 1,000-byte window size is no good because it causes a one-to-one ratio between incoming packets and outgoing ACKs-way too much overhead in relation to payload traffic. The right half of Figure 2-17 shows a better window size of 10,000 bytes. As the illustration shows, this lets the sending station fire off as many packets as it wants, as long as the cumulative total stays beneath the 10,000-byte window size limit. This permits a more favorable payload-to-overhead message ratio.

image from book
Figure 2-17: The UDP segment format doesn't have Sequence or Acknowledgement fields

The message in the lower-right area is shaded to highlight the fact that window sizes are adjusted dynamically during a session. This is done because of changing conditions within the receiving station. For example, if a Web server suddenly picks up connections from other sending hosts, it has less memory buffer available to process your packets, and it adjusts your window size downward.

Connectionless IP Packet Handling Through UDP

The User Datagram Protocol (UDP) is connectionless in that it doesn't use acknowledgments or windowing. Compared to TCP, UDP is a "best effort" transport protocol-it simply transmits the message and hopes for the best. The UDP segment format is shown in Figure 2-17. Besides the port numbers to tell which network applications to run, UDP segments basically just declare packet size. The only reliability mechanism in UDP is the checksum, used to verify the integrity of the data in the transmission. The odds that the checksum of a received packet containing altered data will match the checksum of the sent packet are miniscule.




Cisco. A Beginner's Guide
Cisco: A Beginners Guide, Fourth Edition
ISBN: 0072263830
EAN: 2147483647
Year: 2006
Pages: 102

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net