The Individual Protocols Data is transmitted via TCP/IP using the protocol stack. In each layer of the TCP/IP stack, individual protocols provide certain network services. These protocols can be categorized as network-level or application-level, with many individual protocols existing in each level. Network-Level Protocols Network-level protocols facilitate the data transport process transparently. They are invisible to the end user unless that user employs utilities to monitor system processes. Tip Sniffers are devices that can monitor network processes. A sniffer is a device either hardware or software that can read every packet sent across a network. Sniffers are commonly used to isolate network problems that, although invisible to the end user, are degrading net work performance. As such, sniffers can read all activity occurring between network-level protocols. Moreover, sniffers can pose a tremendous security threat. You will examine sniffers in Chapter 15, "Sniffers." Important TCP/IP network-level protocols include the following: Address Resolution Protocol (ARP) Internet Control Message Protocol (ICMP) Internet Protocol (IP) Transmission Control Protocol (TCP) User Datagram Protocol (UDP) We will briefly examine each, ascending up the stack from the data-link layer to the transport layer. For more comprehensive information about protocols (or the stack in general), see TCP/IP Illustrated, Volume 1 by W. Richard Stevens (Addison Wesley, ISBN # 0-201-63346-9). The Address Resolution Protocol (ARP) The Address Resolution Protocol (ARP) serves the critical purpose of mapping Internet addresses into hardware addresses and translating the network layer address (or IP address) to the data-link address. This is vital in routing information between hosts on a local network, and out onto the Internet. Before a message (or other data) is sent, it is packaged into IP packets, or blocks of information suitably formatted for Internet transport. These contain the numeric, network IP address of both the originating and destination machines. What remains is to determine the hardware, or the data-link address of the destination machine. This is where ARP makes its entrance. An ARP request message is broadcast on a local network. If the destination IP address is active on the local network, the destination host will reply with its own hardware address. The originating machine receives this reply, and the transfer process can begin. For those readers seeking in-depth information on ARP, see RFC 826 at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc826.txt. The Internet Control Message Protocol (ICMP) The Internet Control Message Protocol provides error and control messages that are passed between two (or more) computers or hosts. It enables those hosts to share the information. In this respect, ICMP is critical for diagnosis of network problems. ICMP provides helpful messages, such as the following: Echo and reply messages to test for network availability Redirect messages to enable more efficient routing Time-exceeded messages to inform sources that a packet has exceeded its allocated time within the network An ICMP packet can be of several types. The two most common are the ICMP_ECHO_REQUEST and ICMP_ECHO_REPLY. These packets are used to test network connectivity to make sure a host or network component is active and reachable. Tip Perhaps the most widely known ICMP implementation involves a network utility called ping. Ping is often used to determine whether a remote machine is alive. Ping's method of operation is simple: When the user pings a remote machine, a series of ICMP_ECHO_REQUEST packets are forwarded from the user's machine to the remote host. The remote host replies with ICMP_ ECHO_REPLY packets. If no reply packets are received at the user's end, the ping program usually generates an error message, indicating that the remote host is down or unreachable. In-depth information about ICMP can be found in RFC 792 at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc792.txt. The Internet Protocol (IP) The Internet Protocol provides packet delivery for all protocols within the TCP/IP suite. Thus, IP is the heart of the process by which data traverses the Internet. The IP datagram, or packet, is the vehicle for transmission of data on TCP/IP networks. The structure of an IP datagram is shown in Figure 4.3. Figure 4.3. The IP datagram. An IP datagram is composed of several parts. The first part, the header, is composed of important network information, including source and destination IP addresses. Together, these elements form a complete header. The remaining portion of a datagram contains whatever data is then being sent. One of the important aspects of IP networking is that it can be used to transmit data using a number of protocols (that is, TCP, UDP, and so on). Each protocol serves a particular function; we'll be looking at some important ones soon. In addition, IP enables the fragmentation and reassembly of data. At the data-link layer, networks can only transmit data in discrete chunks up to a specific size, called the Maximum Transmission Unit (MTU). If the data you want to transmit is larger than the MTU that a network can transmit, the data must be broken into pieces smaller than the MTU, transmitted, and then put back together at the other end. IP provides a mechanism for fragmenting the data, tracking it, and reassembling it. Fragmentation is also important from a security perspective. In some cases, it can be manipulated to work around security measures if security isn't implemented carefully. An IP datagram also contains a time-to-live (TTL) field. A numeric value, the TTL is decremented as the IP datagram traverses the network. When that value finally reaches zero, the datagram is discarded. This ensures that the network doesn't become clogged with datagrams that can't find their destination in a timely fashion. Many other types of packets have time-to-live limitations, and some network utilities (such as Traceroute) use the time-to-live field as a marker in diagnostic routines. IP Network Addressing The IP address is a unique identifier for a system on the network. It is 32 bits long and is usually represented as 4 numbers, each a byte, separated by decimal points, for example, 32.96.111.130. Each byte, or octet, in an IP address can range from 0 to 255. This representation of an IP address is called dotted-decimal notation and is the most common humanly readable format for working with IP addresses. A contiguous range of IP addresses defines an IP network. This range of IP addresses is denoted by the combination of an IP address and network mask (or netmask). A netmask is a 32-bit value like an IP address, which, when combined with the IP address, defines address boundaries of the IP network. This requires conversion of the IP address and netmask to binary format and their combination using binary arithmetic. Note that the first address in a contiguous range of IP addresses indicates the network address. The last address in the contiguous range denotes the network broadcast address. The network layer in TCP/IP is usually considered to be unicast. This is in contrast to the data-link layer, where ARP operates in a broadcast mode. Unicast indicates that IP communications occur between two endpoints in a point-to-point fashion. However, an IP datagram can be addressed to the network broadcast address. This causes the IP datagram to be received and responded to by all nodes on the IP network. Several network based denial of service attacks take advantage of this broadcast capability in IP. In-depth information on the Internet Protocol can be found in RFC 760 at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc760.txt. The Transmission Control Protocol (TCP) The Transmission Control Protocol (TCP) is one of the main protocols employed on the Internet. Working at the transport level in the stack, it facilitates such mission-critical tasks as file transfers and remote sessions. TCP accomplishes these tasks through a method called reliable communication. In this respect, TCP is more reliable than other protocols within the suite because it includes mechanisms for sequencing and acknowledgment of data transmission. As with IP, TCP has its own packet structure (see Figure 4.4), composed of source port and destination port numbers that identify services. In addition, important parts of a TCP packet are the sequence number, flags, and checksum. The sequence number tracks a TCP connection and the order in which data is sent. The flags control the connection state, whether it is being established, in use, or being closed. There are six flags that can be used in combination to describe the state of a TCP connection. The most important for this analysis are SYN, ACK, and FIN. The checksum in the TCP packet ensures that the data has not been corrupted during transmission. Figure 4.4. TCP packet structure. The TCP system relies on a virtual circuit between the requesting machine (client) and its target (server). This circuit is opened via a three-part process, often referred to as the three-way handshake. The process typically follows the pattern illustrated in Figure 4.5. Figure 4.5. Establishment of TCP connection. To establish a TCP connection, the three-way handshake must be completed as follows: 1. The client sends a TCP SYN packet to the server that it wants to establish a connection with. This is a TCP packet with only the SYN flag active. The packet also contains an initial sequence number (ISN) that will be used to track the connection. 2. The server responds with a TCP SYN packet with its own ISN. The server also acknowledges the client's TCP SYN by setting the ACK flag on this packet and using the client's ISN plus 1 as the acknowledgement number. 3. The client acknowledges the server's TCP SYN with a TCP ACK using the server's ISN plus 1. No data is exchanged during this process, but, when it is completed, a connection is available for data transfer between the client and server. This connection provides a full-duplex transmission path. Full-duplex transmission enables data to travel to both machines at the same time. In this way, while a file transfer (or other remote session) is underway, any errors that arise can be forwarded to the requesting machine. TCP also provides extensive error-checking capabilities. For each block of data sent, a checksum is calculated, and the sequence number is incremented. The two machines identify each transferred block using the sequence number. For each block successfully transferred, the receiving host sends an ACK message to the sender that the transfer was clean. Conversely, if the transfer is unsuccessful, one of two things might occur: The requesting machine receives error information. The requesting machine receives nothing. When an error is received, the data is retransmitted unless the error is fatal, in which case, the transmission is usually halted. A typical example of a fatal error would be if the connection was dropped. Similarly, if no confirmation is received within a specified time period, the information is also retransmitted. This process is repeated as many times as necessary to complete the transfer or remote session. TCP Connection Termination As you might expect, because TCP provides a protocol for establishing a connection, it also provides a protocol for terminating a connection. Establishing a TCP connection takes three steps, whereas terminating one takes four steps. Because a TCP connection is bi-directional or full-duplex, transmission in both directions of the connection must be shut down separately. This is done by using the TCP FIN packet, much as the TCP SYN packet is used to create a connection. When a client is finished using a connection, it will issue a TCP FIN packet to the server. The server responds with a TCP ACK to acknowledge that the connection is closing. Because the connection is bi-directional, the server will also issue a TCP FIN to the client. The client will then acknowledge the server's TCP FIN, thus completing the TCP connection termination process. In-depth information about TCP can be found in RFC 793 at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc793.txt. User Datagram Protocol (UDP) The User Datagram Protocol (UDP) is a simple, connectionless transport layer protocol. In fact, it is so simple that the RFC that defines it is only three pages long. Unlike TCP, UDP provides no reliability, and, because it is connectionless, it doesn't have any mechanism for connection establishment or termination. It does provide data integrity checks via a checksum. Although it might seem that UDP is inferior to TCP, it is, in fact, much better for certain applications because it has very low overhead. In-depth information about UDP can be found in RFC 768 at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc768.txt. Application Level Protocols The Ports Each time a machine requests services from another, it specifies a particular destination and transport method. The destination is expressed as the Internet (IP) address of the target machine, and the transport method is the transport protocol (that is, TCP or UDP). Further, the requesting machine specifies the application it is trying to reach at the destination by using a system of ports. Just as machines on the Internet have unique IP addresses, each application (FTP or Telnet, for example) is assigned a unique address called a port. The port defines the type of service that is being requested or provided. The application in question is bound to that particular port, and, when any connection request is made to that port, the corresponding server application responds. There are thousands of ports on the average Internet server, although often, most will not be active. For purposes of convenience and efficiency, a standard framework has been developed for port assignments. (In other words, although a system administrator can assign services to the ports of his choice, services are generally assigned to recognized ports commonly referred to as well-known ports.) Table 4.1 shows some commonly recognized ports and the applications typically bound to them. Table 4.1. Common Ports and Their Corresponding Services or Applications | Service or Application | Port | Hypertext Transfer Protocol (HTTP) | TCP port 80 | Domain Name System (DNS) | UDP and TCP port 53 | Telnet | TCP port 23 | File Transfer Protocol (FTP) | TCP port 20 and 21 | Simple Mail Transfer Protocol (SMTP) | TCP port 25 | Secure Shell (SSH) | TCP port 22 | Each of the ports described in Table 4.1 are assigned to application-level protocols or services that is, they are visible to the user, and the user can interact with them. We will examine each of these applications in the following sections. For a comprehensive list of all port assignments, visit ftp://ftp.isi.edu/in-notes/iana/assignments/port-numbers. This document is extremely informative and exhaustive in its treatment of commonly assigned port numbers. Hypertext Transfer Protocol (HTTP) Hypertext Transfer Protocol (HTTP) is perhaps the most renowned protocol of all because it enables users to surf the World Wide Web. Stated briefly in RFC 1945, HTTP is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing of data representation, enabling systems to be built independently of the data being transferred. RFC 1945 has been superseded by RFC 2068, which is available at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2068.txt. RFC 2068 is a more recent specification of HTTP. HTTP has forever changed the nature of the Internet, primarily by bringing the Internet to the masses. Using a common browser such as Netscape Navigator or Microsoft Internet Explorer, you can monitor the process of HTTP as it occurs. Depending upon the version of HTTP the server supports, your browser will contact the server for each data element (text, graphic, sound) on a WWW page. Thus, it will first grab text, then a graphic, then a sound file, and so on. In the lower-left corner of your browser's screen is a status bar. Watch it for a few moments while it is loading a page. You will see this request/response activity occur, often at a very high speed. HTTP typically runs on port 80 using TCP. HTTP does little to protect the confidentiality of data because documents are transmitted without encryption. Some security can be added by using HTTPS, which is HTTP transmitted over Secure Sockets Layer (SSL). HTTPS typically runs on port 443 using TCP. Domain Name System (DNS) The Domain Name System (DNS) provides services that translate host names to IP addresses and back again. Much as Address Resolution Protocol provides a mechanism for translating addresses between the data-link and network layers (hardware address to IP address), DNS translates addresses between the network layer and the application layer (IP address to hostnames). Because IP addresses aren't exactly human friendly, the Domain Name System was developed to allow people to use human-friendly naming for systems. For example, when you enter http://www.fbi.gov into your Web browser, the name needs to be translated from that friendly format into an IP address that can be used by the network layer. DNS has two modes of operation. The first mode is primarily for communications to clients that need names resolved to addresses. Because this is a small, easy task, transport for this mode is provided by UDP. DNS servers also must transfer large blocks of DNS records so that the workload and administration involved with resolving names to and from IP addresses can be distributed. These larger transfers (called DNS zone transfers) occur via TCP. DNS is a very active area of discussion, and numerous Internet drafts and RFCs have been created to add functionality and security to DNS. The core RFCs for DNS are 1034 and 1035. You can find them at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1034.txt and http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1035.txt. All modern operating systems that run TCP/IP come with a DNS client (called a resolver) as part of the OS. A client program that enables a user to query DNS directly is often included. On UNIX and Microsoft Windows NT or 2000, the program nslookup is provided. This DNS client lets you interactively connect to a DNS server and perform various queries of the DNS data. The most widely used DNS server is the Berkeley Internet Name Domain (BIND) DNS server. Developed and supported by the Internet Software Consortium, BIND is available for most UNIX systems as well as for Microsoft Windows NT. DNS typically runs on port 53 using UDP and TCP. Telnet Telnet is best described in RFC 854, the Telnet protocol specification: The purpose of the Telnet protocol is to provide a fairly general, bi-directional, eight-bit byte-oriented communications facility. Its primary goal is to allow a standard method of interfacing terminal devices and terminal-oriented processes to each other. Telnet not only enables the user to log in to a remote host, it also lets that user execute commands on the host. Thus, an individual in Los Angeles can telnet to a machine in New York and begin running programs on the New York machine just as though she were in New York. For those of you who are unfamiliar with Telnet, it operates much like the interface of a bulletin board system (BBS). Telnet is an excellent application for providing a terminal-based front-end to databases. For example, many university library catalogs can be accessed via Telnet or tn3270 (a variation that emulates an IBM 3270 terminal). Figure 4.6 shows an example of a Telnet library catalog screen. Figure 4.6. A sample Telnet session. Even though GUI applications have taken the world by storm, Telnet which is essentially a text-based application is still incredibly popular. Telnet enables you to perform a variety of functions (retrieving mail, for example) at a minimal cost in network resources. To use Telnet, the user issues whatever command necessary to start his Telnet client, followed by the name (or numeric IP address) of the target host. In UNIX, this is done as follows: % telnet http://www.fbi.gov This command launches a Telnet session, contacts http://www.fbi.gov, and requests a TCP connection on port 23. That connection request will either be honored or denied, depending on the configuration at the target host. In UNIX, the telnet command has long been a native one. In addition, Telnet has been included with Microsoft Windows distributions for more than a decade. Telnet is a simple protocol, and offers very little in the way of security. All data transmitted during a Telnet session, including the login ID and password, are sent unencrypted. Anyone with access to a sniffer and the network between the client and server could capture critical data including your password. Secure Shell (examined later in this chapter) provides services similar to Telnet, but adds security by encrypting the data between client and server. Telnet typically runs on port 23 via TCP. File Transfer Protocol (FTP) File Transfer Protocol (FTP) is a standard method of transferring files from one system to another. Its purpose is set forth in RFC 0765 as follows: The objectives of FTP are 1) to promote sharing of files (computer programs and/or data), 2) to encourage indirect or implicit (via programs) use of remote computers, 3) to shield a user from variations in file storage systems among Hosts, and 4) to transfer data reliably and efficiently. FTP, though usable directly by a user at a terminal, is designed mainly for use by programs. For more than two decades, researchers have investigated a wide variety of file-transfer methods. The development of FTP has undergone many changes in that time. Its first definition occurred in April 1971, and the full specification can be read in RFC 114. RFC 114 contains the first definition of FTP, but a more practical document might be RFC 959, found at http://info.internet.isi.edu:80/in-notes/rfc/files/rfc959.txt. Mechanical Operation of FTP File transfers using FTP can be accomplished using any suitable FTP client. Table 4.2 defines some common FTP clients, by operating system. Table 4.2. FTP Clients for Various Operating Systems | Operating System | Clients | UNIX | Native, LLNLXDIR2.0, FTPtool, NCFTP | Microsoft Windows 95/98 | Native, WS_FTP, Netload, Cute-FTP, Leap FTP, SDFTP, FTP Explorer | Microsoft Windows NT/2000 | See listings for Windows 95/98 | Microsoft Windows 3.x | Win_FTP, WS_FTP, CU-FTP, WSArchie | Macintosh | Anarchie, Fetch, Freetp | OS/2 | Gibbon FTP, FTP-IT, Lynn's Workplace FTP | FTP file transfers occur in a client/server environment. The requesting machine starts one of the clients named in Table 4.2. This generates a request that is forwarded to the targeted FTP server (usually a host on another network). Typically, the request is sent by the client to port 21. For a connection to be established, the targeted file server must be running an FTP server. FTPD: An FTP Server Daemon FTPD is the standard FTP server daemon for UNIX. Its function is simple: to reply to connect requests received and to satisfy those requests for file transfers. An FTP daemon comes standard on most distributions of UNIX (for other operating systems, see Table 4.3). Table 4.3. FTP Servers for Various Operating Systems | Operating System | Servers | UNIX | Native (FTPD), WUFTD | Microsoft Windows 95/98 | WFTPD, Microsoft FrontPage, WAR FTP Daemon, Vermilion | Microsoft Windows NT/2000 | Serv-U, OmniFSPD, Microsoft Internet Information Server | Microsoft Windows 3.x | WinQVT, Serv-U, Beames & Whitside BW Connect, WFTPD FTP Server, WinHTTPD | Macintosh | Netpresenz, FTPd | OS/2 | Penguin | FTPD waits for a connection request. When such a request is received, FTPD requests the user login. The user must either provide her valid user login and password or log in anonymously (if the server allows anonymous sessions). When logged in, the user can download files. In certain instances and if security on the server allows, the user can also upload files. As with Telnet, FTP is an insecure protocol. It does nothing to encrypt the user ID, password, or any of the files being transferred. Secure Shell provides a more secure method of file transfer via either Secure Copy (SCP) or Secure FTP (SFTP). FTP uses ports 20 and 21 via TCP. Simple Mail Transfer Protocol (SMTP) SMTP is the protocol responsible for email transmission between servers, and the sending of email from clients to servers. Its purpose is stated concisely in RFC 821: The objective of Simple Mail Transfer Protocol (SMTP) is to transfer mail reliably and efficiently. SMTP is an extremely lightweight protocol. Running any SMTP-compliant client, the user sends a request to an SMTP server. The client forwards a series of instructions, indicating that it wants to send mail to a recipient somewhere on the Internet. If the SMTP allows this operation, an affirmative acknowledgment is sent back to the client machine. At that point, the session begins. The client might then forward the recipient's identity, his IP address, and the message (in text) to be sent. Despite the simple character of SMTP, mail service has been the source of countless security holes. The configuration of an SMTP server can be complex, depending upon the options an administrator needs to support. A combination of SMTP server application bugs, and difficulty in configuration have led to numerous security holes. These security issues are covered in detail later in this book. Most networked operating systems have SMTP servers available for use. STMP server support is included as sendmail for most UNIX distributions, or part of Internet Information Services for Microsoft Windows. SMTP typically runs on port 25 via TCP. Further information on SMTP is available in RFC 821 http://info.internet.isi.edu:80/in-notes/rfc/files/rfc821.txt. Secure Shell Protocol (SSH) SSH is relatively new to the TCP/IP suite of protocols. Unlike the application protocols we've examined already, SSH has been widely implemented without completing the RFC process. This is largely because of the vast demand for a more secure method of providing services similar to Telnet and FTP. There are two versions of the SSH protocol, and a number of implementations. The first widely used version of the protocol was SSH1, which was defined in an Internet draft (a pre-RFC document) in 1995. As of this writing, there is an Internet Engineering Task Force working group developing the second generation of SSH. Based upon that group's Internet drafts, a number of SSH2 implementations have been completed. Information on the IETF SSH working group, along with the latest Internet drafts, can be found at http://www.ietf.org/html.charters/secsh-charter.html. SSH allows you to log in to another computer over a network, to execute commands in a remote machine (like Telnet), and to move files from one machine to another (like FTP). It provides for strong authentication and secure, encrypted communications over otherwise insecure networks. It is intended as a replacement for Telnet and other remote access protocols like rlogin, rsh, and rcp. In SSH2, there is a replacement for FTP as well, called sftp. Secure Shell client implementations exist for a variety of platforms, as shown in Table 4.4. Table 4.4. SSH Clients for Various Operating Systems | Operating System | Clients | UNIX | SSH Communications, F-Secure, OpenSSH, Lsh, MindTerm | Microsoft Windows 95/98 | SSH Communications, F-Secure, PuTTY, TeraTerm, FiSSH, SecureCRT, Cygwin32, MindTerm | Microsoft Windows NT/2000 | Same as 95/98 | Macintosh | F-Secure, NiftyTelnet 1.1 SSH, MindTerm | OS/2 | MindTerm | Secure Shell server implementations are also available, although not on as many platforms as for clients (see Table 4.5). Table 4.5. SSH Servers for Various Operating Systems | Operating System | Servers | UNIX | SSH Communications, F-Secure, OpenSSH, Lsh | Microsoft Windows 95/98 | SSH Communications, F-Secure, Cygwin32 | Microsoft Windows NT/2000 | Same as 95/98 | |