27.3 FTP Examples

We now look at some examples using FTP: its management of the data connection, how text files are sent using NVT ASCII, FTP's use of the Telnet synch signal to abort an in-progress transfer, and finally the popular "anonymous FTP."

Connection Management: Ephemeral Data Port

Let's first look at FTP's connection management with a simple FTP session that just lists a file on the server. We run the client on the host svr4 with the -d flag (debug). This tells it to print the commands and replies that are exchanged across the control connection. All the lines preceded by ---> are sent by the client to the server, and the lines that begin with a 3-digit number are the server's replies. The client's interactive prompt is ftp>.

 svr4 %  ftp -d bsdi   -d option for debug output  Connected to bsdi.  client does active open of control connection  220 bsdi FTP server (Version 5.60) ready.  server responds it is ready  Name (bsdi:rstevens):  client prompts us for a login name  ---> USER rstevens  we type RETURN, so client sends default  331 Password required for rstevens.    Password:  we type our password; it's not echoed  ---> PASS xxxxxxx  client sends it as cleartext  230 User rstevens logged in.    ftp>  dir hello.c   ask for directory listing of a single file  ---> PORT 140, 252, 13, 34, 4, 150  see Figure 27.4  200 PORT command successful.    ---> LIST hello.c    150 Opening ASCII mode data connection for /bin/ls.    -rw-r--r--  1 rstevens  staff  38 Jul 17 12:47 hello.c    226 Transfer complete.    remote: hello.c  output by client  56 bytes received in 0.03 seconds (1.8 Kbytes/s)    ftp>  quit   we're done  ---> QUIT    221 Goodbye.

When the FTP client prompts us for a login name, it prints the default (our login name on the client). When we type the RETURN key, this default is sent.

Asking for a directory listing of a single file causes a data connection to be established and used. This example follows the procedure we showed in Figures 27.4 and 27.5. The client asks its TCP for an ephemeral port number for its end of the data connection, and sends this port number (1174) to the server in a PORT command. We can also see that a single interactive user command ( dir ) becomes two FTP commands (PORT and LIST).

Figure 27.6 is the time line of the packet exchange across the control connection. (We have removed the establishment and termination of the control connection, along with all the window size advertisements.) We note in this figure where the data connection is opened, used, and then closed.

Figure 27.6. Control connection for FTP example.

Figure 27.7 is the time line for the data connection. The times in this figure are from the same starting point as Figure 27.6. We have removed all window advertisements, but have left in the type-of-service field, to show that the data connection uses a different type-of-service (maximize throughput) than the control connection (minimize delay). (The TOS values are in Figure 3.2.)

Figure 27.7. Data connection for FTP example.

In this time line the FTP server does the active open of the data connection, from port 20 (called ftp-data ), to the port number from the PORT command (1174). Also in this example, where the server writes to the data connection, the server does the active close of the data connection, which tells the client when the listing is complete.

Connection Management: Default Data Port

If the client does not send a PORT command to the server, to specify the port number for the client's end of the data connection, the server uses the same port number for the data connection that is being used for the control connection. This can cause problems for clients that use the stream mode (which the Unix FTP clients and server always use), as we show below.

The Host Requirements RFC recommends that an FTP client using the stream mode send a PORT command to use a nondefault port number before each use of the data connection.

Returning to the previous example (Figure 27.6), what if we asked for another directory listing a few seconds after the first? The client would ask its kernel to choose another ephemeral port number (perhaps 1175) and the next data connection would be between svr4 port 1175 and bsdi port 20. But in Figure 27.7 the server did the active close of the data connection, and we showed in Section 18.6 that the server won't be able to assign port 20 to the new data connection, because that local port number was used by an earlier connection that is still in the 2MSL wait state.

The server gets around this by specifying the SO_REUSEADDR option that we mentioned in Section 18.6. This lets it assign port 20 to the new connection, and the new connection will have a different foreign port number (1175) from the one that's in the 2MSL wait (1174), so everything is OK.

This scenario changes if the client does not send the PORT command, specifying an ephemeral port number on the client. We can force this to happen by executing the user command sendport to the FTP client. Unix FTP clients use this command to turn off sending PORT commands to the server before each use of a data connection.

Figure 27.8 shows the time line only for the data connections for two consecutive LIST commands. The control connection originates from port 1176 on host svr4, so in the absence of PORT commands, the client and server use this same port number for the data connection. (We have removed the window advertisements and type-of-service values.)

Figure 27.8. Data connection for two consecutive LIST commands.

The sequence of events is as follows.

The control connection is established from the client port 1176 to the server port 21. (We don't show this.)
When the client does the passive open for the data connection on port 1176, it must specify the SO_REUSEADDR option since that port is already in use by the control connection on the client.
The server does the active open of the data connection (segment 1) from port 20 to port 1176. The client accepts this (segment 2), even though port 1176 is already in use on the client, because the two socket pairs
```
 <svr4, 1176, bsdi, 21>     <svr4, 1176, bsdi, 20> 
```
are different (the port numbers on bsdi are different). TCP demultiplexes incoming segments by looking at the source IP address, source port number, destination IP address, and destination port number, so as long as one of the four elements differs , all is OK.
The server does the active close of the data connection (segment 5), which puts the socket pair
```
 <svr4, 1176, bsdi, 20> 
```
in a 2MSL wait on the server.
The client sends another LIST command across the control connection. (We don't show this.) Before doing this the client does a passive open on port 1176 for its end of the data connection. The client must specify the SO_REUSEADDR option again, since the port number 1176 is already in use.
The server issues an active open for the data connection from port 20 to port 1176. Before doing this the server must specify SO_REUSEADDR, since the local port (20) is associated with a connection that is in the 2MSL wait, but from what we showed in Section 18.6, the connection won't succeed. The reason is that the socket pair for the connection request equals the socket pair from step 4 that is still in a 2MSL wait. The rules of TCP forbid the server from sending the SYN. There is no way for the server to override this 2MSL wait of the socket pair before reusing the same socket pair.

At this point the BSD server retries the connection request every 5 seconds, up to 18 times, for a total of 90 seconds. We see that segment 9 succeeds about 1 minute later. (We mentioned in Chapter 18 that SVR4 uses an MSL of 30 seconds, for a 2MSL wait of 1 minute.) We don't see any SYNs from these failures in this time line because the active opens fail and the server's TCP doesn't even send a SYN.

The reason the Host Requirements RFC recommends using the PORT command is to avoid this 2MSL wait between successive uses of a data connection. By continually changing the port number on one end, the problem we just showed disappears.

Text File Transfer: NVT ASCII Representation or Image?

Let's verify that the transmission of a text file uses NVT ASCII by default. This time we don't specify the -d flag, so we don't see the client commands, but notice that the client still prints the server's responses:

 sun %  ftp bsdi  Connected to bsdi.     220 bsdi FTP server (Version 5.60) ready.     Name (bsdi:rstevens):  we type RETURN  331 Password required for rstevens.     Password:  we type our password  230 User rstevens logged in.     ftp>  get hello.c   fetch a file  200 PORT command successful.     150 Opening ASCII mode data connection for hello.c (38 bytes).     226 Transfer complete.  server says file contains 38 bytes  local: hello.c remote: hello.c  output by client  42 bytes received in 0.0037 seconds (11 Kbytes/s)  42 bytes across data connection  ftp>  quit  221 Goodbye.     sun %  ls -l hello.c  -rw-rw-r--  1 rstevens       38 Jul 18 08:48 hello.c  but file contains 38 bytes  sun %  wc -l hello.c   count the lines in the file  4 hello.c

Forty-two bytes are transferred across the data connection because the file contains four lines. Each Unix newline character ( \n ) is converted into the NVT ASCII 2-byte end-of-line sequence ( \r\n ) by the server for transmission, and then converted back by the client for storage.

Newer clients attempt to determine if the server is of the same system type, and if so, transfer files in binary (image file type) instead of ASCII. This helps in two ways.

The sender and receiver don't have to look at every byte (a big savings).
Fewer bytes are transferred if the host operating system uses fewer bytes for the end-of-line than the 2-byte NVT ASCII sequence (a smaller savings).

We can see this optimization using a BSD/386 client and server. We'll enable the debug mode, to see the client FTP commands:

 bsdi %  ftp -d slip   specify -d to see client commands  Connected to slip.    220 slip FTP server (Version 5.60) ready.    Name (slip:rstevens) :  we type RETURN  ---> USER rstevens    331 Password required for rstevens.    Password:  we type our password  ---> PASS XXXX    230 User rstevens logged in.    ---> SYST  this is sent automatically by client  215 UNIX Type: L8 Version: BSD-199103  server's reply  Remote system type is UNIX.  information output by client  Using binary mode to transfer files.  information output by client  ftp>  get hello.c   fetch a file  ---> TYPE I  sent automatically by client  200 Type set to I.    ---> PORT 140, 252, 13, 66, 4, 84  port number = 4  256  +  84 = 1108  200 PORT command successful.    ---> RETR hello.c    150 Opening BINARY mode data connection for hello.c (38 bytes).    226 Transfer complete.    38 bytes received in 0.035 seconds (1.1 Kbytes/s)  only 38 bytes this time  ftp>  quit  ---> QUIT     221 Goodbye.

After we login to the server, the client FTP automatically sends the SYST command, which the server responds to with its system type. If the reply begins with the string " 215 UNIX Type: L8 ", and if the client is running on a Unix system with 8 bits per byte, binary mode (image) is used for all file transfers, unless changed by the user.

When we fetch the file hello.c the client automatically sends the command TYPE I to set the file type to image. Only 38 bytes are transferred across the data connection.

The Host Requirements RFC says an FTP server must support the SYST command (it was an option in RFC 959). But the only systems used in the text (see inside front cover) that support it are BSD/386 and AIX 3.2.2. SunOS 4.1.3 and Solaris 2.x reply with 500 (command not understood ). SVR4 has the extremely unsocial behavior of replying with 500 and closing the control connection!

Aborting A File Transfer: Telnet Synch Signal

We now look at how the FTP client aborts a file transfer from the server. Aborting a file transfer from the client to the server is easy ” the client stops sending data across the data connection and sends an ABOR to the server on the control connection. Aborting a receive, however, is more complicated, because the client wants to tell the server to stop sending data immediately. We mentioned earlier that the Telnet synch signal is used, as we'll see in this example.

We'll initiate a receive and type our interrupt key after it has started. Here is the interactive session, with the initial login deleted:

 ftp>  get a.out   fetch a large file  ---> TYPE I  client and server are both 8-bit byte Unix systems  200 Type set to I.     ---> PORT 140, 252, 13, 66, 4, 99     200 PORT command successful.     ---> RETR a.out     150 Opening BINARY mode data connection for a.out (28672 bytes).  ^?   type our interrupt key  receive aborted  output by client  waiting for remote to finish abort  output by client  426 Transfer aborted. Data connection closed.     226 Abort successful     1536 bytes received in 1.7 seconds (0.89 Kbytes/s)

After we type our interrupt key, the client immediately tells us it initiated the abort and is waiting for the server to complete. The server sends two replies: 426 and 226. Both replies are sent by the Unix server when it receives the urgent data from the client with the ABOR command.

Figures 27.9 and 27.10 show the time line for this session. We have combined the control connection (solid lines) and the data connection (dashed lines) to show the relationship between the two.

Figure 27.9. Aborting a file transfer (first half).

Figure 27.10. Aborting a file transfer (second half).

The first 12 segments in Figure 27.9 are what we expect. The commands and replies across the control connection set up the file transfer, the data connection is opened, and the first segment of data is sent from the server to the client.

In Figure 27.10, segment 13 is the receipt of the sixth data segment from the server on the data connection, followed by segment 14, which is generated by our typing the interrupt key. Ten bytes are sent by the client to abort the transfer:

<IAC, IP, IAC, DM, A, B, O, R, \r, \n >

We see two segments (14 and 15) because of the problem we detailed in Section 20.8 dealing with TCP's urgent pointer. (We saw the same handling of this problem in Figure 26.17 with Telnet.) The Host Requirements RFC says the urgent pointer should point to the last byte of urgent data, while most Berkeley-derived implementations have it point 1 byte beyond the last byte of urgent data. The FTP client purposely writes the first 3 bytes as urgent data, knowing the urgent pointer will (incorrectly) point to the next byte that is written (the data mark, DM, at sequence number 54). This first write with 3 bytes of urgent data is sent immediately, along with the urgent pointer, followed by the next 7 bytes. (The BSD FTP server does not have a problem with which interpretation of the urgent pointer is used by the client. When the server receives urgent data on the control connection it reads the next FTP command, looking for ABOR or STAT, ignoring any embedded Telnet commands.)

Notice that despite the server saying the transfer was aborted (segment 18, on the control connection), the client receives 14 more segments of data (sequence numbers 1537 through 5120) on the data connection. These segments were probably queued in the network device driver on the server when the abort was received, but the client prints "1536 bytes received" meaning it ignores all the segments of data that it receives (segments 17 and later) after sending the abort (segments 14 and 15).

In the case of a Telnet user typing the interrupt key, we saw in Figure 26.17 that by default the Unix client does not send the interrupt process command as urgent data. We said this was OK because there is little chance that the flow of data from the client to the server is stopped by flow control. With FTP the client is also sending an interrupt process command across the control connection, and since two connections are being used there is little chance that the control connection is stopped by flow control. Why does FTP send the interrupt process command as urgent data when Telnet does not? The answer is that FTP uses two connections, whereas Telnet uses one, and on some operating systems it may be hard for a process to monitor two connections for input at the same time. FTP assumes that these marginal operating systems at least provide notification that urgent data has arrived on the control connection, allowing the server to then switch from handling the data connection to the control connection.

Anonymous FTP

One form of FTP is so popular that we'll show an example of it. It's called anonymous FTP, and when supported by a server, allows anyone to login and use FTP to transfer files. Vast amounts of free information are available using this technique.

How to find which site has what you're looking for is a totally different problem. We mention it briefly in Section 30.4.

We'll use anonymous FTP to the site ftp.uu.net (a popular anonymous FTP site) to fetch the errata file for this book. To use anonymous FTP we login with the username of "anonymous" (you learn to spell it correctly after a few times). When prompted for a password we type our electronic mail address.

 sun %  ftp ftp.uu.net  Connected to ftp.uu.net. 220 ftp.UU.NET FTP server (Version 2.0WU(13) Fri Apr 9 20:44:32 EDT 1993) ready. Name (ftp.uu.net:rstevens):  anonymous  331 Guest login ok, send your complete e-mail address as password. Password:  we type  rstevens@noao.edu;  it's not echoed  230- 230-                Welcome to the UUNET archive. 230-   A service of UUNET Technologies Inc, Falls Church, Virginia 230-   For information about UUNET, call +1 703 204 8000, or see the files 230-   in /uunet-info  more greeting lines  230 Guest login ok, access restrictions apply. ftp>  cd published/books   change to the desired directory  250 CWD command successful. ftp>  binary   we'll transfer a binary file  200 Type set to I. ftp>  get stevens.tcpipiv1.errata.Z   fetch the file  200 PORT command successful. 150 Opening BINARY mode data connection for stevens.tcpipiv1.errata.Z (105 bytes). 226 Transfer complete.  (you may get a different file size)  local: stevens.tcpipiv1.errata.Z remote: stevens.tcpipiv1.errata.Z 105 bytes received in 4.1 seconds (0.83 Kbytes/s) ftp>  quit  221 Goodbye. sun %  uncompress stevens.tcpipiv1.errata.Z  sun %  more stevens.tcpipiv1.errata

The uncompress is because many files available for anonymous FTP are compressed using the Unix compress (1) program, resulting in a file extension of . Z. These files must be transferred using the binary file type, not the ASCII file type.

Anonymous FTP from an Unknown IP Address

We can tie together some features of routing and the Domain Name System using anonymous FTP. In Section 14.5 we talked about pointer queries in the DNS ”taking an IP address and returning the hostname. Unfortunately not all system administrators set up their name servers correctly with regard to pointer queries. They often add new hosts to the file required for name-to-address mapping, but forget to add them to the file for address-to-name mapping. We often see this with traceroute, when it prints an IP address instead of a hostname.

Some anonymous FTP servers require that the client have a valid domain name. This allows the server to log the domain name of the host that's doing the transfer. Since the only client identification the server receives in the IP datagram from the client is the IP address of the client, the server can call the DNS to do a pointer query, and obtain the domain name of the client. If the name server responsible for the client host is not set up correctly, this pointer query can fail.

To see this error we'll do the following steps.

Change the IP address of our host slip (see the figure on the inside front cover) to 140.252.13.67. This is a valid IP address for the author's subnet, but not entered into the name server for the noao.edu domain.
Change the destination IP address of the SLIP link on bsdi to 140.252.13.67.
Add a routing table entry on sun that directs datagrams for 140.252.13.67 to the router bsdi. (Recall our discussion of this routing table in Section 9.2.)

Our host slip is still reachable across the Internet, because we saw in Section 10.4 that the routers gateway and netb just sent any datagram destined for the subnet 140.252.13 to the router sun. Our router sun knows what to do with these datagrams from the routing table entry we made in step 3 above. What we have created is a host with complete Internet connectivity, but without a valid domain name. That is, a pointer query for the IP address 140.252.13.67 will fail.

We now use anonymous FTP to a server that we know requires a valid domain name:

 slip %  ftp ftp.uu.net  Connected to ftp.uu.net. 220 ftp.UU.NET FTP server (Version 2.0WU(13) Fri Apr 9 20:44:32 EDT 1993) ready. Name (ftp.uu.net:rstevens):  anonymous  530- Sorry, we're unable to map your IP address 140.252.13.67 to a hostname 530- in the DNS. This is probably because your nameserver does not have a 530- PTR record for your address in its tables, or because your reverse 530- nameservers are not registered. We refuse service to hosts whose 530- names we cannot resolve. If this is simply because your nameserver is 530- hard to reach or slow to respond then try again in a minute or so, and 530- perhaps our nameserver will have your hostname in its cache by then. 530- If not, try reaching us from a host that is in the DNS or have your 530- system administrator fix your servers. 530 User anonymous access denied.. Login failed. Remote system type is UNIX. Using binary mode to transfer files. ftp>  quit  221 Goodbye.

The error reply from the server is self-explanatory.