2.5 The File Transfer Protocol

FTP was an important component of the Web authoring process at first, as discussed in Section 2.1.3. Users edited and saved pages locally, and then the widely interoperable and standard FTP protocol was used to transfer the finished pages to a Web server.

FTP servers hosted massive amounts of data when HTTP was invented, and they still do today. Web pages have been able to link to FTP pages since the CERN project's early days because Tim Berners-Lee saw that the more information that was available through the Web (even if you were directed out of the Web), the more valuable it would be. It was natural to use FTP to fill in the gaps left by HTTP. Not only could users' FTP clients upload files to the FTP file space, these same clients could also upload files to the HTTP file space on the server. FTP supports creating, updating, moving, copying, and deleting files and folders, so it filled in most of the missing HTTP authoring features.

The primary advantage of this combination is the ubiquity of FTP servers and clients. Almost every server platform that supports HTTP also has an FTP server available today, and most client operating systems come with FTP client support (even if on Windows the FTP tool is obscure and seldom used).

Since FTP is a ubiquitous tool that supports remote file authoring, why not continue to use FTP to author Web sites? The WebDAV inventors took the trouble to invent WebDAV because FTP has many subtle drawbacks, some specifically related to Web authoring. Let's look at some of those concerns.

2.5.1 Missing Functionality

FTP does not solve the lost update problem. If the user uploads a file with a name identical to an existing file on the remote server, the remote file is overwritten. There's no way to mark the server's copy of the file to let other users know it's being edited.

FTP lacks metadata support. Document authoring systems require properties for bibliographic metadata such as the author of the document, subject, keywords, and a brief description.

2.5.2 Poor Navigation

FTP does not have a standard way to list directory contents. Directory listings look different on Windows and Unix servers, regardless of the user's client or operating system. Although graphical FTP clients exist, they may not work with all FTP servers, and they work better for browsing and downloading than for authoring. FTP was originally designed not for graphical clients but for command-line interfaces.

Addresses are file system paths, not Web URLs. Since FTP is designed for file transfer between file systems, it exposes the file system hierarchy. To upload files with FTP, users need to understand the mapping between FTP file paths and HTTP URLs on their Web server. There is no way to automatically discover this mapping. A Web site author has to know the mapping scheme, and these schemes may be nonintuitive and vary from server to server. A Web authoring protocol should natively use HTTP URLs to avoid address mapping confusion.

2.5.3 Some Performance Concerns

FTP is a stateful protocol able to handle fewer active users. FTP keeps the network connection open and persists state information such as the authentication credentials and the current working directory. In 1996, Internet servers could not support very many TCP connections. FTP users become accustomed to annoying "connection refused" messages on high-traffic systems.

FTP requires extra connections, reducing active user limits even more. FTP requires two TCP connections, a control channel and a data channel, to be held open for each session. The client must first establish the control connection, then select the file to download, and then establish the data connection. This takes longer than establishing an HTTP connection, since HTTP only requires one TCP connection to both identify the file and download it. Sometimes an entire HTTP transaction can be completed before an FTP session is fully established.

FTP Addresses

FTP sites commonly use a domain name beginning with "ftp." There are some conventions for naming FTP directories, such as putting general-use files under the directory "pub." Imagine the following addresses are each valid document addresses:

  1. ftp://ftp.example.com/pub/contrib/index.html

  2. ftp://ftp.example.com/pub/doc/www/index.html

  3. ftp://www.example.com/pub/doc/www/index.html

What the user wants to do is edit a Web page with this address:

 
 http://www.example.com/index.html 

The client software has no way of knowing which FTP address, if any, can be used to update the correct Web page. The user must know what FTP server and path to use for each Web site.


2.5.4 Lessons Learned

FTP was intended for the transfer of files across the widely divergent storage systems available in the 1970s; storing metadata and fixing the lost update problem were simply not design goals. Hence, the lack of support for these features does not lessen the importance of FTP's achievements in interoperable cross-system file transfer. Although FTP wasn't seriously considered by experts in the area as a long-term standard Web authoring solution, it did serve as a conscious example for what a Web authoring protocol should do and how it might do it successfully.

FTP's Advantages

FTP does, of course, have some advantages over WebDAV for Web authoring. Its first advantage is its ubiquity.

Subtler, however, is the consideration that FTP clearly addresses the original file at all times. FTP servers never transform files. Web servers transform dynamic pages (such as Java Server Pages or Active Server Pages) that contain code for the server to execute. The contents of the downloaded Web page change for every viewer or even every download.

As we'll see much later, WebDAV attempts to deal with the problem of whether to retrieve the original source code or the dynamically generated output, but it fails to solve the entire problem. FTP works around the source-versus-output problem because it's a completely different protocol that always retrieves and updates the original.




WebDAV. Next Generation Collaborative Web Authoring
WebDAV. Next Generation Collaborative Web Authoring
ISBN: 130652083
EAN: N/A
Year: 2003
Pages: 146

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net