1.1 What Is Authoring?

Web authoring is the process of creating, updating, and managing content on a Web server. A basic Web server offers a set of text files formatted in HTML. These files contain links to other files. The content hosted by a Web server can also include non-HTML documents as well as images, programs, music, and video.

Web authoring frequently takes place remotely; the user creates content on a client machine and then transfers that content to the Web server. To offer that functionality, the Web server must provide a way for clients to create new Web pages and give the Web pages names and locations. Clients must be able to see a Web page's current content and provide new content to the server to replace old content. Clients must also be able to delete obsolete Web pages.

Web authoring frequently involves more than one author. A Web site may be managed by a team of people collaborating on the same files, contributing individual expertise to create a rich set of pages. Multiple authors frequently try to update the same pages within a short period, and without knowing that others are working on the same pages. Multiple authors make an authoring solution more difficult, yet teams of Web content authors are common.

Originally, HTTP attempted to solve the authoring problem with a few simple operations: the ability to download, upload, or delete a Web page. Although these operations provided some authoring functionality, many problems were left unsolved. Some of the most pressing were:

  • No method for creating a new folder/directory. Web pages, like files on any file system, are typically organized in directories.

  • No methods defined to move, copy, or rename Web pages.

  • No way for clients to request a standardized (software-parsable) listing of a folder's contents.

  • No way for two authors working on the same content to easily coordinate their changes.

Because these problems are in practice quite frustrating, basic HTTP has hardly ever been used for authoring without custom extensions.

1.1.1 The Bad Old Days

The first Web sites were necessarily created with text editors because no specialized tools existed. If the Web author happened not to be working directly on the Web server machine, File Transfer Protocol (FTP) [RFC959] would commonly be used to transmit the finished HTML file to the Web server. However, FTP was a standard in 1985, long before the Web came along, so it wasn't designed for authoring Web sites. It doesn't handle multiple authors gracefully, it doesn't use the same namespace as HTTP, and its directory listing format isn't well standardized. Obviously, something new was needed.

The first generation of popular Web authoring software consisted of HTML editors. These tools greatly improved the ability to change a local HTML file, but the process of getting the changed files to a remote server remained difficult. Many HTML editors did not attempt to do this at all, instead requiring the user to transfer the files separately to the Web server.

1.1.2 Usable but Proprietary Software

The second generation of Web authoring software attempted to solve the problem of putting the files on the Web server (as well as HTML editing) but did so with proprietary or awkward methods. Each major authoring package had significant incompatibilities with other software.

  • Vermeer FrontPage, which became Microsoft FrontPage, used a proprietary protocol to remotely manage pages. A couple of non-Microsoft servers work with the FrontPage protocol, but few non-Microsoft client implementers have adopted it.

  • Macromedia Dreamweaver and Allaire's ColdFusion provided some distributed authoring capabilities (including the ability to lock Web pages for single-author edits in Dreamweaver), but through proprietary protocols or protocol extensions only.

  • Netscape Navigator Gold had the capability to place files on the server, using either FTP or the HTTP PUT request. This solution worked well enough for single-author Web sites but didn't work well with multiple authors because both FTP and HTTP PUT overwrite other authors' changes.

Professional Web development teams sometimes use source control repositories known as Configuration Management (CM) systems. Concurrent Versioning System (CVS) on Unix and Microsoft SourceSafe are both CM servers. CM servers typically support either proprietary or custom protocols, which are difficult for any other product to pick up and support. Although CM provides excellent functionality for multiple authors and even allows recovery of older versions of any document, it is unwieldy for Web site development. For example, it is frequently difficult or undesirable to host the Web site directly from the files in the CM repository. Therefore, any time the Web site is tested or released, its latest pages must be copied to a Web server.

1.1.3 Web Authoring via Web Forms

A number of Internet services and sites (today including MSN, AOL, GeoCities, and Yahoo!) host small personal Web sites for a large number of members. To manage changes submitted by all these users, these services use custom-built HTML forms to allow Web page editing.

An HTML form is a text file formatted in HTML with HTML tags identifying fields for a user to fill in and usually text explaining to the user what information to enter into each field. All Web browsers are capable of displaying basic HTML forms with a standard set of fields, collecting the user input, and returning the field values in a standard format to the server.

HTML forms are the lowest common denominator tool for solving the Web site authoring problem. A Web-based user interface does not require special software downloads. For example, in 2003 Yahoo! GeoCities hosted PageWizards [GeoCities02], [GeoCities03]. A PageWizard was a series of forms that walked the user through a number of steps to end up with a finished personal Web site.

Web-based user interfaces have some serious drawbacks:

  • Users have very little control over their content because the content can't be accessed directly. For example, users can't back up their Web sites or synchronize them with local working copies.

  • These interfaces are usually designed for use by one author only and cannot handle multiple authors managing the same content.

  • Web interfaces are slow because the entire page and even images are transmitted to the client with each page refresh or page change.

  • Web interfaces have very poor capabilities for text editing, so frequently authors want to use their own text editing tool.

  • Using a text editing tool together with the Web-based authoring interface requires multiple steps and is error-prone, as I'll explain in the next section.

1.1.4 Multiple Stages Multiply Errors

The authoring systems we've discussed so far (proprietary, CM, and Web form based) all had clear interoperability problems, so many Web sites continued to use FTP for authoring. Every site had different conventions for relating FTP locations to Web locations, so it was difficult for authors to figure out where to put files on the FTP server. If the site required authorization, users could find it frustrating to log in to the FTP server as well as the Web server, possibly with different passwords. However, these are only minor frustrations compared to losing all the work you've done in a day a potential outcome when an updated file is overwritten accidentally.

When authoring requires downloading and uploading files using different pieces of software in several stages, errors multiply. The stages include:

  1. Browse to the correct file on the Web or FTP server. Save the file locally, choosing a name and location.

  2. Edit the file, making sure all links will work even when the file is in a different location.

  3. Open another piece of software, such as an FTP client.

  4. Open a connection to the correct server.

  5. Navigate to the correct remote location.

  6. Browse the local file system looking for the changed file that was just saved, and upload it.

Saving to the local file system can be a problem because the local file system might have different file-naming rules than the server. Unix has always allowed file extensions like ".html," but Windows file systems used to restrict the extension to three characters. Unix treats names with different capitalization as different file names, but Windows does not, so a Windows user could upload index.HTM instead of index.htm. Because of the lost update problem, sites frequently have backup directories, but the user then has to choose between multiple locations of the same file. Files named default.htm and index.htm are so common, the wrong one may be updated.

These mistakes may not seem too likely when considered theoretically, but I make them often enough, particularly if my work is interrupted. My most common error is to save a file to two different local directories on different occasions, creating two local copies. When I upload the file to a server, I sometimes upload the wrong copy. I've also made backups of files and then overwritten the copy that had my recent changes with the copy from backup. Most memorably, I did that with the references file for this book after a month of improvements. I know I'm sometimes careless, but I also know that other users make the same mistakes, aided by software that doesn't make it easy enough to do things properly.



WebDAV. Next Generation Collaborative Web Authoring
WebDAV. Next Generation Collaborative Web Authoring
ISBN: 130652083
EAN: N/A
Year: 2003
Pages: 146

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net