Section 14.1. Uniform Resource Locators (URLs)

14.1. Uniform Resource Locators (URLs)

A URL points to an object on the Internet.^[*] It's a text string that identifies an item, tells you where to find it, and specifies a method for communicating with it or retrieving it from its source. A URL can refer to any kind of information source. It might point to static data, such as a file on a local filesystem, a web server, or an FTP site; or it can point to a more dynamic object such as an RSS news feed or a record in a database. URLs can even refer to less tangible resources such as Telnet sessions and email addresses.

^[*] The term URL was coined by the Uniform Resource Identifier (URI) working group of the IETF to distinguish URLs from the more general notion of Uniform Resource Names or URNs (see RFC 2396), described later in this chapter.

Since there are many different ways to locate an item on the Net and different mediums and transports require different kinds of information, URLs can have many forms. The most common form has four components: a network host or server, the name of the item, its location on that host, and a protocol by which the host should communicate:

 protocol://hostname/path/item-name

protocol (also called the "scheme") is an identifier such as http or ftp, hostname is usually an internet hostname, and the path and item components form a unique path that identifies the object on that host. Variants of this form allow extra information to be packed into the URL, specifying, for example, port numbers for the communications protocol and fragment identifiers that reference sections inside documents. Other more specialized types of URLs such as "mailto" URLs for email addresses or URLs for addressing things like database components may not follow this format precisely but do conform to the general notion of a protocol followed by a unique identifier.

Since most URLs have the notion of a hierarchy or path, we sometimes speak of a URL that is relative to another URL, called a base URL. In that case, we are using the base URL as a starting point and supplying additional information to target an object relative to that URL. For example, the base URL might point to a directory on a web server; a relative URL might name a particular file in that directory or in a subdirectory.