19.2 Uniform Resource Locators (URLs)

Team-FLY

A Uniform Resource Locator (URL) has the form scheme : location . The scheme refers to the method used to access the resource (e.g., HTTP), and the location specifies where the resource resides.

Example 19.1

The URL http://www.usp.cs.utsa.edu/usp/simple.html specifies that the resource is to be accessed with the HTTP protocol. This particular resource, usp/simple.html , is located on the server www.usp.cs.utsa.edu .

While http is not the only valid URL scheme, it is certainly the most common one. Other schemes include ftp for file transfer, mailto for mail through a browser or other web client, and telnet for remote shell services. The syntax for http URLs is as follows .

 http_URL = "http:"  "//" host [ ":" port ] [abs_path [ "?" query]] 

The optional fields are enclosed in brackets. The host field should be the human-readable name of a host rather than a binary IP address (Section 18.8). The client (often a browser) determines the server location by obtaining the IP address of the specified host. If the URL does not specify a port, the client assumes port 80. The abs_path field refers to a path that is relative to the web root directory of the server. The optional query is not discussed here.

Example 19.2

The URL http://www.usp.cs.utsa.edu:8080/usp/simple.html specifies that the server for the resource is listening on port 8080 rather than default port 80. The URL's absolute path is /usp/simple.html .

When a user opens a URL through a browser, the browser parses the server's host name and makes a TCP connection to that host on the specified port. The browser then sends a request to the server for the resource, as designated by the URL's absolute path using the HTTP protocol described in the next section.

Example 19.3

Figure 19.1 shows the location of a typical web server root directory ( web ) in the host file system. Only the part of the file system below the web directory root is visible and accessible through the web server. If the host name is www.usp.cs.utsa.edu , the image title.gif has the URL http://www.usp.cs.utsa.edu/usp/images/title.gif .

Figure 19.1. The root directory for the web server running on this host is /web . Only the boxed subtree is accessible through the Web.

graphics/19fig01.gif

The specification of a resource location with a URL ties it to a particular server. If the resource moves, web pages that refer to the resource are left with bad links. The Uniform Resource Name (URN) gives more permanence to resource names than does the URL alone. The owner of a resource registers its URN and the location of the resource with a service. If the resource moves, the owner just updates the entry with the registration service. URNs are not in wide use at this time. Both URLs and URNs are examples of Uniform Resource Identifiers (URIs). Uniform Resource Identifiers are formatted strings that identify a resource by name, location or other characteristics.

Team-FLY


Unix Systems Programming
UNIX Systems Programming: Communication, Concurrency and Threads
ISBN: 0130424110
EAN: 2147483647
Year: 2003
Pages: 274

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net