D.2 PICS Labels

only for RuBoard - do not distribute or recompile

D.2 PICS Labels

The PICS label specification defines the syntax for document labels. Labels can be obtained over the Web from a search service using an HTTP extension defined in the PICS standard. Alternatively, labels can be automatically included with a document, as part of the document's header.

Here is a PICS label that ranks a URL using the service described in the previous section:

(PICS-1.0 "http://moviescale.org/v1.0"   labels    on "2002.6.01T00:01-0500"    until "2002.12.31T23:59-0500"    for "http://www.missionimpossible.com/"    by "Simson L. Garfinkel"    ratings (r 0))

This label describes the web site for the Paramount movie Mission: Impossible using the fictitious labeling service described in the previous section. The label was created on June 1, 2002, and is valid until December 31, 2002. The label is for information stored at the URL >http://www.missionimpossible.com/. The label was written by Simson L. Garfinkel. Finally, the label gives the rating "(r 0)."

Although the movie Mission: Impossible had a rating of "R," the web site has a rating of "G." (The value "G" is transmitted with 0 using the http://moviescale.org/v1.0 rating service.)

Ratings may include more than one transmitted value. For example, if a rating service defined two scales, a label rating might look like this: "(r 3 n 4)."

Labels can be substantially compressed by removing nearly all information except the ratings themselves. For example, the previous label could be transmitted like this:

(PICS-1.0 "http://moviescale.org/v1.0"   r 0)

Labels can optionally include an MD5 message digest hash of the labeled document. This allows software to determine if the fetched document has been modified in any way since the label was created. Labels can also have digital signatures, which allows labeling services to sign their own labels. That would allow a site to distribute labels for its content that were created by a third-party labeling service and give users the assurance that the labels have not been modified in any way.

Here is a complete description of all of the fields in revision 5 of the label format.

Information about the document that is labeled

at quoted-ISO-date

The modification date of the item being rated. The standard proposes using the modification date "as a less expensive, but less reliable, alternative to the message integrity check (MIC) options."

MIC-md5 "Base64-string"
or MD5 "Base64-string"[A]

The MD5 hash value of the item being rated

Information about the document label itself:

by name

The name of the person or organization that rated the item. The name, like all strings in the label specification, may be either a human-readable quoted name or a Base64 encoded string.

for URL

The URL of the item to which this rating applies.

generic boolean

If Boolean is "true," the label applies to all items that are prefaced by the "for" URL. This is useful for rating an entire site or set of documents within a particular directory. If false, the rating applies only to this document.

on quoted-ISO-date

The date on which the rating was issued.

signature-RSA-MD5 "Base64-string"

An RSA digital signature for the label.

until quoted-ISO-date
exp quoted-ISO-date

The date on which this rating expires.

Other information:

comment acomment

A comment. It's not supposed to be read by people.

complete-label quotedURL
full quotedURL

A URL of the complete label. The idea of this field is that an abridged label might be sent with a document in the interest of minimizing transmission time. Then, if a piece of software wants the complete label, that software can get it from the quotedURL.

extension quotedURL data

Extensions are a formal means by which the PICS standard can be extended. The extension keyword introduces additional data that is used by an extension. Each extension must include a URL that indicates where the extension is documented. This is designed to avoid duplication of extension names. For example, both China and Singapore could adopt "monitoring" extensions that might be used to transmit to the web browser a unique serial number used to track every download of every labeled document. However, the two countries might adopt slightly different monitoring extensions. As one extension would have a URL of http://censorship.gov.cn/monitoring.html and the other would have a URL of http://censorship.gov.sg/monitoring.html, the two extensions would not conflict even though they had the same name. A list of extensions currently in use appears at http://w3.org/PICS/extensions. There were no such extensions at the time this book was published.

D.2.1 Labeled Documents

The PICS standard allows for PICS labels to be automatically transmitted with any message that uses an RFC 822 header. These headers are used by Internet email, HTTP, and Usenet news protocols. This allows for convenient labeling of information transmitted over these systems.

The PICS RFC 822 header is PICS-Label. The format is:

PICS-Label: labellist

For example, the following email message might contain some explicit, racy material. Or, it might be about some medical experiments. Or maybe it has to do with one roommate playing a joke on another after a party. Or it could be an exercise in surreal literature. Whatever it may be, we can use the PICS label to determine something about content and whether we should avoid reading the full text, thereby saving ourselves from shock and embarrassment. (Alternatively, we could use the labels to quickly scan a mail archive and zero in on the "good ones"):

To: saras@ex.com From: wendy@ex.com Date: Tue, 26 Nov 2002 14:05:55 -0500 Subject: Last Night PICS-Label: (PICS-1.1 "http://www.rsac.org/1.0/" v 0 s 4 n 4 l 4) Dearest Sara, You passed out last night before the action really got started, so I wanted to send  you a detailed description of what we did ...

D.2.2 Requesting PICS Labels by HTTP

PICS defines an extension to the HTTP protocol that allows you to request a PICS header along with the document. The extension requires that you send a Protocol-Request command after the HTTP GET command. The Protocol-Request command contains a tag that allows you to specify which PICS service labels you wish.

For example, to request a document using HTTP with the RSAC labels, a client might send an HTTP request such as this:

GET / HTTP/1.0 Protocol-Request: {PICS-1.1 {params minimal {services "http://www.rsac.org/1.0"}}}

The keyword "minimal" in the HTTP request specifies the amount of information that is requested. Options include minimal, short, full, and complete-label.

A PICS-enabled HTTP server might respond with this:

Date: Fri, 29 Nov 1996 21:43:40 GMT Server: Stronghold+PICS/1.3.2 Ben-SSL/1.3 Apache/1.1.1 Content-type: text/html PICS-Label: (PICS-1.1 "http://www.rsac.org/1.0/" v 0 s 0 n 2 l 0) <HTML> <HEAD> <TITLE>Welcome to Deus Ex Machina Software, Inc.</TITLE> ...

D.2.3 Requesting a Label from a Rating Service

The PICS standard also defines a way to request a label for a particular URL from a rating service. A rating service might be run by anybody. In 1996, the Simon Wiesenthal Center conducted a campaign asking Internet service providers to block access to Nazi hate literature that was on the Web; an alternative recommended by Resnick and Miller is that the Simon Wiesenthal Center could run a rating service, rating documents on the Web based on their view of the historical accuracy and propaganda level. SurfWatch, a vendor of blocking software, might run its own rating service that indicated the amount of nudity, sex, violence, and profane language based on each particular document. Fundamentalist religious groups could rate pages on adherence to their particular beliefs. And militia groups could run a rating service that would put up increasing numbers of little black helicopter icons for pages they suspect have fallen under United Nations control. The potential is limited only by one's free time.

Rating services are supposed to respond to HTTP GET requests that encode database lookups in URLs. URLs should look like this:[B]

[B] When the URL is actually sent an HTTP GET request, it must be properly encoded. For example, the characters %3A must be used to represent a ":" and the characters %2F must be used to represent a "/". This encoding is specified by RFC 1738.

http://service.net/Ratings?opt=generic&u="http://www.some.com/somedoc.html"&s="http:// www.some.rating.company/service.html"

Several options are defined:

opt=normal

This indicates that the label for the URLs specified should be sent. If no label is available for the specific URL, the server may send a generic URL or a URL for an ancestor URL. Omitting the opt completely has the same result.

opt=tree

This requests a tree of labels that is, all of the labels for the site or for the requested subpart of the site.

opt=generic+tree

This requests a generic label for the specified tree.

u=objectURL

This specifies the URL for which a label is desired. More than one URL may be requested by including multiple u= specifications.

s=serviceURL

This specifies the URL for the particular rating service that is desired. If multiple services are requested, a label is returned for each.

format=aformat

Specifies which format of labels are requested.

extension=aString

Specifies an extension that should be in effect for the label that is requested.

Thus, if a web browser were communicating with a rating service, the actual message sent to port 80 of the web server at service.net would be:

GET /Ratings?opt=generic&u="http%3A%2F%2Fwww.some.com%2Fsomedoc.html" &s="http%3A%2F%2Fwww.some.rating.company%2Fservice.html" HTTP/1.0

This message would be sent as a single line without a break or space.

only for RuBoard - do not distribute or recompile


Web Security, Privacy & Commerce
Web Security, Privacy and Commerce, 2nd Edition
ISBN: 0596000456
EAN: 2147483647
Year: 2000
Pages: 194

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net