Appendix A. XML Schemas and Document Type Definitions for Policy Statements

 < Day Day Up > 

XML is an excellent method of expressing policies. It has the advantage of being human readable, while software can process it. In its simplest form, XML is text surrounded by tags, in much the same vein as HTML. In fact, as of HTML 4.0, HTML is defined as a specific set of XML tags.

Tags are expressed by a pair of angle brackets (< and >) with text in between. XML tags always come in pairs, with an opening tag containing only the angle brackets and a closing tag using angle brackets and a slash. For example, an ILM field can be described in XML as:

 <ILM> Some ILM Policy </ILM> 

XML allows for hierarchical structures through the use of nested tags. This allows for very complex structures to be designed while maintaining the readability of the text. Most XML-based applications and parsers have an inherent backward compatibility. Like web browsers, they ignore tags that they don't understand. That allows standard applications to process the XML document, while specialized applications can take advantage of additional information encoded in it. XML is a naturally extensible language. This makes it a prime candidate for expressing DLM and ILM policies.

Unlike a database, XML does not require an external method of defining the structure of a document. The structure can instead be derived from the tags within the document. It is still good practice to design a schema to ensure consistency among documents, as well as to communicate better the intent of the internal structure. Many standard schemas are designed by vertical industries or support broad applications. Web Services, which implements a standard method of communication among web applications, uses a standard set of XML structures. RDF does the same for documents. As of the writing of this book, no standard ILM or DLM structures exist.

Two common methods of defining XML document structures are XML Schemas and Document Type Definitions (DTDs). The latter is an older way of defining XML documents and is being superseded by XML Schemas. DTDs are easier to design and parse, but XML Schemas provide a richer description of the document's structure.

The XML Schema and DTD for the DLM examples from Chapter 7 would be rendered as follows:

<?xml version="1.0" encoding="UTF-8" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="asset"> <xs:complexType mixed="true"> <xs:attribute name="asset_types" type="xs:string" use="required" /> </xs:complexType> </xs:element> <xs:element name="constraints"> <xs:complexType mixed="true" /> </xs:element> <xs:element name="create"> <xs:complexType mixed="true" /> </xs:element> <xs:element name="date"> <xs:complexType> <xs:sequence> <xs:element ref="create" /> <xs:element ref="revision" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="description"> <xs:complexType mixed="true" /> </xs:element> <xs:element name="expected_result"> <xs:complexType mixed="true" /> </xs:element> <xs:element name="name"> <xs:complexType mixed="true" /> </xs:element> <xs:element name="policy"> <xs:complexType> <xs:sequence> <xs:element ref="name" /> <xs:element ref="description" /> <xs:element ref="purpose" /> <xs:element ref="date" /> <xs:element ref="process" /> <xs:element ref="expected_result" /> <xs:element ref="constraints" /> <xs:element ref="asset" maxOccurs="unbounded" /> </xs:sequence> <xs:attribute name="parent" type="xs:string" use="required" /> <xs:attribute name="policy_type" type="xs:NMTOKEN" use="required" /> <xs:attribute name="data_type" type="xs:string" use="required" /> </xs:complexType> </xs:element> <xs:element name="process"> <xs:complexType mixed="true" /> </xs:element> <xs:element name="purpose"> <xs:complexType mixed="true" /> </xs:element> <xs:element name="revision"> <xs:complexType mixed="true" /> </xs:element> </xs:schema> <!ELEMENT asset ( #PCDATA ) > <!ATTLIST asset asset_types CDATA #REQUIRED > <!ELEMENT constraints ( #PCDATA ) > <!ELEMENT create ( #PCDATA ) > <!ELEMENT date ( create, revision ) > <!ELEMENT description ( #PCDATA ) > <!ELEMENT expected_result ( #PCDATA ) > <!ELEMENT name ( #PCDATA ) > <!ELEMENT policy ( name, description, purpose, date, process, expected_result, constraints , asset+ ) > <!ATTLIST policy data_type CDATA #REQUIRED > <!ATTLIST policy parent CDATA #REQUIRED > <!ATTLIST policy policy_type NMTOKEN #REQUIRED > <!ELEMENT process ( #PCDATA ) > <!ELEMENT purpose ( #PCDATA ) > <!ELEMENT revision ( #PCDATA ) >

The XML Schema more clearly shows the nested structure of the DLM policy. It should be noted that no attempt was made to tweak the constraints or data types. XML Schemas have a rich set of data types.

The ILM policy example in Chapter 8 would have a Schema and DTD such as this:

<?xml version="1.0" encoding="UTF-8" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Action" type="xs:string" /> <xs:element name="Actions"> <xs:complexType> <xs:sequence> <xs:element ref="Move" /> <xs:element ref="Copy" /> <xs:element ref="Destroy" /> <xs:element ref="No_action" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Attributes"> <xs:complexType> <xs:sequence> <xs:element ref="Content_Clues" /> </xs:sequence> <xs:attribute name="Owner" type="xs:string" use="required" /> <xs:attribute name="File_Type" type="xs:string" use="required" /> </xs:complexType> </xs:element> <xs:element name="Class"> <xs:complexType> <xs:sequence> <xs:element ref="Attributes" /> </xs:sequence> <xs:attribute name="ID" type="xs:ID" use="required" /> </xs:complexType> </xs:element> <xs:element name="Content_Clues"> <xs:complexType> <xs:sequence> <xs:element ref="Content_Rule" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Content_Hash" type="xs:hexBinary" /> <xs:element name="Content_Rule" type="xs:string" /> <xs:element name="Copy"> <xs:complexType> <xs:attribute name="ID" type="xs:ID" use="required" /> <xs:attribute name="New_URI" type="xs:anyURI" use="required" /> </xs:complexType> </xs:element> <xs:element name="Destroy"> <xs:complexType> <xs:attribute name="ID" type="xs:ID" use="required" /> </xs:complexType> </xs:element> <xs:element name="History"> <xs:complexType> <xs:sequence> <xs:element ref="State" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="ILM"> <xs:complexType> <xs:sequence> <xs:element ref="Information" /> <xs:element ref="Actions" /> <xs:element ref="Policies" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Information"> <xs:complexType> <xs:sequence> <xs:element ref="Class" /> <xs:element ref="State" /> <xs:element ref="Timestamp" /> <xs:element ref="History" /> </xs:sequence> <xs:attribute name="ID" type="xs:ID" use="required" /> </xs:complexType> </xs:element> <xs:element name="Information_Paths"> <xs:complexType> <xs:sequence> <xs:element ref="URI" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Last_Access_Time" type="xs:dateTime" /> <xs:element name="Move"> <xs:complexType> <xs:attribute name="ID" type="xs:ID" use="required" /> <xs:attribute name="Destination_URI" type="xs:anyURI" use="required" /> </xs:complexType> </xs:element> <xs:element name="No_action"> <xs:complexType> <xs:attribute name="ID" type="xs:ID" use="required" /> </xs:complexType> </xs:element> <xs:element name="Policies"> <xs:complexType> <xs:sequence> <xs:element ref="Policy" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Policy"> <xs:complexType> <xs:sequence> <xs:element ref="Trigger" /> <xs:element ref="Action" /> </xs:sequence> <xs:attribute name="Name" type="xs:string" use="required" /> <xs:attribute name="Owner" type="xs:string" use="required" /> <xs:attribute name="Description" type="xs:string" use="required" /> </xs:complexType> </xs:element> <xs:element name="Relationships"> <xs:complexType> <xs:sequence> <xs:element ref="URI" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Rule" type="xs:string" /> <xs:element name="State"> <xs:complexType> <xs:sequence> <xs:element ref="Information_Paths" minOccurs="0" /> <xs:element ref="Last_Access_Time" minOccurs="0" /> <xs:element ref="Content_Hash" minOccurs="0" /> <xs:element ref="Relationships" minOccurs="0" /> <xs:element ref="Value" minOccurs="0" /> </xs:sequence> <xs:attribute name="ID" type="xs:ID" use="optional" /> </xs:complexType> </xs:element> <xs:element name="Timestamp" type="xs:dateTime" /> <xs:element name="Trigger"> <xs:complexType> <xs:sequence> <xs:element ref="State" /> <xs:element ref="Rule" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="URI" type="xs:anyURI" /> <xs:element name="Value" type="xs:string" /> </xs:schema> <!ELEMENT Action EMPTY > <!ELEMENT Actions ( Move, Copy, Destroy, No_action ) > <!ELEMENT Attributes ( Content_Clues ) > <!ATTLIST Attributes File_Type CDATA #REQUIRED > <!ATTLIST Attributes Owner CDATA #REQUIRED > <!ELEMENT Class ( Attributes ) > <!ATTLIST Class ID CDATA #REQUIRED > <!ELEMENT Content_Clues ( Content_Rule ) > <!ELEMENT Content_Hash EMPTY > <!ELEMENT Content_Rule EMPTY > <!ELEMENT Copy EMPTY > <!ATTLIST Copy ID CDATA #REQUIRED > <!ATTLIST Copy New_URI CDATA #REQUIRED > <!ELEMENT Destroy EMPTY > <!ATTLIST Destroy ID CDATA #REQUIRED > <!ELEMENT History ( State ) > <!ELEMENT ILM ( Information, Actions, Policies ) > <!ELEMENT Information ( Class, State, Timestamp, History ) > <!ATTLIST Information ID CDATA #REQUIRED > <!ELEMENT Information_Paths ( URI ) > <!ELEMENT Last_Access_Time EMPTY > <!ELEMENT Move EMPTY > <!ATTLIST Move Destination_URI CDATA #REQUIRED > <!ATTLIST Move ID CDATA #REQUIRED > <!ELEMENT No_action EMPTY > <!ATTLIST No_action ID CDATA #REQUIRED > <!ELEMENT Policies ( Policy ) > <!ELEMENT Policy ( Trigger, Action ) > <!ATTLIST Policy Description CDATA #REQUIRED > <!ATTLIST Policy Name CDATA #REQUIRED > <!ATTLIST Policy Owner CDATA #REQUIRED > <!ELEMENT Relationships ( URI ) > <!ELEMENT Rule EMPTY > <!ELEMENT State ( Information_Paths?, Last_Access_Time?, Content_Hash?, Relationships?, Value? ) > <!ATTLIST State ID CDATA #IMPLIED > <!ELEMENT Timestamp EMPTY > <!ELEMENT Trigger ( State, Rule ) > <!ELEMENT URI EMPTY > <!ELEMENT Value EMPTY >

Once again, the XML Schema is richer. It better expresses the hierarchical structure of the ILM policy and the expected data types.

There are some disadvantages to XML. XML documents can be long and convoluted, making it hard to read them in their raw form. XML files can take considerable application processing time, especially for a very long document. There are also security concerns with XML documents, because they are plain text that is readable by anyone who can access the document.

When using XML documents for policies, it is important to keep these limitations in mind. Documents should be kept as short as possible. The structure should be as simple as the policy goals allow, and encryption should be used whenever the documents are at rest or are being transferred between applications or over a network.

     < Day Day Up > 


    Data Protection and Information Lifecycle Management
    Data Protection and Information Lifecycle Management
    ISBN: 0131927574
    EAN: 2147483647
    Year: 2005
    Pages: 122

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net