Requirements The general requirement for this utility is to convert one or more XML instance documents, each representing a single logical document, into a single flat file. Here's a summary of the required functionality. -
Inputs : A directory containing one or more XML instance documents, each in a single file. All have the same grammar. The second input is a file description document (as discussed in Chapter 6) that describes the XML documents and the flat file to be produced. -
Processing : Each input Element corresponding to a record (as described in the file description document) is written to a record in the flat output file. Each of its child Elements corresponding to fields are written to the appropriate fields in the record. Data types are converted according to the field descriptions in the file description document. -
Output : A flat file with fixed length fields and variable or fixed length records. Running the Utility This section provides instructions for running the XML to flat file conversion utility from the command line. For Java: java XMLToFlat InputDirectory OutputFile.dat FileDescription.XML or java XMLToFlat -h For C++ on Win32: XMLToFlat InputDirectory OutputFile.dat FileDescription.XML or XMLToFlat -h Options follow the parameters except for the help option, which may be specified by itself. Parameters: -
First : Path specification of the input directory (required). Either a relative or an absolute path name may be specified. The trailing directory separator character is optional. -
Second : File specification of the output flat file (required). The specification may include the full or relative path name. If no path name is specified, the file will be created in the current working directory. An extension may be specified, but if not specified the file will be created without one. If a file with the specified name already exists, it will be overwritten. -
Third : File specification of the file description document (required). If no path name is specified, the file is assumed to reside in the current working directory. The full file name must be specified, but there is no restriction on the extension name. Options: -
-v (Validate) : Validate the XML documents before processing. The documents are validated against the schema referenced in the document root Element. -
-h (Help) : Display a help message and exit without further processing. Restrictions: The restrictions are the same as those specified for the flat file to XML utility, plus the following. -
As when flat files are the source, it is recommended (though not required) that field grammar Elements be specified in their record description Element in ascending order by offset. However, the sequence of fields in the input XML document must match the sequence defined in the record grammar. -
The record identifier field Element need not be present in the input document. It will be converted if present, but the contents will be overwritten by the record identifier specified in the grammar. -
Fields without contents in the input XML document are initialized to the fill character specified in the field's grammar. -
If field definitions overlap one or more bytes in the record, the byte(s) are written from the last field specified in the grammar. -
When writing fixed length records, the output record is initialized to ASCII spaces. If a different fill character is desired, it must be specified in a field grammar. If a byte in the record is not defined within a field, it is written with an ASCII space. -
When writing variable length records, the output record is initialized to null characters and is not written beyond the end of the last field defined for the record. To use a fill character other than a null character, it must be specified in a field grammar. If a byte between fields that have content is not defined within a field grammar, it is written with a null character. Sample Input and Output: Purchase Order We'll use the cocoa purchase order for converting from XML to a flat file format. The particular orders we will produce for Big Daddy's system are similar to the invoices in that they contain two logical groups of records. The top-level group is the overall purchase order, containing a header record, a ship to record, and one or more line item groups. The line item group consists of a line item record and an item description record. Table 8.2 shows the logical layout of the purchase order file. The three purchase order instance documents shown below follow the logical organization specified in Table 8.2. Sample FlatPurchaseOrders01.xml <?xml version="1.0" encoding="UTF-8"?> <PurchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="FlatPurchaseOrder.xsd"> <Header> <CustomerNumber>BQ003</CustomerNumber> <PONumber>AZ999345</PONumber> <PODate>2002-11-01</PODate> <RequestedDeliveryDate>2002-11-15</RequestedDeliveryDate> </Header> <ShipTo> <ShipToName> Yazoo Grocers - NE Distribution Center </ShipToName> <ShipToStreet1>12 Industrial Parkway, NW</ShipToStreet1> <ShipToCity>Portland</ShipToCity> <ShipToStateOrProvince>ME</ShipToStateOrProvince> <ShipToPostalCode>04101</ShipToPostalCode> </ShipTo> <LineItem> <Item> <ItemID>HCVAN</ItemID> <OrderedQty>12</OrderedQty> <UnitPrice>2.59</UnitPrice> </Item> <ItemDescription> <Description> Instant Hot Cocoa Mix - Vanilla flavor </Description> </ItemDescription> </LineItem> <LineItem> <Item> <ItemID>HCMIN</ItemID> <OrderedQty>24</OrderedQty> <UnitPrice>2.53</UnitPrice> </Item> <ItemDescription> <Description> Instant Hot Cocoa Mix - Mint flavor </Description> </ItemDescription> </LineItem> </PurchaseOrder> Table 8.2. Logical Layout for the Purchase Order Group | Record | Record Tag | Field Number | Field Name | Offset | Length | Data Type | Description | PO | Header | HDR | 1 | Record Tag | | 3 | Alphanumeric | Required; record identifier | | | | 2 | Customer Number | 3 | 20 | Alphanumeric | Required; identifier assigned to the customer in our system | | | | 3 | PO Number | 23 | 20 | Alphanumeric | Required; customer purchase order number | | | | 4 | PO Date | 43 | 8 | Date | Required; date that the purchase order was issued | | | | 5 | Requested Delivery Date | 51 | 8 | Date | Optional; date on which delivery of the ordered items is requested | PO | Ship To | SHP | 1 | Record Tag | | 3 | Alphanumeric | Required; record identifier | | | | 2 | Ship to Name | 3 | 40 | Alphanumeric | Required; name of the receiving location for the shipped order | | | | 3 | Ship to Street 1 | 43 | 30 | Alphanumeric | Required; first address line of the receiving location | | | | 4 | Ship to Street 2 | 73 | 30 | Alphanumeric | Optional; second address line of the receiving location | | | | 5 | Ship to City | 103 | 20 | Alphanumeric | Required; city of the receiving location | | | | 6 | Ship to State or Province | 123 | 3 | Alphanumeric | Required; state or province of the receiving location | | | | 7 | Ship to Postal Code | 126 | 10 | Alphanumeric | Required; postal code of the receiving location | | | | 8 | Ship to Country | 136 | 3 | Alphanumeric | Optional; country of the receiving location | PO/Line Item | Line Item | LIN | 1 | Record Tag | | 3 | Alphanumeric | Required; record identifier | | | | 2 | Item ID | 3 | 20 | Alphanumeric | Required; our identifier for the ordered item | | | | 3 | Item Ordered Quantity | 23 | 10 | Decimal number, space filled | Required; the number of units ordered | | | | 4 | Item Unit Price | 23 | 10 | Implied decimal, two places, zero filled | Optional; unit price in U.S. dollars | | | | 5 | Extended Amount | 43 | 10 | Implied decimal, two places, zero filled | Optional; total amount due for the ordered item in U.S. dollars (unit price multiplied by the quantity ordered) | PO/Line Item | Item Description | DSC | 1 | Record Tag | | 3 | Alphanumeric | Required; record identifier (record itself is optional) | | | | 2 | Description | 3 | 80 | Alphanumeric | Required; description of the ordered item | Sample FlatPurchaseOrders02.xml <?xml version="1.0" encoding="UTF-8"?> <PurchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="FlatPurchaseOrder.xsd"> <Header> <CustomerNumber>BQ003</CustomerNumber> <PONumber>AW999346</PONumber> <PODate>2002-11-01</PODate> <RequestedDeliveryDate>2002-11-15</RequestedDeliveryDate> </Header> <ShipTo> <ShipToName> Yazoo Grocers - SE Distribution Center </ShipToName> <ShipToStreet1>Dock 37</ShipToStreet1> <ShipToStreet2>3975 Hwy 75</ShipToStreet2> <ShipToCity>Atoka</ShipToCity> <ShipToStateOrProvince>OK</ShipToStateOrProvince> <ShipToPostalCode>74525</ShipToPostalCode> </ShipTo> <LineItem> <Item> <ItemID>HCVAN</ItemID> <OrderedQty>36</OrderedQty> <UnitPrice>2.59</UnitPrice> </Item> <ItemDescription> <Description> Instant Hot Cocoa Mix - Vanilla flavor </Description> </ItemDescription> </LineItem> <LineItem> <Item> <ItemID>HCMIN</ItemID> <OrderedQty>72</OrderedQty> <UnitPrice>2.53</UnitPrice> </Item> <ItemDescription> <Description> Instant Hot Cocoa Mix - Mint flavor </Description> </ItemDescription> </LineItem> </PurchaseOrder> Sample FlatPurchaseOrders03.xml <?xml version="1.0" encoding="UTF-8"?> <PurchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="FlatPurchaseOrder.xsd"> <Header> <CustomerNumber>AY001</CustomerNumber> <PONumber>2002-0967</PONumber> <PODate>2002-11-09</PODate> <RequestedDeliveryDate>2002-11-14</RequestedDeliveryDate> </Header> <ShipTo> <ShipToName>Corner Drug and Sundries</ShipToName> <ShipToStreet1>14 Main Street</ShipToStreet1> <ShipToCity>Wichita</ShipToCity> <ShipToStateOrProvince>KS</ShipToStateOrProvince> <ShipToPostalCode>67201</ShipToPostalCode> </ShipTo> <LineItem> <Item> <ItemID>HCVAN</ItemID> <OrderedQty>24</OrderedQty> <UnitPrice>2.59</UnitPrice> </Item> <ItemDescription> <Description> Instant Hot Cocoa Mix - Vanilla flavor </Description> </ItemDescription> </LineItem> </PurchaseOrder> Figure 8.2 shows the resulting flat file. |