Flat File to XML: Functionality and Operation


Requirements

This utility converts a flat file containing one or more logical documents to one or more XML instance documents, each representing a single logical document. Here's a summary of the required functionality.

  • Inputs : A flat file in fixed or variable length format, with each field having a fixed length. A specific field in each record contains a record identifier. The input flat file may consist of more than one logical document. The second input is a file description document (as discussed in Chapter 6) that describes the flat file and the grammar of the XML document to be produced.

  • Processing : An Element is created in the output document for each logical group of records in the flat file, and each input flat file record is written to an Element that is a child of that group Element. Each field in the record is written to a child Element of the record Element. The organization of records into groups, of fields into records, and their Element names are derived from the file description document. Field content is converted to schema language data types as specified in the file description document. Empty fields, that is, those with zero length after whitespace has been trimmed , do not create Elements in the output document. Processing breaks optionally occur on a change in a trading partner field.

  • Output : One or more XML instance documents, each in a single file. The root Element name is derived from the grammar in the file description document. The file name is formed by appending a three-digit sequence number to the root Element name and adding the extension .xml. If break on trading partner has been specified, the documents for each trading partner are placed in a separate subdirectory. The subdirectories are named according to the trading partner IDs in the partner break field.

Running the Utility

This section provides instructions for running the flat file to XML conversion utility from the command line.

For Java:

java FlatToXML InputFile.dat OutputDirectory FileDescription.XML

or

java FlatToXML -h

For C++ on Win32:

FlatToXML InputFile.dat OutputDirectory FileDescription.XML

or

FlatToXML -h

Options follow the parameters except for the help option, which may be specified by itself.

Parameters:

  • First : File specification of the input flat file (required). The specification may include the full or relative path name. If no path name is specified, the file is assumed to reside in the current working directory. The full file name must be specified, but there is no restriction on the extension name.

  • Second : Path specification of the output directory (required). The directory must exist. Either a relative or an absolute path name may be specified. The trailing directory separator character is optional. If no break on trading partner is specified, all the created XML files are placed in this directory. If break on trading partner has been specified, then a subdirectory for each trading partner is created beneath this directory.

  • Third : File specification of the file description document (required). If no path name is specified, the file is assumed to reside in the current working directory. The full file name must be specified, but there is no restriction on the extension name.

Options:

  • -v (Validate) : Validate the created XML documents before writing them to disk. The documents are validated against the schema specified in the file description document.

  • -h (Help) : Display a help message and exit without further processing.

Restrictions:

Unless otherwise noted, all numeric limits may be modified by changing parameters in the program source and appropriate type definitions in the file description document schemas.

  • A field may have a maximum of 1,023 bytes.

  • A record may be no longer than 16,383 bytes.

  • A maximum of 100 fields per record is supported.

  • There is no absolute limit on the number of records; the number is only practically limited by system memory.

  • Each field must be assigned a unique Element name.

  • Element names are limited to 127 characters .

  • It is recommended, though not required, that field grammar Elements be specified in their record description Element in ascending order by offset.

  • Path lengths for complete file specifications are limited to 127 characters.

  • Schema location URIs are limited to 127 characters.

  • A maximum of 999 output XML documents from an input flat file is supported.

  • A maximum of 100 different trading partner destinations in an input flat file is supported.

  • The field indicating a break on trading partner must be in the beginning record of a logical document.

  • Trading partner IDs must be valid directory names for the operating system where the utility is run.

Sample Input and Output: Invoice

As in Chapter 7, we're going to use invoices from Big Daddy's Gourmet Cocoa for our example. However, Big Daddy has now upgraded to a more capable order management and bookkeeping system. The new system supports a more comprehensive flat, hierarchical file structure than the CSV formats supported by the previous system.

The simple invoice example is composed of two levels of record groups. The group at the top level is the invoice itself, consisting of a header record, ship to address, one or more line item groups, and a summary record. The second group, the line item group, contains a line item record and an item description record.

This particular file has variable length records. Although Big Daddy's system uses variable length records, we could just as easily specify a fixed length record. Table 8.1 shows the logical layout of the invoice file.

Figure 8.1 shows the sample input invoice flat file.

Table 8.1. Logical Layout for the Invoice

Group

Record

Record Tag

Field Number

Field Name

Offset

Length

Data Type

Description

Invoice

Header

HDR

1

Record Tag

3

Alphanumeric

Record identifier

     

2

Customer Number

3

20

Alphanumeric

Identifier we have assigned to the customer in our system

     

3

Invoice Number

23

20

Alphanumeric

System-assigned invoice number

     

4

Invoice Date

43

8

Date

Date of invoice, formatted YYYYMMDD

     

5

PO Number

51

20

Alphanumeric

Customer purchase order number

     

6

Due Date

71

8

Date

Date that invoice amount is due for payment, formatted as YYYYMMDD

Invoice

Ship To

SHP

1

Record Tag

3

Alphanumeric

Record identifier

     

2

Ship to Name

3

40

Alphanumeric

Name of the receiving location for the shipped order

     

3

Ship to Street 1

43

30

Alphanumeric

First address line of the receiving location

     

4

Ship to Street 2

73

30

Alphanumeric

Second address line of the receiving location

     

5

Ship to City

103

20

Alphanumeric

City of the receiving location

     

6

Ship to State or Province

123

3

Alphanumeric

State or province of the receiving location

     

7

Ship to Postal Code

126

10

Alphanumeric

Postal code of the receiving location

     

8

Ship to Country

136

3

Alphanumeric

Country of the receiving location

Invoice/Line Item

Line

LIN

1

Record Tag

3

Alphanumeric

Record identifier

     

2

Item ID

3

20

Alphanumeric

Our identifier for the ordered item

     

3

Item Invoiced Quantity

23

10

Decimal number, space filled

The number of units to be invoiced

     

4

Item Unit Price

33

10

Implied decimal, two places, zero filled

Unit price in U.S. dollars

     

5

Extended Amount Due

43

10

Implied decimal, two places, zero filled

Total amount due for the invoiced item in U.S. dollars (unit price multiplied by the quantity invoiced)

Invoice/Line Item

Item Description

DSC

1

Record Tag

3

Alphanumeric

Record identifier

     

2

Item Description

3

80

Alphanumeric

Description of the ordered item

Invoice

Summary

SUM

1

Record Tag

3

Alphanumeric

Record identifier

     

2

Total Amount

3

10

Implied decimal, two places, zero filled

Total amount due on invoice in U.S. dollars

     

3

Number of Lines

13

10

Integer, space filled

Total number of invoice lines

Figure 8.1 Sample Input Flat File (Invoices.dat)
[View full width]
 10        20        30        40        50        60        70        80 graphics/ccc.gif 90        100       110       120       130 graphics/ccc.gif HDRBQ003               2002041             20021112AZ999345            20021212 SHPYazoo Grocers - NE Distribution Center  12 Industrial Parkway NW graphics/ccc.gif Portland            ME 04101 LINHCVAN                       120000000259000003108 DSCInstant Hot Cocoa Mix - Vanilla flavor LINHCMIN                       240000000253000006072 DSCInstant Hot Cocoa Mix - Mint flavor SUM0000009180         2 HDRBQ003               2002042             20021112AW999346            20021212 SHPYazoo Grocers - SE Distribution Center  Dock 37                       3975 Hwy 75 graphics/ccc.gif Atoka               OK 74525 LINHCVAN                       360000000259000009324 DSCInstant Hot Cocoa Mix - Vanilla flavor LINHCMIN                       720000000253000018216 DSCInstant Hot Cocoa Mix - Mint flavor SUM0000027540         2 HDRAY001               2002043             200211122002-0967           20021212 SHPCorner Drug and Sundries                14 Main Street graphics/ccc.gif Wichita             KS 67201 LINHCVAN                       240000000259000006216 DSCInstant Hot Cocoa Mix - Vanilla flavor SUM0000006216         1 HDRBR095               2002044             200211124397-0498           20021212 SHPBig Box Discounters - Store # 97        37 MegaMall graphics/ccc.gif Azusa               CA 91702 LINHCMIN                      1200000000253000030360 DSCInstant Hot Cocoa Mix - Mint flavor LINHCVAN                      3600000000259000093240 DSCInstant Hot Cocoa Mix - Vanilla flavor LINHCDUC                      2400000000259000062160 DSCInstant Hot Cocoa Mix - Dutch Chocolate flavor SUM0000185760         3 HDRBR095               2002045             200211124345-0498           20021212 SHPBig Box Discounters - Store # 45        45 Highway 76 graphics/ccc.gif Branson             MO 65615 LINHCMIN                       720000000253000018216 DSCInstant Hot Cocoa Mix - Mint flavor LINHCDUC                       960000000259000024864 DSCInstant Hot Cocoa Mix - Dutch Chocolate flavor SUM0000043080         2 HDRDQ349               2002046             20021112987-43671           20021212 SHPMaple Leaf Grocers - DC #1              987 Yorkland Blvd graphics/ccc.gif Willowdale          ON M2J 4Y8   CAN LINHCMOC                     36000000000269000096840 DSCInstant Hot Cocoa Mix - Mocha flavor SUM0000096840         1 

Listed below are the first three XML documents produced by the utility from this input file.

FlatInvoice001.xml
 <?xml version="1.0" encoding="UTF-8"?> <FlatInvoice     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     xsi:noNamespaceSchemaLocation="FlatInvoice.xsd">   <Header>     <RecordID>HDR</RecordID>     <CustomerNumber>BQ003</CustomerNumber>     <InvoiceNumber>2002041</InvoiceNumber>     <InvoiceDate>2002-11-12</InvoiceDate>     <PONumber>AZ999345</PONumber>     <DueDate>2002-12-12</DueDate>   </Header>   <ShipTo>     <RecordID>SHP</RecordID>     <ShipToName>       Yazoo Grocers - NE Distribution Center     </ShipToName>     <ShipToStreet1>12 Industrial Parkway NW</ShipToStreet1>     <ShipToCity>Portland</ShipToCity>     <ShipToStateOrProvince>ME</ShipToStateOrProvince>     <ShipToPostalCode>04101</ShipToPostalCode>   </ShipTo>   <LineItemGroup>     <LineItem>       <RecordID>LIN</RecordID>       <ItemID>HCVAN</ItemID>       <ItemQuantity>12</ItemQuantity>       <UnitPrice>2.59</UnitPrice>       <ExtendedPrice>31.08</ExtendedPrice>     </LineItem>     <ItemDescription>       <RecordID>DSC</RecordID>       <Description>         Instant Hot Cocoa Mix - Vanilla flavor       </Description>     </ItemDescription>   </LineItemGroup>   <LineItemGroup>     <LineItem>       <RecordID>LIN</RecordID>       <ItemID>HCMIN</ItemID>       <ItemQuantity>24</ItemQuantity>       <UnitPrice>2.53</UnitPrice>       <ExtendedPrice>60.72</ExtendedPrice>     </LineItem>     <ItemDescription>       <RecordID>DSC</RecordID>       <Description>         Instant Hot Cocoa Mix - Mint flavor       </Description>     </ItemDescription>   </LineItemGroup>   <Summary>     <RecordID>SUM</RecordID>     <TotalAmount>91.80</TotalAmount>     <NumberOfLines>2</NumberOfLines>   </Summary> </FlatInvoice> 
FlatInvoice002.xml
 <?xml version="1.0" encoding="UTF-8"?> <FlatInvoice     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     xsi:noNamespaceSchemaLocation="FlatInvoice.xsd">   <Header>     <RecordID>HDR</RecordID>     <CustomerNumber>BQ003</CustomerNumber>     <InvoiceNumber>2002042</InvoiceNumber>     <InvoiceDate>2002-11-12</InvoiceDate>     <PONumber>AW999346</PONumber>     <DueDate>2002-12-12</DueDate>   </Header>   <ShipTo>     <RecordID>SHP</RecordID>     <ShipToName>       Yazoo Grocers - SE Distribution Center     </ShipToName>     <ShipToStreet1>Dock 37</ShipToStreet1>     <ShipToStreet2>3975 Hwy 75</ShipToStreet2>     <ShipToCity>Atoka</ShipToCity>     <ShipToStateOrProvince>OK</ShipToStateOrProvince>     <ShipToPostalCode>74525</ShipToPostalCode>   </ShipTo>   <LineItemGroup>     <LineItem>       <RecordID>LIN</RecordID>       <ItemID>HCVAN</ItemID>       <ItemQuantity>36</ItemQuantity>       <UnitPrice>2.59</UnitPrice>       <ExtendedPrice>93.24</ExtendedPrice>     </LineItem>     <ItemDescription>       <RecordID>DSC</RecordID>       <Description>         Instant Hot Cocoa Mix - Vanilla flavor       </Description>     </ItemDescription>   </LineItemGroup>   <LineItemGroup>     <LineItem>       <RecordID>LIN</RecordID>       <ItemID>HCMIN</ItemID>       <ItemQuantity>72</ItemQuantity>       <UnitPrice>2.53</UnitPrice>       <ExtendedPrice>182.16</ExtendedPrice>     </LineItem>     <ItemDescription>       <RecordID>DSC</RecordID>       <Description>         Instant Hot Cocoa Mix - Mint flavor       </Description>     </ItemDescription>   </LineItemGroup>   <Summary>     <RecordID>SUM</RecordID>     <TotalAmount>275.40</TotalAmount>     <NumberOfLines>2</NumberOfLines>   </Summary> </FlatInvoice> 
FlatInvoice003.xml
 <?xml version="1.0" encoding="UTF-8"?> <FlatInvoice     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     xsi:noNamespaceSchemaLocation="FlatInvoice.xsd">   <Header>     <RecordID>HDR</RecordID>     <CustomerNumber>AY001</CustomerNumber>     <InvoiceNumber>2002043</InvoiceNumber>     <InvoiceDate>2002-11-12</InvoiceDate>     <PONumber>2002-0967</PONumber>     <DueDate>2002-12-12</DueDate>   </Header>   <ShipTo>     <RecordID>SHP</RecordID>     <ShipToName>Corner Drug and Sundries</ShipToName>     <ShipToStreet1>14 Main Street</ShipToStreet1>     <ShipToCity>Wichita</ShipToCity>     <ShipToStateOrProvince>KS</ShipToStateOrProvince>     <ShipToPostalCode>67201</ShipToPostalCode>   </ShipTo>   <LineItemGroup>     <LineItem>       <RecordID>LIN</RecordID>       <ItemID>HCVAN</ItemID>       <ItemQuantity>24</ItemQuantity>       <UnitPrice>2.59</UnitPrice>       <ExtendedPrice>62.16</ExtendedPrice>     </LineItem>     <ItemDescription>       <RecordID>DSC</RecordID>       <Description>         Instant Hot Cocoa Mix - Vanilla flavor       </Description>     </ItemDescription>   </LineItemGroup>   <Summary>     <RecordID>SUM</RecordID>     <TotalAmount>62.16</TotalAmount>     <NumberOfLines>1</NumberOfLines>   </Summary> </FlatInvoice> 


Using XML with Legacy Business Applications
Using XML with Legacy Business Applications
ISBN: 0321154940
EAN: 2147483647
Year: 2003
Pages: 181

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net