XML to CSV: Functionality and Operation


Requirements

The general requirement for this utility is to convert one or more XML instance documents, each representing a single logical document, into a single CSV file. As with the CSV to XML utility, this is a significant improvement over the utility presented in Chapter 2. Again, many of the restrictions on both the CSV file format and the grammar of the resulting XML document are removed. Here's a summary of the required functionality.

  • Inputs : A directory containing one or more XML instance documents, each in a single file. All have the same grammar. The second input is an XML file description document (as discussed in Chapter 6) describing the XML documents and the CSV file to be produced.

  • Processing : Each input Element corresponding to a row (as described in the file description document) is written to a record in the CSV output file. Each of its child Elements corresponding to columns are written to the appropriate column numbers . Column delimiter characters are inserted, and data types are converted according to the column descriptions in the file description document.

  • Output : A CSV file in row and column organization.

Running the Utility

This section provides instructions for running the revised XML to CSV Converter utility from the command line.

For Java:

java XMLToCSV InputDirectory OutputFile.csv FileDescription.XML

or

java XMLToCSV -h

For C++ on Win32:

XMLToCSV InputDirectory OutputFile.csv FileDescription.XML

or

XMLToCSV -h

Options follow the parameters except for the help option, which may be specified by itself.

Parameters:

  • First : Path specification of the input directory (required). Either a relative or an absolute path name may be specified. The trailing directory separator character is optional.

  • Second : File specification of the output CSV file (required). The specification may include the full or relative path name. If no path name is specified, the file will be created in the current working directory. An extension may be specified, but if not specified the file will be created without one. If a file with the specified name already exists, it will be overwritten.

  • Third : File specification of the File Description XML instance document (required). If no path name is specified, the file is assumed to reside in the current working directory. The full file name must be specified, but there is no restriction on the extension name.

Options:

  • -v (Validate) : Validate the XML documents before processing. The documents are validated against the schema referenced in the document root Element.

  • -h (Help) : Display a help message and exit without further processing.

Restrictions:

The restrictions are the same as those specified for the CSV to XML utility, plus the following.

  • Field Elements in the input XML document must be ordered sequentially according to their column positions in the output CSV row.

Sample Input and Output: Purchase Order

Following through with our simple procurement example, we'll use a purchase order as the sample document. (See Table 7.2 for the layout. Note that the Description column in this table includes a required or optional designation. This is something we'll be concerned about validating when importing into the application but not when we export from it.) We'll have several XML documents representing purchase orders from different customers, each ordering the gourmet hot chocolate mixes . We want to convert the XML documents to CSV format, in a single file, so that we can import them into our desktop bookkeeping and order management system.

Here are three purchase orders that follow this logical organization.

Sample PurchaseOrders01.xml
 <?xml version="1.0" encoding="UTF-8"?> <PurchaseOrder>   <POLine>     <CustomerNumber>BQ003</CustomerNumber>     <PONumber>AZ999345</PONumber>     <PODate>2002-11-12</PODate>     <RequestedDeliveryDate>2002-11-15</RequestedDeliveryDate>     <ShipToName>       Yazoo Grocers - NE Distribution Center     </ShipToName>     <ShipToStreet1>12 Industrial Parkway, NW</ShipToStreet1>     <ShipToCity>Portland</ShipToCity>     <ShipToState>ME</ShipToState>     <ShipToPostalCode>04101</ShipToPostalCode>     <ItemID>HCVAN</ItemID>     <OrderedQty>12</OrderedQty>     <UnitPrice>2.59</UnitPrice>     <ItemDescription>       Instant Hot Cocoa Mix - Vanilla flavor     </ItemDescription>   </POLine>   <POLine>     <CustomerNumber>BQ003</CustomerNumber>     <PONumber>AZ999345</PONumber>     <PODate>2002-11-12</PODate>     <RequestedDeliveryDate>2002-11-15</RequestedDeliveryDate>     <ShipToName>       Yazoo Grocers - NE Distribution Center     </ShipToName>     <ShipToStreet1>12 Industrial Parkway, NW</ShipToStreet1>     <ShipToCity>Portland</ShipToCity>     <ShipToState>ME</ShipToState>     <ShipToPostalCode>04101</ShipToPostalCode>     <ItemID>HCMIN</ItemID>     <OrderedQty>24</OrderedQty>     <UnitPrice>2.53</UnitPrice>     <ItemDescription>       Instant Hot Cocoa Mix - Mint flavor     </ItemDescription>   </POLine> </PurchaseOrder> 
Table 7.2. Logical Layout for the Purchase Order

Column Number

Column Name

Data Type

Description

1

Customer Number

Alphanumeric

Required; identifier assigned to the customer in our system

2

PO Number

Alphanumeric

Required; customer purchase order number

3

PO Date

Date

Required; date that the purchase order was issued

4

Requested Delivery Date

Date

Optional; date on which delivery of the ordered items is requested

5

Ship to Name

Alphanumeric (delimited)

Required; name of the receiving location for shipped order

6

Ship to Street 1

Alphanumeric (delimited)

Required; first address line of the receiving location

7

Ship to Street 2

Alphanumeric (delimited)

Optional; second address line of the receiving location

8

Ship to City

Alphanumeric

Required; city of the receiving location

9

Ship to State or Province

Alphanumeric

Required; state or province of the receiving location

10

Ship to Postal Code

Alphanumeric

Required; postal code of the receiving location

11

Ship to Country

Alphanumeric

Optional; country of the receiving location

12

Item ID

Alphanumeric

Required; identifier for the ordered item

13

Item Quantity

Decimal number

Required; number of units ordered

14

Item Unit Price

Decimal number

Optional (since price may be determined by contract); unit price in U.S. dollars

15

Item Description

Alphanumeric (delimited)

Optional; description of the ordered item

Sample PurchaseOrders02.xml
 <?xml version="1.0" encoding="UTF-8"?> <PurchaseOrder>   <POLine>     <CustomerNumber>BQ003</CustomerNumber>     <PONumber>AW999346</PONumber>     <PODate>2002-11-12</PODate>     <RequestedDeliveryDate>2002-11-15</RequestedDeliveryDate>     <ShipToName>       Yazoo Grocers - SE Distribution Center     </ShipToName>     <ShipToStreet1>Dock 37</ShipToStreet1>     <ShipToStreet2>3975 Hwy 75</ShipToStreet2>     <ShipToCity>Atoka</ShipToCity>     <ShipToState>OK</ShipToState>     <ShipToPostalCode>74525</ShipToPostalCode>     <ItemID>HCVAN</ItemID>     <OrderedQty>36</OrderedQty>     <UnitPrice>2.59</UnitPrice>     <ItemDescription>       Instant Hot Cocoa Mix - Vanilla flavor     </ItemDescription>   </POLine>   <POLine>     <CustomerNumber>BQ003</CustomerNumber>     <PONumber>AW999346</PONumber>     <PODate>2002-11-12</PODate>     <RequestedDeliveryDate>2002-11-15</RequestedDeliveryDate>     <ShipToName>       Yazoo Grocers - SE Distribution Center     </ShipToName>     <ShipToStreet1>Dock 37</ShipToStreet1>     <ShipToStreet2>3975 Hwy 75</ShipToStreet2>     <ShipToCity>Atoka</ShipToCity>     <ShipToState>OK</ShipToState>     <ShipToPostalCode>74525</ShipToPostalCode>     <ItemID>HCMIN</ItemID>     <OrderedQty>72</OrderedQty>     <UnitPrice>2.53</UnitPrice>     <ItemDescription>       Instant Hot Cocoa Mix - Mint flavor     </ItemDescription>   </POLine> </PurchaseOrder> 
Sample PurchaseOrders03.xml
 <?xml version="1.0" encoding="UTF-8"?> <PurchaseOrder>   <POLine>     <CustomerNumber>AY001</CustomerNumber>     <PONumber>2002-0967</PONumber>     <PODate>2002-11-12</PODate>     <RequestedDeliveryDate>2002-11-14</RequestedDeliveryDate>     <ShipToName>Corner Drug and Sundries</ShipToName>     <ShipToStreet1>14 Main Street</ShipToStreet1>     <ShipToCity>Wichita</ShipToCity>     <ShipToState>KS</ShipToState>     <ShipToPostalCode>67201</ShipToPostalCode>     <ItemID>HCVAN</ItemID>     <OrderedQty>24</OrderedQty>     <UnitPrice>2.59</UnitPrice>     <ItemDescription>       Instant Hot Cocoa Mix - Vanilla flavor     </ItemDescription>   </POLine> </PurchaseOrder> 

Successful processing of these three documents should produce a CSV file that looks like the one shown below. ( Note : Line breaks and indentation have been inserted for readability.)

Sample Output CSV File
 BQ003,AZ999345,11/12/2002,11/15/2002,  "Yazoo Grocers - NE Distribution Center",  "12 Industrial Parkway, NW",,"Portland",ME,04101,,  HCVAN,12,2.59,"Instant Hot Cocoa Mix - Vanilla flavor" BQ003,AZ999345,11/12/2002,11/15/2002,  "Yazoo Grocers - NE Distribution Center",  "12 Industrial Parkway, NW",,"Portland",ME,04101,,  HCMIN,24,2.53,"Instant Hot Cocoa Mix - Mint flavor" BQ003,AW999346,11/12/2002,11/15/2002,  "Yazoo Grocers - SE Distribution Center",  "Dock 37","3975 Hwy 75","Atoka",OK,74525,,  HCVAN,36,2.59,"Instant Hot Cocoa Mix - Vanilla flavor" BQ003,AW999346,11/12/2002,11/15/2002,  "Yazoo Grocers - SE Distribution Center",  "Dock 37","3975 Hwy 75","Atoka",OK,74525,,  HCMIN,72,2.53,"Instant Hot Cocoa Mix - Mint flavor" AY001,2002-0967,11/12/2002,11/14/2002,  "Corner Drug and Sundries",  "14 Main Street",,"Wichita",KS,67201,,  HCVAN,24,2.59,"Instant Hot Cocoa Mix - Vanilla flavor" 


Using XML with Legacy Business Applications
Using XML with Legacy Business Applications
ISBN: 0321154940
EAN: 2147483647
Year: 2003
Pages: 181

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net