New DataCell Methods and Derived Classes

For the data types supported for CSV files we add two new methods to the base class and create three new derived classes. Depending on the conversion requirements, the new derived classes may have their own implementation of the fromXML or toXML method, or they may use the implementation in the base class.

New DataCell Methods

delimitText

This method checks to see whether a DelimitText Element in the cell 's Grammar Element is set to true. It is primarily used by the XML to CSV utility to determine whether or not a column should have the text delimiter added to it before writing it to the output CSV file.

Logic for the DataCell delimitText Method

 Arguments:   None Returns:   Boolean - true if DelimitText Attribute is present and has a       value of true, false otherwise Delimit Text <- call Grammar Element's getAttribute     on "DelimitText" IF Delimit Text String is null or if Delimit Text String     is false   return false ENDIF Return true

trimLeadingZeroes

This method trims leading zeroes from numeric data types. It also trims leading and trailing whitespace. It is designed to properly trim leading zeroes and spaces from the following numeric representations while retaining the sign characters :

Sign character in first position followed by leading spaces or zeroes
Leading spaces followed by leading sign character
Leading spaces or zeroes with no sign character

Note that this algorithm is somewhat permissive in that it will convert and pass data that isn't numeric or that has more than one sign character. Again, we depend on schema validation to detect most of those kinds of problems. We keep our code simple here by doing the minimum required.

Logic for the DataCell trimLeadingZeroes Method

 Arguments:   None Returns:   Error status or throws exception Initialize Sign Character IF first character is + or - sign character   Sign Character <- first Character ENDIF Call trim (C++ base class method or native Java) to remove     leading and trailing whitespace from Cell Buffer IF Cell Buffer is empty after trimming   return ENDIF Position <- 0 IF first character is + or - sign character   Sign Character <- first Character   Position <- 1; ENDIF DO while Position < Buffer Length and     Cell Buffer[Position] = "0"   Position++ ENDDO IF Position = Buffer Length because it contains only zeroes   Decrement Position so that we have at least one zero ENDIF IF Sign Character is present   Cell Buffer <- Sign Character + Cell Buffer substring starting       at Position ELSE   Cell Buffer <- Cell Buffer substring starting at Position ENDIF Return success

DataCellAN Class

This class handles conversion to and from an alphanumeric data type and the schema language string data type. It uses the base class fromXML method but implements its own version of the toXML method.

Logic for the DataCellAN toXML Method

 Arguments:   None Returns:   Error status or throws exception Call trim (C++ utility method or native Java) to remove leading     and trailing whitespace from Cell Buffer Return success

DataCellReal Class

This class handles conversion to and from a real number (or decimal) data type and the schema language decimal data type. The main thing to note about this class is that the fromXML method trims spaces. If the source XML document is validated , the source Element with a schema language data type of decimal should never have spaces in it. However, we can't depend on validation being performed, and other methods that we'll develop later depend on there being no spaces in the Cell Buffer. So, we remove them with the fromXML method.

Logic for the DataCellReal fromXML Method

 Arguments:   None Returns:   Error status or throws exception Trim Spaces Remove leading plus sign if present Return success

Logic for the DataCellReal toXML Method

 Arguments:   None Returns:   Error status or throws exception Call trimLeadingZeroes base class method to remove leading     zeroes and leading and trailing spaces Return success

DataCellDateMMsDDsYYYY Class

This class handles conversion to and from the date data type in MM/DD/YYYY format to the schema language date data type in ISO 8601 date format, that is, YYYY-MM-DD. In the MM/DD/YYYY format both the MM and the DD may be either one or two digits in length.

Note : Various library functions are available in Java and C++ that might make the implementation a bit more efficient than the one presented here. However, in the interest of keeping things simple for myself , as I'm implementing in both languages, I've chosen basic algorithms that work equally well in either language. All positions are expressed as offsets from the first character at position zero.

Also, remember that we're only putting enough validation into these routines to avoid nasty runtime exceptions. Schema validation is our primary method for ensuring that we have good dates.

Logic for the DataCellDateMMsDDsYYYY fromXML Method

 Arguments:   None Returns:   Error status or throws exception IF Buffer Length != 10   Return error ENDIF Month <- Cell Buffer characters at offsets 5 and 6 Day <- Cell Buffer characters at offsets 8 and 9 Year <- Cell Buffer characters at offsets 0 through 3 Cell Buffer <- Month + forward slash + Day + forward slash     + year Return success

Logic for the DataCellDateMMsDDsYYYY toXML Method

 Arguments:   None Returns:   Error status or throws exception Month State = 0 Day State = 1 Year State = 2 State <- Month Cell Buffer < trim leading and trailing whitespace from     Cell Buffer IF Buffer Length > 10   Return error ENDIF TempChar <- First character in Cell Buffer DO until end of Cell Buffer   IF TempChar = Slash     Increment State   ELSE     DO CASE of State       Month:         Append TempChar to Month         BREAK       Day:         Append TempChar to Day         BREAK       Year:         Append TempChar to Year         BREAK       other:         Return Error     ENDDO   ENDIF   TempChar <- Next character in Cell Buffer ENDDO IF length of Month is 1   Month <- "0" + Month ENDIF If Length of Day is 1   Day <- "0" + Day ENDIF IF (Length of Month != 2) OR (Length of Day != 2) OR     (Length of Year != 4)   Return error ENDIF Cell Buffer = Year + dash + Month + dash + Day Return success