Having dealt with these general issues we can now lay out the high-level design of our utilities. All the utilities for which the legacy format is the source format share a common structure, as do all for which the legacy format is the target format. Both sets are built on several common classes developed in this chapter. Derived classes specific to the legacy format are built in each of the next three chapters.
Source Converter Processing
Figure 6.2 shows a simplified high-level collaboration diagram of the classes and processing of the source converter. Exact details vary for each of the converters, but this diagram and description should provide a good high-level understanding of the design.
Figure 6.2. Source Converter Collaboration Diagram
The main routine processes command line arguments and then calls the processFile method of the SourceConverter class appropriate to the legacy format. The processFile method creates disk directories for the output XML documents as determined by the processing options. It also creates DOM Document objects and serializes them to disk. To process the input file, for each record in the input file, it calls the readRecord method of the RecordReader class appropriate to the legacy format. The processFile method then calls the RecordReader's parseRecord method to extract the individual fields from the input record. A DataCell object is created for each field in the input record, with the specific derived class type determined by the data type of the field. The toXML method of each DataCell object is then invoked to convert the contents of the field from the legacy data type to a corresponding schema language data type. Finally, the processFile method calls the RecordReader's writeRecord method to write the DataCell object contents to DOM Elements representing the legacy record and its fields.
Target Converter Processing
Figure 6.3 shows a simplified high-level collaboration diagram of the classes and processing of the target converter. Again, the exact details vary for each of the converters.
Figure 6.3. Target Converter Collaboration Diagram
The main routine is very similar to that of the source converter in that it processes the command line arguments. However, instead of just passing along the name of one legacy source file to the TargetConverter class, it reads a directory to retrieve the file names of the XML documents to process. It also opens and closes the output legacy format file. For each XML document file in the input directory it calls the TargetConverter's processDocument method. This method parses the input XML document file and loads it into a DOM Document object tree. Then, for each Element that represents a record in the legacy format, it calls the RecordWriter's parseRecord method. This method creates a DataCell object for each Element in the DOM tree that corresponds to a field in the legacy format record. As with the source converter, the derived class of the DataCell object is determined by the legacy data type of the target field. Finally, the RecordWriter's writeRecord method is called to write the legacy format record. In its processing it calls the fromXML method of each DataCell object to convert the data representation from the schema language data type to the legacy format data type.
Summary of Classes
The utilities are built using three families of classes.
Figures 6.4, 6.5, and 6.6, respectively, show the inheritance diagrams for these three families of classes. In this chapter we develop the base classes of each family. In Chapters 7, 8, and 9 we develop the derived classes appropriate for each legacy format.
Figure 6.4. Converter Class Inheritance
Figure 6.5. RecordHandler Class Inheritance
Figure 6.6. DataCell Class Inheritance