The specific method calls for validating instance documents against schemas comprise an area beyond the scope of the current W3C Recommendations. So, we won't find many design issues in common between our Java and C++ implementations . However, there are a few.
The first issue is to decide where to put the schema files. We have a few options.
Even with locally hosted schema files, you can still read and validate an input instance document against a schema hosted somewhere else on the Internet. I'm just not going out of my way to code things so that you can create an instance document that way. It might work with the current versions of JAXP and MSXML, but it might not. Again, you can transform your document with XSLT if you need to use a schema on the Internet.
That decided, in order to accommodate as many runtime environments as possible we'll use a local environment variable to specify the base URL of our schemas: BBSCHEMAS (for Babel Blaster schemas). For most shells used on UNIX systems this is done by assigning a value to the variable and optionally exporting it, for example:
BBSCHEMAS=/babelblaster/schemas; export BBSCHEMAS
For WIN32 systems, you can use a set statement from the command line or set it through the appropriate control panel function. For example, on my Windows NT system, from System in the Control Panel, on the Environment tab, I set a User Variable for BBSCHEMAS with the following file URL:
The other design aspect shared by both implementations is the need to write the schema file reference to the output instance document. As we saw in the previous chapter, to do this we add two Attributes to the root Element of the document. This is done in the main routine of CSVToXMLBasic, immediately after creating the root Element and appending it to the Document Node. The first of the two Attributes declares the XMLSchema-instance namespace. To set this we call the root Element's setAttribute method, passing the Attribute name of xmlns:xsi and the URI value of the xsi namespace. For the second Attribute we set the value of an Attribute that lives in that namespace, noNamespace SchemaLocation. Because the Attribute is from a different namespace we need to use the DOM Element's setAttributeNS method, which takes three arguments. The first is the namespace URI of the Attribute, the full URI value of the xsi namespace. The second is the qualified name of the Attribute, that is, the prefix and the local name . The third is the Attribute value. Here's the pseudocode.
Logic to Add Schema- related Attributes
Root Element <- Call Document's createElement method, with tagName of SimpleCSV Document <- Call Document's appendChild to append Root to Document Root Element <- Call Element's setAttribute method, with arguments of xmlns:xsi and http://www.w3.org/2001/XMLSchema-instance Root Element <- Call Element's setAttributeNS method, with arguments of http://www.w3.org/2001/XMLSchema-instance, xsi:noNamespaceSchemaLocation, and the URL of our schema file
The other thing we need to do that is common to both implementations is to handle the new command line options. This is pretty plain- vanilla Java and C++, so I won't talk about it further here.
As a final word, since we're starting to build a more capable system and getting to more complex options, I'm adopting a different naming scheme for the root Element. For this chapter and these two more capable utilities I'm going to use CSVFile instead of SimpleCSV.