XML is not really a technology; it is a standard like Hypertext Markup Language (HTML). Most of us are familiar with HTML by now. Its primary use is to tell a web browser how to format a page visually. It contains tags for changing fonts, placing pictures, changing paragraphs styles, and so forth.
XML, on the other hand, contains tags for defining and transferring data. It is less about the visual display and more about describing data. The ability for XML to be extended by the users without prior approval of a standards body is what makes XML unique over HTML. In order to change HTML you must submit your request to the standards body (W3C) and wait for approval. XML's ability to extend itself is built into its standard. Industries can define their own tags and properties that will enable standardized data formatting within their area. Just as HTML has cascading style sheets, XML has XSL, or Extensible Stylesheet Language. Since we are working specifically with DataSets we will also need to understand a little bit of the XML Schema Definition (XSD).
We will not attempt a full discourse on XML here. We will go over the basics as it pertains to ADO .NET. Essentially this involves understanding the XSD file and the XML file. The two files ending in the respective extensions (.xsd and .xml) go together. The XSD file describes the data in the XML file. The XSD file contains metadata, which is data that describes other data. When you generate a DataSet using the wizard or create it manually by entering your schema information through the Property window, what you're actually generating is an XSD file. The following is a sample XSD file for the Customers table in the Northwind database:
<?xml version="1.0" standalone="yes"?> <xs:schema id="DataSet1" targetNamespace="http://www.tempuri.org/ DataSet1.xsd" xmlns:mstns="http://www.tempuri.org/DataSet1.xsd" xmlns="http://www.tempuri.org/DataSet1.xsd" xmlns:xs="http:// www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml- msdata" attributeFormDefault="qualified" elementFormDefa_u108 ?t="qualified"> <xs:element name="DataSet1" msdata:IsDataSet="True"> <xs:complexType> <xs:choice maxOccurs="unbounded"> <xs:element name="Customers"> <xs:complexType> <xs:sequence> _u32 ?<xs:element name="CustomerID" type="xs:string" /> _u32 ?<xs:element name="CompanyName" type="xs:string" /> _u32 ?<xs:element name="ContactName" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="ContactTitle" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="Address" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="City" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="Region" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="PostalCode" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="Country" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="Phone" type="xs:string" minOccurs="0" /> _u32 ?<xs:element name="Fax" type="xs:string" minOccurs="0" /> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> </xs:complexType> <xs:unique name="Constraint1" msdata:PrimaryKey="True"> <xs:selector xpath=".//mstns:Customers" /> <xs:field xpath="mstns:CustomerID" /> </xs:unique> </xs:element> </xs:schema>
This sample is from a demo application which we will create shortly. Only one line of code in the program was required to create this file. As you can see, the file consists of tags and end tags similar to HTML. There are headers, elements, and attributes just like in HTML. The difference is the tags describe data as opposed to describing a visual layout. This file, the XML schema, only describes the data; it does not contain the data. The XML file that goes along with it contains the data. Each element in the XSD file describes the data block in the XML file. For example, in the XSD file you see CustomerID described as a String type, and when you look in the XML file you will see a corresponding tag called CustomerID followed by the data. Let's look at the first few lines of the XML file and you will see what we mean.
<?xml version="1.0" standalone="yes" ?> - <DataSet1 xmlns=" http://www.tempuri.org/DataSet1.xsd "> - <Customers> <CustomerID> ALFKI </CustomerID> <CompanyName> Alfreds Futterkiste </CompanyName> <ContactName> Maria Anders </ContactName> <ContactTitle> Sales Representative </ContactTitle> <Address> Obere Str. 57 </Address> <City> Berlin </City> <Region /> <PostalCode> 12209 </PostalCode> <Country> Germany </Country> <Phone> 030-0074321 </Phone> <Fax> 030-0076545 </Fax> </Customers> - <Customers> <CustomerID> ANATR </CustomerID> <CompanyName> Ana Trujillo Emparedados y helados </CompanyName> <ContactName> Ana Trujillo </ContactName> <ContactTitle> Owner </ContactTitle> <Address> Avda. de la Constitucin 2222 </Address> <City> Mxico D.F. </City> <PostalCode> 05021 </PostalCode> <Country> Mexico </Country> <Phone> (5) 555-4729 </Phone> <Fax> (5) 555-3745 </Fax> </Customers> - <Customers>
Notice how the XML file makes no attempt to describe the data within it. That is the job of the XSD file. The URL in the second line of the XML file points to the XSD file that contains the schema definition. If that file cannot be found, the system will not be able to parse the XML file properly. The good thing is the file does not have to be on your own system. Since it is a URL it can be on any system reachable over the Internet. This is a big advantage. By keeping the XSD files in a central repository anyone with a network connection to the repository can parse the XML file. The whole XML architecture depends on this capability. If we had to distribute XSD files to everyone who needs them we would not have gained anything over previous technologies.
XSL is to XML what cascading style sheets are to HTML. XSL is generally applied to XML to format the output so that it is understandable to an end user . For the purposes of ADO .NET, we will not be using XSL directly too much. For ADO .NET of interest are the XML file and the XSD file. If you want to know more about XSL there are numerous in-depth books on this subject.
When looking at the XSD file the two main things we're concerned about are elements and attributes. If you look back at the sample XSD file, you'll see that there are lots of tag types. The element tag and its attributes are the primary structure we need to look at. You can see that the element tag has as attributes the name and the data type. This is not the best example because all the data types are strings. How does it know what data types are allowed? Look at the second line of the file. There are a number of arcane tags pointing to URLs. Scroll to the right and find this tag: xmlns:xs. Notice that it points to the URL of the W3C. The next attribute defines the name of the document that defines the standard for the DataSet: xmlns:msdata="urn:schemas-microsoft-com:xml-msdata. This is a very exciting technology because it allows us to centrally define new extensions (thereby the prefix extensible) to the language for our own purposes. We are not limited to the structure defined by someone else. As long as your customer, your client, or whoever can find the XSD document on the Web, he or she can understand any file you send. It doesn't even have to be a Windows system, it could be Linux or UNIX. Contrast this with the "old days" when you would have to send arcane and lengthy specifications to whomever you were going to send your data. It was then up to their programmers to write some kind of conversion program to transform it into the format they would need.
There are hundreds, perhaps thousands, of XSD specifications on the Web. Fortunately for us we are only interested in the one that involves DataSets. I hope this look at XML in general took some of the mystery out of the technological mumbo jumbo.