I l @ ve RuBoard |
If you're working in an environment that consists entirely of platforms that support the .NET Framework, you might wonder why you need XML. You can use the facilities of ADO.NET to manipulate and transport your data, so why bother converting to and from a text format? The answer lies largely in the ubiquity of XML; XML has become the universal data representation of modern distributed applications. This representation allows data to be passed relatively easily between different platforms, different vendors , new and old applications, and so forth. The generic integration provided by Web services relies on XML to deliver interoperability. Almost all applications now have some ability to import and export XML, regardless of their background. Because of its ubiquity, XML forms an important part of the world that your applications will occupy. XML as a Data FormatXML documents consist of data and tags that provide meaning and context for that data. XML is a text format rather than a binary one. If you need to, you can read and write XML using an ordinary text editor, such as Notepad. An XML document consists of tags, which are delimited by less-than ( < ) and greater-than ( > ) characters , and text content. The following is a simple XML document that describes a catalog of cakes: <?xmlversion="1.0" encoding="utf-8" ?> <CakeCatalogxmlns="http://www.fourthcoffee.com/CakeCatalog.xsd"> <CakeTypestyle="Celebration" filling="sponge" shape="square"> <Message>Congratulations</Message> <Description>Generalachievement</Description> <Sizes> <OptionsizeInInches="10" /> <OptionsizeInInches="12" /> </Sizes> </CakeType> <CakeTypestyle="Celebration" filling="fruit" shape="round"> <Message>Hi!</Message> <Description>Quitecasual</Description> <Sizes> <OptionsizeInInches="12" /> <OptionsizeInInches="18" /> </Sizes> </CakeType> <CakeTypestyle="Christmas" filling="fruit" shape="square"> <Message>Season'sGreetings</Message> <Description>Traditional,spicedChristmascake</Description> <Sizes> <OptionsizeInInches="15" /> <OptionsizeInInches="18" /> <OptionsizeInInches="20" /> </Sizes> </CakeType> <CakeTypestyle="Celebration" filling="sponge" shape="hexagonal"> <Message>HappyBirthday</Message> <Description>Anexcellentcake.</Description> <Sizes> <OptionsizeInInches="15" /> </Sizes> </CakeType> </CakeCatalog> Within this document, you can see some common aspects of XML:
You can define the expected structure of an XML document using one of two mechanisms. The XML specification itself defines a structure definition syntax known as Document Type Definition (DTD). As you'll see later, DTDs have some drawbacks, so they have been superseded by a newer standard for document structure definition called XML Schema. An XML document can be checked against an associated DTD or XML schema ”this check is called validation . If the document conforms to the DTD or XML schema, it is said to be valid. One thing that is not shown in our sample XML document is an entity . Entities are among the more obscure parts of the XML standard. An entity shows up in an XML document as a placeholder, and at some point during the processing of the XML document, a value is typically inserted into the placeholder. Entities can be either internal or external. The value for an internal entity is defined as part of the DTD or schema. The value for an external entity is obtained from an external source, such as a URL. The act of retrieving the value of an external entity is called resolving the entity. We will not cover entities in great detail in this chapter, but you'll learn more about them at certain points, when they're relevant. Because XML is a text-based format, it has certain disadvantages over native data formats, such as
However, XML also has advantages:
Roles for XMLXML can perform various roles within an enterprise application:
What Applications Need from XML SupportAs you've seen, XML is a flexible, text-based format. However, you still need facilities that make it easy to manipulate and apply XML:
Regarding the last point, you'll see throughout this chapter that two distinct techniques are used for manipulating XML data. Stream-based processing reads the data through once, presenting the data as soon as it arrives and discarding it once it has been read. This type of processing is ideal when you're dealing with large amounts of data or data with little context, and it's ideal for filtering or when no manipulation of the data is needed. The use of a stream results in comparatively fast processing and a comparatively small memory footprint. However, one problem with stream-based processing relates to context sensitivity. If the meaning of tags and text in your document is dependent on the context in which you find them, you might have to keep a track of the current context when you use stream-based processing. This can mean using many Boolean flags or building complex state models. The alternative mechanism, in-memory processing of XML documents, tends to be slower and more memory intensive , but you have completely random access to the document and you can add, remove, or change parts of it as you see fit. In-memory processing does not have the same context issues that are inherent in the stream-based processing model. Because you can revisit any piece of the document, you can work out the context of any part when you need it rather than having to cache the current context. Processing XML DataGiven data in an XML format, what might you do with it? There are certain tasks you will often perform when manipulating XML. These include
Support for XML in Visual J# and the .NET FrameworkVisual J# and the .NET Framework provide a great deal of support for the generation, consumption, and manipulation of XML when you develop applications. Standards and Mechanisms Supported by the .NET FrameworkAs mentioned previously, there are two primary approaches to processing XML programmatically. The first is to perform forward-only, noncached parsing. This approach is well supported by classes in the .NET Framework. Although there is no official standard for this style of processing, it is commonly used when processing XML. (SAX provides a similar mechanism, but with a different philosophy, and it is in itself not a standard.) The second approach is to use in-memory manipulation through the DOM model. DOM is a standard defined by the World Wide Web Consortium (W3C) and is fully supported by classes in the .NET Framework. The .NET Framework supports XML standards for document structure, namespaces, XSLT, and XPath. Other applicable XML- related standards might be supported in the future as they are formalized under the W3C. Classes in the .NET FrameworkThe .NET classes for XML manipulation are split across several namespaces in the .NET Framework Class Library. These namespaces are
In this chapter, we'll focus primarily on the document manipulation and validation capabilities provided by the classes in the System.Xml and System.Xml.Schema namespaces. Chapter 6 covers the transformational and navigational capabilities supported by the classes in System.Xml.Xsl and System.Xml.XPath . The serialization capabilities provided by the classes under System.Xml.Serialization are discussed in Chapter 10. Manipulating XML Files in Visual J#If your application uses XML, you might need to edit XML documents, define XML schemas, and so on. Naturally, you are free to do this manually (in Notepad) or in a specific XML-oriented tool. However, you can perform most XML-related tasks without leaving the Visual Studio .NET environment. You can add any relevant XML files or schemas to your Visual Studio .NET project by importing existing files or by creating new ones, as shown in Figure 5-1. Figure 5-1. XML File and XML Schema options in the Add New Item dialog box
You can edit an XML document or schema in Visual Studio .NET using the XML Designer. The XML Designer allows you to view and edit an XML file in two ways. You can manipulate the raw XML as shown in Figure 5-2, or you can work with a structured data grid form of the data, as shown in Figure 5-3. You can easily switch between the views by choosing the appropriate command from the View menu. The XML Designer ensures that the two views are kept synchronized ”any changes, additions, or deletions made in one view will be reflected in the other. Note Data view in the XML Designer can show only regular, structured data, such as the results of a database query or a business document such as a purchase order. Other XML documents that use irregular tagging, such as marked -up text produced by an XML-based word processor, will not display correctly in Data view. Figure 5-2. Working with raw XML in Visual Studio .NET
Figure 5-3. Working with XML in a data grid in Visual Studio .NET
The XML Designer also lets you create and manipulate XML schema documents (XSD files). Because XML schemas are themselves XML documents, the XML Designer again provides two views of the document. You can view the XML schema as raw XML or in a graphical view that shows the relationships between the different types of data in the document. (If you're familiar with databases, this will look very similar to the representation of a database schema.) You can create a schema from an existing XML document, import a preexisting schema, or create your own schema from scratch. You can then associate the XML file with a schema within the project through the XML document's properties. Once the file has a schema associated with it, it can be validated within Visual Studio .NET. For more information about the XML Designer, see the Visual Studio .NET documentation. |
I l @ ve RuBoard |